Communications in Mathematical Physics - Volume 192

Commun. Math. Phys. 192, 1 – 7 (1998) Communications in Mathematical Physics c Springer-Verlag 1998 Equilibrium Clas...

Author: A. Jaffe (Chief Editor)

40 downloads 702 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 192, 1 – 7 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Equilibrium Classical Statistical Mechanics of Continuous Systems in the Kirkwood–Salsburg Approach ´ Marek Gorzelanczyk Institute of Theoretical Physics, University of Wrocław, pl. M. Borna 9, 50-204 Wrocław, Poland. E-mail: [email protected] Received: 22 July 1996 / Accepted: 9 June 1997

Abstract: We propose a new scheme for the analysis of the Kirkwood–Salsburg equation. The Kirkwood–Salsburg operator is considered on a space constructed in such a way that the spectral analysis of this operator is essentially simplified. We find a new region free of phase transitions of any order.

1. Introduction To describe the equilibrium states in classical statistical mechanics the correlation functions can be employed. The purpose of this paper is to investigate the analytic properties of the correlation functions in the thermodynamical limit. These functions fulfill the sequence of integral equations (e.g.: Kirkwood–Salsburg (K–S), Mayer–Montroll) on the suitable Banach space. The exploration of integral equations yields some information about the thermodynamical limit of correlation functions. To reach the subject we use the freedom to choose both integral equations and the Banach space. The K–S equation has caught our attention for the accompanying K–S operator is proved to be linear, bounded and independent of the parameter (i.e. chemical activity). Thus, the very point is how the K–S operator spectrum performs while approaching the thermodynamical limit. The only existing general result describing behaviour of the spectrum is the estimation of the K–S operator spectral radius by means of its norm. In the finite volume case Pastur [2] has shown that the spectrum coincides with the zeroes of the grand canonical partition function. Other results concern the properties of the spectrum but without its localization [6]. In that approach, the Banach space on which the K–S operator acts is so wide that the analysis of the spectrum is very complicated. In contrast, in the present paper we take the smallest possible Banach space which contains all the correlation functions. In addition, this space is cyclic for the K–S operator (Theorem 1) and obviously the K–S operator restricted to the cyclic space is still bounded with a norm not greater than the previous one. In Chapter 3, we show that the above restrictions

2

M. Gorzela´nczyk

do simplify the spectral analysis of the K–S operator. Thus, from Theorem 2 it follows that we can investigate the spectral radius of the operator by means of its action on the cyclic vector. The latter takes particularly simple form for the K–S operator. These allow us to prove the following result (see Sect. 3 for notation and all the details): The sequence of the correlation functions tend to the thermodynamical limit uniformly on compact subsets for z −1 ∈ G1 ∪ G2 . So, the limit is the analytic function of z in the set G1 ∪ G2 . Since the set {z : 0 ≤ z ≤ [(e − 1)C]−1 } (where C is the constant defined by Eq. (3)) is contained in G1 ∪ G2 , from this result we conclude that the phase transition does not occur for the nonnegative chemical activity z less than [(e−1)C]−1 . According to [5], Sect. 4.5, we get that for the density ρ ≤ [eC]−1 there does not exist any phase transition either. It is worth stressing that the obtained region of analyticity is essentially larger than the previously known ρ ≤ [(e + 1)C]−1 (see [5]). Moreover, we have shown that all correlation functions are analytic in that region, whereas the result stated in [5] concerns only one particle correlation function (or a density). 2. Cyclic space of the K–S operator Let xi ∈ Rν for i = 1, . . . , n, (x)n = (x1 , . . . , xn ) and ϕ(x)n be a measurable complex-valued function on Rnν . Let Eξ (3) (ξ > 0) denote the Banach space of ∞ sequences ϕ = {ϕ(x)n }n=1 with the norm k ϕ k= sup ξ −n ess sup

(x)n ∈3n

n≥1

| ϕ(x)n |

(1)

where 3 is a bounded Lebesgue measureble set in Rν . Let K3 denote the KirkwoodSalsburg operator Z ∞ X 1 K(x1 , (y)m ) ϕ(y)m d(y)m , (K3 ϕ)(x)1 = χ(x)1 m! 3m m=1 ∞ 0 X 1 (2) (K3 ϕ)(x)n = χ(x)n e−βW (x1 ,(x)n ) m! m=0 Z × K(x1 , (y)m ) ϕ((x)0n , (y)m ) d(y)m for n ≥ 2 , 3m

where (x)0n = (x2 , . . . , xn ) , n X W (x1 , (x)0n ) = 8(x1 − xi ) , i=2

K(x1 , (y)m ) = χ(x)n =

m Y j=1 n Y

(e−β8(x1 −yj ) − 1) , χ3 (xi ) .

i=1

χ3 is the characteristic function of the 3 ⊂ Rν . We assume that the potential 8 is stable and regular so

Equilibrium Classical Statistical Mechanics of Continuous Systems

Z C=

3

| e−β8(x) − 1 | dx < ∞.

(3)

Under these assumptions the operator K3 is bounded [5]. Definition 1. Let Dξ (3) denote the subspace of Eξ (3) consisting of the following functions Z ∞ X al+p ∞ e−βU ((x)p ,(y)l ) d(y)l , (4) ρ3 {al }l=1 (x)p = l! 3l l=0

where U ((x)p , (y)l ) =

p X

8(xi − xj ) +

i<j

X

8(xi − yj ) +

l X

8(yi − yj ),

i<j

1≤i≤p 1≤j≤l

and the coefficients al are complex numbers such that (4) converges for each (x)p , p = 1, 2, ... . R Remark 1. If χ3 (x)p 3l e−βU ((x)p ,(y)l ) d(y)l = 0, we put al+p = 0 . According to [2] let us introduce two bounded operators defined as follows Z ∞ X 1 ϕ((x)p , (y)m ) d(y)m , (A3 ϕ)(x)p = m! 3m m=0 Z ∞ X (−1)m (B3 ϕ)(x)p = ϕ((x)p , (y)m ) d(y)m , m! 3m

(5)

A3 B3 = B3 A3 = I .

(6)

m=0

then we have Lemma 1. Dξ (3) is a Banach space.

∞ ∞ Proof. It suffices to prove the completeness of Dξ (3) . Let a sequence ρ3 {aql }l=1 q=1 be Cauchy. By the definition of the operators A3 , B3 we can write ∞ ∞ ∞ k e−βU (x)l (aql − apl ) l=1 k = k B3 (ρ3 {aql }l=1 − ρ3 {apl }l=1 ) k≤ (7) ∞ ∞ ≤ k B3 k k ρ3 {aql }l=1 − ρ3 {apl }l=1 k . Remark 1 guarantees that there exist points (x)l ∈ 3 such that e−βU (x)l > 0, so the ∞ ∞ sequences {aql }q=1 are Cauchy for any l . Let bl = limq→∞ aql , then ρ3 {bl }l=1 belongs ∞ to Eξ (3) since Eξ (3) is complete, clearly ρ3 {bl }l=1 belongs also to Dξ (3). ∞

The operator K3 acts in the space of the coefficients {al }l=1 in the following way: ∞

∞

K3 {al }l=1 = {al−1 }l=1 , where a0 = −

Z ∞ X am m=1

m!

3m

e−βU (x)m d(x)m .

(8)

4

M. Gorzela´nczyk L

L−1

Lemma 2. Let the sequence {al }l=1 be finite. Then there exist a vector ρ3 {bl }l=1 and complex number β such that L

L−1

ρ3 {al }l=1 = K3 ρ3 {bl }l=1 + β 11,

(9)

where 11 = (1, 0, . . .) . Proof. Equation (9) is equivalent to Z ∞ X bm a1 = β − e−βU (x)m d(x)m , m! 3m m=1

al = bl−1 Thus we can find

L−1 {bl }l=1

for l = 2, . . . , L. L−1

and β such that ρ {bl }l=1

belongs to Dξ (3).

Theorem 1. Vector 11 is a cyclic vector of the space Dξ (3). Proof. The vectors with finite numbers of coefficients are dense in Dξ (3) . Using L Lemma 2 (L − 1) times the vector ρ3 {al }l=1 we get L

ρ3 {al }l=1 = P (K3 ) 11 ,

(10)

where P (K3 ) is the polynomial function of K3 .qed Remark 2. For the potentials with a hard core each vector belonging to Dξ (3) has the form (10). Proposition 1. The spectrum of the operator K3 on Dξ (3) coincides with the set of zeroes of the grand canonical partition function Q(z). Proof. It follows from the result of Pastur [2] and Lemma 1.

3. The Spectral Radius of the K–S Operator In this section we shall study the spectral properties of the operator K3 . Since, for z −1 ∈ ρ(K3 ), where ρ(K3 ) is the resolvent set ofK3 , we have ∞ zl = (z −1 I − K3 )−1 11 , (11) ρ3 Q3 (z) l=1 so we can find analytic properties of the correlation functions from the spectral analysis of K3 . Let T be a bounded, linear operator on a Banach space E with the cyclic vector 11. Let Pk (T ) denote a polynomial function of T , f a rational complex function and r(f (T )) be the spectral radius of the operator f (T ) . Theorem 2. We assume that for each vector ψ of a space E there exists a bounded sequence of operators {Pk (T )} such that ψ = lim Pk (T ) 11. k→∞

Then we have

r(f (T )) = lim sup k f (T )n 11 k1/n . n→∞

(12)

Equilibrium Classical Statistical Mechanics of Continuous Systems

5

Proof. Let ψ ∈ E, lim sup k f (T )n 11 k1/n = r and k Pk (T ) k≤ N , so n→∞

lim supn→∞ k f (T )n ψ k1/n = lim supn→∞ limk→∞ k Pk (T )f (T )n 11 k1/n ≤ ≤ limn→∞ N 1/n lim supn→∞ k f (T )n 11 k1/n = r . Let us consider the following sequence: ∞ f (T )n , (r + )n n=1

(13)

(14)

where is an arbitrary positive number. By the Banach-Steinhaus theorem [4] the family of operators (14) is bounded or there exists ψ ∈ E such that f (T )n = ∞. sup ψ (15) (r + )n n But (13) implies that (15) is impossible. Thus there exists M > 0 such that f (T )n (r + )n ≤ M for any n . Hence

(16)

lim k f (T )n k1/n ≤ r + .

(17)

lim k f (T )n k1/n ≤ lim sup k f (T )n 11 k1/n ,

(18)

lim sup k f (T )n 11 k1/n ≤ lim k f (T )n k1/n lim k 11 k1/n

(19)

n→∞

Since is an arbitrary number n→∞

n→∞

we also have n→∞

n→∞

which completes the proof.

n→∞

By Remark 2 we see that the hard core potentials fulfill the assumption of Theorem 2 so we shall estimate the spectral radius of K3 for nonnegative potentials with the hard core. Using the following inequalities R | 3n K(x1 , (y)n ) d(y)n | ≤ C n , (20) e−βW (x1 ,(y)n ) ≤ 1, we observe that

k K3n 11 k≤ sup ξ −p C n−p+1 (An 11)p , p

where A is the following infinite matrix:  1 1/2! 1 1 0 1 A= 0 0  .. .. . .

1/3! 1/2! 1 1 .. .

1/4! 1/3! 1/2! 1 .. .

 ··· ··· ··· . ···  .. .

(21)

(22)

6

M. Gorzela´nczyk

(An 11)p denotes a p coordinate of An 11. The norm of the matrix {ajk } generated by the vector norm sup | vp | is p

k {ajk } k= sup j

∞ X

| ajk | .

(23)

k=1

Norm (23) of the matrix A is equal to e . Therefore k K3n 11 k1/n ≤ e sup (ξ −p C n−p+1 )1/n . p≤n+1

(24)

When C ≥ ξ −1 , we get by Theorem 2 and the inequality (24) that

Otherwise, when C ≤ ξ −1 ,

r(K3 ) ≤ eC .

(25)

r(K3 ) ≤ eξ −1 .

(26)

Remark 3. As an application we derive interesting properties of the correlation functions for the values of chemical activity z on a real, positive axis. The correlation functions belong to Eξ (3) for λ = z −1 ≥ ξ −1 in the nonnegative potentials case. Thus without losing the generality we can assume that ξ −1 < C. Then the following set G1 = {λ : | λ |> eC} .

(27)

belongs to ρ(K3 ), where ρ(K3 ) is the resolvent set of K3 . If | λ − λ0 |> r(K3 − λ0 I), then ∞ X 1 (K3 − λ0 I)n . (28) (λI − K3 )−1 = (λ − λ0 )n+1 n=0

Let us assume λ0 = −C . Taking into account the inequality Z 0 | e−βW (x1 ,(x)n ) K(x1 , y1 ) dy1 + C |≤ C ,

(29)

3

we conclude in the same manner as above that the set G2 = {λ : | λ + C |> eC}

(30)

also belongs to the resolvent set of the operator K3 for ξ ≥ C −1 . Now we can formulate the technical lemma. Lemma 3. Let D be the compact subset of the set G1 ∪ G2 . If z −1 ∈ D then zl zl − ρ3j ] k= 0 , lim k χ30 [ρ3k k,j→∞ Q3j (z) l=1 Q3j (z) l=1

(31)

S∞ 0 ν where 31 ⊂ 32 ⊂ . . . , k=1 3k = R are Lebesgue measurable and 3 is an arbitrary Lebesgue mesurable compact set contained in Rν .

Equilibrium Classical Statistical Mechanics of Continuous Systems

7

Proof. Assume that z −1 belongs to the set G2 . Since the series (28) is absolutely convergent and the estimation on the spectral radius does not depend on 3, then for any ε > 0 there exists a number N such that for k, j > N , n l o PN z zl 1 0 − ρ3j Q3 (z) ]k≤ k χ3 [ρ3k Q3 (z) n=1 |λ−λ0 |n+1 j k l=1 (32) l=1 × k χ30 [(K3k + CI)n 11 − (K3j + CI)n 11] k +ε, where λ = z −1 . It is easy to see that the expression k χ30 [(K3k + CI)n 11 − (K3j + CI)n 11] k

(33)

tends to zero for k, j → ∞ for any n , so we can estimate (32) by 2ε for sufficently large k, j. When z −1 belongs to G1 we put λ0 = 0 and complete the proof in the same way as above. Applying Lemma 3, we obtain Theorem 3. The sequence of the correlation functions tends to the thermodynamical limit uniformly on compact subsets for z −1 ∈ G1 ∪ G2 . So, the limit is the analytic function of z in the set G1 ∪ G2 . Remark 4. For a pure hard core interaction we can assume C = 1. Then the density means a ratio between the total volume occupied by spheres and the volume of the domain where the spheres are contained. We note that the estimations on the density ρ are consistent with the conjecture that the phase transition occurs for ρ = 1/2 (see [1]) in the hard core gas in dimension greater than 1. References 1. Gorzela´nczyk, M.: Phase trasitions in the gas of hard core spheres. Commun. Math. Phys. 136, 43–52 (1991) 2. Pastur, L.A.: Spectral theory of the Kirkwood-Salsburg equation in finite volume. Theor. Math. Phys. 18, 233 (1974) 3. Petrina, D.Ya., Gerasimenko, V.I., Malyshev, P.V.: Mathematical fundation of classical statistical mechanics. Kiev (1985), English transl. Adv. Stud. Contemp. Math. Vol 6, New York: Gordon and Breach, 1989 4. Rudin, W.: Functional analysis. New York, Mac Graw-Hill, 1973 5. Ruelle, D.: Statistical mechanics. New York, Benjamin, 1969 6. Zagrebnov, V.A.: Spectral properties of Kirkwood–Salsburg and Kirkwood–Ruelle operators. J. Stat. Phys. 27, N 3, 577–591 (1982) Communicated by D. Brydges

Commun. Math. Phys. 192, 9 – 28 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Chaotic Expansions of Elements of the Universal Enveloping Superalgebra Associated with a Z2 -graded Quantum Stochastic Calculus T. M. W. Eyre? Department of Mathematics, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom. E-mail: [email protected] Received: 28 January 1997 / Accepted: 10 June 1997

Abstract: Given a polynomial function f of classical stochastic integrator processes 3 = (31 , . . . , 3N ) whose differentials satisfy a closed Ito multiplication table, we can express the stochastic derivative of f as df (3) = f (3 + d3) − f (3). We establish an analogue of this formula in the form of a chaotic decomposition for Z2 graded theories of quantum stochastic calculus based on the natural coalgebra structure of the universal enveloping superalgebra.

1. Introduction In [HP1] a theory of quantum stochastic calculus was devised. The Bosonic and Fermionic versions of this calculus were unified for the one-dimensional case in [HP2] by means of the formula (1) dB = (−1)3 dA. The full Ito algebra of one-dimensional Fermionic quantum stochastic calculus consists of the integrators d3, dT, dB, dB † , these being the gauge, time, Fermionic annihilation and Fermionic creation integrators respectively. Of these, d3, dT commute about integrands whereas dB, dB † commute or anticommute about an integrand X depending on whether (−1)3 X(−1)3 = X or −X respectively. The Fermionic quantum Ito table [HP2] shows that the conditions for a Z2 -graded algebra are satisfied by this Ito algebra when the obvious assignments dB, dB † odd, dT, d3 even are made. The general theory of Z2 -graded algebras is described in [C]. ?

T. M. W. E. is supported by an EPSRC studentship.

10

T. M. W. Eyre

In [EH], an extension of (1) gave rise to a Z2 -graded theory of multidimensional quantum stochastic calculus. Here we have a mixture of multidimensional Bosonic and Fermionic creation and annihilation integrators along with graded multidimensional gauge integrators. The commutation and anticommutation properties of these integrators provide representations of Lie superalgebras, a class of structures which has received considerable attention since its introduction in 1977 by V. Kac [K]. In particular, all Lie superalgebras of the form gl(N, r) can be represented in this Z2 -graded theory by means of the Z2 -graded pure gauge processes. The purpose of this paper is to give an explicit formula for the chaotic expansion of elements of the universal enveloping superalgebra of the Lie superalgebra associated with the Z2 -graded multidimensional quantum stochastic calculus. This has been done for the ungraded case in [HPu]. Section 2 gives an outline of Z2 -graded multidimensional quantum stochastic calculus and gives some elementary results to be used further on. In Sect. 3 we work with an abstract superalgebra-with-involution, I, called the Ito superalgebra. In the (Chevalley) tensor superalgebra of I denoted by T (I) and defined as T (I) := C ⊕ I ⊕ (I⊗I) ⊕ (I⊗I⊗I) ⊕ · · · we construct a product denoted ?. Some results concerning ? are proved, including associativity. Sect. 4 takes I to be the superalgebra of Z2 -graded quantum stochastic differentials under Ito multiplication. A map I is defined on T (I) that corresponds to iterated integration against the integrators of an element of T (I) and it is shown that, in a certain sense, for a, b ∈ T (I), I(a)I(b) = I(a ? b). In Sect. 5 it is shown that the map I is injective. This is a crucial result for Sect. 6 which gives the main result of the paper, this being the existence of a chaos expansion χ from the universal enveloping superalgebra U associated with I to T (I). An explicit formula is given for this expansion. Some of the arguments given in this paper are straightforward adaptations of those given in [HPu] but there are significant deviations, most notably in Sect. 6. 2. Preliminaries 2.1. Definitions and elementary results. Let M0 (N ) be the space of all complex (N + 1) × (N + 1) matrices, indexed from 0 to N , with multiplication defined by A.B = A1B, where 1 ∈ M0 (N ) has entries that are all zero except for the 1, . . . , N th diagonal entries: 0 0 0 ··· 0 0 . 1 =  ..  0 0

1 .. .

··· ···

0 .. . 0 0

··· .. . 1 0

0 ..  . .  0 1

This space may be used to index the integrators of ungraded quantum stochastic calculus [HP1]. Denote by Eβα the element of M0 (N ) that has 1 in the (α, β)th entry and is zero elsewhere. Note that Greek letters indicate indices varying from 0 to N and Roman

Chaotic Expansions of Universal Enveloping Superalgebra

11

letters indicate indices varying from 1 to N . The Einstein repeated suffix summation convention is also in force throughout this paper. To the element Eβα of M0 (N ) there corresponds the quantum stochastic process 3Eβα which will be denoted by 3α β and which will now be defined. Let the canonical basis element (0, . . . , 0, 1, 0, . . . , 0) of CN be denoted ei , where the 1 appears in the ith position. We define 30j (t) for j ≥ 1 and t ≥ 0 by its action on an arbitrary exponential vector e(f ) in the Boson Fock space 0(L2 (R+ ; CN )): 30j (t)e(f ) =

d e(f + χ[0,t] zej )|z=0 . dz

(2)

For i, j such that 1 ≤ i, j ≤ N we define 3ij (t) for all t ≥ 0 by 3ij (t)e(f ) =

d e(eχ[0,t] z|ej ><ei | f )|z=0 , dz

(3)

where |ej >< ei | is the Dirac dyad acting in L2 (R+ ; CN ) that maps a function f to the function that has zero for all components except the j th component which is equal to the ith component of f . The process 3i0 (t), where i ≥ 1 is defined by Z t i f i (s) ds e(f ). (4) 30 (t)e(f ) = 0

Throughout this paper the notation f α indicates the αth component of f = (f 1 , . . . , f N ) ∈ L2 (R+ ; CN ). We will denote by fα the complex conjugate f¯α of f α . For α = 0 we define f 0 = f0 = 1. In the remaining case where α = β = 0 we define 300 (t) by 300 (t)e(f ) = te(f ).

(5)

The vectors {e(f ) : f ∈ L2 (R+ ; CN )} form a total set in 0(L2 (R+ ; CN )). The space of all finite linear combinations of the exponential vectors is known as the exponential domain and denoted E. All the 3α β are defined on E. The processes defined in (2) are † known as the N -dimensional creation processes and may also be written as Ai (t). The processes defined in (3) are called the N -dimensional gauge processes. The processes defined in (4) are the N -dimensional annihilation processes and may also be written Ai (t). The process (5) is known as the time process. For an arbitrary element A of M0 (N ) which decomposes as λβα Eβα the process (3A (t))t≥0 is defined to be the process (λβα 3α β (t))t≥0 . The algebra M0 (N ) is associative and therefore if it is equipped with a Lie bracket defined as [A, B] := A.B − B.A = A1B − B1A we obtain a Lie algebra M0 (N )Lie which we write as gl0 (N ). The processes and integrators of quantum stochastic calculus form a representation of gl0 (N ), that is to say [3A (t), 3B (t)] = 3[A,B] (t), [d3A (t), d3B (t)] = d3[A,B] (t).

(6)

Eq. (6) involves unbounded operators and so the composition that might be considered implicit in the bracket must be interpreted in terms of adjoints and the inner product. When doing this it is convenient to restrict our attention to the processes {3α β : 0 ≤

12

T. M. W. Eyre

α, β ≤ N }. It is straightforward to show that for all α, β the restriction to E of the α β † ˆα adjoint 3α β (t) of 3β (t) is 3α (t). We define the Evans delta δβ to be zero for all α, β except where α = β 6= 0 in which case it is equal to 1. Note that 1 = {δˆβα }. With this notation, (6) is to be interpreted rigorously as < 3βα (t)e(f ), 3γδ (t)e(g) > − < 3δγ (t)e(f ), 3α β (t)e(g) > γ γ = < e(f ), (δˆδα 3β (t) − δˆβ 3α δ (t))e(g) > .

(7)

The right hand side of (7) may be expressed as < e(f ), 3[Eβα ,Eδγ ] e(g) > . In [EH] it was shown that the one-dimensional Fermionic theory of quantum stochastic calculus may be extended to an arbitrary number of dimensions. This extension provides representations of a broad category of Lie superalgebras. For convenience we give a brief overview of Lie superalgebras here. A superalgebra or Z2 -graded algebra is an associative algebra A that may be decomposed as an internal direct sum A = A0 + A1 , where A0 and A1 satisfy the inclusions A0 A0 , A1 A1 ⊂ A0 ,

A0 A1 , A1 A0 ⊂ A1 .

(8)

We say that A0 is the even subspace and A1 is the odd subspace. If an element a ∈ A is an element of A0 then we say that a is even or of parity 0. Similarly, if a is an element of A1 then we say a is odd or of parity 1. If a is odd or even then it is said to be homogeneous or of definite parity. An element of A is not, in general, of definite parity but any element of A may be expressed uniquely as the sum of two such elements. On elements of A that are of definite parity we define the parity function σ that maps a to its parity, i.e., σ(a) = 0 if a is even and σ(a) = 1 if a is odd. Note that σ is only defined on A0 ∪ A1 , not on all of A. It is well known that an arbitrary associative algebra may be equipped with the standard commutator bracket to give a Lie algebra. An arbitrary associative superalgebra A may be similarly equipped with a superbracket to give a Lie superalgebra. The bracket will be denoted { . , . } and is defined by a bi-linear extension of the following rule on a, b ∈ A with a, b of definite parity: {a, b} = ab − (−1)σ(a)σ(b) ba.

(9)

The bracket acts as a commutator when at least one of a, b is even and as an anticommutator when a and b are both odd. When the bracket is applied to arbitrary elements of A we can see that it acts as part commutator and part anticommutator. A more detailed description of Lie superalgebras, including the full definition, may be found in [K, S]. If an integer r with 0 ≤ r < N is fixed then the algebra M0 (N ) described above may be graded as M0 (N )0 + M1 (N )1 , where M0 (N )0 is defined to be the subspace consisting of all matrices of the form r   ∗ ∗ ··· ∗ 0 ··· 0 ∗ ∗ ··· ∗ 0 ··· 0  . . .  . . ... ... . . . ...   .. ..    r  ∗ ∗ · · · ∗ 0 · · · 0 , 0 0 ··· 0 ∗ ··· ∗  . . .  . . . . ... ... . . . ...  . . 0 0 ··· 0 ∗ ··· ∗

Chaotic Expansions of Universal Enveloping Superalgebra

13

the entries marked ∗ taking any value in C. Similarly we define M0 (N )1 to be the subspace of all matrices of the form 

0 0 .  ..  r 0 ∗ . . .

∗

0 0 .. . 0 ∗ .. .

∗

··· ··· .. . ··· ··· .. . ···

r 0 0 .. . 0 ∗ .. .

∗

∗ ∗ .. .

∗ 0 .. . 0

··· ··· .. . ··· ··· .. . ···

∗ ∗ .. .

∗ 0 .. .

     .    

0

It is easy to see that M0 (N )0 and M0 (N )1 satisfy the inclusions (8). If the superbracket is defined using the established multiplication for M0 (N ) in (9) then we have derived a Lie superalgebra from M0 (N ) which we denote gl0 (N, r). This Lie superalgebra is the foundation of Z2 -graded quantum stochastic calculus, a theory introduced in [EH]. An outline of this theory will now be given. Let the value of r used in gl0 (N, r) be fixed once and for all. We define the grading process G on 0(L2 (R+ ; CN )) by its action on an arbitrary exponential vector e(f ). For time t ≥ 0 and f = (f 1 , . . . , f N ) ∈ L2 (R+ ; CN ) we set G(t)e(f ) = e(χ[0,t] (f 1 , . . . , f r , −f r+1 , . . . , −f N ) + χ(t,∞) f ). The totality of the e(f ) in 0(L2 (R+ ; CN )) and the evident isometry of each G(t) means that G is defined on the whole of that space. Note that for each t the operator G(t) is self-adjoint, involutive and leaves E invariant. It is clear that for arbitrary α, β the element Eβα is of definite parity. Thus we define α σβ to be 0 or 1 depending on whether Eβα is even or odd respectively. Note that for all α, β we have σβα = σαβ . We may now define the integrator processes of Z2 -graded quantum stochastic calculus. The integrators are denoted dΞβα (t), where 0 ≤ α, β ≤ N and are defined by α dΞβα (t) = Gσβ (t) d3α β (t). This definition is an N -dimensional extension of the Boson-Fermion unification formula dB = (−1)3 dA of [HP2]. From this definition we may construct the processes {Ξβα : 0 ≤ α, β ≤ N } by defining for all α, β and for all t ≥ 0, Z t Ξβα (t) = dΞβα (s). 0

α We see that when σβα = 0 the differential dΞβα is equal to d3α β and the process Ξβ is α α α α α equal to 3β . When σβ = 1 the differential dΞβ is equal to G d3β and the process Ξβ (t) Rt is equal to 0 G(s) d3α β (s). For an arbitrary element A of the Lie superalgebra gl0 (N, r) we may write A = λβα Eβα . This enables us to define ΞA = λβα Ξβα . In [EH] it was shown that the ΞA and their differentials form representations of the Lie superalgebra gl0 (N, r), that is to say that for arbitrary A, B ∈ gl0 (N ) we have

{ΞA , ΞB } = Ξ{A,B} , {dΞA , dΞB } = dΞ{A,B} .

(10)

14

T. M. W. Eyre

As with the ungraded case, the left-hand side of (10) must be interpreted in terms of adjoints and inner products. The self-adjointness of each G(t) provides us with Ξβα (t)† = Ξαβ (t) for each t ≥ 0 so (10) is to be interpreted rigorously as α

γ

< Ξαβ (t)e(f ), Ξδγ (t)e(g) > −(−1)σβ σδ < Ξγδ (t)e(f ), Ξβα (t)e(g) > γ

α = < e(f ), (δˆδα Ξβγ (t) − (−1)σβ σδ δˆβγ Ξδα (t))e(g) >,

where t ≥ 0 is arbitrary and f, g are arbitrary elements of L2 (R+ ; CN ). In what follows we will need to make use of the First and Second Fundamental Formulae of [HP1]. In the Z2 -graded case the First Fundamental Formula clearly takes the form Z t Z t α P (s) dΞβα (s)e(g) >= fβ (s)g α (s) < e(f ), P (s)Gσβ (s)e(g) > ds, < e(f ), 0

0

(11) where t ≥ 0, 0 ≤ α, β ≤ N , f, g ∈ L2 (R+ ; CN ) and P is an integrable process. If Q is also an arbitrary integrable process then the Z2 -graded version of the Second Fundamental Formula is easily seen to be Z t Z t β P (s) dΞα (s)e(f ), Q(s) dΞδγ (s)e(g) >= < 0

Z

t

= Z

Z

fδ (s)g γ (s) <

0

0 t

0

s

γ

P (r) dΞαβ (r)e(f ), Q(s)Gσδ (s)e(g) > ds β σα

α

fβ (s)g (s) < P (s)G

+ 0

+ δˆδα

Z

s

(s)e(f ), 0

Z

t

β

Q(r) dΞδγ (r)e(g) > ds γ

fβ (s)g γ (s) < P (s)Gσα e(f ), Q(s)Gσδ (s)e(g) > ds.

(12)

0

Eq. (12) may be summarised as dEF = E dF + (dE)F + dE.dF,

(13)

dΞA .dΞB = dΞA.B = dΞA1B ,

(14)

where this being the quantum Ito multiplication. Details concerning (13) and (14) can be found in [HP1]. Quantum stochastic integrals are themselves integrable processes and therefore we may form iterated quantum stochastic integrals of the form Z dΞA1 (t1 ) · · · ΞAn (tn ). 0
By using formula (11) it can be shown that Z Z Pn G(t) dΞA1 (t1 ) · · · dΞAn (tn )G(t) = (−1) i=1 σ(Ai ) dΞA1 (t1 ) · · · dΞAn (tn ). (15) 0
0
This relation is of great importance in what follows. Application of Eq. (15) along with (12) shows that for A1 , . . . , An ∈ gl0 (N, r) of definite parity with respective adjoint (i.e., conjugate transpose) matrices B1 , . . . , Bn ,

Chaotic Expansions of Universal Enveloping Superalgebra

Z dΞA1 (t1 ) · · · dΞAn (tn ) 0
†

P = (−1)

i<j

15

σ(Ai )σ(Aj )

Z dΞB1 (t1 ) · · · dΞBn (tn ). 0
(16) For n ≥ 1 we denote by Mn the set {1, . . . , n}. For a finite set S we denote by |S| the cardinality of S. All tensor products of associative algebras in this paper are the Chevalley tensor product [C] unless stated otherwise. This Z2 -graded version of the tensor product is described in 2.2. 2.2. The role of the Chevalley tensor product. The Chevalley tensor product plays a critical and entirely natural part in Z2 -graded quantum stochastic calculus. To recall the theory detailed in [C], given two Z2 -graded algebras (that is, superalgebras) A and B we denote their Chevalley tensor product by A⊗B. Multiplication of elements of A⊗B is defined by linear extension of the following rule for product tensors with entries of definite parity: (a1 ⊗ a2 )(b1 ⊗ b2 ) = (−1)σ(a2 )σ(b1 ) a1 b1 ⊗ a2 b2 . This rule for multiplication of Chevalley product tensors may be extended to n-fold tensors by a linear extension of the following rule for n-fold product tensors with entries of definite parity: P σ(a )σ(b ) (a1 ⊗ · · · ⊗ an )(b1 ⊗ · · · ⊗ bn ) = (−1) i>j i j a1 b1 ⊗ · · · ⊗ an bn . A homomorphism f from one Z2 -graded algebra to another is, in addition to being an algebra morphism, required to preserve the parity of elements i.e., for an element a of definite parity we require σ(f (a)) = σ(a). Tensor products of parity preseving maps behave as would be expected, i.e., f ⊗ g(x ⊗ y) = f (x) ⊗ g(y). All superalgebra maps occuring in this work are parity preserving. Consider an arbitrary implementation of ungraded quantum stochastic calculus where the space of integrable processes is denoted B and the space of integrators is denoted J. Then, in strict notation and using the standard, ungraded tensor product, the quantum stochastic integral should be written as Z E ⊗ d3, where E ⊗ d3 ∈ B ⊗ J. The Z2 -graded quantum stochastic calculus used a Z2 graded integrator space spanned by {dΞµλ : 0 ≤ λ, µ ≤ N } which will be denoted by J in this subsection. We shall denote the processes integrable by the elements of J by B. The elements of B are Z2 -graded by G in the natural way. may be embedded in the verThe Z2 -graded R quantum stochastic R calculus R ungraded α α sion by writing E dΞβα in the form EGσβ d3α EGσβ ⊗ d3α β , or more strictly as β again using the standard rather than Chevalley tensor product. If two such integrals are multiplied together and we assume the integrands E, F to be of definite parity then the Ito correction term is Z Z α γ α σδγ ⊗ d3γδ ) (E dΞβ F dΞδ ) = (EGσβ ⊗ d3α β )(F G Z γ α γ = EGσβ F Gσδ ⊗ d3α β .d3δ Z γ α α γ σβ σ(F ) EF Gσβ Gσδ ⊗ d3α = (−1) β .d3δ

16

T. M. W. Eyre α

= (−1)σβ σ(F ) α

= (−1)σβ σ(F )

Z Z

EF ⊗ dΞβα .dΞδγ EF dΞβα .dΞδγ .

Here we have E, F ∈ B, dΞβα ∈ J but for the calculation we have switched to the more familiar non-graded case and used the ungraded tensor product. If we now consider the Chevalley tensor product B ⊗ I, then the rule for multiplication of Chevalley product tensors gives Z Z α γ α σβ σ(F ) EF ⊗ dΞβα .dΞδγ (E ⊗ dΞβ )(F ⊗ dΞδ ) = (−1) Z α = (−1)σβ σ(F ) EF dΞβα .dΞδγ . Similar results hold for the terms ΞA dΞ R B and (dΞA )ΞB of Eq. (13) and so it is natural to interpret a quantum stochastic integral E dΞ in the Z2 -graded case using the Chevalley R tensor product E ⊗ dΞ. Note that processes supercommute about differentials, that is to say dΞA (X) = (−1)σ(X)σ(A) (X)dΞA . The reader is referred to [AH] for earlier work on the relationship between the Chevalley tensor product and Fermionic Quantum Stochastic calculus. 3. The ? Product 3.1. Preliminaries. Take I to be an arbitrary complex associative superalgebra of finite dimension equipped with involution †. We call this the Ito superalgebra. Let T (I) be the complex vector space of all tensors of finite rank formed from I: T (I) = C ⊕ I ⊕ (I⊗I) ⊕ (I⊗I⊗I) ⊕ · · · . An element of T (I) will be a sequence (a0 , a1 , a2 , . . .) with each ai ∈ ⊗i I and only a finite number of non-zero terms. If we relax the finiteness restriction we have the strong sum denoted TS (I). Eq. (16) indicates the means of extending the involution † on I to the whole of TS (I). The extended involution will also be denoted † and is defined by linear extension of the following rule for product elements with entries of definite parity: P σ(a )σ(a ) † (17) (a1 ⊗ · · · ⊗ an )† = (−1) i<j i j a1 ⊗ · · · ⊗ an† . In this section we define a multiplication rule ? which makes T (I) and TS (I) into unital associative superalgebras for which † is an involution. The corresponding multiplication in the ungraded case is described in [HPu]. 3.2. Definition of the ? product. The ? product must be defined in terms of a simpler multiplication rule. Take an arbitrary n ∈ N and let A, B ⊂ Mn with A = {i1 < · · · < i|A| }, B = {j1 < · · · < j|B| } and A ∪ B = Mn . Then if a = a1 ⊗ · · · ⊗ a|A| , b = b1 ⊗ · · · ⊗ b|B| are product tensors with entries of definite parity in TS (I) we define the product • by P σ(al )σ(bk ) 1 c ⊗ · · · ⊗ cn , (18) aA • bB = (−1) jk
Chaotic Expansions of Universal Enveloping Superalgebra

where, for m ∈ Mn ,

( c

m

=

al , bk a l bk

17

if m = il ∈ A ∩ (Mn \ B); if m = jk ∈ (Mn \ A) ∩ B; if m = il = jk ∈ A ∩ B.

If we write for (a0 , a1 , a2 , . . .), (b0 , b1 , b2 , . . .) ∈ TS (I) (a0 , a1 , a2 , . . .) ? (b0 , b1 , b2 , . . .) = (c0 , c1 , c2 , . . .),

(19)

then the ? product is defined explicitly by setting c0 = a0 b0 and for each k ≥ 1 setting X B ck = aA (20) |A| • b|B| . A∪B=Mk

Theorem 3.1. When equipped with the multiplication ?, TS (I) is a complex associative superalgebra with involution † and unit element (1, 0, 0, . . .). Proof. It is evident from (20) that the composition (19) is bilinear so it suffices to prove the properties for product tensors with entries of definite parity only. First we establish associativity. If a, b, c ∈ TS (I) then for all n ≥ 0, X C ((a ? b) ? c)n = (a ? b)D |D| • c|C| D∪C=Mn

=

X

X

B D C (aA |A| • b|B| ) • c|C|

D∪C=Mn A∪B=M|D|

=

X

X

B C D aA |A| • (b|B| • c|C| )|D|

A∪D=Mn B∪C=M|D|

=

X

D aA |A| • (b ? c)|D|

A∪D=Mn

= (a ? (b ? c))n , and so we have that ? is associative. It is clear from (20) that in order to prove the relation (a ? b)† = b† ? a† it suffices to show that for arbitrary A, B ⊂ Mn with A = {i1 < · · · < i|A| }, B = {j1 < · · · < j|B| }, A ∪ B = Mn and a = a1 ⊗ · · · ⊗ a|A| , b = b1 ⊗ · · · ⊗ b|B| ∈ T (I) we have (aA • bB )† = b†B • a†A .

(21)

From (17) and (18) we see that the entries of the tensors on either side of (21) will be the same. It remains to show that the powers of −1 introduced on each side Pof (21) by (17) σ(al )σ(bk )

. and (18) are equal. On the left-hand side, (18) gives a factor of (−1) jk
18

T. M. W. Eyre

Corollary 3.2. The space T (I) is a unital associative superalgebra with involution † under the multiplication ?. Proof. From (20) we see that (a ? b)n depends only on components of a and b of rank ≤ n so the result follows.

4. A Higher Order Ito Product Formula 4.1. Preliminaries. We now take the †-superalgebra I of Sect. 3 to be the superalgebra {dΞA : A ∈ gl0 (N, r)} with multiplication given by (14). We might equally well use a sub-†-superalgebra of this superalgebra. To each integrator there corresponds a quantum stochastic process. Such a process will be integrable [P] and the operators that form the process will act on the exponential domain E. We have then that iterated quantum stochastic integrals are well-defined as operators on E. Evidently the process Z dΞ1 (t1 ) . . . dΞn (tn ) 0
is multilinear for dΞ1 , . . . , dΞn ∈ I. We may therefore define uniquely a linear map I n on T (I) by Z dΞ1 (t1 ) . . . dΞn (tn ). I n : dΞ1 ⊗ · · · ⊗ dΞn 7→ 0
The map I 0 is defined on C by I 0 (z) = zId, where Id is the identity process in P. We amalgamate all the (I n )n≥0 into a single map called the integrator map which is denoted by I and is defined on the whole of T (I). We will denote by P the space I(T (I)) of all iterated quantum stochastic integrals formed by elements of I. Note that convergence requirements prevent I from being defined on the whole of TS (I). The map † on T (I) is defined by (17) on a product tensor dΞ1 ⊗ · · · ⊗ dΞn with each entry of definite parity as P σ(dΞi )σ(dΞj ) † dΞ1 ⊗ · · · ⊗ dΞn† . (dΞ1 ⊗ · · · ⊗ dΞn )† = (−1) i<j This extends to the whole of T (I) by linearity and it is clear from the original model (16) for this involution that I is a † morphism on T (I). 4.2. Fundamental property of ?. The purpose of this subsection is to show that the map I is, in a certain sense, a unital †-superalgebra morphism from T (I) to P. Note that P consists of processes formed from unbounded operators and so a multiplication for P in the sense of composition of operators is not meaningful. We therefore express the theorem in terms of adjoints and the inner product. In Sect. 6 we shall exploit Theorem 4.1 to equip P with a rigorous multiplication. Theorem 4.1. For arbitrary ψ, ρ ∈ E and a, b ∈ T (I) hI(a)† ψ, I(b)ρi = hψ, I(a ? b)ρi.

(22)

Chaotic Expansions of Universal Enveloping Superalgebra

19

Proof. By linearity it suffices to prove the result for cases where a = (0, . . . , 0, a1 ⊗ · · · ⊗ am , 0, · · ·),

b = (0, . . . , 0, b1 ⊗ · · · ⊗ bn , 0, · · ·) γ

with each ai , bi a basis element of I so that we may write ai = dΞβαii , bj = dΞδjj , 1 ≤ i ≤ m, 1 ≤ j ≤ n. We may also assume that ψ, ρ are exponential vectors of the Fock space and we denote them e(f ), e(g) respectively, where f, g are some elements of L2 (R+ ; CN ). If m or n = 0 then (22) is obvious. Thus we may assume m, n ≥ 1 and define a˙ = a1 ⊗ · · · ⊗ am−1 , b˙ = b1 ⊗ · · · ⊗ bn−1 , and by a slight abuse of notation we declare a = a1 ⊗ · · · ⊗ am , b = b1 ⊗ · · · ⊗ bn . We now state and prove a lemma. Lemma 4.2. Using the notation already given Pn m i ˙ ⊗ bn + (−1) i=1 σ(a )σ(b ) (˙a ? b) ⊗ am a ? b = (a ? b) Pn−1 m i ˙ ⊗ a m bn . + (−1) i=1 σ(a )σ(b ) (˙a ? b) Proof of lemma. Consider an arbitrary component of a ? b, X X B aA (a ? b)k = |A| • b|B| = A∪B=Mk

X

=

A∪B=Mk |A|=m |B|=n

aA • b˙ D ⊗ bn

A∪D=Mk−1 |A|=m, |D|=n−1

Pn

+ (−1)

i=1

X

σ(am )σ(bi )

Pn−1 + (−1)

a A • bB

i=1

a˙ C • bB ⊗ am

C∪B=Mk−1 |C|=m−1, |B|=n σ(am )σ(bi )

X

a˙ C • b˙ D ⊗ am bn

C∪D=Mk−1 |C|=m−1, |D|=n−1

Pn m i ˙ ⊗ bn + (−1) i=1 σ(a )σ(b ) (˙a ? b) ⊗ am = (a ? b) Pn−1 m i + (−1) i=1 σ(a )σ(b ) a˙ ? b˙ ⊗ am bn , k

and so the lemma is proved. We now continue with the main proof. Consider the case m = n = 1 so that a = a1 = b = b1 = dΞδγ . We know that I(a)† = I(a† ) so at an arbitrary time t ≥ 0 we have † hI(a)t e(f ), I(b)t e(g)i is equal to

dΞβα ,

hΞαβ (t)e(f ), Ξδγ (t)e(g)i.

(23)

20

T. M. W. Eyre

Applying the Second Fundamental Formula (12) to this expression and then applying the First Fundamental Formula (11) to the first and second of the resulting terms shows that (23) is equal to Z γ α σδγ σ0α fδ (t1 )g γ (t1 )fβ (t2 )g α (t2 )hGσβ (t2 )e(f ), Gσδ (t1 )e(g)i dt2 dt1 (−1) 0
Z

Z

γ

α

fβ (t1 )g α (t1 )fδ (t2 )g γ (t2 )hGσβ (t1 )e(f ), Gσδ (t2 )e(g)i dt2 dt1

+(−1)

0
fβ (t1 )g γ (t1 )he(f ), Gσβ (t1 )e(g)i dt1 .

+ δˆδα

0
By means of the First Fundamental Formula (11) and (15) we can see that this expression may be re-written as Z t1 Z t γ fδ (t1 )g γ (t1 )he(f ), dΞβα (t2 )Gσδ (t1 )e(g)i dt1 0

α γ σβ σδ

Z

Z

t1

α

fβ (t1 )g (t1 )he(f ),

+ (−1)

0 t

Z + δˆδα he(f ),

0

t

0

0

α

dΞδγ (t2 )Gσβ (t1 )e(g)i dt1

dΞβγ (t1 )e(g)i.

A further application of the First Fundamental Formula (11) shows that this expression is equal to he(f ), I(a1 ⊗ b1 )e(g)i + (−1)σ(a

1

)σ(b1 )

he(f ), I(b1 ⊗ a1 )e(g)i

+ he(f ), I(a b )e(g)i 1 1

= he(f ), I(a ? b)e(g)i by Lemma 4.2. Therefore we have that (22) holds in all cases where n+m ≤ 2. NowP we make the inductive assumption that (22) holds for m+n ≤ k−1. Setting Y 0 := (−1) we may write

m−1 i=1

σ(am )σ(ai )

hI(a† )e(f ), I(b)e(g)i Z t Z t γn † 0 βm ˙ =Y h I(˙a )dΞαm e(f ), I(b)dΞ δn e(g)i. 0

0

By the Second Fundamental Formula this is equal to Z t γ ˙ σδnn (t1 )e(g)idt1 fδn (t1 )g γn (t1 )hI(a† )e(f ), I(b)G 0

+ Y0

Z

t 0

+Y

0 ˆ αm δ δn

fβm (t1 )g αm (t1 )hI(˙a† )Gσαm (t1 )e(f ), I(b)e(g)idt1 βm

Z

t 0

βm ˙ σδnn (t1 )e(g)i dt1 . fβm (t1 )g γn (t1 )hI(˙a† )Gσαm (t1 )e(f ), I(b)G γ

Chaotic Expansions of Universal Enveloping Superalgebra

21

Invoking the inductive hypothesis we may re-write this as Z t γ ˙ σδnn (t1 )e(g)idt1 fδn (t1 )g γn (t1 )he(f ), I(a ? b)G 0

+ Y0

Z

t

αm

fβm (t1 )g αm (t1 )he(f ), Gσβm (t1 )I(˙a ? b)e(g)idt1

0

+ Y 0 δˆδαnm

Z

t

αm

γn

˙ σδn (t1 )e(g)i dt1 . fβm (t1 )g γn (t1 )he(f ), Gσβm (t1 )I(˙a ? b)G

0

By (15) this may be re-written as Z t γ ˙ σδnn (t1 )e(g)idt1 fδn (t1 )g γn (t1 )he(f ), I(a ? b)G 0

Pn

+ (−1)

j=1

σ(am )σ(bj )

Pn−1 + (−1)

j=1

Z

t 0

m

j

αm

fβm (t1 )g αm (t1 )he(f ), I(˙a ? b)Gσβm (t1 )e(g)idt1

σ(a )σ(b ) ˆ αm δ δn

Z

t

γn

˙ σβm (t1 )e(g)i dt1 fβm (t1 )g γn (t1 )he(f ), I(˙a ? b)G

0

˙ ⊗ bn )e(g)i = he(f ), I((a ? b) Pn σ(am )σ(bj ) + he(f ), (−1) j=1 I(˙a ? b) ⊗ am e(g)i Pn−1 m j σ(a )σ(b ) ˙ ⊗ am bn )e(g)i. I((˙a ? b) + he(f ), (−1) j=1 By Lemma 4.2 we have that this expression is equal to = he(f ), I(a ? b)e(g)i as required.

5. The Injectivity of I In Sect. 6 we will require the map I to be injective. This is shown using a result from [L]. R· α β Proposition 5.1. If {aα β } is a matrix of elements of T (I) then 0 I(aβ ) dΞα = 0 implies α that for each α, β we have I(aβ ) = 0. Proof. The proposition follows as a special case of Corollary 1.3 in [L] which will be given here. First it is necessary to introduce some notation. The set DS described in [L] may be taken here to be the exponential domain E defined in Sect. 2. A process X is said to be a DS -process if (a) DomX(t) ⊃ DS ∀t ∈ R+ , (b) t 7→ X(t)k is Borel measurable ∀k ∈ DS , t ∈ R+ . A DS -process X is said to be locally square integrable if Z T kX(t)kk2 dt < ∞ ∀k ∈ DS , T > 0. 0

22

T. M. W. Eyre

Corollary 1.3 of [L] states that if {Fβα } is a matrix of adapted, locally square integrable DS -processes and each operator Fβα (t), t ∈ R+ is closable then Z

· 0

Fβα d3βα = 0 ⇒ {Fβα } = 0.

It is well known that quantum stochastic integrals I(a) are adapted processes and that † for each a ∈ T (I), I(a)t has an adjoint I(a)t = I(a† )t . It follows that each I(aα β )t is closable. We know from Sect. 2 that each I(a) is defined on E = DS so that condition (a) of the DS -process property is satisfied. Furthermore, it is well-known that quantum stochastic integrals are time-continuous and are therefore Borel measurable in the sense of (b). This continuity also provides the required square-integrability property. Thus we have that each I(a) is aRDS -process and so, all the conditions on the I(a) being satisfied, · β we may conclude that 0 I(aα β ) dΞα = 0 implies that for all 0 ≤ α, β ≤ N we have α

σβ I(aα = 0, β )G

(24)

where G is the grading function of Sect. 2. From (24) we may conclude that all I(aα β) with σβα = 0 are zero. The remaining terms have σβα = 1 and as the map G leaves the exponential domain E invariant we may multiply each side of (24) by G yielding I(aα β ) = 0 as required. Having established that all the terms are zero, the result follows. Proposition 5.2. The map I : T (I) → P is injective. Proof. The map I is linear so it suffices to show that, given an arbitrary a ∈ T (I), I(a) = 0 implies a = 0. R β α Suppose a is of the form (0, a1 , a2 , . . .). Then I(a) = I(aα β ) dΞα , where the aβ are R β some elements of T (I). If I(aα β ) dΞα = 0, then by Proposition 5.1 we know that each α I(aα ) = 0. Clearly each a is of order less than a so the result follows for all a of the β β form (0, a1 , a2 , . . .) by induction. If a is not of the form (0, a1 , a2 , . . .) then, evaluating I(a) at time zero, we obtain I(a)0 = a0 Id 6= 0. Thus I(a) 6= 0 and the result holds in all cases. 6. The Chaos Map 6.1. Preliminaries. Take an Ito superalgebra I of Z2 -graded multidimensional quantum stochastic integrators. Consider the corresponding Lie superalgebra ISLie realised by equipping I with the supercommutator Lie bracket as described in Sect. 2.1. Let L be an abstract Lie superalgebra isomorphic to ISLie by means of a map L 7→ dΞL . Let U be the universal enveloping superalgebra of L [K, S]. The space U is a complex unital associative †-superalgebra equipped with an injection ı : L → USLie . The injection ı has the universal property, that is to say that given a †-Lie superalgebra morphism φ : L → ASLie , where A is an arbitrary complex unital associative superalgebra there will exist a unique complex unital associative superalgebra morphism j : U → A such that j ◦ ı = φ. The map j is said to extend φ. The map ı is used to identify L as a subsuperalgebra of USLie . Recall that P = {I(a) : a ∈ T (I)}. The identity process Id = I((1, 0, 0, . . .)) is an element of P and we may equip P with a multiplication defined for A = I(a) ∈ P,

Chaotic Expansions of Universal Enveloping Superalgebra

23

B = I(b) ∈ P by A B := I(a ? b). From Sect. 3.2 and Theorem 4.1 we can see that this multiplication corresponds to the weak multiplication. The multiplication is welldefined by the injectivity of the map I discussed in the previous section and it follows from the associativity of ? proved in Theorem 3.1 that is associative. That P is a superalgebra follows from the fact that the maps {G(t) : t ∈ R} leave the exponential domain invariant and can therefore be used to Z2 -grade P as P0 ⊕ P1 , where elements of P0 commute about G and elements of P1 anticommute about G. It is easy to see that I is a superalgebra isomorphism from T (I) to P. Recall that one of the defining properties of a superalgebra homomorphism is that it preserves gradings. We have then that (P, ) is a unital complex associative †-superalgebra. Theorem 6.1. The map φ : L → PSLie defined by L 7→ ΞL is a †-Lie superalgebra morphism. Proof. It suffices to show the result for L, M of definite parity. Set Y 00 = (−1)σ(L)σ(M ) . Then we have {φ(L), φ(M )} = ΞL ΞM − Y 00 ΞM ΞL = I(dΞL ? dΞM ) − Y 00 I(dΞM ? dΞL ) = I(0, dΞL .dΞM , dΞL ⊗ dΞM + Y 00 dΞM ⊗ dΞL , 0, . . .) −Y 00 I(0, dΞM .dΞL , dΞM ⊗ dΞL + Y 00 dΞL ⊗ dΞM , 0, . . .) = I(0, dΞL .dΞM − Y 00 dΞM .dΞL , 0, . . .) = I(0, dΞ{L,M } , 0, . . .) = Ξ{L,M } = φ({L, M }). Therefore φ is a Lie superalgebra morphism. That φ is a linear †-morphism is obvious. Corollary 6.2. There exists a unique complex untial associative †-superalgebra morphism J from U to P such that J(L) = ΞL for each L ∈ L. Proof. The universal property guarantees the existence of a map that extends the morphism φ. Such an extension is the required map J. 6.2. A formula for the chaos map. The complex numbers C may be considered as a Lie superalgebra with even space C, odd space {0} and with bracket mapping each pair of elements to zero. Consider the map L → CSLie that sends every element of L to 0. The universal property guarantees the existence of a unique unital complex †-superalgebra morphism extending this zero map from L to U. This map is called the co-unit and is denoted η. The kernel of η is denoted K and is equal to the ideal of U generated by L, i.e., ULU . Proposition 6.3. Let B be a complex associative †-superalgebra not necessarily possessing a unit. Let φ1 : L → BSLie be a Lie †-superalgebra morphism. Then φ1 may be extended uniquely to a †-superalgebra morphism j1 : K → B. Proof. Denote by A the space B × C equipped with multiplication (b1 , z1 )(b2 , z2 ) = (b1 b2 + z1 b2 + b1 z2 , z1 z2 ).

(25)

24

T. M. W. Eyre

If the even and odd subspaces of B are B0 and B1 respectively and we define A0 = B0 ×C, A1 = B1 × {0}, then it is a simple matter to verify that A0 A0 , A1 A1 ⊂ A0 ,

A0 A1 , A1 A0 ⊂ A1 .

From this it follows that A = A0 ⊕ A1 may be considered to be a Lie superalgebra with even space A0 , odd space A1 and bracket defined as in (9). Furthermore, (0, 1) is evidently a unit for A and we may define an involution on A by (b, z)† = (b† , z). ¯ Thus we have that A is a unital †-superalgebra. It is clear that the map φ : L → ASLie , L 7→ (φ1 (L), 0) is a Lie †-superalgebra morphism. By universality, φ extends to a morphism j acting on U. For arbitrary U ∈ U we may write j(U ) = (j1 (U ), η1 (U )). From (25) we can see that η1 : U → C is the co-unit map as η1 (L) = 0 for each L ∈ L. Thus η1 = η and for any U ∈ K we have that η1 (U ) = η(U ) = 0. Therefore, given an arbitrary U ∈ K, we have j(U ) = (j1 (U ), 0). From (25) we see that j1 restricted to K is a †-superalgebra morphism extending φ1 . It remains to establish uniqueness. Suppose j˜1 is a second such morphism. Every component of U can be written V + z1 with V ∈ K ˜ + z1) = (j˜1 (V ), η(z)), and z ∈ C. This enables us to define a map j˜ : U → A by j(V where V ∈ K, z ∈ C. Eq. (25) shows that j˜ is a unital †-superalgebra morphism from U to A extending φ. Universality forces j˜ = j from which it follows that j˜1 = j1 giving the required uniqueness of j1 . Let γ be the co-product of U, that is the extension to U of the †-Lie superalgebra morphism from L to (U ⊗U )SLie defined by L 7→ L⊗1+1⊗L. We define κ : U 7→ U ⊗U by κ(U ) = γ(U ) − U ⊗ 1. While κ is not a superalgebra homomorphism, it is parity preserving. Let (L1 , . . . , Ln ) be a basis of L consisting of homogeneous elements. Let X be the set of all finite sequences (i1 , . . . , ir ) so that r ≥ 0 is arbitrary, i1 ≤ i2 ≤ · · · ≤ ir and ip < ip+1 when Lip and Lip+1 are odd. The Z2 -graded version of the Poincar´e-BirkoffWitt theorem [S] states that the elements (Lα )α∈X , where Lα := Li1 Li2 . . . Lir form a basis for U. Given a finite sequence α we denote the length r of α by |α| and call it the degree of the basis element Lα . Any U ∈ U will have a component in the basis of U just described that is of maximal degree. The degree of any element U of U is taken to be the degree of this maximal component and is denoted deg U . Proposition 6.4. The morphism κ maps U into U ⊗ K. Furthermore, κ possesses the property that given α ∈ X there exist β1 , . . . , βm ∈ X with each |βj | < |α| and elements k1 , . . . , km ∈ K such that κ(Lα ) =

m X

Lβj ⊗ kj .

j=1

We shall refer to this property as the degree-reducing property.

Chaotic Expansions of Universal Enveloping Superalgebra

25

Proof. If α = (i1 , . . . , ir ) then κ(Lα ) = (Li1 ⊗1+1⊗Li1 )(Li2 ⊗1+1⊗Li2 ) · · · (Lir ⊗1+1⊗Lir )−Li1 . . . Lir ⊗1. (26) The expansion of the right-hand side of (26) consists of 2r − 1 terms. It is clear that the first entry of each term must be of degree at most r − 1 and the second entry of each tensor must be of degree at least 1. The former observation gives the degree reducing property for Lα and the latter observation shows that κ(Lα ) ∈ U ⊗ K. Denote the identity map from U to U by id. We may define maps κn : U → U ⊗ K ⊗ · · · ⊗ K (n copies of K) for n ≥ 1 recursively by κ1 := κ,

κn := (κ1 ⊗ id ⊗ · · · ⊗ id) ◦ κn−1 .

It is convenient to set κ0 = id. Since each application of κ is degree-reducing, for each U ∈ U there exists an integer n such that κn (U ) = 0. Furthermore, we remark that κn ⊗ id ◦ κ = κn+1 .

(27)

Let ξ be the complex associative †-superalgebra morphism from K to I that extends the Lie †-superalgebra morphism L 7→ dΞL according to Proposition 6.3. We can therefore form the morphism η ⊗ ξ ⊗ · · · ⊗ ξ (n copies of ξ) from U ⊗ K · · · ⊗ K (n copies of K) to C ⊗ I ⊗ · · · ⊗ I (n copies of I). We may identify the range of η ⊗ ξ ⊗ · · · ⊗ ξ (n copies of ξ) with ⊗n I. We now state the main theorem, giving an explicit formula for the map J of Corollary 6.2. The ungraded version of this theorem is proved in [HPu] using a combinatorial approach. Theorem 6.5. If χ : U → T (I) is defined component-wise for α ≥ 0 by χ(U )α = η ⊗ ξ ⊗ · · · ⊗ ξ ◦κα (U ), | {z }

(28)

α times

then J(U ) = I(χ(U )). The map χ is known as the chaos map. Proof. Let J˜ be defined to be the map U → P that sends U to I(χ(U )). By the uniqueness of the extension of a morphism to the universal enveloping superalgebra we show that J = J˜ by demonstrating that J˜ satisfies ˜ (i) J(L) = ΞL for each L ∈ L, ˜ ) J(V ˜ ) = J(U ˜ V ) for all U, V ∈ U . (ii) J(U To establish (i) we note that for all n > 1, κn (L) = 0. From this it follows that χ(L) = ˜ = I(dΞL ) = ΞL as required. To establish (ii) we take (0, dΞL , 0, . . .) giving J(L) advantage of the classical relation df (3) = f (3 + d3) − f (3), where f is a polynomial function. A rigorous integral form of this relation in our notation is Z ˜ J(U ) = η(U )Id + J˜ ⊗ ξ ◦ κ(U ). (29) ˜ The R proof of (ii) begins by establishing that J satisfies relation (29). First we note that J˜ ⊗ ξ ◦ κ(U ) may be re-written as I(χ ⊗ ξ ◦ κ(U )). The injectivity of the map I established in the previous section means that the problem is reduced to showing that

26

T. M. W. Eyre

χ(U ) = η(U ) + χ ⊗ ξ ◦ κ(U ).

(30)

The 0th order component of χ(U ) is η ◦ κ0 (U ) = η(U ), which is also the 0th order component of the right-hand side of (30). For n ≥ 1, the nth order component of χ(U ) is η ⊗ ξ ⊗ · · · ⊗ ξ ◦ κn (U ). On the right-hand side, the nth order component is (η ⊗ ξ ⊗ · · · ⊗ξ ◦κn−1 ) ⊗ ξ ◦ κ(U ). | {z }

(31)

n−1 times

The rightmost ξ in this expression operates on the second entry of the tensor κ(U ) while the map η ⊗ ξ ⊗ · · · ⊗ ξ ◦ κn−1 operates on the first entry. Thus we may delay the operation of ξ by one composition, include it in the leftmost map and so re-write (31) as (η ⊗ ξ ⊗ · · · ⊗ ξ ) ◦ κn−1 ⊗ id ◦ κ(U ). | {z } n times

By (27) this is equal to

(η ⊗ ξ ⊗ · · · ⊗ ξ ) ◦ κn−1 (U ), | {z } n times

th

which is the n order component of χ(U ) given in (28). Having proved relation (30) we can conclude that (29) holds. We may now proceed with the main part of the proof of (ii). We proceed by induction on the degree of U V = deg U + deg V . If at least one of U , V is of degree zero then the result is immediate. If deg U = deg V = 1, then U may be written as z1 + L with z ∈ C and L ∈ L. Similarly, V may be written as w1 + M . Linearity allows us to assume that L and M are of definite parity. By (29) we have ˜ ) = wId + ΞM so we may write ˜ ) = zId + ΞL , J(V J(U ˜ ) J(V ˜ ) = zwId + zΞM + wΞL + ΞL ΞM . J(U ˜ On the other hand, U V is of degree 2, so using the definition of J, ˜ V ) = I(η(U V ), η ⊗ ξ ◦ κ(U V ), η ⊗ ξ ⊗ ξ ◦ κ ⊗ id ◦ κ(U V ), 0, 0, . . .). J(U A brief calculation enables us to re-write this as ˜ V ) = I(zw, zdΞM + wdΞL + dΞLM , dΞL ⊗ dΞM + (−1)σ(L)σ(M ) dΞM ⊗ dΞL ). J(U By the definition of the ? product we conclude ˜ V ) = I(zw + zdΞM + wdΞL + d(ΞL ΞM )) J(U = zwId + zΞL + wΞM + ΞL ΞM ˜ ) J(V ˜ ). = J(U We now have that (ii) holds for all U, V ∈ U so that deg U + deg V ≤ 2. Assume, by way of induction, that the result holds for all U, V ∈ U such that deg U + deg V < k for some positive integer k. Now suppose we have U, V ∈ U such that deg U +deg V = k. We ˜ ) J(V ˜ ) = J(U ˜ V ). Consider the processes J(U ˜ ), J(V ˜ ) at time zero. must show that J(U ˜ ˜ ˜ ))0 = ˜ = η(U )Id, J(V ) = η(V )Id. Therefore ( J(U ) J(V From (29) we have J(U )0 0 η(U )Id η(V )Id = η(U V )Id by the definition of and the morphism property of η. ˜ ) J(V ˜ ))0 = J(U ˜ V )0 . It will now ˜ V )0 by (29) so we have (J(U But η(U V )Id = J(U suffice to show that

Chaotic Expansions of Universal Enveloping Superalgebra

˜ ) J(V ˜ )) = dJ(U ˜ V ). d(J(U

27

(32)

Applying relation (13) to the left hand side of (32) gives ˜ ) J(V ˜ )) = J(U ˜ ) dJ(V ˜ ) + (dJ(U ˜ ))J(V ˜ ) + dJ(U ˜ ).dJ(V ˜ ). d(J(U

(33)

˜ ) = J˜ ⊗ ξ ◦ κ(W ) for arbitrary W ∈ U . Thus The differential form of (29) gives dJ(W we may re-write the right-hand side of (33) as ˜ )(J˜ ⊗ ξ ◦ κ(V )) + (J˜ ⊗ ξ ◦ κ(U ))J(V ˜ ) + J˜ ⊗ ξ ◦ κ(U ).J˜ ⊗ ξ ◦ κ(V ). J(U

(34)

The degree-reducing property of κ allows us to invoke the inductive hypothesis and re-write (34) as J˜ ⊗ ξ(U ⊗ 1κ(V ) + κ(U )V ⊗ 1 + κ(U )κ(V )) = J˜ ⊗ ξ(U ⊗ 1(γ(V ) − V ⊗ 1) + (γ(U ) − U ⊗ 1)V ⊗ 1 +(γ(U ) − U ⊗ 1)(γ(V ) − V ⊗ 1)) = J˜ ⊗ ξ(γ(U V ) − U V ⊗ 1) = J˜ ⊗ ξ ◦ κ(U V ) ˜ V) = dJ(U as required.

7. Remarks In classical stochastic analysis, the chaos expansion of a polynomial function f of a process 3 may be obtained by iterating the relation df (3) = f (3 + d3) − f (3). In this work we have replaced the polynomial by an element of the universal enveloping superalgebra of L and through γ, κ, η, ξ and χ we have produced a rigorous interpretation of this result in Z2 -graded quantum stochastic calculus with the resulting chaos expansion χ. Acknowledgement. T.M.W.E. acknowledges the partial support of the British Council and Committee for Scientific Research for funding from the Joint British-Polish Collaboration Programme which made possible a visit to the University of Lodz where part of this work was carried out. Conversations with R.L. Hudson are also acknowledged.

References [AH] Applebaum, D.B. and Hudson, R.L.: Fermion Ito’s Formula and Stochastic Evolutions. Commun. Math. Phys. 96, 473–496 (1984) [C] Chevalley, C.: The Construction And Study Of Certain Important Algebras. Publ. Math. Soc. Japan I, Princeton, NJ: Princeton University Press, 1955 [EH] Eyre T.M.W. and Hudson, R.L.: Representations of Lie Superalgebras and Generalized Boson-Fermion Equivalence In Quantum Stochastic Calculus. Commun. Math. Phys. to appear [HP1] Hudson, R.L. and Parthasarathy, K.R.: Quantum Ito’s Formula and Stochastic Evolutions. Commun. Math. Phys. 93, 301–323 (1984) [HP2] Hudson, R.L. and Parthasarathy, K.R.: Unification of Fermion and Boson Stochastic Calculus. Commun. Math. Phys. 104, 457–470 (1986) [HPu] Hudson, R.L. and Pulmannova, S.: Chaotic Expansions of Elements of the Universal Enveloping Algebra of a Lie Algebra Associated with a Quantum Stochastic Calculus. Proc. LMS to appear.

28

[K] [L] [P] [S]

T. M. W. Eyre

Kac, V.G.: Lie Superalgebras. Adv. in Math. 26, 8–96 (1977) Lindsay, J.M.: Independence for Quantum Stochastic Integrators. In: Quantum Probability and Related Topics Vol. VI, 325–332 (1991) Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Basel: Birkh¨auser, 1992 Scheunert, M.: The Theory Of Lie Superalgebras. Lecture Notes In Mathematics, Vol. 716, Berlin: Springer, 1979

Communicated by H. Araki

Commun. Math. Phys. 192, 29 – 45 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Conservation Laws for Linear Equations on Quantum Minkowski Spaces M. Klimek? Institute of Mathematics and Computer Science, Technical University of Cze¸stochowa, ul. Da¸browskiego 73, 42-200 Cze¸stochowa, Poland. E-mail: [email protected] Received: 4 April 1997 / Accepted: 10 June 1997

Abstract: The general, linear equations with constant coefficients on quantum Minkowski spaces are considered and the explicit formulae for their conserved currents are given. The proposed procedure can be simplified for ∗-invariant equations. The derived method is then applied to Klein–Gordon, Dirac and wave equations on different classes of Minkowski spaces. In the examples also symmetry operators for these equations are obtained. They include quantum deformations of classical symmetry operators as well as an additional operator connected with deformation of the Leibnitz rule in non-commutative differential calculus. 1. Introduction The investigation of conservation laws and invariants of motion for a given action or equation of motion is an important part of classical mechanics and field theory. The general problem is solved by the Noether theorem in which symmetry of the action yields conserved currents and integrals of motion. When we restrict our study to linear equations the method of Takahashi and Umezawa [1] is known which allows us to construct invariants for classical field-theoretic models. We have shown previously that it can be extended to discrete and mixed models on commutative spaces [2–4]. The equations of this kind appear also in realizations of generators and Casimir operators of quantum algebras on commutative spaces, namely for κ-deformed algebras [5–7]. The aim of this paper is to extend this method to linear equations on quantum Minkowski spaces. We have chosen to work within the framework of a class of Minkowski spaces endowed with the action of quantum Poincar´e groups, which were introduced by Woronowicz and Podle´s in [8–10] and use the differential calculus developed by Podle´s in [11]. Let us however notice that once the explicit formula of Leibnitz rule for exterior derivatives for other quantum spaces and differential calculi is given, ?

Research partially supported by KBN grant 2 P03B 130 12.

30

M. Klimek

one can extend the proposed procedure. The example of this type is the braided differential calculus on q-Minkowski space introduced in the work by Ogievetsky et al. [12] and formally developed by Majid [13, 14]. Let us sketch briefly the steps of the procedure: – in derivation of the conserved currents the Leibnitz rule is used. In the case of noncommutative differential calculus it is deformed similarly as in the discrete calculus [2, 4]. We introduce the modified Leibnitz rule – the special operator 0 is built from derivatives acting on the right- and left-hand side. In the quantum case we must use derivatives and their conjugations – for hermitian equation operators the hermitian currents can be derived. As the explicit form of the scalar product on quantum Minkowski space is not known we shall use throughout the paper the notion of a ∗-invariant equation operator for which the construction can be simplified – to obtain different solutions for a given equation one needs its symmetry operators. Their algebras are well known in classical models; we show in examples the symmetry operators for Klein–Gordon, Dirac and wave equations without discussion of their algebraic properties. It was however shown in the example of the Klein–Gordon equation on quantum Minkowski space with Z = 0 that the algebra closes [15]. 2. Modification of Leibnitz Rule in Differential Calculi on Quantum Minkowski Spaces In the paper we shall work within the framework of differential calculi on quantum Minkowski spaces, in the general case introduced and investigated in [11]. Let us recall the fundamental rules of commutation for partial derivatives and variables: ∂ j (xi ) = g ji , ∂ l ∂ k = Rlk ij ∂ i ∂ j ,

(1) (2)

∂ i xk = g ik + (Rik ab xa − (RZ)ik b )∂ b , (R − 1)

ij

kl [x

x −Z

k l

kl

sx

s

+ T ] = 0, kl

(3) (4)

where R-matrix fulfills quantum Yang–Baxter equation R23 R12 R23 = R12 R23 R12 and the condition R2 = 1. The metric tensor g appearing in the above formulae is Rsymmetric, that means Rg = g. The functions on quantum Minkowski space are understood as a formal power series of monomials of variables. For the product of two such arbitrary functions the partial derivatives obey the following Leibnitz rule [11]: ∂ i f g = (∂ i f )g + (ζji f )∂ j g

(5)

with the transformation operator ζ fulfilling the equality: i f )(ζjm g) ζji (f g) = (ζm

(6)

and connected with the operator ρ from [11] via the equation: ζji = g ia ρba gbj .

(7)

Formula (3) yields the explicit form of transformation operator acting on the monomial of the first order:

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

ζji xk = Rik aj xa − (RZ)ik j

31

(8)

and using (6) can be easily extended to an arbitrary function on quantum Minkowski space. It is interesting point that (4) implies analogous Leibnitz rule to be valid also for variables, namely: xi (f g) = (xi f )g = (xi f )act g + (ζ˜ji f )xj g,

(9)

where the action of the variable on the monomial is due to the selfinteraction of variables, is determined by properties of quantum Minkowski space (4) and looks as follows: (xi [k1 , ..., kn ])act =

n X

(ζ˜ji [k1 , ...kl−1 ])v jkl [kl+1 , ..., kn ].

(10)

l=1

The initial and last term are of the form: l=1

v ik1 [k2 , ..., kn ]

l=n

ζ˜ji [k1 , ..., kn−1 ]v jkn ,

(11)

and appearing in the formulae (9,10) the transformation operator ζ˜ is defined for an arbitrary function by its action on xk and multiplicity property: ζ˜ji xk = Rik aj xa + (Z − RZ)ikj , i ζ˜i (f g) = (ζ˜m f )(ζ˜m g). j

j

(12) (13)

In the case where the selfinteraction of variables determined by the tensor v = RT − T vanishes, the Leibnitz rule for variables reduces to the formula: xi (f g) = (xi f )g = (ζ˜ji f )xj g.

(14)

Finally we can also write both Leibnitz rules (5,9) in the vector form: ∂f g = (∂f )g + (ζf )∂g, ˜ )xg. xf g = (xf )g = (xf )act g + (ζf

(15) (16)

In the next sections we shall consider linear equations of motion and derive conservation laws for them. We have solved a similar problem in discrete models on commutative spaces [2–4]. The common feature of discrete and non-commutative models is the deformation of Leibnitz rules for partial derivatives. In both cases the transformation operators appear on the right-hand side of the formulae. Their form is different, depending on the kind of model – in discrete models it is simply a shift operator in a given direction while on quantum spaces it is described by (6,8). Thus the main obstacle in extending the Takahashi–Umezawa method is the fact that one of the operators on the right-hand side of (5) acts simultaneously on the first and second function in the product. In the discrete case we modified the Leibnitz rule using the inverse transformation operator which was simply the back-shift operator on the lattice. Investigating the special case of the Klein–Gordon equation on quantum Minkowski spaces with Z = 0 we also obtained the inverse operator ζ − , showed how it is connected with the transformation operator via ∗-operation and applied it to modification of the Leibnitz rule. This modification is also possible in the general case which we now investigate. It is easy to conjecture from the proof of Proposition 1.1 of [11] that the operator:

32

M. Klimek

ζj−

i

:= ∗ζji ∗

(17)

is the inverse transformation operator fulfilling: ζj−

m i ζm

− = ζjm ζm

i

= δji .

(18)

The explicit form of the inverse operator for an arbitrary function results from its multiplicity property: ζj− i (f g) = (ζj−

m

− i f )(ζm g)

(19)

and from its action on monomials of the first order: ζj− i xk = Rki ja xa + Z ki j .

(20)

Now using the formula (18) and changing the product on the left-hand side of Leibnitz rule (5) we obtain its modification: ∂ i [(ζi− a f )g] = (−∂ † a f )g + f ∂ a g,

(21)

where we use the following notation for the conjugated derivative: ∂†

a

:= −∂ i ζi−

a

= −∂ i ∗ ζia ∗ .

(22)

The conjugated derivative ∂ † can be described using the ∗-operation. Let us notice that similarly to the special case from [15] the Leibnitz rules for the conjugated derivative and the operator ∗(−∂)∗ are identical: ∂ † a (f g) = (∂ † i f )ζi− a g + f ∂ † a g, ∗(−∂ ) ∗ (f g) = (− ∗ ∂ ∗ a

i

f )ζi− a g

(23)

+ f (− ∗ ∂ ∗ g). a

(24)

It is easy to check that the action of both operators on monomials of the first order is the same: ∂ † a xk = −g ka ,

∗(−∂ a ) ∗ xk = −g ka .

(25)

To this aim we have used the definition of ∂ † given in (22) and the property of metric tensor g ij = g ji from [11]. Now Leibnitz rules for both operators (23,24) and their equality for monomials of the first order imply the identity of ∂ † a and ∗(−∂ a )∗ for arbitrary monomials by virtue of mathematical induction principle. Therefore they act in the same way on all functions on quantum Minkowski space, thus are identical: ∂†

a

= ∗(−∂ a ) ∗ .

(26)

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

33

3. The Conservation Laws for Linear Equations of Motion on Quantum Minkowski Spaces 3.1. Equations of motion on quantum Minkowski space. Lately a number of equations of motion on quantum spaces were studied in the literature. They include the Klein–Gordon and Dirac equations and their solutions investigated by Podle´s [11] as well as equations considered on q-Minkowski space from [16–22] built within the framework of braided differential calculus [12–14, 23]. In addition some quantum models on non-commutative spaces, in which deformation of commutation relations is motivated by the Heisenberg principle and classical gravity, were studied by Doplicher et al. in [24, 25]. In this section we shall consider a general, linear equation on quantum Minkowski space defined by (4). These operators include the Klein–Gordon and Dirac operators from [11]. We construct the operator of the equation using partial derivatives fulfilling (2,3) in the following way: 3(∂)8 = 0, PN 3(∂) = 30 + l=1 3µ1 ...µl ∂ µ1 ...∂ µl .

(27) (28)

As the derivatives in (28) R-commute we have the following property of the coefficients (which may be matrices) with respect to permutation of indices: Rµk µk+1

νρ 3µ1 ...µk µk+1 ...µl

= 3µ1 ...νρ...µl ,

(29)

where l = 1, ..., N and k = 1, ..., l − 1; what this means is that they are R-symmetric with respect to permutations. In analogy with the classical field theory we consider constant coefficients and this implies that they obey the following equations: ζjµk 3µ1 ...µk−1 µk ...µl = 3µ1 ...µk−1 j...µl ,

(30)

∂ 3µ1 ...µl = 0,

(31)

j

where l = 1, ..., N k = 1, ..., l j = 1, 2, 3, 4. Let us notice that formulae (18,30) imply also: ζj−

µk

3µ1 ...µk−1 µk ...µl = 3µ1 ...µk−1 j...µl

(32)

for l = 1, ..., N, k = 1, ..., l, j = 1, 2, 3, 4, while (26,31) and the condition of ∗-invariance of the equation operator (65) yields for ∗-invariant equations: ∂ † j 3µ1 ...µl = 0,

l = 1, ..., N,

j = 1, 2, 3, 4.

(33)

ˆ In order to derive the conservation law for Eq. (27) we 3.2. The operators 0 and 0. need an operator 0 which in the classical procedure of Takahashi-Umezawa fulfills the equality: X ←µ ← ← ( ∂ +∂ µ )0µ (∂, ∂ ) = 3(∂) − 3(− ∂ ). (34) µ

In the above formula the partial derivatives obey the rule of classical differential calculus, so derivatives acting on the left- and right-hand side commute.

34

M. Klimek

This is not the case in non-commutative differential calculi, in which the derivatives do not commute according to (2). Additionally we shall deal with conjugated derivatives introduced by modification of the Leibnitz rule. The set of derivatives and their conjugations in the sense of (22) becomes commutative only in the special case R = τ . As we consider the general case we should replace the equality (34) with the following condition for the operator 0: X ←† (− ∂

←†

µ

←†

+∂ µ ) ◦ 0µ (∂, ∂ ) = 3(∂) − 3( ∂ ),

(35)

µ ←†

where the operator 3( ∂ ) looks as follows: ←†

3( ∂ ) = 30 +

N X

←† µ1

∂

←† µl

... ∂

3µ1 ...µl .

(36)

l=1

We introduced the notation for the product “◦” to underline the way it acts on monomials of derivatives: ←† µ

+∂ µ ) ◦ [ν1 , ..., νl ]a(x)[ρ1 , ..., ρk ] := (− ∂ −[ν1 , ..., νl , µ]a(x)[ρ1 , ..., ρk ] + [ν1 , ..., νl ]∂ µ a(x)[ρ1 , ..., ρk ],

(37)

where we have denoted the monomials of derivatives as follows: [ρ1 , ..., ρk ] := ∂ ρ1 ...∂ ρk , ←† ν1

[ν1 , ..., νl ] := ∂

←† νl

... ∂

(38) .

(39)

Proposition 3.1. The unique solution of (35) in a class of polynomials of the derivatives ←†

∂ and ∂ is of the form: ←†

0µ (∂, ∂ ) = 3µ +

l N −1 X X

←† µ1

∂

←† µk

... ∂

3µ1 ...µk µµk+1 ...µl ∂ µk+1 ...∂ µl .

(40)

l=1 k=0

Proof. The technique we use is very similar to that applied in the proof of the analogous proposition for discrete models [4]. We denote the monomials of derivatives as described above (38,39). Now the modified Leibnitz rule (21) implies that in order to solve (35) we should consider the general polynomial of order N-1 with functional coefficients of the following form: N −1 X l X ←† [µ1 , ..., µk ]akµµ1 ...µl [µk+1 , ..., µl ]. (41) 0µ (∂, ∂ ) = a0µ + l=1 k=0

We apply condition (35) to the general form of the solution (41) to derive the equations for coefficients akµµ1 ...µl : X µ

←† µ

(− ∂

←†

+∂ µ ) ◦ 0µ (∂, ∂ ) =

(42)

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

−

N −1 X l X X l=1 k=0

+

N −1 X l X

[µ1 , ..., µk , µ]akµµ1 ...µl [µk+1 , ..., µl ]

µ

X (ζνµ akµµ1 ...µl )[ν, µk+1 , ..., µl ]

[µ1 , ..., µk ]

µ

l=1 k=0

+

N −1 X l X

[µ1 , ..., µk ]

l=1 k=0

−

X µ

35

[µ]a0µ +

X

X

(∂ µ akµµ1 ...µl )[µk+1 , ..., µl ]

µ

(ζνµ a0µ )[ν] +

X

µ

(∂ µ a0µ ) =

µ ←†

= 3(∂) − 3( ∂ ). We compare the coefficients of monomials of the same type on both sides of the above condition and obtain the following set of equations for functions akµµ1 ...µl : a0µ = 3µ ,

(43)

∂ α a0αµ + ζµν a0ν = 3µ , ζµα a0αµ1 ...µl + ∂ α a0αµµ1 ...µl = 3µµ1 ...µl , α k+1 −akµµ1 ...µl + ζµαk+1 ak+1 αµ1 ...µk µµk+2 ...µl + ∂ aαµ1 ...µk µµk+1 ...µl alµµ1 ...µl + ∂ α alαµ1 ...µl µ = 3µ1 ...µl µ ,

(44) (45) = 0,

(46) (47)

with l = 1, ..., N − 1, k = 0, ..., l − 1. Starting from (43,44,45) we get the unique solution for the coefficients a0 : a0µ = 3µ ,

a0µµ1 ...µl = 3µµ1 ...µl ,

l = 1, ..., N − 1.

(48)

This R-symmetric, constant in the sense of (30,31) solution for initial coefficients allows us to evaluate the remaining ones using (46). We start from equations for l = N − 1 and conclude that they reduce to the following set: − akµµ1 ...µN −1 + ζµαk+1 ak+1 αµ1 ...µk µµk+2 ...µN −1 = 0,

0 ≤ k < N − 1.

(49)

As we know the explicit expressions for a0µ1 ...µN from (48) we are able to derive from the above equation and from the properties of coefficients of Eqs. (30,32) the solution for a1µ1 ...µN which has the following form: a1µµ1 µ2 ...µN −1 = 3µ1 µµ2 ...µN −1 .

(50)

Using the explicit form of coefficients of type a1µ1 ...µN we perform the next step of the solution of the set (49) and solve it for k = 1, applying the same method as before, namely rewriting the set of equations, putting the solution (50) into it and then using (32). After subsequent calculations for k = 1, ..., N − 1 we obtain the general form of coefficients akµ1 ...µN : akµµ1 ...µN −1 = 3µ1 ...µk µµk+1 ...µN −1 . (51)

36

M. Klimek

Now from the form of coefficients akµ1 ...µN we conclude that they all fulfill the condition (31) so for l = N − 2 we also obtain the set of equations in which the part with the full divergence vanishes: − akµµ1 ...µN −2 + ζµαk+1 ak+1 αµ1 ...µk µµk+2 ...µN −2 = 0,

0 ≤ k < N − 2.

(52)

The method of solving Eqs. (52) follows the calculations done for l = N − 1. We start from the known coefficient of type a0 described by (48) and derive the remaining ones using the subsequent equations from (52). The result as before are expressions constant in the sense of (30,31) of the form: akµµ1 ...µN −2 = 3µ1 ...µk µµk+1 ...µN −2 .

(53)

It is obvious that the next steps are analogous so we shall not present them in detail. The general and unique solution of the set (43 - 47) looks as follows: akµµ1 ...µl = 3µ1 ...µk µµk+1 ...µl ,

l = 1, ..., N − 1,

k = 0, ..., l.

(54)

The derivation of the explicit formulae for the unique solution of the coefficients of the operator 0µ concludes the proof of Proposition 3.1. From the above proof of Proposition 3.1 we conclude that the unique solution of Eq. (35) can also be derived for equations in which the coefficients obey the requirement (30), but the condition (31) is weakend as follows: X ∂ µk 3µ1 ...µk ...µl = 0, l = 1, .., N, k = 1, ..., l. (55) µk

Corollary 3.2. The unique solution of (35) in the class of polynomials of derivatives ←†

∂ and ∂ for the equation operator 3 fulfilling (30,55) is of the form: ←†

0µ (∂, ∂ ) = 3µ +

l N −1 X X

←† µ1

∂

←† µk

... ∂

3µ1 ...µk µµk+1 ...µl ∂ µk+1 ...∂ µl .

(56)

l=1 k=0

In contrast with the classical case where it is sufficient to know the operator 0µ to construct the conserved currents, we must additionally modify 0µ due to the deformation of the Leibnitz rule (5). We introduce the operator 0ˆ µ in the form: ←†

←− j

0ˆ µ (∂, ∂ ) = ζ µ

3j +

l N −1 X X

←† µ1

∂

←† µk ←− j ζµ

... ∂

3µ1 ...µk jµk+1 ...µl ∂ µk+1 ...∂ µl .

l=1 k=0

(57) As we see the modification consists of introducing the inverse transformation operator ζ − in the monomials between derivatives acting on the left- and right-hand side. 3.3. The conservation laws and conserved currents. In this section we derive the conservation laws for linear equations with constant coefficients described by (27,28). To this aim it is necessary to have the solutions of the initial equation and of its conjugation (22). Having these two functions we can formulate the following proposition which describes the explicit form of conserved currents for Eq. (27).

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

37

Proposition 3.3. Let us assume that the function 8 is an arbitrary solution of Eq. (27) with coefficients fulfilling (30,31), that means: 3(∂)8 = 0

(58)

and the function F solves the conjugated equation: ←†

F 3( ∂ ) = 0. Then

(59)

←†

Jµ = F 0ˆ µ (∂, ∂ )8,

(60)

where the operator 0ˆ µ is defined by (57), is a current which obeys the conservation law in given differential calculus on quantum Minkowski space: X ∂ µ Jµ = 0. (61) µ

Proof. In the straightforward proof of the proposition we use the modified Leibnitz rule (21) and the properties of the operator 0µ (35) as well as properties of coefficients of Eqs. (30,31): X ∂ µ Jµ = (62) µ

X

←†

∂ µ F 0ˆ µ (∂, ∂ )8 =

X

µ

+

l N −1 X X

3j +

µ

←† µ1

∂

←− j

∂µF ( ζ µ

←† µk ←− j ζµ

... ∂

l=1 k=0

X

F

←† j

−∂

3µ1 ...µk jµk+1 ...µl ∂ µk+1 ...∂ µl )8 = 3j + 3j ∂ j 8+

j

+

X j

F(

l N −1 X X

←† µ1

∂

←† µk

... ∂

←† j

(− ∂

3µ1 ...µk jµk+1 ...µl +

l=1 k=0

+3µ1 ...µk jµk+1 ...µl ∂ j )∂ µk+1 ...∂ µl )8 =   X ←† j ←† F  (− ∂ +∂ j ) ◦ 0µ (∂, ∂ ) 8 = j

F

←†

3(∂) − 3( ∂ ) 8 = 0.

Thus the conservation law for an arbitrary linear equation with constant coefficients is valid provided functions F and 8 are solutions of corresponding equations.

38

M. Klimek

Corollary 3.4. If the function 8 is an arbitrary solution of Eq. (27) with coefficients fulfilling (30,55): 3(∂)8 = 0, and the function F solves its conjugation (22): ←†

F 3( ∂ ) = 0, then the current of the form: ←†

Jµ = F 0ˆ µ (∂, ∂ )8 with 0ˆ given by (57) is conserved: X

∂ µ Jµ = 0.

µ

Proof. It results from the proof of Proposition 3.3 and from Corollary 3.2 describing the operator fulfilling (35) for equations obeying the weakend condition (30, 55). Let us observe that in the special case of the Klein–Gordon equation on quantum Minkowski space with Z = 0 we were able to connect the solution of the conjugated equation F with the ∗-transformation of 8 [15]. This possibility was due to the fact that the Klein–Gordon operator is an ∗-invariant one [11]. We shall now check the conditions of ∗-invariance for the operator of the equation of the form (27, 28). Taking into account (i∂)∗ = i∂ from [11] we see that after the ∗-operation we obtain: 3(∂)∗ = 3∗0 +

N X

(∂ µl )∗ ...(∂ µ1 )∗ 3∗µ1 ...µl =

(63)

l=1

= 3∗0 +

N X

(−1)l ∂ µl ...∂ µ1 3∗µ1 ...µl = 3(∂).

l=1

Comparing the coefficients of the 3(∂)∗ with coefficients of the initial operator 3(∂) we conclude that the following proposition is valid Proposition 3.5. The operator of Eqs. (27,28) is ∗-invariant: 3(∂)∗ = 3(∂)

(64)

iff the coefficients fulfill the conditions: 3∗0 = 30 ,

3∗µ1 ...µl = 3µl ...µ1 (−1)l ,

l = 1, ..., N.

(65)

For equations fulfilling the ∗-invariance condition we can express the solution of the conjugated equation (36) in terms of solutions of the initial equation (27, 28). Therefore the conserved currents for such equations can be constructed using only the latter solution.

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

39

Proposition 3.6. For the linear equation with constant coefficients fulfilling the conditions (30,31,65) the current of the form: ←†

Jµ = 8∗ 0ˆ µ (∂, ∂ )8,

(66)

where 8 is an arbitrary solution of (27, 28) is conserved: X

∂ µ Jµ = 0.

(67)

µ

Proof. Proposition 3.3 implies that we need only to show: ←†

8∗ 3( ∂ ) = 0,

(68)

provided 8 is an arbitrary solution of 3(∂)8 = 0. Let us check it using the property of the conjugated derivative (26): ←†

8∗ 3( ∂ ) = N X

←† µl

←† µ1

8∗ ∂

... ∂

(69)

3µ1 ...µl =

l=0 N X

← ←µl

8∗ (−1)l ∗ ∂

←µ1 ←

... ∂

∗ 3µ1 ...µl =

l=0 N X ←µ1 ← ←µl ← ∗ (−1)l 3µl ...µ1 8(−1)l ∂ ... ∂ ∗ = 0. l=0

When two solutions of the equation of motion (27) are known one can construct the current according to the following corollary which is the result of Propositions 3.3 and 3.6: Corollary 3.7. For the linear equation with coefficients fulfilling (30, 31, 65) the current of the form: ←†

←†

Jµ = i(80 )∗ 0ˆ µ (∂, ∂ )8 − i8∗ 0ˆ µ (∂, ∂ )80 ,

(70)

where 8 and 80 are arbitrary solutions of (27) is conserved: X µ

∂ µ Jµ = 0.

(71)

40

M. Klimek

4. Applications We have developed a simple method of derivation of conserved currents for linear equations on a class of quantum Minkowski spaces. This procedure can be applied to equations with coefficients constant in the sense of (30, 31) or fulfilling weakened conditions (30, 55). Now we shall apply the presented technique to a few ∗-invariant equations on different quantum Minkowski spaces. Some of these equations were studied earlier in [11] where also their solutions were constructed. Following the classical field theory we shall obtain different solutions of equation of motion using the symmetry transformation operators. In the examples we show that they are quantum deformations of classical operators plus the transformation operator (6, 8). The algebraic and possible co-algebraic properties of the set of symmetry operators are still to be investigated, nevertheless we wish to point out that in the special case studied in [15] they form closed algebra and we hope to obtain their co-algebraic structure from Leibnitz rules of symmetry operators determined by Leibnitz rules for derivatives and variables (5, 9). 4.1. Klein–Gordon equation. Klein–Gordon equation on quantum Minkowski space in the sense of (4) was introduced by Podle´s in [11] where also its solutions were studied. It looks as follows: (72) ( + m2 )8 = 0 with d’Alembert’s operator built using exterior partial derivatives from non-commutative differential calculus (2, 3, 4): = gab ∂ a ∂ b = ∂ a ∂ b gab .

(73)

The consistency conditions which allow us to write coefficients of equation in front or after the differential operators coincide with requirements studied in the previous section (30, 31). In our construction we shall consider currents connected with symmetry transformations of solutions of Klein–Gordon equation. The special case for Z = 0 was solved earlier in [15] where also algebraic properties of symmetry transformation operators were investigated. Let us now assume that R, Z and T are arbitrary matrices and tensors allowed in calculus on quantum Minkowski space and check the commutator of d’Alembert’s operator with variable xk : [, xk ] = 2∂ k .

(74)

Taking into account property (2) rewritten as follows: (R − 1)kl ab ∂ a ∂ b = 0 we can easily construct the symmetry transformation operator analogous to angular momentum operator from classical field theory: M kl = i(R − 1)kl ab xa ∂ b .

(75)

As these operators commute with Klein–Gordon equation operator they transform solution of Klein–Gordon equation into another solution. The set of symmetry operators can be also completed with momentum operators [11]: (76) P l = i∂ l .

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

41

In addition, using properties of the transformation operator ζ given in Proposition 3.1 of [11] one concludes that there is an additional symmetry operator for Klein–Gordon equation: (77) [, ζba ] = 0 due to the property of R-matrix: g jb Rdc ba = Rjd ak g kc . Now we should construct the 0ˆ operator for our equation using the general formula (57). It has the form identical with the one obtained in [15]: ←†

←† a

0ˆ µ (∂, ∂ ) = ∂

←− j

gaj ζ µ

←− j

+ ζµ

gja ∂ a .

(78)

The results (70,71) presented earlier imply that the currents: ←†

←†

Jµkl = i8∗ 0ˆ µ (∂, ∂ )M kl 8 − i(M kl 8)∗ 0ˆ µ (∂, ∂ )8, ←†

(79)

←†

Jµl = i8∗ 0ˆ µ (∂, ∂ )P l 8 − i(P l 8)∗ 0ˆ µ (∂, ∂ )8, Jµa

←†

b

= i8∗ 0ˆ µ (∂, ∂ )ζba 8 − i(ζba 8)∗ 0ˆ µ (∂, ∂ )8

are conserved quantities by virtue of Corollary 3.7: X X ∂ µ Jµkl = 0, ∂ µ Jµk = 0, µ

(80)

←†

µ

X

∂ µ Jµa

(81)

b

= 0.

(82)

µ

4.2. Dirac equation on quantum Minkowski space with R = τ . Let us recall the Dirac equation from [11]: (83) (iγ0 γµ ∂ µ + mγ0 )9 = 0. In our case when R = τ the γ matrices fulfill the following condition: γ µ γ ν + γ ν γ µ = 2g µν .

(84)

Similarly to the Klein–Gordon operator the coefficients γ0 γµ (which are now matrices) obey the consistency conditions (30,31). For the Dirac equation we know symmetry operators are analogous to momentum from [11]: (85) P l = i∂ l In order to construct the angular momentum operator we check the commutator of the Dirac operator with the scalar angular momentum (75): [iγµ ∂ µ , M kl ] = iγρ g ρl ∂ k − iγρ g ρk ∂ l + iγρ Zνlρ ∂ ν ∂ k − iγρ Zνkρ ∂ ν ∂ l .

(86)

The additional term appearing on the right-hand side implies that the spinorial part of the angular momentum must be extended: 1 1 kl = ixk ∂ l − ixl ∂ k + [γ k , γ l ] + (Zρkj ∂ l − Zρlj ∂ k )γj γ ρ . Mspin 2 2

(87)

42

M. Klimek

Let us notice that the above operator does not commute with the operator of Eq. (83). Nevertheless it is its symmetry operator as it transforms the solution of (83) into a solution. Let 9 be a solution of (83), then we have: kl kl 9 = γ0 Mspin (iγµ ∂ µ + m)9 = (iγ0 γµ ∂ µ + mγ0 )Mspin kl γ0 Mspin γ0 (g00 )−1 γ0 (iγµ ∂ µ + m)9 = 0.

This symmetry operator contains the scalar part and spin part built using Dirac matrices of identical form as in the classical commutative models and is extended with part depending on the Z matrices due to commutation rule (3). Following the general method we construct the 0ˆ operator for (83) using (57): ←†

←− j

0ˆ µ (∂, ∂ ) = i ζ µ

γ0 γj .

(88)

Thus the currents built using symmetry operators (85,87) according to the Corollary 3.7: ←− j

Jµkl = i9∗ ζ µ

←− j

Jµl = i9∗ ζ µ are conserved:

X µ

←− j

kl kl γ0 γj Mspin 9 − i(Mspin 9)∗ ζ µ ←− j

γ0 γj P l 9 − i(P l 9)∗ ζ µ

∂ µ Jµkl = 0,

X

γ0 γj 9,

γ0 γj 9,

∂ µ Jµk = 0.

(89) (90) (91)

µ

Let us notice that we have not included the symmetry operator connected with the transformation operator because in the case when R = τ it can be expressed by momenta (85) [26]. 4.3. Wave equation on quantum Minkowski spaces with Z = 0. Let us now assume that the mass in the Klein–Gordon equation (72) is equal to zero. The result is the wave equation of the form: 8 = 0. (92) We shall study the symmetry operators for class of Minkowski spaces (4) with Z = 0. Similarly to the classical field theory the set of symmetry operators of the Klein–Gordon equation can be extended by additional operators [27]. It is easy to check the following commutation relations: 1 [, gab (Rab kl + δ ab kl )xk ∂ l ] = 2, 2 1 [, gab (Rab kl + δ ab kl )xk xl ] = 2gab g ab + 2gab (Rab kl + δ ab kl )xk ∂ l . 2 Let us denote: i D := gab (Rab kl + δ ab kl )xk ∂ l , 2 1 xˆ 2 := gab (Rab kl + δ ab kl )xk xl . 2 These operators allow us to construct additional symmetry operators, namely: – the dilatation operator D given by (95) – the conformal boosts K m :

(93) (94)

(95) (96)

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

43

K m = ixˆ 2 ∂ m − 2Dxm .

(97)

Acting on the arbitrary solution of the wave equation (92) they produce another solution of this equation: D8 = (D + 2i) 8 = 0, K m 8 = (K m − 2ixm ) 8 = 0. The 0ˆ µ operator for the wave equation is given by (78) ←†

←† a

0ˆ µ (∂, ∂ ) = ∂

←− j

gaj ζ µ

←− j

+ ζµ

gja ∂ a .

Using this operator and symmetry operators from the set: P l = i∂ l , = i(R − 1)kl ab xa ∂ b , ζba ,

(98) (99) (100)

D = 2i gab (R + 1)ab kl xk ∂ l ,

(101)

M

kl

K

m

= ixˆ ∂

2 m

− 2Dx , m

(102)

we can construct the full set of conserved currents for wave equation: ←†

←†

Jµkl = i8∗ 0ˆ µ (∂, ∂ )M kl 8 − i(M kl 8)∗ 0ˆ µ (∂, ∂ )8, ←†

←†

Jµl = i8∗ 0ˆ µ (∂, ∂ )P l 8 − i(P l 8)∗ 0ˆ µ (∂, ∂ )8, Jµa

←†

b

←†

= i8∗ 0ˆ µ (∂, ∂ )ζba 8 − i(ζba 8)∗ 0ˆ µ (∂, ∂ )8, ←†

←†

JµD = i8∗ 0ˆ µ (∂, ∂ )D8 − i(D8)∗ 0ˆ µ (∂, ∂ )8, ←†

←†

J˜µl = i8∗ 0ˆ µ (∂, ∂ )K l 8 − i(K l 8)∗ 0ˆ µ (∂, ∂ )8,

(103) (104) (105) (106) (107)

which are conserved according to Corollary 3.7. Similarly to the Klein–Gordon and Dirac eqauation in the case R = τ the transformation operator ζ is an symmmetry operator described by momenta, so it can be then excluded from the set of independent symmetry operators.

5. Final Remarks The presented extension of the Takahashi–Umezawa procedure gives explicit formulae for construction of conserved currents for linear equations of motion on quantum Minkowski spaces. There is an interesting technical analogy between non-commutative differential calculus and discrete calculus on commutative spaces – the appearance of the transformation operator in Leibnitz rules, which was the main obstacle in our construction. For all presented equations this operator becomes an additional symmetry operator. This fact is trivial for all equations fulfilling (30) with R = τ due to Proposition 3.1 of [11]. In this case however it can be shown that the transformation operator can be expressed via momentum operator [26].

44

M. Klimek

The general case must be further studied as two questions arise: – whether the transformation operator is also the symmetry operator for a certain class of equations considered above, – can the transformation operator be expressed by deformations of classical symmetry operators as in the case R = τ . Let us notice that in the algebraic structure of the example studied in [15] the transformation operator is not necessary to close the algebra, however without this operator one can not close the co-algebra. We wish to point out some other open questions. In classical theory (as well as in discrete models [2–4]) the consequence of conservation laws for equations is existence of conserved quantities. They were constructed using integrals on continuous and discrete space-time obeying a Stokes-type theorem. Once the integral calculus on quantum space time (4) compatible with the differential calculus (2, 3) will be developed we shall be able to derive conserved quantities for arbitrary linear models. Our aim is also the extension of the presented method to other types of noncommutative differential calculi, it should be interesting for braided differential calculus studied by Majid in [13, 14, 23]. The promising feature is existence of the integral calculus which could be applied in further construction of integrals of motion [14, 28, 29]. The other interesting problem is the systematic study of symmetry operators, their algebraic and co-algebraic structure for equations on quantum Minkowski spaces. We hope to come back to these questions in the subsequent paper. Acknowledgement. The author is thankful to Professor J. Lukierski and Dr. P. Podle´s for valuable discussions.

References 1. Takahashi, Y.: An Introduction to Field Quantization. Oxford: Pergamon Press, 1969 2. Klimek, M.: Extension of q-deformed analysis and q-deformed models of classical mechanics. J. Phys. A: Math. & Gen. 26, 955–967 (1993) 3. Klimek, M.: The conservation laws for deformed classical models. Czechoslovak J. Phys. 44, 1049–1057 (1994) 4. Klimek, M.: The conservation laws and integrals of motion for a certain class of equations in discrete models. J. Phys. A: Math. & Gen. 29, 1747–1758 (1996) 5. Lukierski, J., Nowicki, A., Ruegg, H.: New quantum Poincar´e algebra and κ-deformed field theory. Phys. Lett. B 293, 344–352 (1992) 6. Lukierski, J., Ruegg, H., R¨uhl, W.: From κ-Poincar´e algebra to κ-Lorentz quasigroup. A deformation of relativistic symmetry. Phys. Lett. B 313, 357–366 (1993) 7. Nowicki, A., Sorace, E., Tarlini, M.: The quantum deformed Dirac equation from the κ-Poincar´e algebra. Phys. Lett. B 302, 419–422 (1993) 8. Podle´s, P., Woronowicz, S. L.: On the structure of inhomogeneous quantum groups. Preprint PAM-631 UC Berkeley, hep-th 9412058, to appear in Commun. Math. Phys. 9. Podle´s, P., Woronowicz, S. L.: On the classification of quantum Poincar´e groups. Commun. Math. Phys. 178, 61–82 (1996) 10. Podle´s, P., Woronowicz, S. L.: Quantum deformation of Lorentz group. Commun. Math. Phys. 130, 381–431 (1990) 11. Podle´s, P.: Solutions of Klein–Gordon and Dirac equations on quantum Minkowski spaces. Commun. Math. Phys. 181, 569–585 (1996) 12. Ogievetsky, O., Schmiedke, W.B., Wess, J., Zumino, B.: q-Deformed Poincar´e algebra. Commun. Math. Phys. 150, 495–518 (1992) 13. Majid, S.: Free braided differential calculus, braided binomial theorem and the braided exponential map. J. Math. Phys. 34, 4843–4856 (1993)

Conservation Laws for Linear Equations on Quantum Minkowski Spaces

45

14. Majid, S.: Introduction to braided geometry and q-Minkowski space. In: “Quantum groups and their applications”, Proceedings of the International School of Physics “Enrico Fermi”, Varenna 1994: Editors L. Castellani and J. Wess, p. 267–345 Amsterdam: IOS Press, 1996 15. Klimek, M.: The symmetry algebra and conserved currents for Klein–Gordon equation on quantum Minkowski space. In: “Quantum Groups and Quantum Spaces”. Banach Center Publications 40 (1997), Editors R. Budzy´nski, W. Pusz and S. Zakrzewski. Warszawa: Inst. of Math., Polish Acad. Sci., 1997 16. Meyer, U.: Wave equations on q-Minkowski space. Commun. Math. Phys. 174, 457–475 (1996) 17. Pillin, M.: q-Deformed relativistic wave equations. J. Math. Phys. 35, 2804–2817 (1994) 18. Pillin, M., Schmidke, W.B., Wess, J.: q-Deformed relativistic one-particle states. Preprint MPI-Ph/92-75, August 1992 19. Carow-Watamura, U., Schlieker, M. Watamura, S.: SOq (N ) covariant differential calculus on quantum space and quantum deformation of Schr¨odinger equation. Z. Phys. C 49, 439–446 (1991) 20. Hebecker, A., Weich, W.: Free particle in q-deformed configuration space. Lett. Math. Phys. 26, 245–258 (1992) 21. Schirrmacher, A.: Quantum groups, quantum spacetime and Dirac equation. In: “Low-Dimensional Topology and Quantum Field Theory”, Editor H. Osborn, New York: Plenum Press, 1993 22. Azc´arraga, J. A., Kulish, P.P., Rodenas, F.: On the physical contents of q-deformed Minkowski spaces. Phys. Lett. B 351, 123–130 (1995) 23. Majid, S.: Braided momentum in the q-Poincar´e group. J. Math. Phys. 34, 2045–2058 (1993) 24. Doplicher, S., Fredenhagen, K., Roberts, J.E.: The quantum structure of spacetime at the Planck scale and quantum fields. Commun. Math. Phys. 172, 187–220 (1995) 25. Doplicher, S., Fredenhagen, K., Roberts, J.E.: Spacetime quantization induced by classical gravity. Phys. Lett. B 331, 39–44 (1994) 26. Klimek, M.: In preparation 27. Mack, G., Salam Abdus: Component field representations of the conformal group. Ann. Phys. 53, 174– 202 (1969) 28. Chryssomalakos, Ch.: Remarks on quantum integration. q-alg/9601014, Commun. Math. Phys. 184, 1–25 (1997) 29. Chryssomalakos, Ch.: Applications of quantum groups. Ph. D. Thesis, UC Berkeley, 1994 Communicated by H. Araki

Commun. Math. Phys. 192, 47–65 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Three-Manifold Invariants and Their Relation with the Fundamental Group E. Guadagnini1,2 , L. Pilo3 1 Dipartimento di Fisica dell’Universit` a di Pisa, Piazza Torricelli, 2, 56100 Pisa. Italy. E-mail: [email protected] 2 Istituto Nazionale di Fisica Nucleare, Pisa, Italy 3 Scuola Normale Superiore, Piazza dei Cavalieri 7, 56100 Pisa, Italy. E-mail: [email protected]

Received: 15 November 1996 / Accepted: 17 June 1997

Abstract: We consider the 3-manifold invariant I(M ) which is defined by means of the Chern–Simons quantum field theory and which coincides with the Reshetikhin–Turaev invariant. We present some arguments and numerical results supporting the conjecture that for nonvanishing I(M ), the absolute value |I(M )| only depends on the fundamental group π1 (M ) of the manifold M . For lens spaces, the conjecture is proved when the gauge group is SU (2). In the case in which the gauge group is SU (3), we present numerical computations confirming the conjecture.

1. Introduction Recently, new 3-manifold invariants [1, 2] have been discovered; the algebraic aspects of their construction, which is based on the structure of simple Lie groups, are well understood [3–9]. However, the topological meaning of these invariants is still unclear. Let us denote by I(M ) the invariant of the 3-manifold M which is closed, connected and orientable; I(M ) is the invariant defined by means of the Chern–Simons quantum field theory [1, 9] and coincides with the Reshetikhin–Turaev invariant [2, 3]. In general, it is not known how I(M ) is related, for instance, to the homotopy class of M or to the fundamental group of M . In this article we shall formulate the following: Conjecture. For nonvanishing I(M ), the absolute value |I(M )| only depends on the fundamental group π1 (M ). In the absence of a general proof, we shall verify the validity of the conjecture for a particular class of manifolds: the lens spaces. There are examples of lens spaces M1 and M2 with the same fundamental group π1 (M1 ) ' π1 (M2 ) which are not homeomorphic; for all these manifolds, we shall prove that (for nonvanishing invariants) |I(M1 )| = |I(M2 )| when I(M ) is the invariant associated with the group SU (2). In the case in which the gauge group is SU (3), we will present numerical computations confirming

48

E. Guadagnini, L. Pilo

the conjecture. Our results are in agreement with the computer calculations for SU (2) of Freed and Gompf [10] and the expression of the SU (2) invariant obtained by Jeffrey [11]. Differently from [10] and [11], our approach is based exclusively on the properties of 3-dimensional Chern–Simons quantum field theory. We shall use general surgery rules to compute I(M ) and, in our construction, invariance under Kirby moves is manifestly satisfied. Our notations and conventions are described in Sect. 2 where the expression of the invariant I(M ) for a generic lens space is derived. The validity of our conjecture for the gauge group SU (2) is proved in Sect. 3. The numerical computations for the group SU (3) are reported in Sect. 4 and the conclusions are contained in Sect. 5. 2. Surgery Rules The basic ingredient in the construction of the 3-manifold invariant I(M ) is a polynomial invariant E(L) for oriented, framed and coloured links {L} ⊂ S 3 . In the Chern–Simons field theory, this link invariant is defined by the expectation values of the Wilson line operators [1]; each link component is framed and its colour is given by an irreducible representation of a simple compact Lie group which is called the gauge group. When the gauge group is SU (N ), the invariant E(L) is a finite Laurent polynomial in the variable q 1/2N , where q = exp(−i2π/k) is the deformation parameter and k is the renormalized coupling constant of the Chern–Simons field theory [12, 13]. In general, the colour which characterizes one link component is an element of the algebra T which coincides with the complex extension of the representation ring of the gauge group. The sum operation in this algebra extends by linearity to E(L); whereas the product operation in the colour algebra T simply corresponds to the satellites obtained from the companion links by standard cabling [6, 13]. For integer values of the Chern–Simons coupling constant k (k = 1, 2, 3, . . .), the set of vanishing link invariants defines an ideal Ik of T . Thus, for fixed integer k, the colour states belong to the algebra [13] of the equivalence classes T(k) = T / Ik .

(1)

Usually, T(k) is of finite order [14] and, for appropriate values of k, T(k) is isomorphic with the Verlinde algebra [15] which is determined by of fusion rules of certain twodimensional conformal models [1]. We shall now concentrate on T(k) when the gauge group G is SU (2) [13] or SU (3) [16]. For G = SU (2) and k = 1, T(1) is isomorphic with the group algebra of Z2 , which is the center of SU (2). For G = SU (2) and k ≥ 2, the ideal Ik is generated by the representation with J = (k − 1)/2 and T(k) is of order (k − 1). For G = SU (3) and k = 1, 2, the algebra (1) is isomorphic with the group algebra of Z3 , which is the center of SU (3). For G = SU (3) and k ≥ 3, the ideal Ik is generated by the two irreducible representations with Dynkin labels (k − 1, 0) and (k − 2, 0); in this case, T(k) is of order (k − 1)(k − 2)/2. We shall denote by {ψ[i]} (with i = 1, 2, . . . , dim(T(k) )) the elements of a basis in T(k) . When G = SU (2) and k ≥ 2 or when G = SU (3) and k ≥ 3, each ψ[i] represents the equivalence class of an irreducible representation of the gauge group. For low values of k, ψ[i] corresponds to an irreducible representation of the gauge group up to a nontrivial multiplicative factor [13, 16]. The unit in T(k) will be denoted by ψ[1]; ψ[1] is the class defined by the trivial representation. Let us now consider the definition of the 3-manifold invariant I(M ). Each 3-manifold M , which is closed, connected and orientable, admits a surgery presentation [17] given

Three-Manifold Invariants and Their Relation with Fundamental Group

49

by Dehn surgery on S 3 . Each “honest” [17] surgery instruction can be represented by a framed link L ⊂ S 3 with components {Lb } with b = 1, 2, . . .. The surgery link L is not oriented and an integer surgery coefficient rb is attached to the component Lb . The framing Lbf of Lb is specified by the linking number `k(Lb , Lbf ) = rb .

(2)

The surgery link associated to the manifold M is not unique. Indeed, if the surgery links L and L0 are related by a finite sequence of Kirby moves, the corresponding manifolds are homeomorphic [18]. Therefore, each 3-manifold M is characterized by a class of “equivalent” surgery links in S 3 , where “equivalent” links means links related by Kirby moves. Let L ⊂ S 3 be a surgery link for the manifold M . The invariant I(M ) is defined in terms of the expectation value E(L) of the Wilson line operators associated with the surgery link L. More precisely, one introduces an (arbitrary) orientation and a particular colour state 90 for each component of L. For fixed integer k, the surgery colour state 90 ∈ T(k) is [2] X 90 = ak E0 [i]ψ[i], (3) i

where the sum is performed over all the elements {ψ[i]} of the basis of T(k) . The coefficients {E0 [i]} coincide with the expectation values of the unknot with preferred framing and colour ψ[i]. When the gauge group G is SU (2), ak is given by [13] √ ( 1/ 2 k=1 q (4) ak = 2 k ≥ 2; k sin (π/k) whereas, when G = SU (3), one has [16] √ 1/ 3 √ ak = 16 cos(π/k) sin3 (π/k)/(k 3)

k = 1, 2 k ≥ 3.

(5)

We shall denote by σ(L) the signature of the linking matrix associated with L; σ(L) does not depend on the choice of the orientation of L. Let us define the function I(L) by means of the relation [2] I(L) = exp [iθk σ(L)] E(L), where, for G = SU (2), the phase factor eiθk is [2, 13] exp −iπ/4 iθk e = exp iπ3(k − 2)/(4k) and, for G = SU (3), the phase factor is [16]   exp iπ/2 eiθk = exp −iπ/2  exp −i6π/k

(6)

k=1 k ≥ 2;

k=1 k=2 k ≥ 3.

(7)

(8)

It can be verified [2, 3, 13, 16] that I(L) is invariant under Kirby moves and then it represents a topological invariant for the 3-manifold M . In what follows, we shall denote this invariant by I(M ). It should be noted that the multiplicative phase factor

50

E. Guadagnini, L. Pilo

in (6) is not a matter of convention (or choice of framing); the presence of the term exp [iθk σ(L)] in (6) guarantees the invariance of I(L) under Kirby moves. According to the definition (6), the normalization of the 3-manifold invariant I(M ) is fixed by I(S 3 ) = 1. In order to compare I(M ) with the expressions obtained in [10, 11] for lens spaces, we need to produce the relation between the link invariants and the representation matrices of the mapping class group of the torus. Let us consider the Hopf link in S 3 in which the linking number of the two link components C1 and C2 is equal to unity. When the two link components have preferred framings and colours ψ[i] and ψ[j], the associated Chern–Simons expectation value is denoted by Hij = E(C1 , , ψ[i]; C2 , , ψ[j]).

(9)

The complex numbers {Hij }, where i, j = 1, 2, · · · , dim(T(k) ) can be understood as the matrix elements of the so-called Hopf matrix H. Note that H is symmetric and that E0 [i] = H1i = Hi1 . Let Q(i) be the value of the quadratic Casimir operator of the irreducible representation of the gauge group which is associated with an element of the class ψ[i]. One can show [14] that the matrices Xij = ak Hij ;

Yij = q Q(i) δij

Cij = δij ∗

;

(10)

give a projective representation of the modular group X 2 = C;

(11)

(X Y )3 = e−iθk C.

(12)

This representation is isomorphic with the representation obtained in two-dimensional conformal field theories [1]; X corresponds to the S matrix of the conformal models and Y is the analogue of the T matrix. Lens spaces, which are characterized by two integers p and r, will be denoted by {Lp/r }. The fundamental group of Lp/r is Zp . Two lens spaces Lp/r and Lp0 /r0 are homeomorphic if and only if |p| = |p0 | and r = ±r0 (modp) or rr0 = ±1 (modp). Thus, we only need [17] to consider the case in which p > 1 and 0 < r < p; moreover, r and p are relatively prime. The lens spaces Lp/r and Lp0 /r0 are homotopic if and only if |p| = |p0 | and rr0 = ± quadratic residue (modp). Consequently, one can find examples of lens spaces which are homotopic but are not homeomorphic; for instance, L13/2 and L13/5 . One can also find examples of lens spaces which are not homeomorphic and are not homotopic but have the same fundamental group; for instance, L13/2 and L13/3 . One possible surgery instruction corresponding to the lens space Lp/r is given the unknot [17] with rational surgery coefficient (p/r). From this surgery presentation one can derive [17] a “honest” surgery presentation of Lp/r by using a continued fraction decomposition of the ratio (p/r) 1 p = zd − r zd−1 −

,

1

..

.−

(13)

1 z1

where {z1 , z2 , · · · , zd } are integers. The new surgery link L corresponding to a “honest” surgery presentation of Lp/r is a chain with d linked components [17] and the integers {z1 , z2 , · · · , zd } are precisely the surgery coefficients. Let us now consider the invariant

Three-Manifold Invariants and Their Relation with Fundamental Group

51

I(Lp/r ) given in Eq. (6); by using the properties of the link polynomial E(L) for the connected sums of links [1, 13], expression (6) can be written as (14) I(Lp/r ) = eiθk σ(L) (ak )−1 F (p/r) 11 , where F (p/r) 11 is the element corresponding to the first row and the first column of the following matrix F (p/r) = XY zd XY zd−1 X · · · XY z1 X.

(15)

The invariant I(Lp/r ) given in Eq. (14) is in agreement with the expressions obtained in [10, 11] apart from an overall normalization factor. 3. The SU(2) Case In this section, we shall compute I(Lp/r ) for the gauge group G = SU (2). Then, we will show that in this case our conjecture is true; i.e. when I(Lp/r ) 6= 0, the absolute value |I(Lp/r )| only depends on p. For k ≥ 2, the standard basis of Tk is {ψ[j]}; the index j represents the dimension of the irreducible representation described by ψ[j] and 1 ≤ j ≤ (k − 1). The matrix elements of X and Y are i iπmn iπmn − exp ; (X)mn = √ exp − k k 2k iπm2 (Y )mn = ξ exp − δmn ; (16) 2k with ξ = exp(iπ/2k). When k = 1, one has

X=

1 1

1 , −1

(17)

Y =

1 0

0 . i

The algebra T1 is isomorphic with T3 and it is easy to verify that ∗ Ik=1 (Lp/r ) = Ik=3 (Lp/r ) .

(18)

(19)

Therefore, we only need to consider the case k ≥ 2. In order to compute I(Lp/r ), we shall derive a recursive relation for the matrix (15); the argument that we shall use has been produced by Jeffrey [11] in a slightly different context. In fact, our final result for I(Lp/r ) is essentially in agreement with the formulae obtained by Jeffrey. Since in our approach the invariance under Kirby moves is satisfied, our derivation of I(Lp/r ) proves that the appropriate expressions given in [10, 11] really correspond to the values of a topological invariant of 3-manifolds. Let us introduce a few definitions; with the ordered set of integers {z1 , z2 , · · · , zd } one can define the following partial continued fraction decompositions: 1 αt = zt − γt zt−1 −

,

1

..

.−

1 z1

(20)

52

E. Guadagnini, L. Pilo

where 1 ≤ t ≤ d. The integers αt and γt satisfy the recursive relations αm+1 = zm+1 αm − γm , γm+1 = αm , γ1 = 1,

α1 = z1 ,

α0 = 1,

(21) (22)

and, clearly, αd /γd = p/r. Finally, let Ft be the matrix Ft = XY zt XY zt−1 X · · · XY z1 X;

(23)

by definition, one has Fd = F (p/r). Lemma 1. The matrix element (Ft )mn is given by X iπγt s+ n 2 iπγt γt (Ft )mn = Bt e 2kαt − e 2kαt

s− γn

2

t

,

(24)

s(m,k,|αt |)

(−i)t+1 z1 +z2 +···+zt iπ sign(α0 α1 ) + · · · + sign(αt−1 αt ) ξ Bt = p exp − 4 2k|αt | 2 1 iπn 1 exp + ··· + , (25) 2k α0 α1 αt−2 αt−1 where s(m, k, |αt |) stands for the sum over a complete residue system modulo (2k|αt |) with the additional constraint s ≡ m (mod2k). Proof. The proof is based on induction. First of all we need to verify the validity of Eqs. (24) and (25) when t = 1. In this case, from definition (23) one gets i 1 1 z1 X −iπs2 z1 /(2k) h −iπs(m+n)/k ξ e e − e−iπs(m−n)/k + c. c. . (26) 2k 2 2k

(F1 )mn = −

s=1

Since the sum (26) covers twice a complete residue system modulo k, i.e. 1 ≤ s ≤ 2k, a multiplicative factor 1/2 has been introduced in (26). The change of variables s → −s shows that the last two terms in (26) are equal to the first two terms. Therefore, Eq. (26) can be written as i 1 z1 X −iπs2 z1 /(2k) h −iπs(m+n)/k ξ e e − e−iπs(m−n)/k . 2k 2k

(F1 )mn = −

(27)

s=1

At this point, one can use the reciprocity formula [19] reported in the appendix and one gets −1 iπ (F1 )mn = p ξ z1 exp − sign(α0 α1 ) × 4 2k|z1 | ×

|z1 |−1 h

X

iπ

e 2kz1

(2kv+m+n)2

iπ

− e 2kz1

(2kv+m−n)2

i .

(28)

v=0

By introducing the new variable s = 2kv + m, one finds that in Eq. (28) the variable s covers a complete residue system modulo (2k|z1 |) with the constraint that s ≡ m ( mod 2k). Therefore, Eq. (28) can be written in the form

Three-Manifold Invariants and Their Relation with Fundamental Group

(F1 )mn = B1

X

h

iπ

e 2kz1

(s+n)2

53 iπ

− e 2kz1

(s−n)2

i .

(29)

s(m,k,|z1 |)

This confirms the validity of Eq. (24) when t = 1. In order to complete the proof, suppose now that (24) is true for a given t; we shall show that (24) is true also in the case t + 1. Indeed, one has k X (XY zt+1 )mv (Ft )vn . (Ft+1 )mn = (30) v=1

From Eq. (24) one gets 2k iπ 2 iξ zt+1 1 X X (Ft+1 )mn = −Bt √ e− 2k v zt+1 2k 2 v=1 s(v,k,|α |) t 2 2 iπγt iπγt n n s− −iπmv/k 2kαt γt 2kαt s+ γt e e −e e−iπmv/k 2 2 iπγt iπγt n n iπmv/k iπmv/k 2kαt s− γt 2kαt s+ γt . −e e +e e

(31)

Again, the last two terms can be omitted provided one introduces a multiplicative factor 2. Moreover, because of the constraint v = s(mod2k), one can set v = s, thus 2k|αt |−1 X iξ zt+1 iπn2 (Ft+1 )mn = −Bt √ e 2kαt γt 2k s=0 iπ iπ − 2kα αt+1 s2 +2(γt+1 m+n)s − 2kα αt+1 s2 +2(γt+1 m−n)s t t e −e .

By using the reciprocity formula, one obtains the final expression for (Ft+1 )mn , s iπn2 −iπ |αt | zt+1 (Ft+1 )mn = −iBt ξ e 2kαt αt+1 e 4 sign(αt αt+1 ) |αt+1 | |αt+1 | 2 2 X iπαt iπαt n n 2kαt+1 2kv+m+ αt 2kαt+1 2kv+m− αt −e e .

(32)

(33)

v=1

In terms of the variable s = 2kv + m, Eq. (33) can be rewritten in the form (24) and this concludes the proof. From the definition (14) and Lemma 1 it follows (see also [10, 11]) Theorem 1. Let SU (2) be the gauge group, the 3-manifolds invariant Ik (Lp/r ) for k ≥ 2 is given by X i2π iπ(r + 1)2 exp rks2 + (r + 1)s exp Ik (Lp/r ) = 2pkr p s(mod p) iθk σ(L) Bd iπ(r − 1)2 i2π e 2 − exp exp rks + (r − 1)s . (34) 2pkr p ak

54

E. Guadagnini, L. Pilo

Proof. According to Eq. (14), the expression for the matrix element F (p/r) 11 has been written by means of a sum over a complete residue system modulo p. As shown in Eq. (34), the expression for Ik (Lp/r ) is rather involved; nevertheless, |Ik (Lp/r )|2 can be computed explicitly. Let us introduce the modulo-p Croneker delta symbol defined by δp (x) =

0 1

x 6≡ 0 (mod p) ; x ≡ 0 (mod p)

where p and x are integers. One can easily verify that, for integer n, δp (xn) = δp (x) if (n, p) = 1; δpn (xn) = δp (x).

(35)

(36)

Finally, we shall denote by φ(n) the Euler function [20] which is equal to the number of residue classes modulo n which are coprime with n. Theorem 2. The square of the absolute value of Ik (Lp/r ) is given in the following list: for p = 2, 2 Ik (L2/1 ) 2 = 1 + (−1)k sin π/(2k) ; (37) sin2 π/k for p > 2 one has: when p and k are coprime integers, i.e. (k, p) = 1, φ(p) 2 − 1 /(kp) Ik (Lp/r ) 2 = 1 1 − (−1)p sin π k + 2 sin2 (π/k) i 2 φ(p/2) h − 1 /(kp) 1 p p/2 sin π k 1 + (−1) 1 + (−1) ; 2 sin2 (π/k)

(38)

when the greatest common divisor of p and k is greater than unity, i.e. (k, p) = g > 1 and p/g is odd Ik (Lp/r ) 2 =

g δg (r − 1) + δg (r + 1) ; 4 sin (π/k) 2

(39)

when (k, p) = g > 1 and p/g is even h i n g kp/2g 2 (r+1)/g Ik (Lp/r ) 2 = (r + 1) 1 + (−1) (−1) + δ g 4 sin2 (π/k) io h 2 δg (r − 1) 1 + (−1)kp/2g (−1)(r−1)/g .

(40)

Proof. From Theorem 1 it follows that the square of the absolute value of the lens space invariant is Ik (Lp/r ) 2 = a(k)−2 (2kp)−1 S(k, p, r), (41) with

Three-Manifold Invariants and Their Relation with Fundamental Group

i2π 2 2 kr s − t + (r + 1) (s − t) S(k, p, r) = exp p s,t(modp) i2π i2π 2 2 − exp exp kr s − t + r (s − t) + s + t kp p i2π i2π 2 2 − exp − exp kr s − t + r (s − t) − s − t kp p i2π 2 2 + exp kr s − t + (r − 1) (s − t) . p X

55

(42)

The indices s and t run over a complete residue system modulo p. When p = 2, each sum contains only two terms and the evaluation of (42) is straightforward; the corresponding result is shown in Eq. (37). Let us now consider the case in which p > 2. By means of the change of variables s → s + t, the sum in t becomes a geometric sum and one obtains X i2π krs2 + (r + 1) s δp (2krs) exp S(k, p, r) = p p s(modp) i2π i2π 2 exp krs + (r + 1) s δp (2krs + 2) − exp kp p −i2π i2π 2 − exp exp krs + (r − 1) s δp (2krs − 2) kp p i2π 2 + exp krs + (r − 1) s δp (2krs) . (43) p By using properties (36), one can determine the values of s which give contribution to (43). Let us start with (k, p) = 1. Clearly, in this case one has s=p p odd δp (2rks) 6= 0 ⇒ . (44) s = p, p/2 p even When (k, p) = 1 and p is odd, one gets δp (2krs ∓ 2) = δp (krs ∓ 1).

(45)

The delta gives a non-vanishing contribution if and only if the following congruence is satisfied: rks = ±1 (modp). (46) The unique solution [20] to (46) is given by s = ±(rk)φ(p)−1 .

(47)

When (k, p) = 1 and p is even, one finds two solutions s1 = ±(rk)φ(p/2)−1 ,

s2 = ±(rk)φ(p/2)−1 + p/2.

(48)

Let us now examine the case (p, k) = g > 1. We introduce the integer β defined by p = gβ. For β odd, one has (49) δp (2krs) = δβ (s).

56

E. Guadagnini, L. Pilo

Within the residues of a complete system modulo p, the values of s giving non-vanishing contribution are of the form s = αβ with 1 ≤ α ≤ g. When β is even, one gets δp (2krs) = δβ (2s) = δβ/2 (s).

(50)

The solutions of the associated congruence are β s=α , 2

1 ≤ α ≤ 2g.

(51)

When (k, p) = g > 1 and p is odd, δp [2r(ks ± 1)] does not contribute because rks = ±1 (modp) has no solutions. On the other hand, if p is even we have δp [(2rks ± 2)] = δp/2 (rks ± 1).

(52)

The delta function (52) is non-vanishing when (p/2, k) = 1 and, in this case, the two solutions are s1 = ±(rk)φ(p/2)−1 and s2 = s1 + p/2. This exhausts the analysis of the modulo p Croneker deltas when p > 2. At this stage, Theorem 2 simply follows from the substitution of the values of s for which the various Croneker deltas modulo p are non vanishing. In the case (k, p) = 1 and p odd, the algebraic manipulations are straightforward. When (k, p) = 1 and p even, the evaluation of (43) needs some care. In this case, one has to deal with factors of the form iπ φ(b) a −1 ; (53) exp b with b > 2 even and (a, b) = 1. In Appendix B, it is shown that terms of the type (53) are trivial because actually aφ(b) ≡ 1

(mod 2b).

(54)

Finally, the derivation of Eqs. (39) and (40) is straightforward.

Let us now consider the dependence of |I(Lp/r )|2 on r. As shown in Eqs. (39) and (40), |I(Lp/r )|2 depends on r. However, this dependence is rather peculiar: when I(Lp/r ) 6= 0, |I(Lp/r )|2 does not depend on r. Indeed, when expression (39) is different from zero, its values are given by 0 6= (39) =

sin−2 (π/k) (g/4) sin−2 (π/k)

for g = 2 ; for g > 2 .

(55)

Similarly, when expression (40) is different from zero, its value is given by 0 6= (40) =

g . 2 sin (π/k) 2

(56)

To sum up, when I(Lp/r ) 6= 0, |I(Lp/r )|2 only depends on p and, therefore, it is a function of the fundamental group π1 (Lp/r ) = Zp . Thus, Theorem 2 proves the validity of our conjecture for the lens spaces when the gauge group is SU (2).

Three-Manifold Invariants and Their Relation with Fundamental Group

57

4. The SU(3) Case In this section we shall present numerical computations confirming the validity of our conjecture for lens spaces when the gauge group is SU (3). As in the SU (2) case, the SU (3) Chern–Simons field theory can be solved explicitly in any closed, connected and orientable three-manifold [16]. The general surgery rules for SU (3) and for any integer k have been derived in [16]. In particular, it turns out that ∗ Ik=1 (Lp/r ) = Ik=2 (Lp/r ) = Ik=4 (Lp/r ).

(57)

Therefore, we only need to consider the case k ≥ 3. For k ≥ 3, the matrices which give a projective representation of the modular group have the following form: i X(m,n)(a,b) = √ q −2 q −[(m+n)(a+b+3)+(m+3)b+(n+3)a]/3 k 3 (n+1)(a+b+2)+(m+1)(b+1) + q (m+1)(a+b+2)+(n+1)(a+1) 1+q −q (m+1)(b+1) − q (n+1)(a+1) − q (m+n+2)(a+b+2) ;

Y(a,b)(m,n) = q [m

2

+n2 +mn+3(m+n)]/3

C(a,b)(m,n) = δan δbm ;

δam δbn ;

(58)

(59)

(60)

where each irreducible representation of SU (3) has been denoted by a couple of nonnegative integers (m, n) (Dynkin labels). By using Eq. (14), we have computed Ik (Lp/r ) numerically for some examples of lens spaces. In particular, we have worked out the value of the invariant for the lens spaces Lp/r , with p ≤ 20 and 3 ≤ k ≤ 50. In all these cases, the results are in agreement with our conjecture. Our calculations have been performed on a Pentium based PC running Linux. For instance, the results of the computations for the cases L8/1 , L8/3 , L15/1 , L15/2 , L15/4 with 20 ≤ k ≤ 50 are shown in Tables 1–5. It should be noted that for low values of k the invariant I(M ) is a function of the homology groups of M and so, in these cases, our conjecture is true. The spaces L8/1 and L8/3 are not homotopically equivalent; as shown in Tables 1 and 2, the phase of the invariant distinguishes these two manifolds. The case in which p = 15 is more interesting because there are two different spaces belonging to the same homotopy class; L15/1 and L15/4 are homotopically equivalent and L15/2 represents the other homotopy class. The phase of the invariant distinguishes the manifolds of the same homotopy class. The numerical computations show that Ik (L8/3 ) vanishes when k = 8n with integer n; similarly, Ik (L15/2 ) = 0 for k = 5n and Ik (L15/4 ) = 0 for k = 15n.

58

E. Guadagnini, L. Pilo

Table 1.

•

L8/1

•

k

Ik

|Ik |

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

−66.118464248 + i 0.000001734 −90.016155013 + i 0.000000715 43.367008373 − i 50.048195295 22.973052202 − i 3.157573698 219.054514271 − i 58.695485329 23.000602198 − i 5.905551129 84.434763344 + i 44.314782640 −185.409752744 + i 67.483626877 −161.491033989 + i 77.769978391 −214.342425435 + i 99.165358161 50.544948881 − i 155.561366312 47.834849646 − i 26.550506056 443.615385766 − i 296.414325177 46.894822945 − i 30.137471278 211.662329441 + i 39.566544008 −344.610722364 + i 250.374335362 −290.376668930 + i 243.654963011 −381.345093602 + i 307.914657384 27.064369107 − i 326.618342223 79.852315000 − i 70.742976321 734.287201006 − i 734.287220264 78.053570702 − i 75.119028091 408.590935390 − i 0.000001624 −545.541472393 + i 565.843251992 −452.077098514 + i 521.724781996 −590.050989168 + i 655.318028318 −39.324647251 − i 574.905929557 118.947578576 − i 140.691854636 1090.316030520 − i 1420.927547540 116.363242798 − i 145.914887153 687.196328608 − i 86.813088226

66.118464248 90.016155013 66.223253224 23.189036183 226.781922164 23.746646829 95.357376335 197.308936212 179.241523084 236.170369862 163.566899299 54.709251617 533.531688523 55.743982582 215.328709440 425.962272715 379.059824907 490.138262785 327.737732877 106.681586554 1038.438931957 108.329258655 408.590935390 785.998781122 690.340822456 881.817377951 576.249299975 184.235513433 1791.039960963 186.632147733 692.658145364

Three-Manifold Invariants and Their Relation with Fundamental Group

59

Table 2.

•

L8/3

•

k

Ik

|Ik |

20 21 22 23 24 * 25 26 27 28 29 30 31 32 * 33 34 35 36 37 38 39 40 * 41 42 43 44 45 46 47 48 * 49 50

−20.431729095 + i 62.882396270 20.030478885 − i 87.759262069 −55.710545730 + i 35.802993757 −4.717948848 + i 22.704016336 0.000000000 + i 0.000000000 −4.449677900 + i 23.326028428 54.169163837 + i 78.477582217 34.262337211 − i 194.311370120 −39.884991120 + i 174.747563877 38.208113963 − i 233.059184818 −132.328401250 + i 96.142211171 −8.284500381 + i 54.078362271 0.000000000 + i 0.000000000 −7.933195866 + i 55.176589215 129.764538515 + i 171.836019661 57.178306982 − i 422.107212669 −65.823047822 + i 373.301054424 62.256293514 − i 486.168356194 −258.631121471 + i 201.300681961 −12.859044288 + i 105.903757675 0.000000000 + i 0.000000000 −12.423570453 + i 107.614511930 254.752281347 + i 319.449256739 85.965636475 − i 781.283554973 −98.245742501 + i 683.314148273 92.175015400 − i 876.986690089 −447.003088251 + i 363.663986140 −18.441181139 + i 183.310248618 0.000000000 + i 0.000000000 −17.920983557 + i 185.769741658 441.516918550 + i 533.702273720

66.118464248 90.016155013 66.223253224 23.189036183 0.000000000 23.746646829 95.357376335 197.308936212 179.241523084 236.170369862 163.566899299 54.709251617 0.000000000 55.743982582 215.328709440 425.962272715 379.059824907 490.138262785 327.737732877 106.681586554 0.000000000 108.329258655 408.590935390 785.998781122 690.340822456 881.817377951 576.249299975 184.235513433 0.000000000 186.632147733 692.658145364

60

E. Guadagnini, L. Pilo

Table 3.

•

L15/1

•

k

Ik

|Ik |

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

−115.811330427 − i 84.141852139 −53.061367118 − i 135.569663547 36.010672960 + i 16.445523443 −40.070955963 − i 2.740927872 164.290883844 − i 129.064834808 274.923532195 − i 34.730921473 0.000000762 − i 277.056941014 152.288761117 + i 30.720691710 −0.000003253 + i 136.433611353 −4.270422302 + i 12.674160517 485.145409358 + i 667.745345688 13.416605095 + i 0.680413518 164.960256203 + i 68.328775084 −82.729230504 + i 287.059281883 −538.981501170 − i 268.380890132 −415.659227492 + i 629.696787632 −681.771583909 + i 223.335013764 101.630605413 − i 150.367371847 −59.482939675 + i 173.267881065 −279.481438371 − i 847.784737546 347.379313920 − i 1069.123643311 −941.842709163 − i 512.157840150 399.545316356 − i 429.453517605 390.071778264 + i 279.072428091 24.267771316 + i 37.761389387 2753.539308039 + i 289.408610795 32.920210846 − i 30.745331135 536.002877882 − i 206.437637985 394.055675294 + i 817.737030229 −1754.712300120 + i 400.501606784 137.556258644 + i 2186.393963508

143.150674244 145.583798394 39.588177633 40.164588848 208.923972053 277.108616721 277.056941014 155.356453557 136.433611353 13.374260781 825.378649414 13.433847358 178.551694562 298.742626511 602.104111256 754.513510650 717.419696551 181.491395038 183.193828283 892.663898458 1124.143119192 1072.088308877 586.572061733 479.622155784 44.887049949 2768.706568944 45.044596443 574.382784800 907.729985094 1799.837990828 2190.716843400

Three-Manifold Invariants and Their Relation with Fundamental Group

61

Table 4.

•

L15/2

•

k

Ik

|Ik |

20 * 21 22 23 24 25 * 26 27 28 29 30 * 31 32 33 34 35 * 36 37 38 39 40 * 41 42 43 44 45 * 46 47 48 49 50 *

0.000000000 + i 0.000000000 −53.061366041 + i 135.569663969 −11.153278505 + i 37.984578277 −29.353726021 − i 27.414466365 −164.290886665 − i 129.064831217 0.000000000 + i 0.000000000 228.013392388 + i 157.386281027 56.698638283 + i 144.640561664 0.000000000 − i 136.433611353 11.816320486 + i 6.264616638 0.000000000 + i 0.000000000 −1.359079795 − i 13.364922631 164.960256101 + i 68.328775330 −224.792217910 − i 196.762841162 −110.636340119 − i 591.852144574 0.000000000 + i 0.000000000 147.472006710 + i 702.099015977 45.731849880 + i 175.635202563 −177.588145856 − i 44.971426177 −875.291194999 − i 175.254556482 0.000000000 + i 0.000000000 1052.478012596 + i 204.116082250 399.545318063 + i 429.453516017 −171.335849479 − i 447.974819607 44.430164502 + i 6.388093254 0.000000000 + i 0.000000000 −17.945816315 − i 41.315412930 571.498128245 + i 57.493242102 −817.737034535 − i 394.055666358 −780.920437266 − i 1621.597997004 0.000000000 + i 0.000000000

0.000000000 145.583798394 39.588177633 40.164588848 208.923972053 0.000000000 277.056941014 155.356453557 136.433611353 13.374260781 0.000000000 13.433847358 178.551694562 298.742626511 602.104111256 0.000000000 717.419696551 181.491395038 183.193828283 892.663898458 0.000000000 1072.088308877 586.572061733 479.622155784 44.887049949 0.000000000 45.044596443 574.382784800 907.729985094 1799.837990828 0.000000000

62

E. Guadagnini, L. Pilo

Table 5.

•

L15/4

•

k

Ik

|Ik |

20 21 22 23 24 25 26 27 28 29 30 * 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 * 46 47 48 49 50

−44.235991098 + i 136.144381552 72.909410760 − i 126.011349400 −36.010673013 + i 16.445523326 −10.836276386 + i 38.675176941 164.290886665 − i 129.064831217 −257.649075864 + i 102.010485575 275.036883975 − i 33.395523912 −153.611719961 + i 23.217819719 106.668092622 + i 85.064965309 −9.197471902 + i 9.709653035 0.000000000 + i 0.000000000 −4.021598499 + i 12.817761129 −164.960256101 − i 68.328775330 85.599713692 + i 286.216431937 −217.505092368 − i 561.445362956 33.851120653 + i 753.753765751 147.472006710 − i 702.099015977 −136.240563896 + i 119.906777214 0.000000000 + i 183.193828283 538.949599873 − i 711.605343156 −909.450887536 + i 660.754746927 1008.985242455 − i 362.383943545 −546.310790198 + i 213.568031595 426.548198677 + i 219.303548819 −24.267771377 + i 37.761389348 0.000000000 + i 0.000000000 −6.133571758 + i 44.625048641 −558.735830232 − i 133.153503483 394.055666358 + i 817.737034536 −883.212092863 − i 1568.232505801 410.499402001 + i 2151.913225229

143.150674244 145.583798394 39.588177633 40.164588848 208.923972053 277.108616721 277.056941014 155.356453557 136.433611353 13.374260781 0.000000000 13.433847358 178.551694562 298.742626511 602.104111256 754.513510650 717.419696551 181.491395038 183.193828283 892.663898458 1124.143119192 1072.088308877 586.572061733 479.622155784 44.887049949 0.000000000 45.044596443 574.382784800 907.729985094 1799.837990828 2190.716843400

Three-Manifold Invariants and Their Relation with Fundamental Group

63

5. Conclusions In this article, we have presented some arguments and numerical results supporting the conjecture that for nonvanishing I(M ), the absolute value |I(M )| only depends on the fundamental group π1 (M ). Since the Turaev-Viro invariant [21] coincides [3] with |I(M )|2 , our conjecture gives some hints on the topological interpretation of the Turaev– Viro invariant. For the gauge group SU (2), |I(M )|2 can be understood as the improved partition function of the Euclidean version of (2+1) gravity with positive cosmological constant [22, 23]. In this case, our conjecture suggests that for instance, the semiclassical limit is uniquely determined by the fundamental group of the universe. Lens spaces are a natural testing ground for our conjecture; unfortunately, there are not many interesting examples of non-homeomorphic manifolds with the same fundamental group. A class of manifolds of this kind can also be obtained by means of connected sums of two (or more) 3-manifolds. In this case, the fundamental group is freely generated by the fundamental groups of the summands and, by mixing the relative orientations of the summands, one can get spaces which are not homeomorphic. For all these manifolds, our conjecture is valid because the invariant I(M ) is just the product of the invariants of the summands. The numerical computations can be extended to the case in which the gauge group is a generic unitary group SU (N ). In this case, the analog of formula (6) for lens spaces has been produced by Jeffrey [11]. However, the time needed to perform the numerical computations increases rapidly with N . Our preliminary results for N = 4 and N = 5 are in agreement with our conjecture. Finally, one may ask for which values of k the equality Ik (M ) = 0 is satisfied and what the meaning of this fact is. The complete solution to this problem is not known. A discussion on this issue can be found in [24] where lens spaces are analyzed when the gauge group is SU (2). From the field theory point of view, gauge invariance of the factor exp iSCS (where SSC is the Chern–Simons action) in the functional measure gives nontrivial constraints on the admissible values of k in a given manifold M . In certain cases [9] one finds that, in correspondence with the “forbidden” values of k, the invariant Ik (M ) vanishes. So, it is natural to expect that Ik (M ) = 0 is related to a breaking of gauge invariance for large gauge transformations. From the mathematical point of view, Ik (M ) = 0 signals the absence of the natural extension of E(L) to an invariant EM (L) of links in the manifold M . More precisely, when Ik (M ) 6= 0 for fixed integer k, one can define [13] an invariant EM (L) of oriented, framed and coloured links {L ⊂ M } with the following property: if the link L belongs to a three-ball embedded in M , then one has EM (L) = E(L). The values of the invariant EM (L) correspond to the vacuum expectation values of the Wilson line operators associated with links in the manifold M . When Ik (M ) = 0, the invariant EM (L) cannot be constructed; consequently, for these particular values of k, the quantum Chern–Simons field theory is not well defined in M . Acknowledgement. We wish to thank Thurston and Turaev for useful comments and discussions.

Appendix A The generalized Gauss sums have a very useful property which can be expressed by means of the so-called reciprocity formula [19]

64

E. Guadagnini, L. Pilo |c|−1

X

e

2 iπ c (an +bn)

n=0

r |a|−1 X 2 2 π c π = ei 4ac (|ac|−b ) e−i a (cn +bn) , a

(61)

n=0

where the integers a, b, c satisfy the relations ac 6= 0

,

ac + b is even

.

(62)

Appendix B Lemma 2. Let a, b two integers, with (a, b) = 1 and b > 2 even; one has aφ(b) ≡ 1

(mod 2b).

(63)

Proof. The proof consists of two parts: firstly, it is shown by induction that Lemma 2 holds when b = 2m with m > 1 integer. Secondly, Eq. (63) is proved when b = 2m c with m ≥ 1 and c odd integer. Since b is even, a is clearly odd and can be written in the form a = (2f + 1). When b is of the type b = 2m , the condition b > 2 implies that m ≥ 2. Let us now consider the case m = 2; one has φ(b) = φ(22 ) = 2, therefore aφ(b) = (2f + 1)2 = 1 + 4f (f + 1) ≡ 1

(mod 23 ).

(64)

Thus, Lemma 2 is satisfied when b = 22 . Suppose now that Eq. (63) holds when b = 2n for a certain n. We need to prove that (63) is true also for b = 2(n+1) . Indeed, φ(2n+1 ) = 2n and one gets i2 h n+1 n (2f + 1)φ(2 ) = (2f + 1)φ(2 ) . (65) By using the induction hypothesis (2f + 1)φ(2

n

)

= 1 + N 2n+1 ,

(66)

one finds h

(2f + 1)φ(2

n

)

i2

= 1 + 2n+2 N (1 + 2n N ) ≡ 1

(mod 2n+2 ).

(67)

Therefore, Eq. (63) is also satisfied when b = 2(n+1) . To sum up, for m > 1 and a odd, one has m (68) aφ(2 ) ≡ 1 (mod 2m+1 ). Let us now consider the general case in which b = 2m c with c odd integer. From the Euler Theorem [20] it follows that aφ(b) ≡ 1

⇒

(mod b)

aφ(b) ≡ 1

(mod c).

(69)

On the other hand, φ(2m c) = φ(2m )φ(c) and, for m > 1, Eq. (68) implies aφ(b) = aφ(c)φ(2

m

)

≡ 1

(mod 2m+1 ).

(70)

Since (2m+1 , c) = 1, from equations (69) and (70) one gets aφ(b) ≡ 1

(mod 2m+1 c) ≡ 1

(mod 2b).

(71)

Three-Manifold Invariants and Their Relation with Fundamental Group

65

Finally, we need to consider the case b = 2c. Since φ(c) is even, one gets aφ(2c) = [1 + 4f (f + 1)]φ(c)/2 ≡ 1

(mod 22 ).

(72)

Equations (68) and (72) imply aφ(2c) ≡ 1 This concludes the proof.

(mod 22 c).

(73)

References 1. Witten, E.: Commun. Math. Phys. 121, 351–399 (1989) 2. Reshetikhin, N.Y., Turaev, V.G.: Commun. Math. Phys. 127, 1–26 (1990) and Invent. Math. 103, 547–597 (1991) 3. Turaev, V.G.: Quantum invariants of knots and three manifolds. Berlin: de Gruyter Studies in Mathematics, 18, 1994 4. Kirby, R. and Melvin, P.: Invent. Math. 105, 473–545 (1991) 5. Lickorish, W.B.R.: Pacific J. Math. 149, 337–347 (1991) 6. Morton, H.R., Strikland, P.M.: Satellites and Surgery Invariants. In: Knots 90, ed. Hawauchi; Berlin: de Gruyter, 1992 7. Kauffman, L.H. and Sostenes, L.H. Temperley–Lieb recoupling theory and invariants of 3-manifolds, Princeton, NJ: Princeton University Press, 1994 8. Kohno, T.: Topology. 31, 203–230 (1992) 9. Guadagnini, E. and Panicucci, S.: Nucl. Phys. B 388, 159 (1992) 10. Freed, D.S. and Gompf, R.E.: Commun. Math. Phys. 14 79–117 (1991) 11. Jeffrey, L.C.. Commun. Math Phys. 147, 563–604 (1992) 12. Guadagnini, E.: Int. Journ. Mod. Phys. A7, 877 (1992) 13. Guadagnini, E.: The Link Invariants of the Chern-Simons Field Theory. de Gruyter Expositions in Mathematics, Berlin: Walter de Gruyter 1993 14. Guadagnini, E. and Pilo, L.: Nucl. Phys. B 433, 597 (1995) 15. Verlinde, E.: Nucl. Phys. B 300, 360 (1988) 16. Guadagnini, E. and Pilo, L.: J. Geom. Phys. 14, 236 (1994); J. Geom. Phys. 14, 365 (1994) 17. Rolfsen, D.: Knots and Links, Berleley, CA: Publish or Perish, 1976 18. Kirby, R.: Invent. Math. 45, 35–56 (1978); Fenn, R. and Rourke, C.: Topology 18, 1–15 (1979); Rolfsen, D.: Pacific J. Math. 110, 377–386 (1984) ¨ 19. Siegel, C.L.: Uber das quadratische Reziprozit¨atsgesetz algebraischen Zahlk¨orpern. Nachr. Acad. Wiss. G¨ottingen Math. Phys. Kl. 1, 1–16 (1960) 20. Loo-Keng, H.:Introduction to number theory, New York: Springer-Verlag, 1982 21. Turaev, V.G. and Viro, O.Y.: Topology 31, 865 (1992) 22. Archer, F. and Williams, R.: Phys. Lett. B 273, 438 (1991) 23. Guadagnini, E. and Tomassini, P.. Phys. Lett. B 336, 330 (1994) 24. Yamada, S.: J. of Knot Theory and Ramifications 4, 319 (1995) Communicated by A. Jaffe

Commun. Math. Phys. 192, 67 – 76 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Self-Duality and Schlesinger Chains for the Asymmetric d-PII and q -PIII Equations A. Ramani1 , Y. Ohta2 , J. Satsuma3 , B. Grammaticos4 1

CPT, Ecole Polytechnique, CNRS, UPR 14, 91128 Palaiseau, France Department of Applied Mathematics, Faculty of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, 739 Japan 3 Department of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153, Japan 4 GMPIB (ex LPN), Universit´ e Paris VII, Tour 24-14, 5e e´ tage, 75251 Paris, France

2

Received: 15 February 1997 / Accepted: 19 June 1997

Abstract: We analyse two asymmetric discrete Painlev´e equations, namely d-PII and q-PIII which are known to be discrete forms of PIII and PVI respectively. We show that both equations are self-dual. This means that the same equation governs the evolution along the discrete independent variable and the transformations under the action of the Schlesinger transforms along the parameters of the discrete Painlev´e. A bilinear formulation of the self-dual equation is given as a system of nonautonomous Hirota– Miwa equations.

1. Introduction Since the discovery of the discrete Painlev´e (dP) equations [1] one question has been present in the minds of all practitioners: “up to which point can one push the analogy between discrete and continuous Painlev´e’s?” The answer to this question depends on the attitudes one can have with respect to discrete systems. Those who believe that continuous systems are more fundamental (merely because they are more familiar with them) try to establish the discrete analog of the known properties of the continuous Painlev´e equations through discretisation. However, discretisation is a delicate procedure (in particular in the domain of integrable systems) and despite multiple efforts no systematic approach seems to exist to date. The converse attitude consists in considering the discrete systems as the most fundamental ones. Thus one does use the properties of the continuous systems only as a guide and tries to develop techniques which are specific to the discrete systems (the comparison with the continuous system being straightforward through the continuous limit implemented in a systematic way). The advantage of this point of view is clear: discrete systems may and, in fact, do possess properties with no continuous analog or rather properties the continuous avatar of which does not allow one to guess what the discrete property should be. Thus when one sticks too closely to the continuous systems one can easily miss these important discrete properties.

68

A. Ramani, Y. Ohta, J. Satsuma, B. Grammaticos

The main theme in this paper is such a purely discrete property: the self-duality of dP’s. Let us illustrate what we mean by self-duality in the example where this notion first appeared: the alternate d-PII Eq. [2]: zn−1 1 zn + = −xn + + zn + a, xn+1 xn + 1 xn xn−1 + 1 xn

(1.1)

where zn = δn + z0 and a is a parameter. The Schlesinger transforms of (1.1) were presented in [3]. By denoting by x and x˜ the solutions of alt-d-PII corresponding to ˜ parameters a − δ and a + δ respectively, we have: 1 a(1 + xn xn−1 ) + xn = xn 1 + xn xn−1 − zn−1 xn ˜ and

x˜ n =

(a + δ)(1 + xn xn−1 ) xn − 1 + xn xn−1 − zn−1 xn−1

(1.2)

−1 .

(1.3)

Eliminating xn−1 between (1.2) and (1.3) we obtain the dual equation of alt-d-PII i.e. the equation where the parameter a is now the independent variable. We find: a 1 a+δ + = x + − a − z, xx˜ − 1 xx − 1 x ˜

(1.4)

where we have dropped the index n, and z(≡ zn ) is now just a parameter. We remark that (1.4) is essentially alt-d-PII itself. The only, minor, change is the fact that the x of the dual equation is multiplied by i with respect to the initial one. This is the self-duality property: the evolution equation in the discrete independent variable and in the space of the parameters is the same. The interesting question is whether this property of self-duality is a general property of discrete Painlev´e equations. The answer to this question is “yes” provided we do not restrict their freedom unnecessarily. As a matter of fact, when we obtained d-P’s through the singularity confinement approach [4], it turned out that several d-P’s possess nonautonomous parts having an even-odd asymmetry. This parity dependence was quite often neglected since it does not survive as such at the continuous limit. (Let us point out here that there is no way one can guess such parity-dependent terms if one tries to “discretize” [5] a given Painlev´e equation.) It is our experience that these terms are crucial for self-duality. Indeed, the symmetrical forms of the d-P equations resulting from the omission of these parity-dependent terms are not self-dual. On the contrary once the full, asymmetric, forms are considered, self-duality is recovered. In what follows, we are going to use the terminology: “asymmetric” d-Pn . By that we mean that when this d-P is symmetrized (dropping the (−1)n terms) the continuous limit is Pn . The “asymmetric” equation is the one where the full freedom is maintained. The equations we are going to analyse in this paper are the asymmetric d-PII and q-PIII . The first has already been identified in [6] as a d-PIII , quite distinct from the well-known (symmetric) q-PIII , while the asymmetric form of the latter was shown in [7] to be in fact a form of q-PVI .

Self-Duality and Schlesinger Chains for Asymmetric Equations

69

2. The asymmetric d-PII The singularity confinement analysis of (the standard form of) d-PII has been performed in [4] and led to the following form compatible with integrability: up−1 + up+1 =

(2pδ + α)up + β + γ(−1)p . u2p − 1

(2.1)

The usual approach consists in dropping the (−1)p term since, at first sight, this term does not appear to have a continuous limit. The equation then goes over to PII at the continuous limit. The Miura and B¨acklund transformations of the symmetric d-PII have been studied in detail in [8]. Let us summarize here briefly the derivation of the autoB¨acklund in the symmetric case (γ = 0). We start from (2.1) and introduce the following Miura transformation: (2.2a) vp = (up − 1)(up+1 + 1), up =

β + vp−1 − vp , vp−1 + vp − ζp

(2.2b)

where ζp = 2δp + α. Eliminating v between the two equations leads to d-PII (2.1). Eliminating u and with the substitution w = v − ζ/2 − δ/2 we find the equation: (wp + ζp /2 + δ/2)(wp + wp+1 )(wp + wp−1 ) = (β − δ)2 − 4wp2 .

(2.3)

Equation (2.3) is the discrete form of equation P34 which plays the role of a “modified” PII . The important remark is that d-P34 depends on β through the square (β − δ)2 . The construction of the auto-B¨acklund becomes now straightforward. We start with a u(β), i.e. a solution of (2.1) with parameter β. We transform through (2.2) to v(β − δ), and since (2.3) is invariant under a sign change of its parameter we can come back through the Miura to u(−(β − δ) + δ) = u(2δ − β). Since (2.1) is invariant under a simultaneous sign change of u and β we have u(2δ − 2β) = −u(β − 2δ). Performing this chain of transformations we find: up (β − 2δ) = −up (β) +

(β − δ)(up (β) + 1) , (up (β) − 1)(up−1 (β) + 1) + (β − ζp )/2

(2.4)

which is the auto-B¨acklund of the symmetric d-PII . Let us now turn to the asymmetric d-PII . The interpretation of the (−1)p term is, in fact, straightforward. Separating even and odd terms in (2.1) one can transform it into a system: (4mδ + α)u2m + β + γ , (2.5a) u2m−1 + u2m+1 = u22m − 1 u2m + u2m+2 =

((4m + 2)δ + α)u2m+1 + β − γ . u22m+1 − 1

(2.5b)

Putting u2m = xn , u2m+2 = xn+2 , u2m−1 = yn−1 , u2m+1 = yn+1 , so that the indices of the x’s are even, those of the y’s are odd, and introducing zn = nδ +α/2, a = −(β +γ)/2, b = (γ − β)/2 we find: 2zn xn − 2a , (2.6a) yn−1 + yn+1 = x2n − 1

70

A. Ramani, Y. Ohta, J. Satsuma, B. Grammaticos

xn + xn+2 =

2zn+1 yn+1 − 2b . 2 yn+1 −1

(2.6b)

This system has been analyzed in detail in [6] where it was shown that at the continuous limit it goes over to PIII . We put y = u/, z = t/, x = t/u, a = α/, b = β/, and find at the limit → 0 the equation u00 = u02 /u − u0 /t + u3 /t2 − αu2 /t2 + β/t − 1/u, i.e. PIII (although in noncanonical form). In order to simplify the presentation, we shall introduce a shorthand notation for each of the three directions of evolution (along z and along each of the parameters a and b, the latter evolutions being induced by the Schlesinger transforms). Thus we denote ¯ vn−1 = v , etc., where v is any of the x, y, (and the variables to be vn =v, vn+1 = v, introduced later, w, τ ). The¯ evolution along the a axis will be represented by a tilde, i.e, v(a + δ) = v, ˜ v(a − δ) = v , while that of b will be represented by a hat, i.e. v(b + δ) = v, ˆ v(b − δ) = v . Using these˜ notations, we can transcribe (2.6) into: ˆ 2zx − 2a , (2.7a) y + y¯ = x2 − 1 ¯ x + x¯ =

2(z + δ)y¯ − 2b . y¯ 2 − 1

(2.7b)

Next, we consider the Miura transformations introducing the auxiliary variable w: w˜ = y¯ −

z+a z−a = −y + . x−1 x +1 ¯

(2.8)

Similarly

z+a z−a w = −y¯ + =y− . (2.9) x + 1 x −1 ˜ ¯ The dual equations to (2.7) can be obtained in a straightforward way. Combining (2.8) with (2.9) we get: 2ax − 2z . (2.10a) w + w˜ = x2 − 1 ˜ The second equation requires the knowledge of x. ˜ This in turn is based on the implementation of the Schlesinger of asymmetric d-PII . Its derivation follows the one for the symmetric d-PII . We can thus compute x˜ and it turns out that the resulting equation is x + x˜ =

2(a + δ)w˜ − 2b . w˜ 2 − 1

(2.10b)

The self-duality is apparent. It suffices to compare (2.7) to (2.10): z and a have exchanged roles. Similarly we can introduce the Miura along the b direction. We have: wˆ¯ = x¯ −

(z + δ) + b (z + δ) − b = −x + y¯ − 1 y¯ + 1

(2.11)

(z + δ) − b (z + δ) + b =x− . y¯ + 1 y¯ − 1

(2.12)

and w¯ = −x¯ + ˆ

Combining (2.11) and (2.12) we obtain:

Self-Duality and Schlesinger Chains for Asymmetric Equations

w¯ + wˆ¯ = ˆ

71

2by¯ − 2(z + δ) , y¯ 2 − 1

(2.13a)

and using the Schlesinger for y we have: y¯ + yˆˆ¯ =

2(b + δ)wˆ¯ − 2a . wˆ¯ 2 − 1

(2.13b)

Again self-duality is evident. Now b has exchanged its role with z.

Fig. 1. Two consecutive planes (corresponding at the values of parameter b−δ and b) in the cubic lattice covered by the solution of the asymmetric d-PII under the action of the corresponding Schlesinger transformations. The nonlinear variables x, y are shown, together with the τ -function, at their corresponding vertices.

Before proceeding to the bilinear formulation of the asymmetric d-PII and its Schlesinger’s we must introduce a most important ingredient of our analysis, namely the geometry of these transformations. The z, a, b can be thought of as defining a cubic lattice. We present in Fig. 1 two consecutive planes. In order to simplify the treatment of the bilinear equations we have assumed that the plane where the initial x, y live corresponds to a value (b − δ), which explains why these variables appear in the figure as “down-hatted” x, y¯ . (This is in fact the reason for the convention adopted: the b plane ˆ ˆ contains one τ -function that is not shifted in any direction). With these notations we remark readily that the only τ ’s that exist have all symbols (¯,˜,ˆ) appearing in even numbers, or all in odd numbers. Similar, although more complicated, rules can be formulated for the x, y and w. For x, the number of shifts in the ¯ and ˜ directions have the same parity, which is opposite to the parity of the number of shifts in theˆdirection. For y, the privileged direction is ˜, while for w, it is the¯direction. From Fig. 1 (complemented mentally by the other horizontal planes corresponding to b + δ, b ± 2δ, etc.) we remark that each of the x, y, w, has two nearest neighbouring τ ’s. They lie in the z direction for w, the a direction for y and the b direction for x. There exist also for each of the x, y, w, four next-nearest neighbouring τ ’s in the diagonal along the two other directions. The ansatz for the bilinearization consists in writing 1 + x and 1 − x (similarly for y, w) as a ratio of two products of two τ -functions. The denominator is the product of the two nearest neighbouring τ -functions. Thus for x we have τ τ . In ˆ and since there ˆ the numerator two of the next nearest neighbouring τ -functions appear are four of them it is clear that some choice must be made (resulting into an unavoidable asymmetry). We have thus: τ˜¯ τ x = ˆ ˆ˜¯ − 1 = 1 − ˆ ττ ˆ

τ¯ τ˜ ˆ˜ ˆ¯ . ττ ˆ

(2.14)

72

A. Ramani, Y. Ohta, J. Satsuma, B. Grammaticos

Similarly we have: τ¯ τ y¯ = ˆ − 1 = 1 − ˆ ˆτ˜¯ τ˜¯ ˆ

and

τ τ¯ ˆ τ˜¯ τ¯ ˆ ˆ˜

(2.15)

τ τ˜ τ˜ τ (2.16) w ˜ = ˆ −1=1− ˆ. τ˜¯ τ˜ ˆ τ˜¯ τ˜ ˆ ˆ¯ ˆ ˆ¯ A first equation can be obtained by equating the two expressions for x (and two more ˆ starting from y¯ and w ˜ ). We have ˆ ˆ τ˜¯ τ + τ¯ τ˜ − 2τ τ = 0. (2.17) ˆ ˆ˜¯ ˆ˜ ˆ¯ ˆ Note that this equation is of Hirota–Miwa form. Moreover it is autonomous: none of z, a, b appears explicitly. It is most probable that by now even the most dedicated reader has trouble keeping track of the symbols (¯,˜,ˆ). So we introduce a notation that will simplify the situation. We associate the directions z, a, b, to the indices 0,1,2. Moreover an up-shift will be represented by an upper index (appearing a number of times equal to the number of up-shifts) and a down-shift by a lower index. Following this rule we can rewrite (2.17) as {01} {1} {0} (2.18a) τ{2} τ{012} + τ{02} τ{12} − 2τ τ{22} = 0. The two remaining autonomous Hirota–Miwa equations can be easily written {00}

{01} {0}

{11}

{01} {1}

τ {00} τ{22} + τ τ{22} − 2τ{2} τ{12} = 0 and

τ {11} τ{22} + τ τ{22} − 2τ{2} τ{02} = 0

(2.18b) (2.18c)

The duality of Eqs. (2.18abc) is not easy to perceive because of the various up- and down-shifts due to the fact that the τ -functions exist only at certain points of the lattice. Had we written objects that do not exist the self-duality would have become transparent. Shifting Eqs. (2.18) in various directions we can formally rewrite them in explicitly self-dual form: {1} {0} (2.19a) τ {01} τ{01} + τ{0} τ{1} − 2τ {2} τ{2} = 0, {2} {0}

τ {02} τ{02} + τ{0} τ{2} − 2τ {1} τ{1} = 0, {1} {2}

τ {12} τ{12} + τ{2} τ{1} − 2τ {0} τ{0} = 0.

(2.19b) (2.19c)

The Hirota–Miwa Eqs. (2.19) are not sufficient in order to characterize the asymmetric d-PII . They must be complemented by the equation resulting from the Miura transform (2.8–9, 11–12). Starting from the first part of (2.8), for instance, we find: τ˜ τ¯ − τ¯ τ˜ = (z − a)τ τ˜¯ , ˆ ˆ˜ ˆ¯

(2.20)

or, in the notation with indices: {0}

{1}

{01}

τ {11} τ{12} − τ {00} τ{02} = (z0 − z1 )τ τ{2} .

(2.21)

Here z0 , z1 and z2 denote the independent variables in the directions z, a, b, i.e. z0 ≡ z = nδ + α/2, z1 ≡ a = mδ − (β + γ)/2, z2 ≡ b = kδ + (γ − β)/2 with integer m and k. Similarly we can obtain two more equations that will be the duals of (2.21). However,

Self-Duality and Schlesinger Chains for Asymmetric Equations

73

(2.21) and its duals are not the only equations one can write. Indeed using the second part of (2.8) we have: {01}

{1}

τ {11} τ{012} − τ{00} τ{2} = (z0 + z1 )τ τ{02}

(2.22)

(and, quite expectedly, two more dual equations). However, (2.22) is not an essentially new equation. It can be deduced from (2.21) based on a general parity property of asymmetric d-PII . In fact, if we reverse the signs of a, b, but not of z, then x, y change sign but not w. Under this change, the index {0} stays where it is while {1, 2} move from upper position to lower and vice versa. Implementing these changes into (2.22) we find {12} {02} {2} (2.23) τ{11} τ{0} − τ{00} τ{1} = (z0 − z1 )τ τ{01} . Upshifting this equation once in directions 0 and 1 and downshifting it once in direction 2 (note that (z0 − z1 ) does not change) we find exactly (2.21). Thus asymmetric d-PII can be expressed as a system of three self-dual Hirota–Miwa equations in three dimensions. Two of these Hirota-Miwa equations are non autonomous, and are related through a parity transformation. 3. The asymmetric q-PIII The asymmetric form of q-PIII , although obtained in the initial derivation of the discrete analog of PIII , did not receive much attention until Jimbo and Sakai [7] showed that it is in fact a discrete form of PVI . Let us start directly with the asymmetric form written as: (x + a)(x + b) y y¯ = , (1 + x/c)(1 + x/d) ¯

(3.1a)

(y¯ + p)(y¯ + r) , (1 + y/s)(1 ¯ + y/t) ¯

(3.1b)

xx¯ =

where a, b, p, r ∝ λ−n and c, d, s, t ∝ λn . The parameters a, b, . . ., t are not all free. In the gauge we are using in this paper we have the constraints cd = λ−2 st and ab = λ2 pr. Moreover a scaling transformation on x and y can be used in order to ensure that the common value of abcd and prst is unity, further reducing the number of degrees of freedom to five. (We must point out here that our notation, using λ in the independent variable is at odds with the usual one where q appears. Still we prefer to use this notation, to avoid ambiguities, since q will enter later in the unified description of the evolution and the Schlesinger’s.) The Schlesinger transformations of asymmetric q-PIII have been studied in [9]. Let us for instance consider the Miura: √ x+b (1 + x/d) ad 1 √ = y¯ , (3.2) w˜ = y (1 + x/c) bc x+a ¯ √ 1 (1 + x/d) ad x+b √ w= . (3.3) =y x+a ˜ y¯ (1 + x/c) bc ¯ (We recall that abcd = 1.) Let us point out that if in (3.2) we replace b by a and c by d we recover 1/w. So this would not be a new Miura. On the other hand replacing just c ˜ b we do obtain a Miura in a new direction. Moreover around y¯ one can by d but keeping define two more directions by coupling, for instance, r to s or r to t.

74

A. Ramani, Y. Ohta, J. Satsuma, B. Grammaticos

It is straightforward to obtain the dual equation for the w variable: (x + b)(x + d) . ww˜ = (1 + x/c)(1 + x/a) ˜

(3.4)

However, the second equation for x, relating the product xx˜ to w˜ involves more complicated parameters. Thus it is time to dispense with the notation involving symbols (¯,˜) and introduce the analog to the one we employed in Sect. 2, based on indices. The basic geometry in the present case is a hypercubic five-dimensional lattice. We associate the index 0 to the direction n, and the indices 1, 2, 3, 4 to the four directions related to the Schlesinger’s of asymmetric q-PIII . The expressions of the parameters a, b,. . ., t are now: a = q0−1 q1 q2 , b = q0−1 q1−1 q2−1 ,

p = (λq0 )−1 (q3 /λ)(q4 /λ), r = (λq0 )−1 (q3 /λ)−1 (q4 /λ)−1 ,

c = q0 q1 q2−1 ,

s = (λq0 )(q3 /λ)(q4 /λ)−1 ,

d = q0 q1−1 q2 ,

t = (λq0 )(q3 /λ)−1 (q4 /λ).

(3.5)

The reason why we have written two quantities as (q3 /λ) and (q4 /λ) rather than just q3 , q4 , is the same as why in the previous section the plane where the variables x and y lived was indexed by (b−δ) rather than b. When the bilinear form will be considered the τ ’s will have more symmetrical indices. Indeed with this choice the number of shifts appearing in the indices of the τ ’s have all the same parity. On the other hand the variables x, y and w are shifted in such a way that three shifts have one parity and the two remaining have the other one. Thus we have x = X{34} , i.e. X(n0 , n1 , n2 , n3 − 1, n4 − 1), y = X{034} , ¯ {0} {1} y¯ = X{34} , w˜ = X{34} , etc. The asymmetric q-PIII can now be rewritten as: {0}

X{034} X{34} = {0}

{00}

X{34} X{34} =

(X{34} + q0−1 q1 q2 )(X{34} + q0−1 q1−1 q2−1 ) (X{34} q0−1 q1−1 q2 + 1)(X{34} q0−1 q1 q2−1 + 1)

,

(3.6a)

{0}

(X{34} + (λq0 )−1 (q3 /λ)(q4 /λ))(X{34} + (λq0 )−1 (q3 /λ)−1 (q4 /λ)−1 )

. {0} {0} (X{34} (λq0 )−1 (q3 /λ)−1 (q4 /λ) + 1)(X{34} (λq0 )−1 (q3 /λ)(q4 /λ)−1 + 1) (3.6b) Similarly (3.4) becomes now: {1}

X{134} X{34} =

(X{34} + q1−1 q0 q2 )(X{34} + q1−1 q0−1 q2−1 ) (X{34} q1−1 q0−1 q2 + 1)(X{34} q1−1 q0 q2−1 + 1)

.

(3.7a)

˜ It is now straightforward to write the second equation in the “1” direction, for x, x: {11} X{34} X{34}

{1}

{1}

(X{34} + (λq1 )−1 (q3 /λ)(q4 /λ))(X{34} + (λq1 )−1 (q3 /λ)−1 (q4 /λ)−1 )

. {1} {1} (X{34} (λq1 )−1 (q3 /λ)−1 (q4 /λ) + 1)(X{34} (λq1 )−1 (q3 /λ)(q4 /λ)−1 + 1) (3.7b) The self-duality is evident: Eqs. (3.7) are obtained from (3.6) simply by exchanging 0 and 1. Similarly one can introduce the evolution equation in the direction 2. For directions =

Self-Duality and Schlesinger Chains for Asymmetric Equations

75

3 and 4 we must use as a central point y¯ instead of x. Self-duality is, of course present in all directions. Let us now turn to the bilinear formulation of asymmetric q-PIII and its Schlesinger’s. Now, the geometry is such that each x-site (and similarly for y, w) has four nearestneighbouring τ ’s. Consider for instance the variable x = X{34} . Its nearest neighbouring sites correspond to τ -functions τ , τ{3344} , τ{33} and τ{44} . (Any τ -function with odd shifts will be among the next nearest-neighbours at best.) With respect to the positions of X{34} two of the τ -functions (namely τ , τ{3344} ) correspond to shifts of both indices either upwards or downwards, the remaining two (τ{33} and τ{44} ) correspond to shifts of one index upward and the other one downwards. The choice for the representation of the X in terms of τ -functions will be to write it as a ratio with the product of the first two τ -functions in the numerator while the product of the remaining two appear at the denominator: τ τ{3344} (3.8) X{34} = τ{33} τ{44} (but the reverse would have been just as acceptable). We must point out here that the choice of the expression of X is self-duality preserving. Thus we expect the bilinear equations for the τ ’s to be also self-dual. In order to obtain the bilinear equations associated to asymmetric q-PIII we start with {1} the Miura (3.2). With the notations that we introduced, w˜ = X{34} and y = X{034} . The ¯ expressions of these variables in terms of τ -functions read: {1} X{34}

{1}

=

{012}

τ{0234} τ{34} {12}

{01}

τ{034} τ{234}

,

(3.9)

{12}

X{034} =

τ{01234} τ{034}

.

(3.10)

√ (τ τ{3344} + bτ{33} τ{44} )/ b √ , = (τ{33} τ{44} + τ τ{3344} /c) c

(3.11)

{2}

{1}

τ{0134} τ{0234}

Substituting into (3.2) we find: {1} X{034} X{34}

{012}

≡

τ{01234} τ{34} {2}

{01}

τ{0134} τ{234}

where we have used expression (3.2) for x = X{34} . Using the expressions of b and c, we √ equate the numerators and denominators of both sides of (3.11) (with the factor 1/ b being considered as being part of the numerator). We finally obtain the bilinear nonautonomous Hirota–Miwa equations: {012}

1/2 1/2 1/2

−1/2 −1/2 −1/2 q1 q2 τ{33} τ{44} ,

τ{01234} τ{34} = q0 q1 q2 τ τ{3344} + q0 {2}

{01}

−1/2 −1/2 1/2 q1 q2 τ τ{3344}

τ{0134} τ{234} = q0

1/2 1/2 −1/2

+ q0 q1 q2

τ{33} τ{44} .

(3.12a) (3.12b)

Similarly to the asymmetric d-PII case these two equations are related through a updownward symmetry. All the remaining equations can be obtained by duality transformations from these two nonautonomous Hirota–Miwa equations. Both can be written in a compact way as:

76

A. Ramani, Y. Ohta, J. Satsuma, B. Grammaticos

τ (ni + σi , nj + σj , nk + σk )τ (ni − σi , nj − σj , nk − σk ) = σ σk /2 σi σk /2 σi σj /2 qj qk τ (nl

qi j

− 1, nm − 1)τ (nl + 1, nm + 1) +

−σ σ /2 −σ σ /2 −σ σ /2 qi j k qj i k qk i j τ (nl

(3.13)

− 1, nm + 1)τ (nl + 1, nm − 1),

where (i, j, k, l, m) is any permutation of (0,1,2,3,4), all the σ’s are ±1, and the indices not explicitly mentioned are assumed to be unshifted. This expression is an explicitly self-dual nonautonomous Hirota-Miwa which represents q-PIII and its Schlesinger transformations. 4. Conclusion In this paper we have presented a property of the discrete Painlev´e equations that sets them apart from the continuous ones: self-duality. First obtained in the case of the alternate d-PII , self-duality did not appear at the time to be characteristic of all dP’s. The present work shows that when one considers the most general form of a given dP (without unnecessary assumptions leading to symmetrisation) this form is naturally self-dual. This allows for a unified description of a given dP and its Schlesinger transformation, what we have dubbed [10] the “Grand Scheme”. In fact, self-duality means that the same equation governs the evolution in the direction of the independent variable and in the direction of the parameters. Written in bilinear form this equation turns out to be the Hirota–Miwa, discrete Toda equation. Thus the dP’s can be represented as systems of, in general nonautonomous, Hirota–Miwa equations governing the evolution in any direction, be it the independent variable or the parameters. Two equations have been treated in this paper: the asymmetric d-PII and q-PIII which are known to be discrete forms of PIII and PVI . One problem that we plan to address in the near future is that of d-PI . This equation is the only one among the dP’s that does not possess a bilinear but, rather, a trilinear form. It would be interesting to investigate whether the notion of self-duality applies to the asymmetric d-PI leading to a description in terms of the Hirota–Miwa equation. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

Ramani, A., Grammaticos, B. and Hietarinta, J.: Phys. Rev. Lett. 67, 1829 (1991) Fokas, A.S., Grammaticos, B. and Ramani, A.: J. Math. An. and Appl. 180, 342 (1993) Nijhoff, F., Satsuma, J., Kajiwara, K., Grammaticos, B., Ramani, A.: Inv. Probl. 12, 697–716 (1996) Grammaticos, B., Ramani, A. and Papageorgiou, V.: Phys. Rev. Lett. 67, 1825 (1991) Conte, R. and Musette, M.: Phys. Lett. A 223, 43 (1996) Grammaticos, B., Nijhoff, F.W., Papageorgiou, V., Ramani, A. and Satsuma, J.: Phys. Lett. A185, 446– 452 (1994) Jimbo, M. and Sakai, H.: Lett. Math. Phys. 38, 145 (1996) Ramani, A. and Grammaticos, B.: J. Phys. A25, L633 (1992) Jimbo, M., Sakai, H., Ramani, A. and Grammaticos, B.: Phys. Lett. A 217, 111 (1996) Ramani, A. and Grammaticos, B.: The Grand Scheme for discrete Painlev´e equations. Lecture at the Toda symposium (1996)

Communicated by T. Miwa

Commun. Math. Phys. 192, 77–120 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Geometry and Classification of Solutions of the Classical Dynamical Yang–Baxter Equation? Pavel Etingof1 , Alexander Varchenko2 1 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail: [email protected] 2 Department of Mathematics, Phillips Hall, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3250, USA. E-mail: [email protected]

Received: 24 March 1997 / Accepted: 20 June 1997

Abstract: The classical Yang–Baxter equation(CYBE) is an algebraic equation central in the theory of integrable systems. Its nondegenerate solutions were classified by Belavin and Drinfeld. Quantization of CYBE led to the theory of quantum groups. A geometric interpretation of CYBE was given by Drinfeld and gave rise to the theory of Poisson–Lie groups. The classical dynamical Yang–Baxter equation (CDYBE) is an important differential equation analogous to CYBE and introduced by Felder as the consistency condition for the differential Knizhnik–Zamolodchikov–Bernard equations for correlation functions in conformal field theory on tori. Quantization of CDYBE allowed Felder to introduce an interesting elliptic analog of quantum groups. It becomes clear that numerous important notions and results connected with CYBE have dynamical analogs. In this paper we classify solutions to CDYBE and give geometric interpretation to CDYBE. The classification and interpretation are remarkably analogous to the Belavin– Drinfeld picture. 0. Introduction 0.1. In 1984 Knizhnik and Zamolodchikov showed that correlation functions of conformal blocks on P1 for the Wess–Zumino–Witten (WZW) conformal field theory for a simple Lie algebra g satisfy the differential equations κ

∂F X ij = F, ∂zi z i − zj

i = 1, ..., N,

(1)

j6=i

where F is an analytic function of N complex variables z1 , ..., zN with values in V1 ⊗ ... ⊗ VN , V1 , ..., VN are representations of g, ij is the Casimir operator ∈ (S 2 g)g ?

The authors were supported in part by an NSF postdoctoral fellowship and NSF grant DMS-9501290.

78

P. Etingof, A. Varchenko

acting in the ith and the j th component, and κ is a complex number. Equations (1) are called the Knizhnik–Zamolodchikov (KZ) equations. They play an important role in representation theory and mathematical physics. Later Cherednik [C] generalized the KZ equations: he considered differential equations of the form ∂F X ij = r (zi − zj )F, (2) κ ∂zi j6=i

where r(z) is a meromorphic function on C with values in g ⊗ g satisfying the unitarity condition r(z) = −r21 (−z). Cherednik pointed out that system (2) is consistent iff r(z) satisfies the classical Yang–Baxter equation (CYBE): [r12 (z1 − z2 ), r13 (z1 − z3 )] + [r12 (z1 − z2 ), r23 (z2 − z3 )] + [r13 (z1 − z3 ), r23 (z2 − z3 )] = 0. (3) In particular, for the simplest Yang’s solution of (3), r(z) = z , we get the KZ equations. The geometric meaning of Eq. (3) was found by Drinfeld [Dr]. Namely, he showed that any solution r of (3), independent of z, and satisfying the condition r +r21 ∈ (S 2 g)g , defines a natural Poisson–Lie structure on a Lie group G whose Lie algebra is g. He also found that z-dependent solutions of (3) define analogous structures on the loop group LG. If g is a simple Lie algebra, then Eq. (3) has many interesting solutions, both zindependent and z-dependent. These solutions, satisfying an additional nondegeneracy condition, were classified by Belavin and Drinfeld [BD1, BD2]. In the z-dependent case, three types of solutions were found: rational, trigonometric, and elliptic (in terms of z). 0.2. In [B] Bernard found that correlation functions for WZW conformal blocks on an elliptic curve satisfy the differential equations κ

X ∂F ∂F X ij = rKZB (λ, zi − zj )F + xil , ∂zi ∂xl j6=i

(4)

l

where λ ∈ h∗ , h is a Cartan subalgebra of g, F is an analytic function of complex variables z1 , ..., zN and of a point λ, with values in (V1 ⊗ ... ⊗ VN )h , rKZB (z, λ) is a particular meromorphic (in fact, elliptic) function with values in (g ⊗ g)h , and {xl } is a basis of h, which is also regarded as a linear system of coordinates on h∗ . Equations (4) are called the Knizhnik–Zamolodchikov–Bernard equations. Equations (4) are not of type (2), since they contain derivatives with respect to xl on the right hand side. Therefore, it is not surprising that rKZB does not satisfy the classical Yang–Baxter equation. However, as was pointed out by Felder [F1], rKZB satisfies a generalization of the classical Yang–Baxter equation: X l

x(1) l

∂r23 (z2 − z3 ) X (2) ∂r31 (z1 − z3 ) X (3) ∂r12 (z1 − z2 ) + xl + xl + ∂xl ∂xl ∂xl l

l

+[r12 (z1 − z2 ), r13 (z1 − z3 )] + [r12 (z1 − z2 ), r23 (z2 − z3 )]+

(5)

+[r13 (z1 − z3 ), r23 (z2 − z3 )] = 0. Moreover, this equation is a necessary and sufficient condition for (4) to be consistent (for an arbitrary meromorphic function r(λ, z) satisfying the unitarity condition r(λ, z) = −r21 (λ, −z)). This equation is called the classical dynamical Yang–Baxter equation (we

Classical Dynamical Yang–Baxter Equation

79

will abbreviate it as CDYBE, or CDYB equation). The word “dynamical” refers to the fact that (5) is a differential rather than an algebraic equation, so it reminds one of a dynamical system. This paper has two goals: 1. To exhibit the geometric meaning of CDYBE, similarly to Drinfeld’s interprtation of CYBE within the framework of the theory of Poisson–Lie groups. 2. To classify solutions of CDYBE for simple Lie algebras and Kac–Moody algebras (over C). 0.3. The first goal is attained in Chapters 1 and 2. Namely, there we consider solutions of (5), for any pair of finite-dimensional Lie algebras h ⊂ g, which are independent of z, h-invariant, and satisfy the generalized unitarity condition: r + r21 is a constant, invariant element of S 2 g. To give geometric meaning to such solutions, we define and study the notion of a dynamical Poisson groupoid which is a special case of the notion of a Poisson groupoid introduced by Weinstein [W]. We show that any solution of (5) of the above type naturally defines a dynamical Poisson groupoid, which, as a manifold, equals U × G × U , where U ⊂ h∗ is an open set. This construction illustrates how Eq. (5) arises naturally in the theory of Poisson groupoids. When h = 0, this construction reduces to the Drinfeld construction of a Poisson–Lie group from a solution of CYBE. As in the case of the usual CYBE, z-dependent solutions of (5) define analogous structures on the loop group LG. 0.4. The second, more technically challenging goal, is (partially) attained in Chapters 3 and 4. In Chapter 3, we consider z-independent solutions of CDYBE which satisfy the condition that r+r21 is a constant invariant element, when g is a simple finite-dimensional Lie algebra, or, more generally, a Kac–Moody algebra. In this case, r + r21 = , where is the Casimir element, and is a number called the coupling constant. For a simple Lie algebra g, we classify all solutions. It turns out that there are two types of solutions – rational (with zero coupling constant) and trigonometric (with nonzero coupling constant). If g is an arbitrary Kac–Moody algebra, we classify solutions satisfying some additional conditions. Again, we find two types of solutions – rational and trigonometric. We also classify invariant solutions of CDYBE for a pair of Lie algebras l ⊂ g, where g is a finite-dimensional simple Lie algebra, and l is a reductive Lie subalgebra of g containing the Cartan subalgebra h. The classification is obtained by reduction of CDYBE for the pair l ⊂ g to CDYBE for the pair h ⊂ g. In Chapter 4 we are concerned with z-dependent solutions of CDYBE. We consider such solutions, satisfying the unitarity condition and the condition that the residue of r(λ, z) at z = 0 equals . As before, is called the coupling constant. We classify all solutions with nonzero coupling constant. It turns out that there are three types of solutions – rational, trigonometric, and elliptic. We also explain that z-independent solutions for the affine Lie algebra g˜ can be interpreted as z-dependent solutions of CDYBE for g with 6= 0. 0.5. The CDYB equation has a quantum analogue. This quantum equation is called the quantum dynamical Yang–Baxter equation (QDYBE). It was first introduced by Gervais and Neveu [GN] and later by Felder [F1], as a quantization of (5). This equation has important applications in the theory of integrable systems [ABB].

80

P. Etingof, A. Varchenko

QDYBE is a generalization of the usual quantum Yang–Baxter equation, and it gives rise to the notion of a dynamical Hopf algebroid (or dynamical quantum groupoid), in the same way as the usual quantum Yang–Baxter equation gives rise to the notion of a Hopf algebra (quantum group). The notion of a dynamical Hopf algebroid is a quantization of the notion of a dynamical Poisson groupoid, discussed in this paper. An example of a dynamical Hopf algebroid, which is not a Hopf algebra, is the elliptic quantum group of [F1, FV]. In the following papers we will define and study the notion of a dynamical Hopf algebroid, give examples, and show how such objects arise naturally in representation theory of affine Lie algebras and quantum groups, and in conformal field theory.

1. Dynamical Poisson Groupoids 1.1. Definition of a dynamical Poisson groupoid. Recall that a groupoid is a category where all morphisms are isomorphisms. In this paper, we consider only groupoids whose objects and morphisms form a set, and not only a class (i.e. groupoids built on small categories). Thus, a groupoid is defined by the following data: a set X (of morphisms, or arrows) called the groupoid itself, a set P (of objects) called the base of X, two surjective maps s, t : X → P (source and target), a map m : {(a, b) ∈ X×X : s(a) = t(b)} → X (called the arrow composition map), and an injective map E : P → X (the identity morphism, E(p) = idp ), satisfying a number of obvious conditions. One of these conditions is the existence of an involution i : X → X defined by the conditions s(i(x)) = t(x), s(x) = t(i(x)), m(x, i(x)) = idt(x) , m(i(x), x) = ids(x) . For brevity, when we talk about a particular groupoid, we will refer to its set of morphisms X. A groupoid with one object is a group. Thus, the notion of a groupoid is a generalization of the notion of a group. The role of the unit in a groupoid is played by the map E, and the role of inversion by i. A Lie groupoid is a groupoid with a smooth structure (the sets of objects and morphisms are smooth manifolds, the structure maps are smooth, and some additional conditions, see [M]). According to [W], a Poisson groupoid is a Lie groupoid X with a Poisson bracket such that the graph of the composition map (defined only on a subset of X × X) is a ¯ where X¯ is the opposite Poisson manifold to coisotropic submanifold of X × X × X, X. For example, if |P | = 1 (i.e. X is a Lie group), the structure of a Poisson groupoid is the same as a structure of a Poisson–Lie group. In this section we will define a class of Poisson groupoids which we call dynamical Poisson groupoids. Let G be a Lie group, and g its Lie algebra. Let H be a connected Lie subgroup of G, and h the Lie algebra of H. Let U be an open subset of h∗ , which is invariant under the coadjoint action. Consider the manifold X(G, H, U ) = U × G × U . This manifold has a natural structure of a Lie groupoid: X = X(G, H, U ), P = U , s(u1 , g, u2 ) = u2 , t(u1 , g, u2 ) = u1 , E(u) = (u, 1, u), and m((u1 , f, u2 ), (u2 , g, u3 )) = (u1 , f g, u3 ). In the theory of groupoids, this groupoid is called the direct product of the trivial groupoid with base U and the group G. The inversion of the groupoid X is given by i(u1 , g, u2 ) = (u2 , g −1 , u1 ).

Classical Dynamical Yang–Baxter Equation

81

The manifold X carries a pair of commuting actions of H: the left action given by l(h)(u1 , g, u2 ) = (hu1 h−1 , hg, u2 ), and the right action given by r(h)(u1 , g, u2 ) = (u1 , gh, h−1 u2 h) (huh−1 denotes, here and below, the coadjoint action of h on u). The manifold X × X carries a (left) action of H given by 1(h)(x, y) = (r(h)−1 x, l(h)y). We will call this action the diagonal action. This action preserves the composition map. For a ∈ h, let a1 , a2 be the functions on X given by a1 (u1 , g, u2 ) = a(u1 ), a2 (u1 , g, u2 ) = a(u2 ). Let {, } be a Poisson bracket on X. Definition. The pair (X, {, }) is said to be a dynamical Poisson groupoid if the following two conditions are satisfied: (i) The actions l, r are Hamiltonian, the maps t, s are moment maps for them, and for any a, b ∈ h one has {a1 , b2 } = 0. (ii) Let X • X := X × X//1(H) be the Hamiltonian reduction of X × X by the diagonal action of H, and let m ¯ : X • X → X be the reduction by H of the composition map m of X. Then m ¯ is a Poisson map. Remark. Let us explain condition (ii). If condition (i) is satisfied, the diagonal action of H is Hamiltonian, and µ(x, y) = t(y) − s(x) is a moment map for this action. Therefore, the set of zeros of the moment map, µ−1 (0), is precisely the domain of the map m. Thus, the definition of m ¯ makes sense. The space X • X equals U × (Y /H) × U , where Y = G × U × G, and H acts on Y by h ◦ (f, u, g) = (f h−1 , huh−1 , hg). The space X • X has the Poisson structure of the Hamiltonian reduction. Therefore, it makes sense to require that the map m ¯ is Poisson. If H = 1, then {, } is a Poisson bracket on G. Condition (i) is vacuous, and condition (ii) says that the multiplication map in G is Poisson. Thus, a dynamical Poisson groupoid with H = 1 is simply a Poisson–Lie group. Let us compute the general form of the Poisson bracket on a dynamical Poisson groupoid X = U × G × U . Let f be any function on X which is pulled back from G. By the definition, we have the following Poisson commutation relations: {a1 , b2 } = 0, {a1 , b1 } = −[a, b]1 , {a2 , b2 } = [a, b]2 , {a1 , f } = Ra f, {a2 , f } = La f, (1.1) where La , Ra are the left- and the right-invariant vector fields on G corresponding to a. So the only freedom that we have is in the bracket of functions on G. 1.2. The Hamiltonian unit. For Poisson Lie groups, it is known that the unit E : {1} → G is a Poisson map. In the case of dynamical Poisson groupoids, this property fails: the image of E is not a Poisson submanifold of X. Therefore it makes sense to extend E so that its image becomes the smallest Poisson manifold containing the image of E. This is done by introducing the Hamiltonian unit. Let (T ∗ H)U be the set of all (h, p) ∈ T ∗ H (h ∈ H, p ∈ Th∗ H) such that h−1 p ∈ U . We equip (T ∗ H)U with the standard symplectic structure. Definition. The Hamiltonian unit of a dynamical Poisson groupoid X is the map e : (T ∗ H)U → X given by e(h, p) = (ph−1 , h, h−1 p). It is easy to check that this map is Poisson, and its image is the smallest Poisson submanifold of X containing the image of E.

82

P. Etingof, A. Varchenko

1.3. Poisson groupoids and the CDYB equation. In this section we will construct examples of dynamical Poisson groupoids, and will be naturally led to the classical dynamical Yang–Baxter equation (CDYBE). We work in the setting of Sect. 1.1. Namely, we are considering the Lie groupoid X = X(G, H, U ). We want to make X into a dynamical Poisson groupoid. As we know, the Poisson bracket on X is partially defined by (1.1), and it remains to define the Poisson bracket of two functions pulled back from G. Recall [Dr] that if K is a Poisson Lie group, and k its Lie algebra, then a coboundary Poisson–Lie structure on K is a Poisson–Lie structure with the Poisson bivector of the form π = R(ρ) − L(ρ), (1.2) where ρ ∈ 32 k, and L(ρ), R(ρ) are the left- and the right-invariant tensor fields equal to ρ at the identity. It is known [Dr] that (1.2) defines a Poisson–Lie structure iff CY B(ρ) := [ρ12 , ρ13 ] + [ρ12 , ρ23 ] + [ρ13 , ρ23 ] ∈ (33 k)k .

(1.3)

By analogy with this definition, we will look for a Poisson bracket π on X such that for any functions f1 , f2 pulled back from G, {f1 , f2 } = (df1 ⊗ df2 )(R(ρ(u1 )) − L(ρ(u2 ))),

(1.4)

where ρ : U → 32 g is a smooth function. For a Lie algebra g and a tensor Z ∈ g ⊗ g ⊗ g, define Alt(Z) = Z 123 + Z 231 + Z 312 . Let ρ : U → 32 g be a smooth function. Then the differential dρ is a 1-form on U with coefficients in 32 g, so it can be considered as a function on U with values in h ⊗ 32 g ⊂ g ⊗ g ⊗ g. Define the classical dynamical Yang–Baxter functional CDY B(ρ) = Alt(dρ) + CY B(ρ).

(1.5)

Theorem 1.1. Formulas (1.1),(1.4) define a Poisson structure on X if and only if (i) ρ is H-equivariant, and (ii) the element Z = CDY B(ρ(u)) is constant (in u) and lies in (33 g)g . Proof. Property (i) is equivalent to the Jacobi identity for three functions a1 , f1 , f2 , where a ∈ h, and f1 , f2 are pulled back from G (here, as usual, a is regarded as a linear function on U ). Also, when (i) is satisfied, then (ii) is equivalent to the Jacobi identity for three functions f1 , f2 , f3 pulled back from G. Now let ρ satisfy conditions (i),(ii). Then X equipped with the Poisson bracket {, } defined by ρ is a dynamical Poisson groupoid. Indeed, it is easy to see that the composition map m : X • X → X given by (u1 , g1 , u2 ) • (u2 , g2 , u3 ) = (u1 , g1 g2 , u3 ) is Poisson. Definition. A dynamical Poisson groupoid defined by (1.4) is called a coboundary dynamical Poisson groupoid.

Classical Dynamical Yang–Baxter Equation

83

Recall [Dr] that a coboundary Poisson–Lie group K defined by (1.2) is called quasitriangular if it is equipped with a constant element T ∈ (S 2 g)g , such that CY B(ρ) = 41 [T 12 , T 23 ]. In this case the element r = ρ + 21 T is a solution of the classical Yang–Baxter equation, and is called the classical r-matrix of G. If T = 0, then the quasitriangular structure defined by T is called triangular. Thus, quasitriangular structures on G are parametrized by solutions r of the CYB equation, such that r + r21 is a g-invariant element in S 2 g. Triangular structures are parametrized by skew-symmetric solutions of the CYB equation. Definition. A coboundary dynamical Poisson groupoid is called quasitriangular if it is equipped with a constant element T ∈ (S 2 g)g , such that CDY B(ρ) = 41 [T 12 , T 23 ]. If T = 0, the quasitriangular structure defined by T is called triangular. In the quasitriangular case the function r = ρ + 21 T is a solution of the classical dynamical Yang–Baxter equation CDY B(r) = 0.

(1.6)

and is called the classical dynamical r-matrix of X. Thus, quasitriangular structures on X are parametrized by solutions r of the CDYB equation, which are h-invariant and such that r + r 21 is a constant g-invariant element in S 2 g. Triangular structures are parametrized by skew-symmetric, h-invariant solutions of the CDYB equation. Remark. The material of this section trivially generalizes to the case when G, H are complex Lie groups (algebraic groups, formal groups) rather than real Lie groups. 1.4. Gauge transformations of coboundary dynamical Poisson groupoids. Let G be a complex Lie group, H a commutative, connected complex Lie subgroup of G, g, h their Lie algebras, and U ⊂ h∗ a connected open set. Let GH be the centralizer of H in G, and gH its Lie algebra. In this section we will use the following notation: If α is a k-form on U with values in a vector space W , then α¯ is the corresponding function U → 3k h ⊗ W . Let CB(G, H, U ) be the set of all coboundary dynamical Poisson structures on the groupoid X = X(G, H, U ). That is, CB(G, H, U ) is the set of all h-invariant holomorphic functions ρ : U → 32 g such that CDY B(ρ) = Z ∈ (33 g)g is a constant. It turns out that there is a natural (infinite-dimensional) group which acts on CB(G, H, U ), such that the space of its orbits is finite-dimensional. Let g : U → GH be a holomorphic function. Let η = g −1 dg be a 1-form on U with values in gH . The form η defines a function η¯ : U → h ⊗ gH . For any function ρ : U → 32 g, define ρg := (g ⊗ g)(ρ − η¯ + η¯ 21 )(g −1 ⊗ g −1 ).

(1.7)

Proposition 1.2. If ρ ∈ CB(G, H, U ) then ρ ∈ CB(G, H, U ), and CDY B(ρg ) = CDY B(ρ). g

Proof. Fix a basis {xi } of h. We have

¯ CDY B(ρg ) = (Adg)⊗3 CDY B(ρ) + CDY B(η¯ 21 − η) −[ρ12 , η¯ 23 − η¯ 32 ] − [ρ12 , η¯ 13 − η¯ 31 ] − [ρ13 , η¯ 23 − η¯ 32 ]+

[ρ23 , η¯ 12 − η¯ 21 ] + [ρ13 , η¯ 12 − η¯ 21 ] + [ρ23 , η¯ 13 − η¯ 31 ]+ X −1 ∂g −1 ∂g 21 ⊗1+1⊗g , ρ + η¯ − η¯ ] . Alt xi ⊗ [g ∂xi ∂xi

(1.8)

84

P. Etingof, A. Varchenko

Using the facts that ρ is invariant under h, and CDY B(ρ) is invariant under G, we have ¯ + Alt([ρ23 , η¯ 12 + η¯ 13 ])+ (Adg −1 )⊗3 (CDY B(ρg ) − CDY B(ρ)) = CDY B(η¯ 21 − η) X −1 ∂g −1 ∂g 21 ⊗1+1⊗g , ρ − η¯ + η¯ ] . Alt xi ⊗ [g ∂xi ∂xi (1.9) Simplifying the last two terms in (1.9), we get (Adg −1 )⊗3 (CDY B(ρg ) − CDY B(ρ)) = ¯ + Alt([η¯ 12 + η¯ 13 , η¯ 32 − η¯ 23 ]). CDY B(η¯ 21 − η)

(1.10)

However, since η¯ ∈ h ⊗ gH , we have [η¯ 12 , η¯ 13 ] = [η¯ 12 , η¯ 23 ] = 0. Therefore, we have (Adg −1 )⊗3 (CDY B(ρg ) − CDY B(ρ)) = ¯ + Alt([η¯ 12 , η¯ 32 ] − [η¯ 13 , η¯ 23 ]) = Alt(dη¯ 21 − dη¯ − [η¯ 13 , η¯ 23 ]). CDY B(η¯ 21 − η) (1.11) Let F¯η : U → h⊗h⊗gH be the function corresponding to the curvature form Fη = dη + 1 ¯ 2 [η, η]. It is easy to see that the r.h.s. of (1.11) equals −Alt(Fη ). Finally, observe that by definition the form η satisfies the zero-curvature condition Fη = 0. Thus, CDY B(ρg ) = CDY B(ρ). It is easy to check that (ρf )g = ρgf for f, g : U → GH , so the assignment ρ → ρg defines a left action of the group Σ := M ap(U, GH ) of holomorphic functions on U with values in GH on CB(G, H, U ). Definition. We will call transformations ρ → ρg the gauge transformations of the first kind. Remark. The gauge transformations of the first kind are especially simple if g takes ¯ values in H ⊂ GH . In this case ρg = ρ + η¯ 21 − η. Now let ω be a closed holomorphic 2-form on U . This form defines a holomorphic function ω¯ : U → 32 g. To this function there corresponds a transformation of CB(G, H, U ), given by ρ → ρ + ω. ¯ We will call such transformations gauge transformations of the second kind. Proposition 1.3. If the form ω is exact, then the gauge transformation of the second kind ρ → ρ + ω¯ is also of the first kind. Proof. Let ξ be a 1-form on U such that ω = dξ. This 1-form defines a function ξ¯ : U → ¯ ¯ h. Define a holomorphic function gξ : U → H by gξ = e−ξ . Then η = gξ−1 dgξ = −dξ. Thus, η¯ 21 − η¯ = dξ = ω, ¯ as desired. From now till the end of this section we will assume that U is the formal polydisc. In this case by holomorphic functions we will mean arbitrary formal power series. Therefore, the constructions below make sense not only for the field C but for any field of characteristic zero. Since U is a formal polydisc, any closed form is exact. Thus, gauge transformations of the second kind are also of the first kind, and we will call them simply gauge transformations. Now we will show that the quotient space CB(G, H, U )/Σ is finite-dimensional.

Classical Dynamical Yang–Baxter Equation

85

Theorem 1.4. Let ρ, r ∈ CB(G, H, U ). Assume that the values of ρ, r at the origin are equal, and CDY B(ρ) = CDY B(r). Then ρ = rg for some g ∈ Σ. Proof. Let x1 , ..., xn be a basis of h. We regard it as a system of linear coordinates on U . The functions r, ρ are by definition formal power series in xi . We will prove, by induction in N , that the statement of the theorem holds modulo terms of order N + 1 of the power series. This is enough to prove the theorem. For N = 0, the statement follows from the assumption of the theorem. Suppose we know it for N = K, and let us prove it for N = K + 1. We know that there exists a gauge transformation gK ∈ Σ such that the error term K := ρ − rgK is of order K + 1. Let K = EK + ˜K , where EK is the part of K of degree exactly K + 1. Since ρ and rK := rgK satisfy the property CDY B(ρ) = CDY B(rK ), we have Alt(dEK ) = [CY B(rK ) − CY B(ρ)]K , where [∗]K denotes the degree K homogeneous part of an expression ∗. But according to our assumption, ρ and rK coincide in degrees ≤ K, so we get Alt(dEK ) = 0. Now we will study the equation Alt(dE) = 0, which is the linearization of the CDYB equation near the zero solution. Lemma 1.5. Let E : U → 32 g be a homogeneous polynomial function of degree ≥ 1, invariant under h, such that Alt(dE) = 0. Then E takes values in h ⊗ gH + gH ⊗ h. Proof. Because of the h-invariance of E, it is enough to show that E ∈ h ⊗ g + g ⊗ h. Let V ⊂ g be a vector subspace which is a complement to h. Since E(u) ∈ 32 g, we can write E uniquely as a sum E = EV V + EV h + EhV + Ehh , where EV V ∈ 32 V , Ehh ∈ 32 h, EV h ∈ V ⊗ h, and EhV = −EV21h . The equation Alt(dE) = 0 splits in 3 parts: the hV V -part, the hhV -part, and the hhh-part. The hV V -part says that dEV V = 0, i.e. EV V is a constant. Since EV V is homogeneous of degree ≥ 1, EV V = 0. The lemma is proved. Lemma 1.6. In the situation of Lemma 1.5, there exists a closed 1-form η on U with ¯ values in gH , homogeneous of degree K, such that E = η¯ 21 − η. Proof. Let V H = V ∩ gH . Then V H is a complement to h in V . Let ξ be the 1-form on U with values in V H such that ξ¯ = −EhV . The hhV -part of the equation Alt(dE) = 0 says that dξ = 0. Thus, ξ satisfies the conditions of Lemma 1.6. Therefore, it is enough ¯ to prove Lemma 1.6 for E 0 = E + ξ¯21 − ξ. 0 By the definition, E takes values in 32 h, and Alt(dE 0 ) = 0. Therefore, E 0 defines a closed 2-form on U , which we denote by ω (we have ω¯ = E 0 ). Let ζ be a 1-form such ¯ Then θ¯ − θ¯21 = ω¯ = E 0 . The lemma is proved. that ω = dζ, and let θ = dζ. Lemma 1.7. Let η be a closed 1-form on U with coefficients in a complex finite dimensional Lie algebra b, which has order K at 0. Then there exists a 1-form τ satisfying the zero-curvature equation dτ + 21 [τ, τ ] = 0 such that τ − η is of order ≥ K + 1.

86

P. Etingof, A. Varchenko

Proof. Choose a function χ : U → b, of order K + 1, such that η = dχ. Let B be the Lie group of b, and consider the function g = eχ : U → B. Set τ = g −1 dg. Then τ is the desired form. Now we return to the proof of the theorem. We start with the function E = EK . Let η be as in Lemma 1.6, and τ as in Lemma 1.7. Let g : U → GH be the function such that g −1 dg = τ . It is easy to see that for any s : U → 32 g we have sg − s = E + , where are terms of order K + 1 and higher. In particular, rggK − rgK equals E modulo terms of order ≥ K + 1. Thus, if we set gK+1 = ggK , we will get that ρ − rgK+1 is of order ≥ K + 1. The theorem is proved.

2. Biequivariant Poisson Manifolds and Groupoids In this chapter we will introduce the notion of an H-biequivariant Poisson groupoid, which is a natural generalization of the notion of a dynamical Poisson groupoid. 2.1. Biequivariant Poisson manifolds. Definition. An H-biequivariant Poisson manifold is a Poisson manifold X equipped with two commuting, proper, free Hamiltonian actions of H – a left action l : H×X → X and a right action r : X × H → X, and two maps µl , µr : X → h∗ , which are moment maps for l, r, such that for any smooth functions a, b on h∗ one has {a ◦ µl , b ◦ µr } = 0. Remark. We recall ([GHV],v.2, p. 135) that a smooth action of a group H on a manifold X is called proper if for any two compact sets A, B ∈ X the set of elements of a ∈ H such that B ∩ aA 6= ∅ is compact. It is known that if an action is proper and free then the quotient X/H has a unique structure of a smooth manifold such that the natural map X → X/H is a smooth submersion. We will denote the left and the right action of h ∈ H on x ∈ X by hx and xh, respectively. Observe that for any H-biequivariant Poisson manifold X the maps µl , µr are submersions, since the actions l, r are free. Let U ⊂ h∗ be an open set invariant under the coadjoint action. We will say that an H-biequivariant Poisson manifold X is over U if the images of µl , µr coincide with U . Let CU be the category of H-biequivariant Poisson manifolds over U , where morphisms are Poisson maps which commute with l, r, µl , µr . We will now define the structure of a monoidal category on CU . Define the product • on CU as follows. Let X1 , X2 ∈ CU , with actions of H l1 , r1 , l2 , r2 , and moment maps µ1l , µ1r , µ2l , µ2r . Consider the left action 1 of H on X1 ×X2 defined by 1(h)(x1 , x2 ) = (x1 h−1 , hx2 ). The moment map for this action is µ1r − µ2l . Let X1 • X2 = X1 × X2 //H be the Hamiltonian reduction of X1 × X2 with respect to the action 1 of H. That is, X1 • X2 = Z/H, where Z ⊂ X1 × X2 is the set of points (x1 , x2 ) such that µr (x1 ) = µl (x2 ). It is easy to see that Z is a smooth manifold (as µl , µr are submersions), and H acts freely and properly on Z, so Z/H is smooth. The space X1 •X2 has a natural structure of an object of CU : it has a Poisson structure of Hamiltonian reduction, two free commuting actions of H – l1 and r2 , and two moment maps µ1l and µ2r for them. Thus, • is a bifunctor CU × CU → CU .

Classical Dynamical Yang–Baxter Equation

87

It is easy to see that the operation • is associative: X1 • (X2 • X3 ) = (X1 • X2 ) • X3 = X1 × X2 × X3 //H × H, where H × H acts on X1 × X2 × X3 by (h1 , h2 )(x1 , x2 , x3 ) = −1 (x1 h−1 1 , h1 x2 h2 , h2 x3 ). Define an object 1 ∈ CU to be (T ∗ H)U (see Chapter 1), with the obvious left and right actions of H, and the moment maps µl (p, h) = ph−1 , µr (p, h) = h−1 p, h ∈ H, p ∈ Th∗ H. It turns out that 1 is a unit object of CU . Indeed, let us check that 1•X is naturally isomorphic to X. We have Z = {(h, p, x) : h−1 p = µl (x)}. Thus, Z is naturally identified with H × X, via (h, p, x) → (h, x). The action of H on Z is by h(h0 , x) = (h0 h−1 , hx). Thus, the quotient Z/H is naturally isomorphic to X, and it is easy to check that the Poisson structure on X, the two actions of H, and the corresponding moment maps are the original ones. Similarly one checks that X • 1 is naturally isomorphic to X. The unit object axioms are checked directly. Thus, (CU , •, 1) is a monoidal category with a unit object 1. Remark. Let D be the category whose objects are manifolds with two commuting, free, proper actions of H – a left action l and a right action r. This category has a natural monoidal structure, with product being the fiber product ×H over H, and the unit object H (with obvious l and r). Then we have a natural functor T ∗ from D to Ch∗ – the functor of cotangent bundle: M → T ∗ M . (Indeed, it is well known that if H acts on M , then its induced action on T ∗ M is Hamiltonian, with moment map µ(m, p)(L) = hL(m), pi, where m ∈ M , p ∈ Tm M , L ∈ h, and L(a), a ∈ M is the corresponding vector field on M . Thus, T ∗ M is naturally an object of Ch∗ .) The main property of the functor T ∗ is that it is a monoidal functor: T ∗ (M1 ×H M2 ) = T ∗ M1 • T ∗ M2 . Let X ∈ CU . Denote by X¯ the new object of CU obtained as follows: X¯ is X as a manifold, with the opposite Poisson structure −{, }, the left and the right actions of H permuted (i.e. the left, respectively right, action of h on X¯ is the right, respectively left, action of h−1 on X), and the moment maps also permuted. We will call X¯ the dual ¯ object to X. By a reflection on X we will mean an involutive morphism i : X → X. −1 We will often write x for i(x). Let X ∈ CU and i : X → X¯ be a reflection map. Let ϕi+ , ϕi− : X → X × X be given by the formulas ϕi+ (x) = (x, x−1 ), ϕi− (x) = (x−1 , x). It is easy to see that these i : X → X • X, as µr (x−1 ) = µl (x), maps descend to maps (not necessarily Poisson) ψ± −1 µl (x ) = µr (x). 2.2. Biequivariant Poisson groupoids. Definition. Let X ∈ CU . We will say that X is an H-biequivariant Poisson semigroupoid if it is equipped with an associative morphism m : X • X → X (called multiplication). In this case, a unit in X is a morphism e : 1 → X such that the morphisms m ◦ (e • id), m ◦ (id • e) are the canonical morphisms 1 • X → X, X • 1 → X. Further, such X is called an H-biequivariant Poisson groupoid if it is equipped with a reflection i : X → X¯ (called the inversion map), such that m(ψ+i (x)) = e(1, µl (x)), i (x)) = e(1, µr (x)), where (1, µl,r (x)) ∈ (T1∗ H)U . m(ψ− Remark 1. If X has a unit, it is unique, as for any two units e1 , e2 , we have e1 = m ◦ (e1 • e2 ) = e2 . If the inversion map i on X exists, it is also unique, as for any two inversion maps i1 , i2 , m3 (i1 (x) • x • i2 (x)) = i1 (x) = i2 (x), where m3 : X • X • X → X is the multiplication map. Remark 2. If H is trivial, the notion of an H-biequivariant Poisson groupoid coincides with the notion of a Poisson–Lie group.

88

P. Etingof, A. Varchenko

If X is a dynamical Poisson groupoid, it is automatically an H-biequivariant Poisson groupoid. Indeed, in this case X is an H-biequivariant Poisson manifold, with µl = ¯ defined in Sect. 1.1, which is obviously t, µr = s. The composition map is the map m associative. The unit axiom is satisfied for the Hamiltonian unit e : 1 → X. Finally, it is easy to check that the inversion map is anti-Poisson and satisfies the inversion axiom. As we mentioned, the notion of a Poisson groupoid is known in the literature [W]. So let us justify the usage of this term in our paper, by showing that X is indeed a Poisson groupoid in the sense of [W] in a natural way. Let X be an H-biequivariant Poisson groupoid over U . Then X has a natural structure of groupoid in the usual sense. Namely, P = U , s = µr , t = µl , the composition map is given by the multiplication map m, and E = e|U , where U = {(1, p) ∈ (T ∗ H)U }. It is easy to check that this groupoid is in fact a Lie groupoid in the sense of [M]. Proposition 2.1. X is a Poisson groupoid in the sense of [W]. Proof. We should show that the graph of composition is coisotropic. This follows from the following easy lemma. Lemma. Let X be a Poisson manifold with a proper, free hamiltonian action of a connected Lie group H with moment map µ : X → h∗ . Let X0 = µ−1 (0). Let Y be another Poisson manifold, and f : X0 → Y be an H-invariant smooth map. Then: f descends to a Poisson map X//H → Y (where X//H is the hamiltonian reduction) if and only if the graph of f is coisotropic in X × Y¯ (where Y¯ is Y with the opposite Poisson structure). Proof. Straightforward. To prove Proposition 2.1, it is enough to apply this lemma to f = m, where m is the multiplication map in the groupoid. Remark. We have defined dynamical and H-biequivariant Poisson groupoids in the category of smooth manifolds. Similarly one can define the same objects in the categories of complex analytic, formal, and algebraic varieties. In the formal and algebraic settings, we can work over an arbitrary field of characteristic zero. These generalizations are straightforward, and we will not give them here. 2.3. H-biequivariant Poisson algebras. In this and the next section we will sketch the constructions of the previous two sections in the algebraic language, i.e. working with Poisson algebras rather than Poisson manifolds. This is related to the previous two sections by the operation of taking spectrum. Let k be a field of characteristic zero. Let A be a Poisson algebra over k, H a connected affine algebraic group, and ψ : H × A → A be a right algebraic action of H on A by Poisson automorphisms. (“Algebraic” means that A is a sum of finite dimensional representations of H). Let h be the Lie algebra of H. Then the variety h∗ has a natural Poisson structure. Let U ⊂ h∗ be an H-invariant open set. A Poisson homomorphism µ : O(U ) → A (where O(X) denotes the ring of algebraic functions on a variety X) is called a moment map for ψ if for any regular function g on U , and any f ∈ A we have {µ(g), f } =

X j

µ(

∂g ) · dψ|h=1 (yj , f ). ∂yj

Classical Dynamical Yang–Baxter Equation

89

Here yj ∈ h are a linear system of coordinates on U , h ∈ H, and dψ|h=1 : h × A → A is the differential of ψ at h = 1 ∈ H. In particular, for a linear function on U given by a ∈ h the last equation is {µ(a), f } = dψ|h=1 (a, f ). For a left action of H, a moment map is defined in the same way, with the only difference that it is anti-Poisson rather than Poisson. Definition. An H-biequivariant Poisson algebra over U is a 5-tuple (A, l, r, µl , µr ), where A is a Poisson algebra with 1 over k, l, r is a pair of commuting algebraic actions of H on A (a left action and a right action) by Poisson algebra automorphisms, and µl , µr : O(U ) → A are moment maps for l, r, such that (i) µl , µr are embeddings, and their images Poisson commute. (ii) There exists an l(H)×r(H)-invariant subspace Al0 of A such that the multiplication map µr (O(U )) ⊗ Al0 → A is a linear isomorphism; there exists an l(H) × r(H)invariant subspace Ar0 of A such that the multiplication map µl (O(U )) ⊗ Ar0 → A is a linear isomorphism. A morphism of H-biequivariant Poisson algebras over U is a morphism of Poisson algebras which preserves l, r and µl , µr . Remark 1. From [l, r] = 0 it follows that {µl ◦ x, µr ◦ y} is a central element (in the Poisson sense) for any x, y ∈ h, but it does not follow that this commutator equals to zero. So we require that it is zero by condition (i). Remark 2. Condition (ii) is of technical nature, and is not very important in the discussion below. Denote the category of H-biequivariant Poisson algebras over U by AU . For convenience we will write l(h)a as ha and r(h)a as ah. Let us now describe the monoidal structure on AU . A B B Let A, B ∈ AU . Let lA , rA , lB , rB , µA l , µr , µl , µr be the corresponding actions and moment maps. Consider the action of the group H in A ⊗ B by 1(h)(a ⊗ b) = e which is ah−1 ⊗ hb. We will construct a new H-biequivariant Poisson algebra A⊗B, obtained by Hamiltonian reduction of A ⊗ B by this action of H. Denote by A ∗ B the product A ⊗O(U ) B, where O(U ) is mapped to A via µA r and . The algebra A ∗ B has two commuting actions of H (l ⊗ 1 and 1 ⊗ rB ). to B via µB A l But we cannot claim that A ∗ B ∈ AU , since the Poisson structure on A ⊗ B does not, in general, descend to A ∗ B. However, the action 1 of H on A ⊗ B descends to one on A ∗ B, so we can define e := (A ∗ B)H , where H acts by 1. It is easy to check that the Poisson structure A⊗B e (Hamiltonian reduction). The two actions of H and on A ⊗ B descends to one on A⊗B e So, in order to check that A⊗B e ∈ AU , it their moment maps also descend to A⊗B. suffices to check properties (i) and (ii). A B B Using properties (i) and (ii) of the moment maps µA l , µr , µl , µr , it is easy to see A r r e is identified that A ∗ B is naturally identified with µl (O(U )) ⊗ A0 ⊗ B0 , and A⊗B A r r H −1 with µl (O(U )) ⊗ (A0 ⊗ B0 ) , where H acts by a ⊗ b → ah ⊗ hb. This implies e properties (i) and (ii) for the moment map µA l ⊗ 1 : O(U ) → A⊗B, corresponding r r r H e (with (A⊗B) e 0 = (A0 ⊗ B0 ) ). For the moment map to the left action of H on A⊗B e : O(U ) → A ⊗B corresponding to the right action, these properties are proved 1 ⊗ µB r analogously.

90

P. Etingof, A. Varchenko

e ∈ AU . It is clear that the assignment A, B → A⊗B e is a bifunctor Thus, A⊗B AU × AU → AU . Consider the algebra O((T ∗ H)U ), with the standard Poisson structure, equipped with the standard actions l, r of H on left and right given by (x, p) → (h1 xh2 , h1 ph2 ). Let Ml , Mr : (T ∗ H)U → U be given by (h, p) → ph−1 , (h, p) → h−1 p. Let µl,r = ∗ : O(U ) → O((T ∗ H)U ). It is easy to check that µl,r are moment maps for l, r. Ml,r Let 1 = (O((T ∗ H)U ), l, r, µl , µr ) ∈ AU . It is easy to check that we have natural e ≡ A ≡ 1⊗A. e isomorphisms A⊗1 e ⊗C e = A⊗(B e ⊗C). e Proposition 2.2. (i) (A⊗B) e and (AU , ⊗, e 1) is a monoidal category. (ii) 1 is a unit object in AU with respect to ⊗, Proof. Easy.

Let A ∈ AU . Denote by A¯ the new object of AU obtained as follows: A¯ is A as an algebra, with the opposite Poisson structure, the left and the right actions of H permuted (i.e. the left, respectively right, action of h on A¯ is the right, respectively left, action of h−1 on A), and the moment maps are also permuted. We will call A¯ the dual object to A. By a reflection on A we will mean an involutive morphism i : A¯ → A. Let A ∈ AU and i : A¯ → A be a reflection. Let ϕi+ , ϕi− : A ⊗ A → A be given by the formulas ϕi+ (a ⊗ b) = ai(b), ϕi− (a ⊗ b) = i(a)b. It is easy to see that these maps i e → A. : A⊗A descend to maps (not necessarily Poisson) ψ± 2.4. H-biequivariant Poisson-Hopf algebroids. Now let us define the algebraic version of the notion of an H-biequivariant Poisson groupoid – the notion of an H-biequivariant Poisson-Hopf algebroid. Definition. Let A be an H-biequivariant Poisson algebra. Then A is called an Hbiequivariant Poisson-Hopf algebroid over U if it is equipped with a coassociative e called the coproduct, an AU -morphism ε : A → 1 AU -morphism 1 : A → A⊗A called the counit, and a reflection S : A¯ → A called the antipode, such that (i) (id • ε) ◦ 1 = (ε • id) ◦ 1 = id, and S (ii) ψ+S ◦ 1 = µl ◦ P ◦ ε, ψ− ◦ 1 = µr ◦ P ◦ ε, where P : 1 → O(U ) acts by f (x, p) → f (1, p). Remark. In the above discussion, U is a Zariski open set. If k = R or C, then we can take U to be an open set in the usual sense, and define O(U ) to be the algebra of smooth, respectively analytic, functions on U . Then we can repeat sections 2.3, 2.4 and thus define the notions of an H-biequivariant Poisson algebra and Poisson-Hopf algebroid over U . Similarly, one can take U to be the infinitesimal neighborhood of zero in h∗ (i.e. O(U ) = k[[h]]). The material of Sects. 2.3 and 2.4 can also be generalized to this case in a straightforward way. The constructions of Sect. 1.3 can easily be put in the algebraic framework of Sects. 2.3, 2.4. Let G be an affine algebraic group, and H a connected algebraic subgroup of G. Let X(G, H, U ) be a coboundary dynamical Poisson groupoid defined by (1.1), (1.4). Consider the algebra A = O(U ) ⊗ O(G) ⊗ O(U ), where O(G) is the algebra of polynomial functions on G. It is easy to see that A is closed under the Poisson bracket defined by (1.1), (1.4), so it is a Poisson algebra. The two actions of H on A are defined by

Classical Dynamical Yang–Baxter Equation

91

l(h)[a ⊗ f ⊗ b](u1 , g, u2 ) = [a ⊗ f ⊗ b](h−1 u1 h, h−1 g, u2 ), r(h)[a ⊗ f ⊗ b](u1 , g, u2 ) = [a ⊗ f ⊗ b](u1 , gh−1 , hu2 h−1 ),

(2.1)

and the corresponding moment maps are the maps corresponding to the projections of X to the first and second component of U . The coproduct, counit, and antipode in A are defined by the groupoid structure on X. Thus, A is an H-biequivariant Poisson-Hopf algebroid. We will call such a H-biequivariant Poisson-Hopf algebroid a dynamical Poisson-Hopf algebroid. The notion of a dynamical Poisson-Hopf algebroid will be useful to us in the next paper.

3. Classification of Classical Dynamical r-Matrices 3.1. Kac–Moody algebras [K, Ch. 2]. Let A = (ai,j )ni,j=1 be a symmetrizable generalized Cartan matrix. Let g(A) be the associated Kac–Moody Lie algebra, h its Cartan subalgebra, g = h ⊕ ⊕α∈1 gα , the root decomposition of the Kac–Moody Lie algebra. Let (·, ·) be an invariant nondeˆ the associated Casimir operator. Here g⊗g ˆ generate bilinear form on g and ∈ g⊗g denotes the completed tensor product. Remark. In this work we consider the Kac–Moody Lie algebras associated with a symmetrizable generalized Cartan matrix, although all theorems with the same proof are valid for more general Kac–Moody Lie algebras associated with a symmetric complex matrix, see [SV, V, 11.1.10]. 3.2. Classical dynamical r-matrices. Let g be a Kac–Moody Lie algebra, h its Cartan subalgebra. A meromorphic function ˆ r : h∗ → g⊗g is called a classical dynamical r-matrix associated with the pair h ⊂ g if it satisfies the following three conditions: 1. The zero weight condition, [h ⊗ 1 + 1 ⊗ h , r(λ)] = 0

(3.1)

r12 (λ) + r21 (λ) =

(3.2)

for any λ ∈ h∗ and h ∈ h. 2. The generalized unitarity, for some constant ∈ C and all λ. 3. The classical dynamical Yang–Baxter equation, CDYB, Alt(dr) + [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0.

(3.3)

92

P. Etingof, A. Varchenko

Here we use the following notations and conventions. By a meromorphic function ˆ satisfying the zero weight condition we mean a function of the form r : h∗ →Pg⊗g r = rh + α∈1 rα , where 1 is the set of roots and rh : h∗ → h ⊗ h, rα : h∗ → gα ⊗ g−α are meromorphic maps of finite dimensional spaces. If X ∈ End(Vi ), then we denote by X (i) ∈ End(V1 ⊗ . . . ⊗ Vn ) the operator · · · ⊗ Id ⊗ X ⊗ Id ⊗ · · ·, acting non-trivially on the ith factor of a tensor product of vector P P (i) (j) spaces, and if X = Xk ⊗ Yk ∈ End(Vi ⊗ Vj ), then we set X ij = Xk Y k . The differential of the r-matrix is considered in (3.3) as a meromorphic function ˆ ⊗g, ˆ dr : h∗ → g⊗g

λ 7→

X

xi ⊗

i

∂r (λ) , ∂xi

where {xi } is any basis in h. We denote by Alt(dr) the following symmetrization of dr with respect to even permutations of 1, 2, 3,

Alt(dr) =

X i

x(1) i

∂r23 X (2) ∂r31 X (3) ∂r12 + xi + xi . ∂xi ∂xi ∂xi i i

ˆ ⊗g. ˆ The CDYB equation is an equation in g⊗g The constant in (3.2) is called the coupling constant. 3.3. Classification of the classical dynamical r-matrices with nonzero coupling constant. Theorem 3.1. 1. Let xi , i = 1, ..., N, be a basis in h. For any positive root α ∈ 1+ let eiα , i = 1, ..., Nα , be a basis in the root space gαP and ei−α , i = 1, ..., Nα , the ∗ dual basis in g−α . Let ν be an element in h , C = i,j Ci,j dxi ⊗ dxj a closed meromorphic 2-form on h∗ , a nonzero complex number. Then the function r : ˆ defined by h∗ → g⊗g

r(λ) =

N X

Ci,j (λ)xi ⊗ xj +

i,j=1

2

Nα XX + cotanh ( (α, λ − ν)) eiα ⊗ ei−α 2 2 α∈1

(3.4)

i=1

is a classical dynamical r-matrix with the coupling constant . Here cotanh is the hyperbolic cotangent. ˆ be a classical dynamical r-matrix with nonzero coupling constant, → g⊗g 2. Let r : h∗P r = rh + α∈1 rα its weight decomposition. If the function rα is not constant for any simple positive root α, then the function r has the form indicated in (3.4). Theorem 3.1 is proved in Sect. 3.6. For a simple Lie algebra g, the complete classification of r-matrices r : h∗ → g ⊗ g with nonzero coupling constant is given in Sect. 3.7.

Classical Dynamical Yang–Baxter Equation

93

3.4. Classification of the classical dynamical r-matrices with zero coupling constant. We shall use the notations of Theorem 3.1. Theorem 3.2. 1. Let X be a subset of the set of roots, 1, of a Kac–Moody Lie algebra g such that 1) if α, β ∈ X and α + β is a root then α + β ∈ X, and 2) if α ∈ X then −α ∈ X. P Let ν be an element in h∗ , C = i,j Ci,j dxi ⊗ dxj a closed meromorphic 2-form ˆ defined by on h∗ . Then the function r : h∗ → g⊗g r(λ) =

N X

Ci,j (λ)xi ⊗ xj +

i,j=1

Nα XX α∈X i=1

1 ei ⊗ ei−α (α, λ − ν) α

(3.5)

is a classical dynamical r-matrix with zero coupling constant. 2. If g is a simple Lie algebra, then any classical dynamical r-matrix with zero coupling constant has this form. ˆ be a classical 3. Let g be an arbitrary Kac–Moody Lie algebra. Let r : h∗ →Pg⊗g dynamical r-matrix with zero coupling constant, r = rh + α∈1 rα its weight decomposition. If the function rα is not identically equal to zero for any simple positive root α, then the function r has the form indicated in (3.5) with X = 1. Theorem 3.2 is proved in Sect. 3.5. Example. Let g be the Lie algebra of type B2 with roots (±1, 0), (0, ±1), (±1, ±1). Then the set of long roots (±1, ±1) gives an example of the set X. Consider an r-matrix of type (3.5). Assume that the element ν ∈ h∗ tends to infinity so that all terms of the matrix have a limit. Then the limiting function is an r-matrix of type (3.5) for a new set X. Notice that the r-matrix (3.5) corresponding to the example is not a limiting case of the r-matrix (3.5) with X = 1. 3.5. Proof of Theorem 3.2. First we prove Theorem 3.2 assuming that g is a simple Lie algebra. In this case dim gα = 1 for any root α. For any positive root α fix elements eα ∈ gα and e−α ∈ g−α dual with respect to the bilinear form. Let r : h∗ → g ⊗ g be a classical dynamical r-matrix with zero coupling constant. The zero weight condition and the unitarity condition imply that the r-matrix could be written in the form r(λ) =

N X i,j=1

Ci,j (λ)xi ⊗ xj +

X

ϕα (λ) eα ⊗ e−α ,

α∈1

where ϕα , Ci,j are suitable scalar meromorphic functions such that ϕ−α (λ) = −ϕα (λ), and Ci,j (λ) = −Cj,i (λ). The CDYB equation is an equation in g⊗3 . The unitarity condition implies that the left hand side of the CDYB equation is skew-symmetric with respect to permutations of factors. This remark and the zero weight condition show that in order to solve the CDYB equation it is enough to solve its h ⊗ h ⊗ h−, h ⊗ gα ⊗ g−α − and gα ⊗ gβ ⊗ gγ −parts, where α, β, γ ∈ 1 and in the last case α + β + γ = 0.

94

P. Etingof, A. Varchenko

A basis in g⊗3 is formed by the elements x ⊗ y ⊗ z, where x, y, z run through xi , eα , i = 1, ..., N , α ∈ 1. The xi ⊗ xj ⊗ xk -part of the CDYB equation has the form ∂Cj,k ∂Ck,i ∂Ci,j + + =0 ∂xi ∂xj ∂xk

(3.6)

PN and says that i,j=1 Ci,j (λ)dxi ⊗ dxj is a closed differential form. The h ⊗ gα ⊗ g−α -part of the CDYB equation has the form X ∂ϕα k

∂xk

xk ⊗ eα ⊗ e−α + ϕ2α hα ⊗ eα ⊗ e−α = 0,

where hα = [eα , e−α ]. This equation could be written in the form d ϕα + ϕ2α d hα = 0.

(3.7)

Hence ϕα = 0 or ϕα = (hα − να )−1 for some να ∈ C. Here hα is considered as a linear function on h∗ , hα (λ) = (α, λ) for λ ∈ h∗ . The gα ⊗ gβ ⊗ g−α−β -part of the CDYB equation has the form ϕα ϕβ − ϕα ϕα+β − ϕα+β ϕβ = 0.

(3.8)

Lemma 3.3. Let X = {α ∈ 1 | ϕα 6= 0}. Then X is closed with respect to multiplication by −1 and addition. Proof. The set X is closed with respect to multiplication by −1 because of the unitarity condition. If ϕα and ϕβ are different from zero, then ϕα+β (ϕα + ϕβ ) = ϕα ϕβ , and, hence, ϕα+β is different from zero. Lemma 3.4. να+β = να + νβ .

(3.9)

Proof. Equation (3.8) implies 1 1 1 1 1 =( + ) . hα − ν α hβ − ν β hα − να hβ − νβ hα+β − να+β Since hα+β = hα + hβ , in order to cancel the last pole one needs (3.9).

Corollary 3.5. There is ν ∈ h∗ such that να = (α, ν) for all α ∈ X. This finishes the proof of Theorem 3.2 for a simple Lie algebra. ˆ be Now we assume that g is an arbitrary Kac–Moody algebra. Let r : h∗ → g⊗g a classical dynamical r-matrix with zero coupling constant. The zero weight condition and the unitarity condition imply that the r-matrix could be written in the form r(λ) =

N X i,j=1

Ci,j (λ)xi ⊗ xj +

Nα XX

j i ϕi,j α (λ) eα ⊗ e−α ,

α∈1 i,j=1

j,i where Ci,j (λ), ϕi,j α (λ) are suitable scalar functions such that Ci,j = −Cj,i and ϕ−α = i,j −ϕα .

Classical Dynamical Yang–Baxter Equation

95

The h ⊗ h ⊗ h-part of the CDYB equation is the same as for a simple Lie algebra PN and means that i,j=1 Ci,j (λ)dxi ⊗ dxj is a closed differential form. To analyze the h ⊗ gα ⊗ g−α - and gα ⊗ gβ ⊗ g−α−β -parts of the CDYB equation we shall introduce some useful linear operators ϕα (λ) : gα → gα , where α ∈ 1 and λ ∈ h∗ . Namely, for any α ∈ 1 and λ ∈ h∗ , we define ϕα (λ) by the formula ϕα (λ) : ejα 7→

Nα X

i ϕi,j α (λ) eα .

(3.10)

i=1

Let rα (λ) denote the gα ⊗ g−α -part of the r-matrix, rα (λ) =

Nα X

j i ϕi,j α (λ) eα ⊗ e−α =

i,j=1

Nα X

ϕα (λ) ejα ⊗ ej−α .

j=1

Lemma 3.6. The h ⊗ gα ⊗ g−α -part of the CDYB equation has the form d ϕα + ϕ2α d hα = 0,

(3.11)

where the operator valued function ϕα : h∗ → End (gα ) is defined in (3.10). Proof. The h ⊗ gα ⊗ g−α -part of the CDYB equation has the form Nα X X ∂ϕi,j α 12 (λ) xk ⊗ eiα ⊗ ej−α + [r−α (λ), rα13 (λ)] = 0. ∂xk

i,j=1

k

To prove the lemma it is enough to show that 12 (λ), rα13 (λ)] = [r−α

Nα X

hα ⊗ ϕ2α elα ⊗ el−α .

l=1

We shall use the formula [eiα , ej−α ] = (eiα , ej−α ) hα = δi,j hα . Then

X

12 (λ), rα13 (λ)] = [ [r−α

X

i k ϕi,k −α e−α ⊗ eα ⊗ 1,

i,k j,l ϕi,k −α ϕα

[ei−α , ejα ]

⊗

ekα

⊗

i,k,j,l

el−α

=

X i,k,l

X

j l ϕj,l α eα ⊗ 1 ⊗ e−α ] =

j,l i,l k l ϕk,i α ϕα hα ⊗ eα ⊗ e−α =

X

hα ⊗ ϕ2α elα ⊗ el−α .

l

The lemma is proved. Introduce a linear map 3 : g ⊗ g ⊗ g → C,

x ⊗ y ⊗ z 7→ (x, [y, z]).

Recall that the invariance of the bilinear form implies (x, [y, z]) = ([x, y], z).

96

P. Etingof, A. Varchenko

Lemma 3.7. Let α, β, γ ∈ 1 be any roots such that α+β +γ = 0. Then the g−α ⊗g−β ⊗ g−γ -part of the CDYB equation is equivalent to the statement that the composition map 3 ◦ (ϕα ⊗ ϕβ ⊗ 1 + ϕα ⊗ 1 ⊗ ϕγ + 1 ⊗ ϕβ ⊗ ϕγ ) : gα ⊗ gβ ⊗ gγ → C (3.12) is the zero map. Proof. The g−α ⊗ g−β ⊗ g−γ -part has the form 13 23 12 (λ), r−β (λ)] + [rβ12 (λ), rγ13 (λ)] + [r−α (λ), rγ23 (λ)] = 0. [r−α

Compute each of the terms. X

13 23 (λ), r−β (λ)] = [ [r−α

X

i j ϕi,j −α e−α ⊗ 1 ⊗ eα ,

i,j k,l ϕi,j −α ϕ−β

ei−α

⊗

X

k l ϕk,l −β 1 ⊗ e−β ⊗ eβ ] =

k,l

ek−β

⊗

[ejα , elβ ]

i,j,k,l

X

=

k,l i k m m j l ϕi,j −α ϕ−β e−α ⊗ e−β ⊗ e−γ · (eγ , [eα , eβ ]).

i,j,k,l,m

Using the equalities (ejα , [elβ , em γ ]), we get

P j

j i ϕi,j −α eα = −ϕα eα ,

13 23 [r−α (λ), r−β (λ)] =

X

P l

l k m j l ϕk,l −β eβ = −ϕβ eβ , (eγ , [eα , eβ ]) =

i k m 3(ϕα eiα ⊗ ϕβ ekβ ⊗ em γ ) e−α ⊗ e−β ⊗ e−γ .

i,k,m

Similarly X

[rβ12 (λ), rγ13 (λ)] =

i k m 3(eiα ⊗ ϕβ ekβ ⊗ ϕγ em γ ) e−α ⊗ e−β ⊗ e−γ ,

i,k,m 12 [r−α (λ),

rγ13 (λ)]

=

X

i k m 3(ϕα eiα ⊗ ekβ ⊗ ϕγ em γ ) e−α ⊗ e−β ⊗ e−γ .

i,k,m

These formulae prove the lemma. Lemmas 3.6 and 3.7 imply the first statement of Theorem 3.2 for an arbitrary Kac– Moody Lie algebra. We shall show the third statement of Theorem 3.2 by induction on the complexity of the roots. Let α1 , ..., αr be the simple positive roots of g and α ∈ 1 any root. Represent it as a linear combination of the simple roots, α = k1 α1 + ... + kr αr . The absolute value of the number k1 + ... + kr will be called the complexity of the root α. Thus the only roots of complexity one are the simple roots. P Lemma 3.8. Let g be a Kac–Moody Lie algebra. Let r = rh + α∈1 rα be a classical dynamical r-matrix with zero coupling constant and with nonzero components rα corresponding to the simple positive roots. Then the associated operator valued functions ϕα : h∗ → End (gα ) have the following form. There exists ν ∈ h∗ such that ϕα (λ) = for all α ∈ 1 and λ ∈ h∗ .

1 id (α, λ − ν)

(3.13)

Classical Dynamical Yang–Baxter Equation

97

Proof. For any simple root α, dim gα = 1. Then ϕα is a scalar function of the form ϕα (λ) = (hα − να )−1 for some να ∈ C, see Lemma 3.6. Hence there exists ν ∈ h∗ such that the operator ϕα (λ) has the form indicated in (3.13) for all simple roots α. Our goal is to extend this formula to all roots. Let e1 , ..., er , f1 , ..., fr be the Chevalley generators corresponding to the simple positive roots α1 , ..., αr . For any root α, the space gα is generated by commutators of the elements e1 , ..., er , if α is positive, and by commutators of the elements f1 , ..., fr , if α is negative. Hence for any root α the space gα is generated by commutators of the form [x, y], where x ∈ gβ , y ∈ gγ , α = β + γ, and the complexity of β and γ is less than the complexity of α. Assume now that a root γ has the following property. For any roots α and β such that α + β + γ = 0 and the complexity of α and β is less than the complexity of γ, the operators ϕα and ϕβ have the form indicated in (3.13). Let us show that the operator ϕγ has the same form. Formula (3.12) implies (

1 1 1 + ) ([x, y], ϕγ (λ) z) = − ([x, y], z). (α, λ − ν) (β, λ − ν) (α, λ − ν) (β, λ − ν)

This formula means that the operator ϕγ (λ) is uniquely determined by the operators ϕα and ϕβ corresponding to the roots α and β with complexity less than the complexity of γ. At the same time we see that if the operator ϕγ has the form indicated in (3.13), then it satisfies both Eqs. (3.12) and (3.11). Lemma 3.8 and Theorem 3.2 are proved. 3.6. Proof of Theorem 3.1. First we prove Theorem 3.1 assuming that g is a simple Lie algebra. As in the proof of Theorem 3.2, fix the dual generators eα ∈ gα and e−α ∈ g−α for all roots α ∈ 1. Fix an orthonormal basis in h, x1 , ..., xN . number, Let r : h∗ → g⊗g be a meromorphic map, a nonzero complex P ∈ g⊗g the P Casimir operator of g associated to the bilinear form, = k xk ⊗xk + α∈1 eα ⊗e−α . Introduce a meromorphic map s : h∗ → g ⊗ g by the formula s(λ) = r(λ) − /2. Lemma 3.9. The map r is a classical dynamical r-matrix with the coupling constant if and only if the map s satisfies the zero weight condition (3.1), the unitarity condition, s12 (λ) + s21 (λ) = 0 , and the following analog of the CDYB equation, Alt(ds) + [s12 , s13 ] + [s12 , s23 ] + [s13 , s23 ] +

2 ( [12 , 13 ] + [12 , 23 ] + [13 , 23 ] ) = 0. 4

(3.14)

Proof. The only thing that needs to be checked is the fact that the terms of the form [(i,j) , s(k,l) ]/2 cancel in the CDYB equation. This can be verified by an easy direct calculation. The zero weight condition and the unitarity condition imply that the matrix s can be written in the form s(λ) =

N X i,j=1

Ci,j (λ)xi ⊗ xj +

X α∈1

ϕα (λ) eα ⊗ e−α ,

98

P. Etingof, A. Varchenko

where ϕα , Ci,j are suitable scalar meromorphic functions such that ϕ−α (λ) = −ϕα (λ), and Ci,j (λ) = −Cj,i (λ). The CDYB Eq. (3.14) is an equation in g⊗3 . Its left hand side is invariant with respect to even permutations of factors. To solve the CDYB equation it suffices to solve its h ⊗ h ⊗ h−, h ⊗ gα ⊗ g−α − and gα ⊗ gβ ⊗ gγ −parts, where α, β, γ ∈ 1 and in the last case α + β + γ = 0. The xi ⊗ xj ⊗ xk -part of the CDYB equation has the form indicated in (3.6) and PN says that i,j=1 Ci,j (λ)dxi ⊗ dxj is a closed differential form. The h ⊗ gα ⊗ g−α -part of the CDYB equation has the form X ∂ϕα 2 −hα ⊗ eα ⊗ e−α + xk ⊗ eα ⊗ e−α + ϕ2α hα ⊗ eα ⊗ e−α + ∂xk 4 k X ( xk ⊗ [xk , eα ] ⊗ e−α + xk ⊗ eα ⊗ [xk , e−α ] ) = 0, k

where hα = [eα , e−α ]. This equation can be written in the form 2 2 d hα = 0. d ϕα + ϕ α − 4 cotanh (hα − να ) 2 2 for some να ∈ C, or ϕα = ± /2. Here hα is considered as a linear function on h∗ , cf. (3.7). Let α, β, γ ∈ 1 be roots such that α + β + γ = 0. The gα ⊗ gβ ⊗ gγ -part of the CDYB Eq. (3.14) has the form Hence

ϕα (λ) =

ϕ α ϕβ + ϕ α ϕγ + ϕ γ ϕβ +

2 = 0. 4

(3.15)

cotanh (hα − να ) , ϕβ (λ) = cotanh (hβ − νβ ) 2 2 2 2 for some να , νβ ∈ C, then ϕγ is not a constant, so (hγ − νγ ) ϕγ (λ) = cotanh 2 2 If

ϕα (λ) =

for some νγ ∈ C. Starting from simple positive roots we conclude that all functions ϕα (λ) are not constant. Equation (3.15) also implies (hγ − νγ ) + cotanh (hα + hβ − να − νβ ) = 0. cotanh 2 2 Hence, there exists ν ∈ h∗ such that να = (α, ν) and (α, λ − ν) ϕα (λ) = cotanh 2 2 for all roots α ∈ 1. Theorem 3.1 is proved for a simple Lie algebra g.

Classical Dynamical Yang–Baxter Equation

99

The generalization of the above proof to the case of a Kac–Moody algebra is word by word parallel to the generalization of the proof of Theorem 3.2 from the simple Lie algebra case to the Kac–Moody algebra case. 3.7. Classification of classical dynamical r-matrices with nonzero coupling constant, the simple Lie algebra case. Let g be a simple Lie algebra, h its Cartan subalgebra, g = h ⊕ ⊕α∈1 gα the root decomposition, (·, ·) an invariant nondegenerate bilinear form on g. For any positive root α ∈ 1(h) fix basis elements eα ∈ gα and e−α ∈ g−α which are dual with respect to the bilinear form. Fix a basis in the Cartan subalgebra, {xi }, orthonormal with respect to the bilinear form. Let 1 = 1+ ∪ 1− be a polarization of roots into positive and negative, 1s+ ⊂ 1+ the set of simple positive roots. Fix a subset X ⊂ 1s+ of the set of positive simple roots. Fix a nonzero complex number and an element µ ∈ h∗ . For any root α introduce a meromorphic function ϕα : h∗ → C by the following rule. If a root α is a linear combination of simple roots from X, then we set (α, λ − µ) . ϕα (λ) = cotanh 2 2 Otherwise we Pset ϕα (λ) = /2, if α is positive, and ϕα (λ) = −/2, if α is negative. Let C = i,j Ci,j dxi ⊗ dxj be a closed meromorphic 2-form on h∗ . ˆ by the formula Theorem 3.10. 1. Introduce a function r : h∗ → g⊗g r(l + a) =

N X

Ci,j (λ)xi ⊗ xj +

i,j=1

X + ϕα (λ) eα ⊗ e−α , 2 α∈1

where Ci,j and ϕα are defined above. Then r is a classical dynamical r-matrix with nonzero coupling constant . 2. Any classical dynamical r-matrix, r : h∗ → g ⊗ g , with nonzero coupling constant has this form. The proof of Theorem 3.10 is based on the following fact. Theorem 3.11. Let Y ⊂ 1 be a subset of the set of roots with two properties. A. If α, β ∈ Y and α + β ∈ 1, then α + β ∈ Y . B. If α is an element of Y , then −α is not an element of Y . Then there exists a polarization 1 = 1+ ∪ 1− such that Y ⊂ 1+ . Proof (of Theorem 3.11). Consider n = ⊕α∈Y C eα , Then n, m are Lie subalgebras of g. Lemma 3.12. 1. [m, m] = n.

m = h ⊕ ⊕α∈Y C eα .

100

P. Etingof, A. Varchenko

2. The Killing form B of m vanishes on n. Proof (of Lemma 3.12). The first statement follows from (3.11.B). The Killing form B of m is defined by B(x, y) = tr|m (ad x|m · ad y|m ). Now the second statement follows from (3.11.B).

According to Theorem 2.1.2 in [GG], conditions 1 and 2 for a finite-dimensional Lie algebra m imply that m is solvable. Thus, m is a solvable subalgebra of g. This, in particular, means that m is contained in a Borel subalgebra b. Since h ⊂ b, the Borel subalgebra b defines a polarization of roots 1 = 1+ ∪ 1− such that Y ⊂ 1+ . Theorem 3.11 is proved. Proof (of Theorem 3.10). To prove the first statement of Theorem 3.10 it is enough to check that for any roots α, β, γ ∈ 1 such that α + β + γ = 0, the functions ϕα , ϕβ , ϕγ satisfy Eq. (3.15). This could be easily done by direct verification. Now we prove the second statement of Theorem 3.10. Let r : h∗ → g ⊗ g be a classical dynamical r-matrix with nonzero coupling constant . According to Sect. 3.6, the r-matrix has the form r(λ) =

N X i,j=1

Ci,j (λ)xi ⊗ xj +

X + ϕα (λ) eα ⊗ e−α , 2 α∈1

P where i,j Ci,j dxi ⊗ dxj is a closed meromorphic 2-form on h∗ , and the functions ϕα are scalar meromorphic functions such that ϕα = 2 cotanh ( 2 ((α, λ)−µα )) for a suitable constant µα or ϕα = ±/2. Moreover, for any roots α, β, γ ∈ 1 such that α + β + γ = 0, the functions ϕα , ϕβ , ϕγ satisfy Eq. (3.15). Let Y ⊂ 1 be the set of all roots α such that ϕα = /2. Equation (3.15) and the unitarity condition easily imply that the set Y has properties 3.11.A - 3.11.B. By Theorem 3.11 there exists a polarization of roots 1 = 1+ ∪ 1− such that Y ⊂ 1+ . Introduce two sets X and Z by X = 1s+ − 1s+ ∩ Y,

Z = 1+ − Y.

Lemma 3.13. Z is the span of X in 1+ , i.e. Z = Z≥0 [X] ∩ 1+ . Proof. If α, β, α + β ∈ 1+ , ϕα 6= ±/2 and ϕβ 6= ±/2, then Eq. (3.15) implies that ϕα+β 6= ±/2. This statement implies the inclusion Z≥0 [X] ⊂ Z. If α, β, α + β ∈ 1+ and ϕα = /2, then Eq. (3.15) implies that ϕα+β = /2 ( since ϕβ could not be equal to −/2 ). This statement implies the inclusion Z ⊂ Z≥0 [X]. The lemma is proved. For α ∈ Z the functions ϕα have the form ϕα = 2 cotanh ( 2 ((α, λ)−µα )), where µα are suitable numbers. Moreover, if α, β, α + β ∈ Z , then the corresponding constants µα , µβ , µα+β satisfy the equation µα + µβ = µα+β . Let µ ∈ h∗ be any element in the Cartan subalgebra such that µα = (α, µ) for all α ∈ X. Then for all α ∈ Z, we have µα = (α, µ). Now we may conclude that the r-matrix r has the form indicated in Theorem 3.10 and is associated to the polarization 1 = 1+ ∪ 1− , the set X and the element µ constructed above. Theorem 3.10 is proved.

Classical Dynamical Yang–Baxter Equation

101

Now we show that each of the r-matrices indicated in Theorem 3.10 is a limiting case of the r-matrix (3.4) (unlike the example at the end of Sect. 3.4). Namely, let 1 = 1+ ∪ 1− be a polarization, X ⊂ 1s+ a subset of positive simple roots, µ an element of h∗ , a nonzero complex number. Set X ωi , ν(t) = µ + t i∈X

where ωi are fundamental weights of g. Consider the r-matrix rt (λ) defined by (3.4) with parameter ν equal to ν(t). Then the limit of rt (λ) as t tends to infinity is equal to the r-matrix of Theorem 3.10. Example. For any polarization, the constant matrix r=

X 1 X xi ⊗ xi + eα ⊗ e−α 2 i α∈1

(3.16)

+

is a solution to the classical Yang–Baxter equation. In particular, it is a solution to the CDYB equation with coupling constant 1. This solution corresponds to X = 1s+ . Remark. Consider the r-matrix (3.4) with Ci,j = 0 and ν = 0, X cotanh ( (α, λ)) eα ⊗ e−α . r(λ) = + 2 2 2 α∈1 Let 0 ⊂ h∗ be an alcove of the Lie algebra g and 1 = 1+ ∪ 1− the corresponding polarization. If λ tends to infinity inside the alcove 0 in a generic direction, then the rmatrix r(λ) has a limit and this limit is given by (3.16). Thus the r-matrix r(λ) extrapolates different solutions of the classical Yang–Baxter equation of type (3.16), labeled by different polarizations. 3.8. Classical dynamical r-matrices associated with a pair of Lie algebras. Let g be a simple Lie algebra, h its Cartan subalgebra, g = h ⊕ ⊕α∈1 gα the root decomposition, (·, ·) an invariant nondegenerate bilinear form on g and ∈ g⊗g the associated Casimir operator. Let l be a Lie subalgebra of g containing h. Assume that l is reductive. This condition is equivalent to the condition that there is a subset 1(l)+ of the set 1+ of positive roots such that l = h ⊕ ⊕α∈1(l)+ ( gα ⊕ g−α ). Let H ⊂ L ⊂ G be the corresponding complex Lie groups. A meromorphic function r : l∗ → g ⊗ g is called a classical dynamical r-matrix associated with the pair l ⊂ g if it satisfies the following three conditions: 1. The invariance condition. The meromorphic function r is Ad L - invariant, r(xλx−1 ) =Ad x ( r(λ) ) for any λ ∈ l∗ and x ∈ L.

102

P. Etingof, A. Varchenko

2. The generalized unitarity,

r12 (λ) + r21 (λ) =

(3.17)

for some constant ∈ C and all λ. 3. The classical dynamical Yang–Baxter equation, CDYBE, Alt(dr) + [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0.

(3.18)

The differential of the r-matrix is considered in (3.18) as a meromorphic function dr : l∗ → g ⊗ g ⊗ g,

λ 7→

X

yi ⊗

i

∂r23 (λ) , ∂yi

where {yi } is any basis in l. As before we denote by Alt(dr) the following symmetrization of dr , X (1) ∂r23 X (2) ∂r31 X (3) ∂r12 yi + yi + yi . Alt(dr) = ∂yi ∂yi ∂yi i i i It turns out that the classification of the classical dynamical r-matrices associated with a pair l ⊂ g can be reduced to the classification of the classical dynamical r-matrices associated with the pair h ⊂ g . Namely, let us identify l∗ with h∗ ⊕ ⊕α∈1(l)+ ( g∗α ⊕ g∗−α ). Let r : l∗ → g ⊗ g be a classical dynamical r-matrix associated with a pair l ⊂ g . First notice that if λ ∈ l∗ is a semisimple element, then there exists x ∈ L such that xλx−1 ∈ h∗ . Since semisimple elements are dense and since the r-matrix satisfies the invariance condition, the function r is completely determined by its restriction to the dual to the Cartan subalgebra, r|h∗ . For any α ∈ 1(l)+ fix basis elements eα ∈ gα and e−α ∈ g−α which are dual with respect to the invariant bilinear form. Define a function ρ : h∗ → g ⊗ g by the formula X eα ⊗ e−α − e−α ⊗ eα ρ(λ) = . (3.19) (α, λ) α∈1(l)+

Theorem 3.14. A function r : l∗ → g ⊗ g satisfying the invariance condition is a classical dynamical r-matrix associated with the pair l ⊂ g if and only if the function r|h∗ + ρ : h∗ → g ⊗ g is a classical dynamical r-matrix associated with the pair h ⊂ g. Theorem 3.14 is proved in Sect. 3.9. 3.9. Proof of Theorem 3.14. Let {xi } be any basis in h. The elements {xi }, and eα , e−α , α ∈ 1(l)+ , form a basis in l. Then dr(λ) =

X i

xi ⊗

X ∂r23 ∂r23 ∂r23 (λ) + ( eα ⊗ (λ) + e−α ⊗ (λ) ) ∂xi ∂eα ∂e−α α∈1(l)+

for any λ ∈ l∗ . Lemma 3.15. Let λ ∈ h∗ . Then the second sum in (3.20) is equal to [ρ12 (λ) + ρ13 (λ), r23 (λ)].

(3.20)

Classical Dynamical Yang–Baxter Equation

103

Proof. Let {x∗i }, and e∗α , e∗−α , α ∈ 1(l)+ be the basis in l∗ dual to the basis {xi }, eα , e−α , α ∈ 1(l)+ . The Lie algebra l acts on l∗ by ad∗ . Compute this action on elements of h∗ ⊂ l∗ . For λ ∈ h∗ we have ((ad∗ eα ) λ)(y) = −λ([eα , y]). Hence (ad∗ eα ) λ = −λ(hα ) e∗−α = −(α, λ) e∗−α . Similarly (ad∗ e−α ) λ = (α, λ) e∗α and (ad∗ xj ) λ = 0. Now for any λ ∈ h∗ we have 1 ∂r [e−α ⊗ 1 + 1 ⊗ e−α , r(λ)]. (λ) = ∂eα (α, λ)

(3.21)

Indeed, r(ete−α λe−te−α ) = r(λ) + t (α, λ)

∂r (λ) + O(t2 ), ∂eα

and the invariance condition gives r(ete−α λe−te−α ) = (ete−α ⊗ ete−α ) r(λ) (e−te−α ⊗ e−te−α ) = r(λ) + t [e−α ⊗ 1 + 1 ⊗ e−α , r(λ)] + O(t2 ), which implies (3.21). Similarly

−1 ∂r [eα ⊗ 1 + 1 ⊗ eα , r(λ)]. (λ) = ∂e−α (α, λ)

(3.22)

Formulae (3.21) and (3.22) prove the lemma. Denote the first sum in (3.20) by dh r, then for λ ∈ h∗ we have dr(λ) = dh r(λ) + [ρ12 (λ) + ρ13 (λ), r23 (λ)]. Now we finish the proof of Theorem 3.14. Let r : l∗ → g⊗g be a function satisfying the invariance condition. Restrict this function to h∗ . The invariance condition implies that the restriction satisfies the zero weight condition (3.1). The restriction also satisfies the generalized unitarity condition (3.17). Introduce a function r˜ : h∗ → g ⊗ g by r|h∗ (λ) = r(λ) ˜ − ρ(λ). Then the new function r(λ) ˜ satisfies the zero weight condition (3.1) and the generalized unitarity condition (3.17). Lemma 3.16. The function r satisfies the CDYB Eq. (3.18) on h∗ if and only if the function r˜ satisfies the CDYB Eq. (3.3). Lemma 3.16 implies Theorem 3.14. Proof (of Lemma 3.16). The restriction of the CDYB equation on h∗ takes the form Alt(dh r) ˜ − Alt(dh ρ) − [ρ12 + ρ13 , ρ23 ] − [ρ21 + ρ23 , ρ31 ] −[ρ31 + ρ32 , ρ12 ] + [ρ12 , ρ13 ] + [ρ12 , ρ23 ] + [ρ13 , ρ23 ] −[ρ12 + ρ13 , r˜ 23 ] − [ρ21 + ρ23 , r˜ 31 ] − [ρ31 + ρ32 , r˜ 12 ] −[r˜ 12 , ρ13 ] − [ρ12 , r˜ 13 ] − [r˜ 12 , ρ23 ] − [ρ12 , r˜ 23 ] − [r˜ 13 , ρ23 ] −[ρ13 , r˜ 23 ] + [r˜ 12 , r˜ 13 ] + [r˜ 12 , r˜ 23 ] + [r˜ 13 , r˜ 23 ].

104

P. Etingof, A. Varchenko

Using the identity ρ12 (λ) + ρ21 (λ) = 0 we conclude that the terms containing only the function ρ take the form of the CDYB Eq. (3.3). By Theorem 3.2 we know that the function ρ satisfies Eq. (3.3), and hence the ρ-terms in (3.23) are canceled out. Using in addition the generalized unitarity condition for r˜ we see that the [ρ, r]-terms ˜ in (3.23) are canceled out too. The remaining part of (3.23) is the CDYB Eq. (3.3) for the function r. ˜ Lemma 3.16 and Theorem 3.14 are proved. 4. Classification of Classical Dynamical r-Matrices with Spectral Parameter 4.1. Classical dynamical r-matrices with spectral parameter. Let g be a simple Lie algebra, h its Cartan subalgebra, g = h ⊕ ⊕α∈1 gα the root decomposition, (·, ·) an invariant nondegenerate bilinear form on g and ∈ g⊗g the associated Casimir operator. For any positive root α ∈ 1(h) fix basis elements eα ∈ gα and e−α ∈ g−α which are dual with respect to the bilinear form. Fix a basis in the Cartan subalgebra, {xi }, orthonormal with respect to the bilinear form. Let E be a neighborhood of 0 in C, 0 ∈ E ⊂ C. A meromorphic function r : h∗ × E → g ⊗ g is called a classical dynamical r-matrix with spectral parameter associated with the pair h ⊂ g if it satisfies the following four conditions: 1. The zero weight condition, [h ⊗ 1 + 1 ⊗ h , r(λ, z)] = 0

(4.1)

for all λ ∈ h∗ , z ∈ E and h ∈ h. 2. The generalized unitarity, r12 (λ, z) + r21 (λ, −z) = 0

(4.2)

Resz=0 r(λ, z) =

(4.3)

for all λ ∈ h∗ and z ∈ E. 3. The residue condition for some constant ∈ C. 4. The classical dynamical Yang–Baxter equation, CDYBE, Alt(dh r) + [r12 (λ, z1,2 ), r13 (λ, z1,3 )]+ [r12 (λ, z1,2 ), r23 (λ, z2,3 )] + [r13 (λ, z1,3 ), r23 (λ, z2,3 )] = 0, where zi,j = zi − zj .

(4.4)

Classical Dynamical Yang–Baxter Equation

105

In (4.4) the differential of the r-matrix is considered with respect to the h-variables, dh r : h∗ × E → g ⊗ g ⊗ g,

(λ, z) 7→

X i

xi ⊗

∂r23 (λ, z). ∂xi

In (4.4) we denote by Alt(dh r) the following symmetrization of dh r , Alt(dh r) =

X

x(1) i

i

X (2) ∂r31 X (3) ∂r12 ∂r23 (λ, z2,3 ) + xi (λ, z3,1 ) + xi (λ, z1,2 ). ∂xi ∂xi ∂xi i i

The variable z is called the spectral parameter. The number in (4.3) is called the coupling constant. We classify the germs of classical dynamical r-matrices with spectral parameter at the subset h∗ ×0 ⊂ h∗ ×C.PWe assume that an r-matrix has a Laurent power series expansion of the form r(λ, z) = m rm (λ) z m , where rm (λ) are meromorphic functions on h∗ . We assume that the Laurent expansion is convergent to a meromorphic function of λ and z in a punctured neighborhood of h∗ × 0 in h∗ × C. Any function r(λ, z) with these properties will be called a function with Laurent expansion. 4.2. Gauge transformations. In this subsection we introduce four transformations of maps r : h∗ × C → g ⊗ g called the gauge transformations. We assume that the map satisfies the zero weight condition and the generalized unitarity condition and, therefore, has the form r(λ, z) =

N X

Si,j (λ, z)xi ⊗ xj +

X

ϕα (λ, z) eα ⊗ e−α ,

(4.5)

α∈1

i,j=1

where ϕα , Si,j are suitable scalar meromorphic functions such that ϕ−α (λ, −z) = −ϕα (λ, z), and Si,j (λ, −z) = −Sj,i (λ, z). P 1. Let C = i,j Ci,j (λ)dxi ⊗ dxj be a closed meromorphic 2-form. Set r(λ, z) 7→ r(λ, z) +

N X

Ci,j (λ) xi ⊗ xj .

i,j=1

2. For a vector v ∈ h∗ and a function f on h∗ , we denote Lv f a new function on h∗ which is the derivative of f along the constant vector field defined by v. For a holomorphic function ψ : h∗ → C, set r(λ, z) 7→

N X

(Si,j (λ, z)+z

i,j=1

X ∂2ψ (λ)) xi ⊗xj + ϕα (λ, z) ezLα ψ(λ) eα ⊗e−α . ∂xi ∂xj α∈1

3. For ν ∈ h∗ , set r(λ, z) 7→

N X i=1

Si,j (λ − ν, z) xi ⊗ xj +

X α∈1

ϕα (λ − ν, z) eα ⊗ e−α .

106

P. Etingof, A. Varchenko

4. For nonzero complex number a, b, set r(λ, z) 7→ a r(aλ, bz). Notice that the first three transformations do not change the residue of r(λ, z) at z = 0 and the last transformation multiplies it by a/b. Theorem 4.1. Any gauge transformation transforms an r-matrix with spectral parameter to an r-matrix with spectral parameter. Theorem 4.1 is proved in Sect. 4.5. Two maps r(λ, z) and r0 (λ, z) will be called equivalent if one of them could be transformed into another by a sequence of gauge transformations. 4.3. Elliptic, trigonometric and rational r-matrices with spectral parameter. In this section we give examples of r-matrices with spectral parameter, and formulate a theorem that any r-matrix with nonzero coupling constant is equivalent to one of the examples. In order to describe r-matrices we use theta functions. Let τ be a nonzero complex number such that Im τ > 0. Let θ1 (z, τ ) = −

∞ X

1 2

eπi(j+ 2 )

τ +2πi(j+ 21 )(z+ 21 )

j=−∞

be the Jacobi theta function. Following [FW] introduce the functions σw (z, τ ) =

θ1 (w − z, τ ) θ10 (0, τ ) , θ1 (w, τ ) θ1 (z, τ )

ρ(z, τ ) =

θ10 (z, τ ) , θ1 (z, τ )

(4.6)

where 0 means the derivative with respect to the first argument. Notice that σw (z, τ ) = z −1 + O(1), ρ(z, τ ) = z −1 + O(z) as z → 0 and σ−w (−z, τ ) = −σw (z, τ ), ρ(−z, τ ) = −ρ(z, τ ). Example of an elliptic r-matrix. r(λ, z, τ ) = ρ(z, τ )

N X

xi ⊗ xi +

X

σ−(α,λ) (z, τ )eα ⊗ e−α .

(4.7)

α∈1

i=1

For every τ ∈ C, Im τ > 0, the function r(λ, z, τ ) is a classical dynamical r-matrix with spectral parameter z and coupling constant = 1 [FW]. Examples of trigonometric r-matrices. Let 1 = 1+ ∪ 1− be a polarization of the set of roots. Fix a subset X ⊂ 1s+ of the set of simple positive roots. For any root α introduce a meromorphic function ϕα : h∗ → C by the following rule. If a root α is a linear combination of simple roots from X, then we set ϕα (λ, z) =

sin ((α, λ) + z) , sin (α, λ) sin z

otherwise we set ϕα (λ, z) =

e−iz , sin z

for α ∈ 1+ ,

ϕα (λ, z) =

eiz , sin z

for α ∈ 1− .

Classical Dynamical Yang–Baxter Equation

107

We introduce a trigonometric r-matrix by r(λ, z) = cotan z

N X

xi ⊗ xi +

X

ϕα (λ, z) eα ⊗ e−α ,

(4.8)

α∈1

i=1

where cotan z = cos z /sin z. Examples of rational r-matrices. For a subset X ⊂ 1 of the set of roots closed with respect to addition and multiplication by −1, we introduce a rational r-matrix by r(λ, z) =

X 1 + eα ⊗ e−α . z (α, λ)

(4.9)

α∈X

Theorem 4.2. 1. Each of the r-matrices (4.7)–(4.9) described in this section is a classical dynamical r-matrix with coupling constant 1. 2. The germ at h∗ × 0 ⊂ h∗ × C of any classical dynamical r-matrix with spectral parameter and a nonzero coupling constant is equivalent to one of the r-matrices (4.7)–(4.9). Corollary. Any such a germ extends to a meromorphic function on h∗ × C . Theorem 4.2 is proved in Sects. 4.4 and 4.5. 4.4. Proof of part 1 of Theorem 4.2. According to [FW], the elliptic r-matrix (4.7) is a classical dynamical r-matrix with spectral parameter . Taking the limit of the r-matrix (4.7) when τ tends to +i∞, we conclude that for any fixed element ν ∈ h the r-matrix rν (λ, z) = cotan z

N X

xi ⊗ xi +

i=1

X sin ((α, λ − ν) + z) eα ⊗ e−α sin (α, λ − ν) sin z α∈1

is a classical dynamical r-matrix with spectral parameter. If ν = 0, then this r-matrix has the form (4.8) with X = 1s+ . To show that any matrix of the form of (4.8) is a classical dynamical r-matrix with spectral parameter it is enough to apply to rν the limiting procedure with respect to ν, cf. the end of Sect. 3.7. Lemma 4.3. If r0 (λ) is a classical dynamical r-matrix without spectral parameter and with zero coupling constant, then r(λ, z) =

+ r0 (λ) z

is a classical dynamical r-matrix with spectral parameter. Proof by direct verification. Lemma 4.3 and Theorem 3.2 show that any r-matrix (4.9) is a classical dynamical r-matrix with spectral parameter. 4.5. Proof of part 2 of Theorem 4.1 and Theorem 4.2. Let r : h∗ × C → g ⊗ g be a germ at h∗ × 0 ⊂ h∗ × C of a classical dynamical r-matrix with spectral parameter. It

108

P. Etingof, A. Varchenko

follows from the zero weight condition that the r-matrix can be written in the form of (4.5). The CDYB Eq. (4.4) is an equation in g⊗3 . Its left hand side is invariant with respect to even permutations of factors and simultaneuos permutations of variables z1 , z2 , z3 . In order to solve the CDYB equation it suffices to solve its h ⊗ h ⊗ h−, h ⊗ gα ⊗ g−α − and gα ⊗ gβ ⊗ gγ −parts, where α, β, γ ∈ 1 and in the last case α + β + γ = 0. 4.5.1. The h ⊗ h ⊗ h-part of the CDYB equation. First we analyse the h ⊗ h ⊗ h-part of the CDYB equation (4.4) which has the form X (2) ∂S 31 X (1) ∂S 23 xi (λ, z2 − z3 ) + xi (λ, z3 − z1 ) ∂xi ∂xi i i (4.10) X (3) ∂S 12 + xi (λ, z1 − z2 ) = 0. ∂xi i Lemma 4.4. The sum

P

23

i

∂S x(1) i ∂xi (λ, z) is a linear function of z.

Proof. Differentiating (4.10) with respect to z1 and z2 we conclude that the second derivative ∂ 2 /∂z1 ∂z2 of the third sum in (4.10) is equal to zero. This implies the lemma. P Corollary 4.5. Consider the h ⊗ h-valued function S(λ, z) = i,j Si,j (λ, z)xi ⊗ xj . P n Let S(λ, z) = n S (λ) z n be its Laurent expansion. Then ∂S n (λ) = 0 ∂xi for all λ, i, and n, n 6= 0, 1. Each of the three sums in (4.10) is a linear function of z1 , z2 , z3 . Hence Eq. (4.10) splits into four independent equations corresponding to the coefficients of z1 , z2 , z3 and the constant coefficient. The constant coefficient part has the form 0 ∂Sj,k

+

0 ∂Sk,i

+

0 ∂Si,j =0 ∂xk

∂xi ∂xj P 0 and is equivalent to the fact that Si,j (λ)dxi ⊗ dxj is a closed differential form. Lemma 4.6. There is a multivalued meromorphic function ψ : h∗ → C with univalued meromorphic second derivatives such that ∂2ψ 1 (λ) = Si,j (λ) ∂xi ∂xj for all i, j, λ. 1 (λ) = Proof. The z1 , z2 , z3 -parts of Eq. (4.10) together with the unitarity condition Si,j 1 Sj,i (λ) have the form 1 1 ∂Sk,j ∂Si,j (λ) = (λ) ∂xk ∂xi 1 = for all λ, i, j, k. These equations imply that there exist functions ϕj such that Si,j ∂ϕj /∂xi and moreover, ∂ϕj /∂xi = ∂ϕi /∂xj . Hence, there exists a function ψ with the properties indicated in the lemma.

Classical Dynamical Yang–Baxter Equation

109

Remark. Later we will show that the function ψ is in fact holomorphic in h∗ . Corollary 4.7. Let s : C → h ⊗ h be a germ at h∗ × 0 ⊂ h∗ × C of a meromorphic function with Laurent expansion. Assume that s(z) + s21 (−z) = 0 for all z, andP assume that the Laurent expansion of s does not contain the terms of degree 0, 1, s(z) = m6=0,1 sm z m . Let r : h∗ × C → g ⊗ g be a germ at h∗ × 0 ⊂ h∗ × C of a function of the form r(λ, z) = s(z) +

N X

Ci,j (λ) xi ⊗ xj + z

i,j=1

N X

∂2ψ (λ) xi ⊗ xj + ∂xi ∂xj i,j=1 X ϕα (λ, z) eα ⊗ e−α +

(4.11)

α∈1

P

where Ci,j dxi ⊗ dxj is a closed meromorphic form on h∗ , ψ is a multivalued meromorphic function with univalued meromorphic second derivatives , and the functions ϕα are such that ϕ−α (λ, −z) = −ϕα (λ, z). Then the function r satisfies the zero weight condition (4.1), the unitarity condition (4.2) and the h⊗h⊗h-part of the CDYB Eq. (4.4). Moreover, any classical dynamical r-matrix with spectral parameter has this form. 4.5.2. The h ⊗ gα ⊗ g−α -part of the CDYB equation. Now we analyze the h ⊗ gα ⊗ g−α part of the CDYB equation. This part has the form X ∂ϕα xi ⊗ eα ⊗ e−α + ϕ−α (λ, z1,2 ) ϕα (λ, z1,3 ) [e−α , eα ] ⊗ eα ⊗ e−α + ∂xi i

X

ϕα (λ, z2,3 ) (si,j (z1,2 ) + Ci,j (λ) + z1,2

∂2ψ (λ)) xi ⊗ [xj , eα ] ⊗ e−α + ∂xi ∂xj

ϕα (λ, z2,3 ) (si,j (z1,3 ) + Ci,j (λ) + z1,3

∂2ψ (λ)) xi ⊗ eα ⊗ [xj , e−α ] = 0. ∂xi ∂xj

i,j

X i,j

This equation can be written as −ϕ−α (λ, z1,2 ) ϕα (λ, z1,3 ) hα ⊗ eα ⊗ e−α + ϕα (λ, z2,3 )

X j

[ si,j (z1,2 ) − si,j (z1,3 ) + z3,2 α(xj )

X ∂ϕα ( + ∂xi i

∂2ψ (λ) ] ) xi ⊗ eα ⊗ e−α = 0. ∂xi ∂xj

(4.12) We interpret this equation as an equation of differential 1-forms on h∗ identifying linear functions with their differentials and ignoring the factor eα ⊗ e−α . Then the first term in (4.12) can be written as ϕα (λ, z2,1 ) ϕα (λ, z1,3 ) dhα . The second has the form P dϕα (λ, z2,3 ), where the differential∗ is with respect to λ. For a fixed z consider si,j (z) xi ⊗ xj as a bilinear form on h , X s(z){λ, µ} = si,j (z) < xi , λ >< xj , µ > .

110

P. Etingof, A. Varchenko

If the second argument of this form is equal to α, then we get a linear function on h∗ , P si,j (z) < xj , α > xi . Hence the third and the fourth terms in (4.12) have the form ϕα (λ, z2,3 ) (ds(z1,2 ){λ, α} − ds(z1,3 ){λ, α}), where the differentials are with respect to λ. Finally the last term in (4.12) could be written as z2,3 ϕα (λ, z2,3 ) Lα dψ(λ), where Lα dψ(λ) is the Lie derivative of the differential with respect to the constant vector field defined by α. Now the h ⊗ gα ⊗ g−α -part of the CDYB equation takes the form dϕα (λ, z2,3 ) + ϕα (λ, z2,1 ) ϕα (λ, z1,3 ) dhα + ϕα (λ, z2,3 ) (ds(z1,2 ){λ, α} − ds(z1,3 ){λ, α} + z2,3 Lα dψ(λ) ) = 0. Setting u = z2,1 and v = z1,3 we get ϕα (λ, u) ϕα (λ, v) dϕα (λ, u + v) + dhα + ϕα (λ, u + v) ϕα (λ, u + v) ds(−u){λ, α} − ds(v){λ, α} − (u + v) Lα dψ(λ) = 0, where the differentials are with respect to λ. Make a change of variables, ϕα (λ, z) = 8(λ, z) ezLα ψ(λ) . Then d8(λ, u + v) 8(λ, u) 8(λ, v) + dhα + ds(−u){λ, α} − ds(v){λ, α} = 0. 8(λ, u + v) 8(λ, u + v) Taking the differential of both sides we see that the second ratio depends only on hα . Hence the function 8 has the form 8(λ, z) = µ(hα (λ), z) ezν(λ) for suitable new functions µ(hα (λ), z) and ν(λ). Now the equation takes the form µ(hα , u) µ(hα , v) ∂µ(hα , u + v)/∂hα dhα + (u + v) dν(λ) + dhα + µ(hα , u + v) µ(hα , u + v) ds(−u){λ, α} − ds(v){λ, α} = 0.

(4.13)

Consider a coordinate system on h∗ , y1 , ..., yN ∈ h, such that y1 = hα and < α, yi >= 0 for i > 1. Then (4.13) gives the equations ∂ν µ(hα , u) µ(hα , v) ∂µ(hα , u + v)/∂hα + (u + v) + (λ) + µ(hα , u + v) ∂hα µ(hα , u + v) s(−u){α, α} − s(v){α, α} = 0, (α, α) (u + v)

∂ ∂ ∂ν (λ) + s(−u){ , α} − s(v){ , α} = 0 , ∂yi ∂yi ∂yi

i = 2, ..., N.

(4.14)

(4.15)

∂ (Here { ∂y } is the basis of h∗ dual to the basis {yi } of h.) Equations (4.15) imply that i ∂ν/∂yi = 0, since the Laurent expansion of the function s(z) does not have the first order term. So we can assume that ν = 0 and, therefore, ϕα (λ, z) = µ(hα (λ), z) ezLα ψ(λ) . ∂ , α} does not depend on z. But the Laurent Now Eqs. (4.15) imply that s(z){ ∂y i expansion of this function does not contain the terms of degree zero and one. So the ∂ , α} is identically equal to zero for i ≥ 2. function s(z){ ∂y i ∂ The fact that s(z){ ∂y , α} is identically equal to zero for i ≥ 2 easily implies that i

Classical Dynamical Yang–Baxter Equation

111

s(z) = t(z)

N X

xi ⊗ xi ,

i=1

where t(z) is a scalar function. The function t(z) is a nonzero function since the coupling constant is not zero. We have t(−z) = −t(z), and the Laurent expansion of t does not have the terms of degree zero and one. Then (4.14) takes the form ∂µ(hα , u + v) = (t(u) + t(v))µ(hα , u + v) − µ(hα , u) µ(hα , v). ∂hα

(4.16)

4.5.3. Proof of Theorem 4.1. Before proceeding with analysis of Eq. (4.16) let us write the gα ⊗ gβ ⊗ gγ − part of the CDYB equation, ϕα (λ, z13 ) ϕβ (λ, z23 ) + ϕβ (λ, z21 ) ϕγ (λ, z31 ) + ϕα (λ, z12 ) ϕγ (λ, z32 ) = 0.

(4.17)

It is easy to see that Eqs. (4.10),(4.16), (4.17) are invariant with respect to the gauge transformations of Sect. 4.2. This proves Theorem 4.1. 4.5.4. The h ⊗ gα ⊗ g−α -part of the CDYB equation: classification of solutions. Now we will find all solutions of Eq. (4.16). Lemma 4.8. The function µ(x, z) has at most a simple pole at z = 0. Proof. Applying the operator ∂/∂u − ∂/∂v to both sides of (4.16) we have (t0 (u) − t0 (v))µ(x, u + v) − µ0 (x, u) µ(x, v) + µ(x, u) µ0 (x, v) = 0. Set v = −u + δ. Then using the fact that t0 (−z) = t0 (z) we get (t0 (u) − t0 (u − δ))µ(x, δ) − µ0 (x, u) µ(x, −u + δ) + µ(x, u) µ0 (x, −u + δ) = 0. The function t(z) could not be a linear function. Hence the first factor is of order δ. The second and the third terms are regular at generic values of u. This shows the lemma. The lemma easily implies that the function t(z) has also at most a simple pole at z = 0. Thus, ∞ ∞ X X n tn z n . µn (x)z , t(z) = µ(x, z) = n=−1 n=−1, n odd Notice that from the residue condition (4.3) we know that t−1 = (α, α), where is the coupling constant. Substitute the expansions into (4.16) and multiply both sides by uv(u + v), ∞ X n=−1

µ0n (x)(u + v)n+1 uv −

∞ X

tn µm (x)(u + v)m+1 (un + v n )uv+

n,m=−1 ∞ X

µn (x)µm (x)un+1 v m+1 (u + v) = 0.

n,m=−1

(4.18) This is an equality of power series in u and v. Equating the homogeneous parts we get a sequence of equations for the numbers {tn } and the functions {µn (x)}. We write the first equations. The equation of degree 1 has the form

112

P. Etingof, A. Varchenko

µ−1 (x) = t−1 .

(4.19)

The equation of degree 2 has the form µ0−1 (x) = 0 and follows from (4.19). The equations of degree 3 and 4 have the form µ00 = 2t−1 µ1 − µ20 + t−1 t1 , µ01 = 3t−1 µ2 + t1 µ0 − µ0 µ1 . The equation of degree 5 gives two scalar equations µ02 = 4t−1 µ3 + t1 µ1 + t3 t−1 − µ0 µ2 , 2µ02 = 6t−1 µ3 + 2t1 µ1 − t3 t−1 − µ21 , which imply

µ02 = t1 µ1 − 5t−1 t3 + 3µ0 µ2 − 2µ21 , 2µ3 t−1 = 2µ0 µ2 − 3t−1 t3 − µ21 .

Thus we have

µ00 = t−1 t1 + 2t−1 µ1 − µ20 , µ01 = 3t−1 µ2 + t1 µ0 − µ0 µ1 , µ02

= t1 µ1 − 5t−1 t3 + 3µ0 µ2 −

(4.20) 2µ21 .

Lemma 4.9. Let t−1 6= 0 and n ≥ 3. Then the degree n+3 equation determine µn+1 , tn+1 uniquely in terms of µm , tm with smaller m. Proof. The degree n + 3 equation contains only µm , tm with m ≤ n + 1 and has the form −t−1 µn+1 (u + v)n+2 (

1 1 + )uv−tn+1 µ−1 (un+1 + v n+1 )uv+ u v µ−1 µn+1 (un+2 + v n+2 )(u + v) + ... = 0,

where ... denotes the terms containing only tm and µm with m < n + 1. The coefficients of un+2 v and un+1 v 2 have the form (n + 2)µn+1 + tn+1 + ... = 0, (n + 3)(n + 2)µn+1 + ... = 0. The equations imply the lemma.

Corollary 4.10. If t−1 6= 0, then a solution of Eq. (4.16) is uniquely determined by the six parameters µ0 (0), µ1 (0), µ2 (0), t−1 , t1 , t3 . Now we present a six parameter family of solutions. We shall use the solution of the CDYB Eq. (4.4) given in [FW], X X r(λ, z, τ ) = −ρ(z, τ ) xi ⊗ xi − σhα (λ) (z, τ )eα ⊗ e−α , i

α∈1

where the functions ρ and σ are defined in (4.6). For every τ ∈ C, Im τ > 0, the function r(λ, z, τ ) is a classical dynamical r-matrix with spectral parameter z and coupling constant = −1 [FW]. Hence, for every τ , the functions t(z) = −ρ(z, τ ) and µ(x, z) = −σx (z, τ ) form a solution of (4.16).

Classical Dynamical Yang–Baxter Equation

113

Lemma 4.11. Let t(z) and µ(x, z) be a solution of (4.16). Let A be a complex number, then t(Az), µ(x, Az), t(z), µ(x + A, z), A t(z), A µ(Ax, z), t(z) + Az, t(z),

eAzx µ(x, z), eAz µ(x, z)

are solutions of (4.16). Corollary 4.12. The functions t(z) = −Aρ(Bz, τ ) + Dz, µ(x, z) = −AσAx−C (Bz, τ )ez(Dx+E)

(4.21)

form a solution of (4.16) depending on six parameters A, B, C, D, E, τ . Let t, µ be the solution of (4.16) defined by (4.21). Let t−1 , t1 , t3 , µ0 (0), µ1 (0), µ2 (0) be the corresponding Taylor coefficients of t and µ, cf. Corollary 4.10. These six Taylor coefficients are functions of the six parameters A, B, C, D, E, τ and define a meromorphic map χ : C5 × H → C6 , where H is the upper half plane. Lemma 4.13. The Jacobian of this map is not identically equal to zero, and the image of the map is dense in C6 . Proof. In order to prove that the Jacobian is not zero it suffices to show that all the six parameters A, ..., τ could be recovered from the function µ = −A

θ1 (Ax − C − Bz, τ )θ10 (0, τ ) z(Dx+E) e . θ1 (Ax − C, τ )θ1 (Bz, τ )

In fact, the poles of this function are given by Ax − C ∈ Z ⊕ τ Z

and Bz ∈ Z ⊕ τ Z.

Knowing the poles we recover τ, A, B, C. If Bz 7→ Bz + 1, then µ 7→ µe(Dx+E)/B , this property allows us to recover D, E at least locally. d ln µ(0, z) = − z1 + Now let us show the density of the image of χ. Let p(z) = dz 2 5 6 p0 + p1 z + p2 z + · · · . Consider the map χ˜ : C × H → C given by the formula (A, B, D, E, C, τ ) → (t−1 , t1 , t3 , p0 , p1 , p2 ). It is enough to show that the image of χ˜ is dense. We have t(z) = −Aρ(Bz, τ ) + Dz,

p(z) = E +

θ1 (C + Bz, τ ) d ln . dz θ1 (Bz, τ )

Thus, t1 = D + f1 (A, B, τ ), p0 = E + f2 (B, C, τ ), for some meromorphic functions f1 , f2 , and t−1 , t3 , p1 , p2 do not depend on D and E. Thus, to show the density of the image of χ, ˜ it is enough to show the density of the image of ξ : C3 × H → C4 given by (A, B, C, τ ) → (t−1 , t3 , p1 , p2 ). It is clear that the function p0 (z) is doubly periodic with respect to C, with periods 1, τ . Therefore, for any fixed τ , the map ξ is a rational map ξτ : C2 × Eτ → C4 , where

114

P. Etingof, A. Varchenko

Eτ is the elliptic curve corresponding to τ . Denote by Iτ the closure of the image of ξτ . As the Jacobian of ξ is not identically zero, for generic τ , the set Iτ is an irreducible algebraic hypersurface in C4 . It is easy to see that each of the functions t00 (z), p0 (z) satisfies the modular invariance conditions f (A, B, C, z, τ + 2) = f (A, B, C, z, τ ), f (A, B, C, z, −1/τ ) = f (Aτ, Bτ, Cτ, z, τ ). ˜ of Therefore, the hypersurface Iτ is modular invariant with respect to the subgroup 0 the modular group 0 generated by τ → τ + 2, and τ → −1/τ . This means that the ˜ coefficients of the equation of Iτ are modular functions on the modular curve Σ = H/0 2πiτ → 0). This shows that (it is easy to see that they have power growth in q as q = e the hypersurfaces Iτ form an algebraic family over Σ. Let T be the total space of this family. We have a natural rational map ψ : T → C4 , and the closure I of its image coincides with the closure of the image of ξ. Since the map is rational and has a nonzero Jacobian, we have I = C4 , as desired. Corollaries 4.10, 4.12 and Lemma 4.13 tell us that all solutions t, µ to Eq. (4.16) are limits of solutions given by (4.21). It is enough to list the limits of the function µ since then the function t can be recovered from (4.16). Without loss of generality we assume that the coupling constant is equal to 1, i.e. A = −B. Let θ(B(x − x0 − z), τ ) θ0 (0, τ ) z(Dx+E) e f (x, z) = B θ(B(x − x0 ), τ ) θ(Bz, τ ) be a function of x, z depending on parameters B, D, E, x0 , τ . A function g(x, z) will be called a limit of the function f if there exist sequences Bn , Dn , En , x0,n , τn such that g(x, z) is the limit of f (x, z; Bn , Dn , En , x0,n , τn ) when n tends to infinity. Proposition 4.14. Any limit g of the function f has one of the following three forms: Rational type. 1 z(Dx+E) e , z x − x0 − z z(Dx+E) g = e , (x − x0 )z

g =

(4.22) (4.23)

where D, E, x0 are parameters. Trigonometric type. 2πB ez(Dx+E) , sin (2πBz) 2πB sin (2πB(x − x0 − z)) z(Dx+E) g= e , sin (2πB(x − x0 )) sin (2πBz)

g =

(4.24) (4.25)

where B, D, E, x0 are parameters. Elliptic type. g=B

θ(B(x − x0 − z), τ ) θ0 (0, τ ) z(Dx+E) e , θ(B(x − x0 ), τ ) θ(Bz, τ )

where B, D, E, x0 , τ are parameters.

(4.26)

Classical Dynamical Yang–Baxter Equation

115

Proof. Let g be a limit of f . Introduce v1 = ∂z (∂x + ∂z ) ln g, v2 = ∂x (∂x + ∂z ) ln g, v3 = ∂x ∂z ln g. Lemma 4.15. The functions v1 , v2 , v3 have the following properties: 1. v1 is a function of z, v2 is a function of x, v3 is a function of x − z. 2. After identifying the respective variables in these functions with a new variable t, we have v2 (t) = v3 (t). 3. v1 (t) = t−2 + O(t−1 ), if t → 0. 4. The functions v1 , v2 , v3 satisfy a common differential equation of the form (v 0 )2 = 4v 3 + pv 2 + qv + r

(4.27)

for suitable numbers p, q, r. Such an equation is unique. Proof. Properties 1,2,3 are satisfied for g = f with any values of the parameters. Therefore, they are satisfied for any limit g. Property 4 is satisfied for g = f . Using property 3, we conclude that for any limit g the function v1 satisfies a unique equation of form (4.27). Since for g = f the functions v2 , v3 satisfy the same equation as v1 , this is also true for any limit. The lemma is proved. Let P (t) be the cubic polynomial on the right hand side of (4.27). Lemma 4.16. Let the roots of P be pairwise distinct. Then g has form (4.26). Proof. If the roots are distinct, then by Lemma 4.15 the function v1 (t) has the form B 2 ℘(Bt, τ ) + D, where ℘ is the Weierstrass function, and B, D, τ are suitable constants. The functions v2 = v3 satisfy the same differential Eq. (4.27) as v1 . Therefore, either v2 (t) = v3 (t) = v1 (t − t0 ) for some t0 ∈ C, or v2 = v3 = const. It is clear that the second situation cannot arise in a limit of f , so v2 (t) = v3 (t) = v1 (t − t0 ). If v1 , v2 , v3 are known, then the second differential of the logarithm of the function g is known, and hence the function g is known up to a transformation of the form g 7→ g eax+bz+c for suitable constants a, b, c. Therefore, g = G eax+bz+c , where G has form (4.26). The condition Resz=0 g = 1 implies a = c = 0. Now g = G ebz and the parameter b can be included into the parameter E of (4.26). The lemma is proved. Lemma 4.17. Let P have a root of multiplicity 2. Then g has form (4.24) or (4.25). Proof. An equation of the form (v 0 )2 = 4(v − α)2 (v − β) can be solved explicitly. This gives 4π 2 B 2 + D. (4.28) v1 (z) = sin2 (2πBz) The function v2 = v3 has to be a solution of this equation, so either v2 (t) = v3 (t) = v1 (t − t0 ), or v2 = v3 = α, or v2 = v3 = β. It is easy to see that the third case cannot arise as a limiting case of f . In the first case, g has form (4.25) up to a factor eax+bz+c for suitable numbers a, b, c. Reasoning as before we conclude that g has form (4.25). Similarly, in the second case, g has form (4.24). The lemma is proved. Lemma 4.18. If the polynomial P has a root of multiplicity 3, then g has form (4.22) or (4.23).

116

P. Etingof, A. Varchenko

Proof. Analogous to Lemma 4.17.

The proposition is proved. 4.5.5. End of proof of Theorem 4.2, part 2. In the previous section we have determined the possible forms of the function µ(x, z) for any root α: they are given by (4.22)-(4.26) for coupling constant 1. Now we will determine the consistency conditions, which are imposed on these functions for different roots by the gα ⊗ gβ ⊗ gγ -part of the CDYB equation, where α + β + γ = 0. First of all, by our assumptions the function ϕα is a meromophic function for any root α. Since ϕα = µezLα ψ , and µ is meromorphic, the function ψ(λ) is holomorphic on h∗ . Therefore, by using gauge transformations of type 1 and 2, it is possible to reduce any r-matrix with spectral parameter and coupling constant 1 to the form in which its P h ⊗ h-part is T (z) xi ⊗ xi , where T (z) = z1 + T1 z + O(z 3 ) is an odd, scalar-valued meromorphic function. We will call such r-matrices reduced, and from now on will work only with them. For a reduced r-matrix ϕα (λ, z) = µ∗α ((α, λ), z), where µ∗α (x, z) = µα (x, z)eT1 xz , and µα is the function µ introduced in Sect. 4.5.2. Observe that µ∗α is a function from family (4.21). Let Aα , Bα , Cα , Dα , Eα , τα be the parameters A, B, C, D, E, τ determined from (4.21), for µ = µ∗α . Since the coupling constant is 1, we have A = −B. Lemma 4.19. All µ∗α are of the same type (rational, trigonometric, or elliptic). Proof. From Eq. (4.16) we can find the function t(u). We know that it is the same for all roots α. It is easy to check that µ∗α is of rational, trigonometric, or elliptic type iff the set of poles of t(z) is a lattice of rank 0, 1, 2 respectively. The lemma is proved. So it remains to consider the rational, trigonometric, and elliptic cases separately. Elliptic case. Lemma 4.20. Let r be a reduced dynamical r-matrix such that the functions µ∗α are of elliptic type. Then there exist complex numbers a, b, τ , Im τ > 0, and elements ν, κ ∈ h∗ such that Aα = a,

Bα = b,

τα = τ,

Cα = (α, ν),

Dα = c,

Eα = (α, κ),

for all roots α. Proof. Substitute formula (4.21) into the gα ⊗ gβ ⊗ gγ −part of the CDYB equation, see (4.17). Considering the poles at the hyperplanes of the form z1,2 =const, we observe that the functions ϕβ and ϕγ have the same lattice of periods. This allows us to conclude that there exist numbers b and τ such that Bα = b, and τα = τ for all α. The residue condition (4.3) implies the existence of a number a such that Aα = a for all α. The necessity to cancel the poles at the hyperplanes of the form (α, λ) =const implies the existence of elements κ, ν ∈ h∗ , c ∈ C such that Cα = (α, ν), Dα = c, Eα = (α, κ) for all α. This proves the lemma, and Theorem 4.2, part 2, in the elliptic case. Now consider the trigonometric and rational case. Lemma 4.21. Dα are the same for all α. Proof. Easily follows from (4.17).

Classical Dynamical Yang–Baxter Equation

117

Thus, we can reduce to the situation Dα = 0 by using a gauge transformation of type 2 with ψ = −D(λ, λ)/2. Rational case. Lemma 4.22. In the rational case Eα+β = Eα + Eβ . Proof. Follows directly from (4.17). Thus, in the rational case we can reduce to the situation Eα = 0 by a gauge trasformation of type 2 with ψ = −(E, λ), E ∈ h∗ . If r is a reduced rational dynamical r-matrix with Dα = Eα = 0, then it is easy to see from Proposition 4.14 that r is of the form z + r0 (λ), where r0 is skew-symmetric. Since r0 = limz→∞ r, r0 (λ) is a classical dynamical r-matrix without spectral parameter with zero coupling constant. Such r-matrices were classified in Theorem 3.2. Theorem 3.2. implies that r(λ, z) is equivalent to (4.9) by gauge transformations of type 3. This proves Theorem 4.2, part 2, in the rational case. Trigonometric case. In the trigonometric case, the functions µ∗α have form (4.24),(4.25). As in the elliptic case, it is easy to see that B is the same for all α since B −1 is the period of the lattice of poles. So by a gauge transformation of type 4 we can arrange B = 1. Lemma 4.23. Let ρ be the half-sum of positive roots. There exists a limit of r(sρ, z) as s → i∞ Proof. The statement is clear from formulae (4.24),(4.25).

Denote this limit by r(z). ¯ This is a classical r-matrix with spectral parameter having the form z + O(1) as z → 0 and invariant under the action of the Cartan subalgebra. A classification of such r-matrices was given by Belavin and Drinfeld [BD1], Theorem 6.1. Namely, consider an r-matrix rtr (z) = 2i

− e2iz + + , e2iz − 1

(4.29)

P P xi ⊗ xi + α∈1± eα ⊗ e−α are the half-Casimirs. According to where ± = 21 [BD1], any classical r-matrix of the above type can be obtained from rtr (z) by change of polarization of the Lie algebra, and by gauge transformations of type 4, and type 2 with a linear function ψ. Thus, in order to prove Theorem 4.2, part 2, in the trigonometric case, it is enough to assume that r(z) ¯ = rtr (z). Under this assumption, it is easy to deduce from Proposition 4.14 that µ∗α (x, z) equals to: sin(x−x0 +z) if µ∗α depends nontrivially on x; 1. sin(x−x 0 ) sin z −iz

2. esin z , if α > 0 and µ∗α does not depend on x. eiz ∗ 3. sin z , if α < 0 and µα does not depend on x. Now let us send z → i∞. It is easy to see that the limit of r(λ, z) exists. Denote this limit by r∞ (λ). Let µ∞ α ((λ, α)) be the coefficient of eα ⊗ e−α in r∞ (λ). From cases 1-3 above, we get that µ∞ α (x) equals to: −i(x−x0 ) 1’. esin(x−x0 ) , if µ∞ α depends nontrivially on x; 2’. −2i, if α > 0 and µ∞ α does not depend on x.

118

P. Etingof, A. Varchenko

3’. 0, if α < 0 and µ∞ α does not depend on x. On the other hand, r∞ (λ) is a classical dynamical r-matrix without spectral parameter, and it is clear from cases 1’-3’ that it has coupling constant −2i. So, r∞ (λ) has to be of the form given by Theorem 3.10. This determines possible combinations of functions µ∗α , and shows that r(λ, z) is equivalent to an r-matrix (4.8) for a suitable subset X by a gauge transformation of type 3. Theorem 4.2, part 2 is proved. 4.6. Dynamical r-matrices without spectral parameter for affine Lie algebras. In this section we interpret dynamical classical r-matrices without spectral parameter for an affine Lie algebra as dynamical r-matrices with spectral parameter for the underlying simple Lie algebra. Let g be a simple Lie algebra, and g˜ = g[t, t−1 ] ⊕ Cc ⊕ Cd be the corresponding affine Lie algebra, where c is the central element, and d is the grading element. Let h ⊂ g be a Cartan subalgebra, and {xi } an orthonormal basis of h. Let h˜ ⊂ g˜ be the Cartan subalgebra of g˜ , h˜ = h ⊕ Cc ⊕ Cd. Recall that c, d are orthogonal to h with respect to the standard bilinear form, and (c, d) = 1, (c, c) = (d, d) = 0. The elements of h˜ have the form h + xc + yd, where h ∈ h, x, y ∈ C. The elements of h˜ ∗ are triples (λ, k, s) such that h(λ, k, s), h + xc + ydi = hλ, hi + kx + sy. Let δ ∈ h˜ ∗ be the positive imaginary root, hδ, di = 1, hδ, ci = 0, hδ, hi = 0. The roots of g˜ are α + nδ and nδ, where α is a root of g, n ∈ Z, and in the second case n 6= 0. The positive roots are α + nδ and (n + 1)δ, where α is a root of g, n ≥ 0, and n > 0 for negative α. A basis of positive root elements of g˜ is formed by eα tn , e−α tn+1 , and xi tn+1 , where α > 0, n ≥ 0. The dual elements are respectively e−α t−n , eα t−n−1 , and xi t−n−1 , where α > 0, n ≥ 0. According to Theorem 3.1, the solution of CDYBE for g˜ with coupling constant 2 and C = 0, ν = 0 has the form X ˆ + cotanh((α, λ) + kn)eα tn ⊗ e−α t−n r(λ, k, s) = α∈1, n

+

XX i

cotanh(kn)xi tn ⊗ xi t−n .

(4.30)

n6=0

ˆ is the Casimir element of g˜ . where For any z ∈ C∗ , let πz : g[t, t−1 ] → g be the evaluation map at t = z. Consider the ˆ is the image of r under function r(y, ¯ w) = (πy ⊗ πw )(r|c=0 ), where r|c=0 ∈ g[t, t−1 ]⊗2 the reduction modulo c. It is easy to see that r¯ depends only on u = y/w, and equals X X r(λ, ¯ k, u) = [ un (1 + cotanh((α, λ) + kn))]eα ⊗ e−α α∈1 n∈Z

+

X i

[1 +

X

un (1 + cotanh(kn))]xi ⊗ xi .

(4.31)

n6=0

Set u = e2πiz . Lemma 4.24. For any k, the function r(λ, ˆ z) = r(λ, ¯ k, u) satisfies the CDYB equation with spectral parameter z.

Classical Dynamical Yang–Baxter Equation

119

Proof. Applying the operator πz1 ⊗ πz2 ⊗ πz3 to the CDYB Eq. (3.3) for g˜ , one easily obtains the CDYB equation with spectral parameter. Now we compute r(λ, ˆ z). Set τ = k/πi, assume that Im τ > 0, and use the classical formulae X 1 a (z, τ ), un (1 + cotanh(a + πiτ n)) = − σ πi (4.32) πi n∈Z

and 1+

X

un (1 + cotanh(πiτ n)) = −

n∈Z\0

1 ρ(z, τ ), πi

(4.33)

where σ and ρ are defined in (4.6). Then r(λ, ˆ z) takes the form of the Felder solution of CDYBE with spectral parameter, N X 1 X 1 xi ⊗ xi − σ (α,λ) (z, τ ) eα ⊗ e−α . r(λ, ˆ z) = − ρ(z, τ ) πi πi α∈1 πi i=1

Appendix: Open Problems In conclusion we would like to formulate two open problems. Let g be a simple Lie algebra, h its Cartan subalgebra, and h0 a subspace in h. We will say that a meromorphic function r : h∗0 → g ⊗ g is a classical dynamical r-matrix if it satisfies (3.2),(3.3), and (3.1) for h ∈ h0 . We will say that a meromorphic function r : h∗0 × C → g ⊗ g is a classical dynamical r-matrix with a spectral parameter if it satisfies (4.2)–(4.4), and (4.1) for h ∈ h0 . The number in both cases is called the coupling constant. Problem 1. Classify classical dynamical r-matrices on h0 , with a nonzero coupling constant. Problem 2. Classify classical dynamical r-matrices on h0 with a spectral parameter, with a nonzero coupling constant. These problems are solved for two extreme cases – h0 = 0 ([BD1, BD2]) and h0 = h (this paper). We expect that it can be solved in the intermediate cases by combining of the methods of our paper and the two papers of Belavin and Drinfeld. Note added in the second version. While this paper was being revised, O. Schiffmann obtained a partial solution of Problem 1 [Sch]. Acknowledgement. We thank Giovanni Felder who raised the question of geometric interpretation of the classical dynamical Yang–Baxter equation. We are grateful to Olivier Schiffmann for careful reading of this manuscript and pointing out a number of errors. We thank Vladimir Drinfeld and Jiang-Hua Lu for useful discussions.

References [ABB] Avan, J., Babelon, O., Billey, E.: The Gervais-Neveu-Felder equation and the quantum CalogeroMoser systems. hep-th/9505091 Commun. Math. Phys. 178, 2, 281–299 (1996) [B] Bernard, D.: On the Wess-Zumino-Witten models on the torus. Nucl. Phys. B303, 77–93 (1988)

120

P. Etingof, A. Varchenko

[BD1] Belavin, A.A. and Drinfeld, V.G.: Solutions of the classical Yang–Baxter equations for simple Lie algebras Funct. Anal. Appl. 16, 159–180 (1982) [BD2] A.A.Belavin and V.G.Drinfeld: Triangle equations for simple Lie algebras Mathematical physcis reviews (ed. Novikov et al), Harwood, New York 93–165 (1984) [C] Cherednik, I.V. : Generalized braid groups and local r-matrix systems. Soviet Math. Doklady 307, 43–47 (1990) [Dr] Drinfeld, V.G.: Quantum groups. Proc. Int. Congr. Math. Berkeley, 1986, 1, pp. 798–820 [F1] Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Preprint hepth/9407154, to appear in the Proceedings of the ICM, Z¨urich, 1994 [F2] Felder, G.: Elliptic quantum groups. Preprint hep-th/9412207, to appear in the Proceedings of the ICMP, Paris 1994 [FV] Felder, G. and Varchenko, A.: On representations of the elliptic quantum group Eτ,η (sl2 ). q-alg 9601003 (1996) [FW] Felder, G. and Wieczerkowski, C.: Conformal blocks on elliptic curves and the Knizhnik– Zamolodchikov–Bernard equations. Commun. Math. Phys. 176, 133 (1996) [GG] Goto, M. and Grosshans, F.: Semisimple Lie Algebras. New York and Basel: Marcel Dekker, INC, 1978 [GN] Gervais, J.-L., and Neveu, A.: Novel triangle relation and absense of tachyons in Liouville string field theory. Nucl. Phys. B 238, 125 (1984) [GHV] Greub, W., Halperin, S. and Vanstone, R.: Connections, curvature, and cohomology, vol II. New York: Academic press, 1973 [M] Mackenzie, K.:Lie groupoids and Lie algebroids in differential geometry. Cambridge: Cambridge Univ. Press, 1997 [Sch] Schiffmann, O.: On classification of dynamical r-matrices. To appear in q-alg (1997) [SV] Schechtman, V. and Varchenko, A.: Arrangements of hyperplanes and Lie algebra homology. Invent. Math. 106, 139–194 (1991) [SV] Varchenko, A.:Multidimensional Hypergeometric Functions and Representation Theory of Lie Algebras and Quantum Groups. Advanced series in Math. Physics, Vol 21, Singapore: World Scientific, 1995 [W] Weinstein, A.: Coisotropic calculus and Poisson groupoids. J. Math. Soc. Japan 40, 705–727 (1988) Communicated by G. Felder

Commun. Math. Phys. 192, 121 – 144 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Dirac Structures and Poisson Homogeneous Spaces Zhang-Ju Liu1,? , Alan Weinstein2,?? , Ping Xu3,??? 1 Department of Mathematics, Peking University, Beijing, 100871, China. E-mail: [email protected] 2 Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] 3 Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected]

Received: 31 December 1996 / Accepted: 20 June 1997

Abstract: Poisson homogeneous spaces for Poisson groupoids are classified in terms of Dirac structures for the corresponding Lie bialgebroids. Applications include Drinfel’d’s classification in the case of Poisson groups and a description of leaf spaces of foliations as homogeneous spaces of pair groupoids.

1. Introduction Dirac structures on manifolds include closed 2-forms, Poisson structures, and foliations. They extend the flexibility of computations with such objects by permitting the passage to both submanifolds and quotients. The combination of these two operations is central to the theory of reduction in Poisson geometry. Dirac structures were introduced by Courant and Weinstein [4] and thoroughly investigated by Courant in [3]. Dorfman [5] used Dirac structures in the context of the formal calculus of variations for the study of completely integrable systems of partial differential equations. Under a regularity assumption which is always satisfied on an open dense subset, a Dirac structure on a manifold P is locally the same thing as a Poisson structure on the leaf space of a foliation of P .1 An essential object in the theory of Dirac structures is a natural antisymmetric bracket operation (see Eq. (7)) on the sections of T P ⊕ T ∗ P introduced by Courant. Although this Courant bracket does not satisfy the Jacobi identity, it does satisfy that identity on 0(E) when E is a subbundle of T P ⊕T ∗ P which is maximal isotropic for the symmetric form (X1 + ξ1 , X2 + ξ2 )+ = 21 (hξ1 , X2 i + hξ2 , X1 i) and whose sections are closed under ?

Research supported by NSF of China and a grant from SEC. Research partially supported by NSF grants DMS93-09653 and DMS96-25122. ??? Research partially supported by NSF Grant DMS95-04913. 1 Dually, under a slightly different regularity assumption, it is the same as a smooth family of closed 2-forms on the leaves of a foliation of P (generally different from the foliation in the first description). ??

122

Z.-J. Liu, A. Weinstein, P. Xu

the bracket. (For instance, we recover the usual bracket of vector fields on T P ⊕ 0 and the zero bracket on 0 ⊕ T ∗ P .) The theory of Dirac structures finds an echo in Drinfel’d’s theory of Lie bialgebras and Poisson homogeneous spaces [6, 8]. A Lie bialgebra (g, g∗ ) can be thought of as a pair of Lie algebra structures on a vector space g and its dual having a common extension (which turns out to be unique) to a Lie algebra structure on g ⊕ g∗ for which the symmetric form ( , )+ is ad-invariant. The Lie algebra g ⊕ g∗ is called the double of the Lie bialgebra (g, g∗ ). The main result of [8] is that maximal isotropic subalgebras of the double correspond (modulo some details concerning closedness and connectedness of subgroups) to Poisson homogeneous G-spaces, where G is the Poisson Lie group whose linearization is the given Lie bialgebra. The similarities between the Courant bracket and the bracket on the double of a Lie bialgebra were explained in our recent paper [11], where both were exhibited as special cases in a theory of doubles of Lie bialgebroids. Since the bracket on sections of a Lie algebroid A over P satisfies the Jacobi identity, while the Courant bracket does not, it is clear that one must look beyond Lie algebroids to find these doubles. Hence we introduced in [11] a notion of Courant algebroid, in which the Lie algebroid axioms for a bracket on 0(A) and a bundle map a : A → P are satisfied only modulo certain “coboundary anomalies,” explicitly described in terms of a nondegenerate bilinear form on E which is part of the Courant algebroid structure. When P is a point, the anomalies vanish, and a Courant algebroid is just a Lie algebra with a nondegenerate ad-invariant symmetric bilinear form. Although more explicit descriptions are available (see Sect. 2), we can define a Lie bialgebroid as a pair of Lie algebroid structures on a vector bundle A and its dual having a common extension (which turns out to be unique) to a Courant algebroid structure on A ⊕ A∗ with the symmetric form ( , )+ . For the original Courant bracket, A = T P with the usual bracket of vector fields and A∗ = T ∗ P with the zero bracket on 1-forms. When P is a point, we recover the Lie bialgebras. Lie bialgebroids arise as the linearizations of (possibly local) Poisson groupoids G, in which the bracket on A determines G, while the bracket on A∗ determines a compatible Poisson structure. If (A, A∗ ) is a Lie bialgebroid over P , we can now define an (A, A∗ ) Dirac structure on P to be a maximal isotropic subbundle of the Courant algebroid L ⊂ A ⊕ A∗ which is closed under the bracket. Since the Lie algebroid anomalies are defined in terms of ( , )+ , they vanish on L, which is therefore an ordinary Lie algebroid. The main result of this paper, already announced in [11], is that there is a 1-1 correspondence between (A, A∗ ) Dirac structures on P satisfying a certain regularity condition and Poisson homogeneous spaces of the form G/H, where G is a Poisson groupoid whose tangent Lie bialgebroid is (A, A∗ ), and H is a subgroupoid of G which is closed and wide, i.e. containing all the identity elements. (We will assume throughout this paper that all our groupoids are α-connected; i.e. the fibres of the source map are connected). Drinfel’d’s theorem is the special case of this for P a point, while for ordinary (i.e. (T P, T ∗ P )) Dirac structures, we recover their description as Poisson structures on quotient manifolds of P . On the way to our main result, we develop several topics of independent interest. First of all, we extend to (A, A∗ ) Dirac structures the original application of Dirac structures to Poisson reduction. A technical complication here is that we must deal with quotient spaces for which the projection is not a Poisson mapping. This renders the streamlined methods of “coisotropic calculus” [19] inapplicable, and we need to do a number of computations by hand. Eventually, it will be useful to develop a modified

Dirac Structures and Poisson Homogeneous Spaces

123

coisotropic calculus to handle Dirac reductions directly. Second, we study pullbacks of Dirac structures under morphisms of Lie bialgebroids. Finally, we discuss the general notion of homogeneous spaces for groupoids. Here is an outline of the paper. Sect. 2 is a review of basic definitions and properties of Lie bialgebroids, Courant algebroids, and Dirac structures. (We will often omit the prefix “(A, A∗ )", which is still implied.) In Sect. 3, we establish a correspondence between Dirac structures and Poisson structures on quotient manifolds. Using this correspondence, we characterize in Sect. 4 the Dirac structures which are invariant under Poisson actions of groups, and we prove Drinfel’d’s theorem by extending a (g, g∗) Dirac structure to a left-invariant (T G, T ∗ G) Dirac structure on G. (Here, the bialgebroid has the nontrivial bracket on 1-forms coming from the Poisson structure on G.) Sect. 5 contains a theorem about pullbacks of Dirac structures, which is used in Sect. 6 to extend (A, A∗ ) Dirac structures on P to “left-invariant” (T G, T ∗ G) Dirac structures on the Poisson groupoid G. These invariant structures are then related to Poisson structures on quotients of P . Sect. 7 establishes a characterization of Poisson actions of Poisson groupoids. Finally, in Sect. 8, we define Poisson homogeneous spaces and then use the results of Sect. 6 to prove our main theorem.

2. Dirac Structures on a Lie Bialgebroid The notion of Lie bialgebroids is a natural generalization of that of Lie bialgebras. Roughly speaking, a Lie bialgebroid is a pair of Lie algebroids (A, A∗ ) satisfying a certain compatibility condition. Such a condition was given in [16]. We quote here an equivalent formulation from [10]. Definition 2.1. A Lie bialgebroid is a dual pair (A, A∗ ) of vector bundles equipped with Lie algebroid structures such that the differential d∗ on 0(∧∗ A) coming from the structure on A∗ is a derivation of the Schouten-type bracket on 0(∧∗ A) obtained by extension of the structure on A. Equivalently, d∗ is a derivation for sections of A, i.e., d∗ [X, Y ] = [d∗ X, Y ] + [X, d∗ Y ], ∀X, Y ∈ 0(A).

(1)

For a Lie bialgebroid (A, A∗ ), the base P inherits a natural Poisson structure: {f, g}P =< df, d∗ g >, ∀f, g ∈ C ∞ (P ),

(2)

where d∗ : C ∞ (P ) −→ 0(A) and d : C ∞ (P ) −→ 0(A∗ ) are the usual differential operators associated to Lie algebroids [16]. It is easy to check the identity [df, dg] = d{f, g}P .

(3)

Given a Lie bialgebroid (A, A∗ ) over the base P , with anchors a and a∗ respectively, let E denote their vector bundle direct sum: E = A ⊕ A∗ . On E, there exist two natural nondegenerate bilinear forms, one symmetric and another antisymmetric, which are defined as follows: (X1 + ξ1 , X2 + ξ2 )± = On 0(E), we introduce a bracket by

1 (hξ1 , X2 i ± hξ2 , X1 i). 2

(4)

124

Z.-J. Liu, A. Weinstein, P. Xu

[e1 , e2 ] = ([X1 , X2 ]+Lξ1 X2−Lξ2 X1−d∗ (e1 , e2 )− )+([ξ1 , ξ2 ]+LX1 ξ2 −LX2 ξ1 +d(e1 , e2 )− ), (5) where e1 = X1 + ξ1 and e2 = X2 + ξ2 . Finally, we let ρ : E −→ T P be the bundle map defined by ρ = a + a∗ . That is, ρ(X + ξ) = a(X) + a∗ (ξ), ∀X ∈ 0(A) and ξ ∈ 0(A∗ ).

(6)

When (A, A∗ ) is a Lie bialgebra (g, g∗ ), the bracket above reduces to the famous Lie bracket of Manin on the double g ⊕ g∗ . On the other hand, if A is the tangent bundle Lie algebroid T M and A∗ = T ∗ M with zero bracket, then Eq. (5) takes the form: [X1 + ξ1 , X2 + ξ2 ] = [X1 , X2 ] + {LX1 ξ2 − LX2 ξ1 + d(e1 , e2 )− }.

(7)

This is the bracket first introduced by Courant [3], then generalized to the context of the formal variational calculus by Dorfman [5]. In general, E together with this bracket and the bundle map ρ satisfies certain properties as outlined in the following: Proposition 2.2 ([11]). Given a Lie bialgebroid (A, A∗ ), let E = A ⊕ A∗ . Then E, with the nondegenerate symmetric bilinear form (·, ·)+ , the skew-symmetric bracket [·, ·] on 0(E) and the bundle map ρ : E −→ T P as introduced above, satisfies the following properties: 1. For any e1 , e2 , e3 ∈ 0(E), [[e1 , e2 ], e3 ] + c.p. = DT (e1 , e2 , e3 ); 2. for any e1 , e2 ∈ 0(E), ρ[e1 , e2 ] = [ρe1 , ρe2 ]; 3. for any e1 , e2 ∈ 0(E) and f ∈ C ∞ (P ), [e1 , f e2 ] = f [e1 , e2 ] + (ρ(e1 )f )e2 − (e1 , e2 )Df ; 4. ρ◦D = 0, i.e., for any f, g ∈ C ∞ (P ), (Df, Dg) = 0; 5. for any e, h1 , h2 ∈ 0(E), ρ(e)(h1 , h2 ) = ([e, h1 ] + D(e, h1 ), h2 ) + (h1 , [e, h2 ] + D(e, h2 )), where 1 ([e1 , e2 ], e3 ) + c.p., (8) 3 and D : C ∞ (P ) −→ 0(E) is the map: D = d∗ + d. (The notation “+ c.p.” means to add the two terms obtained by circular permutation of the indices 1, 2, 3.) T (e1 , e2 , e3 ) =

Objects satisfying the properties above are called Courant algebroids in [11]. In other words, we have: Theorem 2.3. If (A, A∗ ) is a Lie bialgebroid, then E = A ⊕ A∗ together with ([·, ·], ρ, (·, ·)+ ) is a Courant algebroid. In this case, E is called the double of the Lie bialgebroid. Definition 2.4. Let E = A⊕A∗ be the double of a Lie bialgebroid (A, A∗ ). A subbundle L of E is called isotropic if it is isotropic under the symmetric bilinear form (·, ·)+ . It is called integrable if 0(L) is closed under the bracket [·, ·]. A Dirac structure, or Dirac subbundle, of the Lie bialgebroid (A, A∗ ) is a subbundle L ⊂ E which is maximally isotropic and integrable.

Dirac Structures and Poisson Homogeneous Spaces

125

The following proposition follows immediately from the definition of Dirac structures, and Properties (i)-(v) of Proposition 2.2. Proposition 2.5. Suppose that L is an integrable isotropic subbundle of a Courant algebroid (E, ρ, [·, ·], (·, ·)). Then (L, ρ|L , [·, ·]) is a Lie algebroid. In particular, any Dirac subbundle of a Lie bialgebroid is a Lie algebroid. 3. Dirac Structures and Poisson Reduction Suppose that (A, A∗ ) is a Lie bialgebroid over the base manifold P , with anchors a and a∗ respectively. Let E = A ⊕ A∗ denote its double, and let L ⊂ E be a Dirac subbundle. Clearly, L ∩ A is a (singular) subalgebroid of A, and therefore D = a(L ∩ A) is an integrable (singular) distribution on P . We call D the characteristic distribution of L. Let F denote the corresponding (generally singular) foliation of P . Definition 3.1. A Dirac subbundle L ⊂ E is called reducible if its characteristic distribution D induces a simple foliation. Here by a simple foliation, we mean a regular foliation F such that P/F is a nice manifold such that the projection is a submersion. A function f ∈ C ∞ (P ) is called L-admissible if there is a section in 0(A), denoted by Yf (Yf may not be unique), such that Yf + df ∈ 0(L). We write CL∞ (P ) for the set of all L-admissible functions. Let L ⊂ E be a reducible Dirac structure. Then f is L-admissible iff f is constant along F , i.e., CL∞ (P ) ∼ = C ∞ (P/F ). For any f, g ∈ CL∞ (P ) admissible, define a bracket by {f, g} = ρ(ef )g,

(9)

where ef = Yf + df , which is unique up to a section of L ∩ A. Theorem 3.2. Suppose that L is a reducible Dirac structure. The bracket (9) defines a Poisson structure on C ∞ (P/F). Proof. From the definition, we have {f, g} =< Yf , dg > + < df, d∗ g >, ∀f, g ∈ CL∞ (P ). On the right hand side, the first term is skew-symmetric since L is isotropic. The second term is just the Poisson bracket on C ∞ (P ) as defined by Eq. (2). Hence, {·, ·} is skew-symmetric. Next, we prove that CL∞ (P ) is closed under this bracket and the Jacobi identity holds. Let [ef , eg ]∗ denote the component of [ef , eg ] in 0(A∗ ). According to Eq. (5), [ef , eg ]∗ = [df, dg] + LYf dg − LYg df + d(ef , eg )− = [df, dg] + LYf dg − LYg df + d(ef , eg )− + d(ef , eg )+ = d{f, g}P + d < Yf , dg > = d{f, g}. This means that {f, g} is also L-admissible and one can take [ef , eg ] as e{f,g} . It follows that {{f, g}, h} = ρ(e{f,g} )h = ρ([ef , eg ])h = [ρ(ef ), ρ(eg )]h = {f, {g, h}} − {g, {f, h}}. That is, {·, ·} defines a Poisson structure on P/F .

126

Z.-J. Liu, A. Weinstein, P. Xu

In what follows, we apply the result above to a special class of Lie bialgebroids: Lie bialgebroids of Poisson manifolds. Moreover, we will prove that in this case Dirac structures, roughly speaking, are in one-one correspondence with Poisson structures on quotient spaces of the Poisson manifold. Given a Poisson manifold (P, π), its cotangent bundle T ∗ P inherits a natural Lie algebroid structure, called the cotangent Lie algebroid of the Poisson manifold P [2]. On the other hand, the tangent bundle T P is a Lie algebroid in an evident sense. It is known that they constitute a Lie bialgebroid [16]. For simplicity, we will use (T P, T ∗ P ; π) to denote this Lie bialgebroid. As a special case of Theorem 3.2, therefore, any reducible Dirac structure L on its double T P ⊕ T ∗ P will induce a Poisson structure on the quotient space P/F. Conversely, suppose that D is an integrable distribution on P with foliation F, which is simple. Assume that M = P/F has a Poisson structure. Let J : P −→ M denote the natural projection. To keep under control the fact that J is not a Poisson map, we define a “difference” bracket C ∞ (M ) × C ∞ (M ) −→ C ∞ (P ) by: {f, g}1 = J ∗ {f, g} − {J ∗ f, J ∗ g}P , ∀f, g ∈ C ∞ (M ).

(10)

It is easy to see, by using the Leibniz identity of Poisson brackets, that this bracket defines a skew-symmetric bilinear form on the conormal bundle D⊥ , which in turn induces a bundle map (11) 3 : D⊥ −→ T P/D. Let pr : T P −→ T P/D be the natural projection. Define a subbundle L ⊂ T P ⊕ T ∗ P by L = {(ν, ξ)|pr(ν) = 3ξ, ∀ν ∈ T P, ξ ∈ D⊥ }.

(12)

It is clear that L is a maximal isotropic subbundle of T P ⊕ T ∗ P , and CL∞ (P ) ∼ = C ∞ (M ). For any f ∈ CL∞ (P ), it is easy to see, from the definition, that there exists a vector field Yf ∈ X (P ) such that Yf + df ∈ 0(L). And in fact, 0(L) is spanned by all those sections of the form g(Yf + df ), for f ∈ CL∞ (P ) and g ∈ C ∞ (P ). To prove that L is integrable, it suffices to show that the bracket is closed for those sections having the form Yf + df according to Property 3 of Proposition 2.2, since L is isotropic. Given any f and g ∈ CL∞ (P ). Let ef = Yf + df be a section in 0(L). Similarly, let eg = Yg + dg and e{f,g} = Y{f,g} + d{f, g} ∈ 0(L). It is easy to check that ρ(ef )g = {f, g}. By virtue of the Jacobi identity, we have ρ([ef , eg ] − e{f,g} )h = 0, ∀f, g, h ∈ CL∞ (P ).

(13)

Since the component of [ef , eg ] on 0(T ∗ P ) is d{f, g} according to the proof of Theorem 3.2, [ef , eg ] − e{f,g} is a section in 0(T P ). Thus Eq. (13) implies that [ef , eg ] − e{f,g} ∈ 0(L ∩ T P ) ⊂ 0(L). Hence, [ef , eg ] ∈ 0(L), and therefore L is integrable. This proves the following

Dirac Structures and Poisson Homogeneous Spaces

127

Theorem 3.3. Suppose that P is a Poisson manifold. There is a one-one correspondence between reducible Dirac structures in the double E = T P ⊕T ∗ P and Poisson structures on a quotient space P/F. Remarks. (1) This is a generalization of a result of Courant [3]. As a main motivation for the introduction of Dirac structures, Courant proved that a Dirac structure on T P ⊕T ∗ P , when P is equipped with the zero Poisson structure, induces a Poisson bracket on a quotient space. In fact, the first part of Theorem 3.3 can be obtained by reducing to the zero Poisson case. This can be seen as follows. If P has a non-trivial Poisson structure, the double T P ⊕ T ∗ P , as a Courant algebroid, is still isomorphic to the double studied by Courant. As a consequence, any Dirac structure will thus induce a Poisson structure on the quotient. However, the converse seems to be new. (2) If the Poisson structure on the quotient P/F is induced from that on P , i.e., the projection J is a Poisson map, then it is simple to see that L = D ⊕ D⊥ . This is called a null Dirac structure (see [11]). Thus, we have proved: a foliation D on a Poisson manifold P is compatible with the Poisson structure iff L = D⊕D⊥ is a Dirac subbundle of E = T P ⊕ T ∗ P . (3) The result is even interesting when D = 0. In this case, a Poisson structure on the quotient is simply another Poisson structure π1 on P . Then, L is simply the graph of the bundle map T ∗ P −→ T P induced from the bivector field π1 − π.

4. Invariant Dirac Structures This section is devoted to the study of invariant Dirac structures of a Poisson group. As an application, we will give a new proof for the Drinfel’d theorem on homogeneous spaces. Lemma 4.1. Let P be a Poisson manifold with a Lie group G-action: {ϕk }k∈G . Write ϕk (x) = kx for x ∈ P . Suppose that L ⊂ T P ⊕ T ∗ P is a reducible Dirac subbundle. Then L is G-invariant iff both the characteristic distribution D and the difference bracket {·, ·}1 defined by Eq. (10) are G-invariant. Proof. First, suppose that L is G-invariant. Then, D = L ∩ T P is clearly G-invariant. For any k ∈ G and Yf + df ∈ 0(L), ∗ (ϕk∗ −1 + ϕ∗k )(Yf + df ) = ϕ−1 k∗ (Yf ) + d(ϕk f ) ∈ 0(L).

This means that ϕ∗k f is also L-admissible and we may take Yϕ∗k f = ϕ−1 k∗ (Yf ). Hence, {ϕ∗k f, ϕ∗k g}1 (x) = < Yϕ∗k f (x), d(ϕ∗k g)(x) >

= < Yϕ∗k f (x), ϕ∗k (dg(kx)) > = < Yf (kx), dg(kx) > = {f, g}1 (kx) = ϕ∗k {f, g}1 (x).

Conversely, from the assumption, it is easy to see that the bundle map 3 as defined by Eq. (11) is G-invariant. Thus L is G-invariant according to Eq. (12).

128

Z.-J. Liu, A. Weinstein, P. Xu

Remark. We note that in general the group action does not preserve the bracket on the double T P ⊕ T ∗ P unless it preserves the Poisson structure on P . However, as we shall see below, in most interesting cases, we need to study a Poisson group action which does not preserve the Poisson structure. Theorem 4.2. With the notation above, suppose that G is a Poisson group and P is a Poisson G-space. Then the following statements are equivalent: 1. L is G-invariant. 2. The G-action can be reduced to the quotient space P/F such that the reduced action is also a Poisson action. Proof. By definition, {·, ·}1 is G-invariant iff ϕ∗k {f, g}P − {ϕ∗k f, ϕ∗k g}P = ϕ∗k {f, g} − {ϕ∗k f, ϕ∗k g}, ∀f, g ∈ C ∞ (P/F ).

(14)

Recall that, for a Poisson Lie group G, a Poisson manifold P with a G-action is a Poisson G-space iff the equality ϕ∗k {f, g}P (x) − {ϕ∗k f, ϕ∗k g}P (x) = {fx , gx }G (k)

(15)

holds for all k ∈ G, x ∈ P , and f, g ∈ C ∞ (P ), where the function fx ∈ C ∞ (G) is defined by fx (k) = f (kx). When L is G-invariant, its characteristic foliation F is also G-invariant. Hence, the action can be reduced to P/F. Moreover, combining Eqs. (14) and (15), we get ϕ∗k {f, g}(x) − {ϕ∗k f, ϕ∗k g}(x) = {fx , gx }G (k), ∀k ∈ G, x ∈ P, and f, g ∈ CL∞ (P ). (16) This means that P/F is a Poisson G-space. Conversely, if both P and P/F are Poisson G-spaces, then Eqs. (15) and (16) imply Eq. (14), which is equivalent to L being G-invariant. Now, by means of the theory of Dirac structures, we are in a position to explain Drinfel’d’s more or less mysterious theorem regarding Poisson homogeneous spaces, outlined in his short paper [8] (also see [12] for the intepretation of the associated Dirac structures of Poisson homogeneous spaces in terms of Lie algebroids). To a Poisson Lie group (G, πG ), there are associated two Lie bialgebroids. One is (T G, T ∗ G; πG ) with the canonical Lie bialgebroid structure induced from the Poisson structure on G, and the other is its tangent Lie bialgebra (g, g∗ ). Identifying T G ⊕ T ∗ G with the trivial vector bundle G × (g ⊕ g∗ ) by left translations, it is clear that there is a 1-1 correspondence between maximal isotropic subspaces L of g ⊕ g∗ and left invariant maximal isotropic subbundles L¯ of T G ⊕ T ∗ G. Moreover, we have Lemma 4.3. Given any e1 , e2 ∈ g ⊕ g∗ , let e¯1 and e¯2 be their corresponding left invariant sections in T G ⊕ T ∗ G. Then, [e¯1 , e¯2 ] = [e1 , e2 ]− . ¯ e¯2 = Y¯ + η. ¯ Here Proof. Assume that e1 = X + ξ, e2 = Y + η ∈ g ⊕ g∗ , and e¯1 = X¯ + ξ, ¯ ¯ ¯ X, Y and ξ, η¯ denote the corresponding left invariant vector fields and 1-forms on G for X, Y ∈ g and ξ, η ∈ g∗ respectively. ¯ η] ¯ Y¯ ] = [X, Y ]− and [ξ, ¯ = [ξ, η]− (see [20]), it follows that From the fact that [X, LX¯ ξ¯ = (ad∗X ξ)− , Lξ¯ X¯ = (ad∗ξ X)− .

Dirac Structures and Poisson Homogeneous Spaces

129

This means that ¯ = LX¯ ξ¯ − Lξ¯ X¯ + 1 (d∗ − d) < X, ¯ ξ¯ > ¯ ξ] [X, 2 = (ad∗X ξ)− − (ad∗ξ X)− = [X, ξ]− . Therefore, ¯ Y¯ ] + [ξ, ¯ η] ¯ Y¯ ] + [X, ¯ η] [e¯1 , e¯2 ] = [X, ¯ + [ξ, ¯ − − − = [X, Y ] + [X, η] + [ξ, Y ] + [ξ, η]− = [e1 , e2 ]− . An immediate consequence is the following: Corollary 4.4. As above, L is a Lie subalgebra iff L¯ is integrable. That is, Dirac structures of g⊕g∗ are in 1-1 correspondence with left invariant Dirac structures of T G⊕T ∗ G . Proof. Sections of L¯ are spanned by those of the form f e, ¯ where f ∈ C ∞ (G). The conclusion thus follows immediately from Lemma 4.3 by using Property 3 of Proposition 2.2. Given a left invariant T G⊕T ∗ G Dirac stucture L¯ (for the Lie bialgebroid associated with the Poisson structure on G), its characteristic distribution L¯ ∩ T G is just the left translation of the subalgebra h = L ∩ g of g. So its quotient space is G/H, where H is the connected subgroup of G with Lie algebra h. On the other hand, it is well known that G/H is a nice manifold such that the projection is a submersion iff H is closed. In this situation, we call L a regular (g, g∗ ) Dirac structure. In other words, L is regular if the left translation of h = L ∩ g defines a simple foliation on G. It is simple to see that L is regular iff L¯ is reducible . Thus, by Theorem 4.2, we obtain the following: Theorem 4.5. Regular Dirac structures of g⊕g∗ are in 1-1 correspondence with Poisson homogeneous spaces G/H, where H is a connected closed subgroup of G. ˜ where H˜ is a closed Every homogeneous G space X is of the form X = G/H, ˜ subgroup of G, with H being its connected component at the unit. Then D = H/H is a discrete group, and the projection p : G/H −→ X is a covering map with structure group D. Any Poisson structure on X can be pulled back to G/H such that p is a Poisson map. Moreover, if one is a Poisson homogeneous space, so is the other. Thus, any Poisson homogeneous G-space X = G/H˜ induces a regular Dirac structure L in g ⊕ g∗ . It is easy to see that L is AdH˜ -invariant. Conversely, given a regular Dirac structure L of g⊕g∗ , let G/H be its corresponding Poisson homogeneous G-space. Then, the Poisson structure on G/H can be reduced to a ˜ = (G/H)/D iff L is AdD -invariant, or equivalently, homogeneous G-space X = G/H is AdH˜ -invariant. Thus, we obtain the following: Theorem 4.6. [8] Poisson homogeneous G-spaces bijectively correspond to pairs (L, K), where L is a regular Dirac structure of g ⊕ g∗ and K is a closed subgroup of G with Lie algebra L ∩ g such that L is invariant under the (adjoint,coadjoint) action of K.

130

Z.-J. Liu, A. Weinstein, P. Xu

5. Pullback of Dirac Structures This section is devoted to a discussion of pullbacks of Dirac structures. It will be used in Sect. 6 to extend (A, A∗ ) Dirac structures on P to “left-invariant” (T G, T ∗ G) Dirac structures on the Poisson groupoid G. These invariant structures are then related to Poisson structures on quotients of G. Given two vector spaces U , V , and a surjective linear map 8 : U −→ V , its dual 8∗ : V ∗ −→ U ∗ is injective and 8∗ (V ∗ ) = (ker8)⊥ . Write ¯ = 8 ⊕ (8∗ )−1 : U ⊕ (ker8)⊥ −→ V ⊕ V ∗ . 8 ¯ is a surjective linear map. Given any maximal isotropic subspace L ⊂ V ⊕V ∗ , Clearly, 8 ¯ −1 (L). Then, L¯ is a maximal isotropic subspace of we denote by L¯ the inverse image 8 ∗ U ⊕ U , which is called the pullback of L. Similarly suppose that A −→ P and B −→ Q are vector bundles, and 8 : A −→ B is a surjective bundle map covering a map P → Q. Then, given any maximal isotropic subbundle L ⊂ B ⊕ B ∗ , we may define its pullback L¯ ⊂ A ⊕ A∗ as the fiberwise ¯ ¯ to L, pullback. Then, L¯ is a maximal isotropic subbundle, and the restriction of 8 ¯ is a vector bundle morphism L¯ −→ L. denoted by the same symbol 8, Given a vector bundle morphism ρ : E1 −→ E2 , a section X¯ of E1 will be called admissible if ρX¯ corresponds to a section X in E2 . In this case, X¯ and X are said to be ρ-related. If ρ is surjective, then 0(E1 ) is spanned over R (possibly an infinite sum but locally finite) by all sections of the form f X¯ for f ∈ C ∞ (P ) and X¯ ∈ 0(E1 ) admissible, by the partition of unity. ¯ : L¯ −→ L, it follows Applying the observation above to the bundle morphism 8 that ¯ = span{f e|∀ ¯ and f ∈ C ∞ (P )}. (17) 0(L) ¯ admissible e¯ ∈ 0(L) Theorem 5.1. Let (A, A∗ ), (B, B ∗ ) be two Lie bialgebroids, and 8 : A −→ B a Lie bialgebroid morphism [16] which is a surjective bundle map. Thus the bundle map ¯ = 8 ⊕ (8∗ )−1 : A ⊕ (ker8)⊥ −→ B ⊕ B ∗ is surjective. 8 ¯ −1 (L) is a Given any maximal isotropic subbundle L ⊂ B ⊕ B ∗ , its pull back L¯ = 8 ∗ ¯ Dirac structure of A ⊕ A iff L is Dirac structure. Moreover, in this case, 8 : L¯ −→ L is a Lie algebroid morphism. ¯ is spanned by sections of the form f e, Proof. Since 0(L) ¯ it suffices to prove the following identity: ¯ ¯ e¯1 , e¯2 ] = [e1 , e2 ], ∀ admissible e¯1 , e¯2 ∈ 0(L), 8[ (18) according to Property 3 of Proposition 2.2. ¯ where X¯ and Y¯ are admissible sections of A under Write e¯1 = X¯ + ξ¯ and e¯2 = Y¯ + η, the map 8, and ξ¯ and η¯ are admissible sections of (ker8)⊥ under the map (8∗ )−1 . Denote by X, Y , and ξ, η their corresponding sections in B and B ∗ , respectively. Since 8 is a Lie algebroid morphism, by definition (see [9]), ¯ Y¯ ] = [X, Y ]. 8[X,

(19)

Moreover, 8 is a Poisson map, where A and B are equipped with the Lie-Poisson structures corresponding to the Lie algebroids A∗ and B ∗ respectively, since 8 is a Lie bialgebroid morphism. Thus, 8∗ {lξ , lη } = {8∗ lξ , 8∗ lη },

Dirac Structures and Poisson Homogeneous Spaces

131

where lξ and lη are the linear functions on B corresponding to ξ, η ∈ 0(B ∗ ). Therefore, 8∗ l[ξ,η] = {lξ¯ , lη¯ } = l[ξ,¯ η] ¯ . Thus it follows that ∗ −1

¯ η] ¯ = [ξ, η]. (8∗ )−1 [ξ, ⊥

(20)

∗

That is, (8 ) : (ker8) −→ B is also a Lie algebroid morphism, where (ker8)⊥ is considered as a subalgebroid of A∗ . Now ¯ Y¯ ] + [ξ, ¯ η]. ¯ Y¯ ] + [X, ¯ η] ¯ + [ξ, ¯ [e¯1 , e¯2 ] = [X, According to Eq. (5), 1 ¯ η¯ >= iX¯ dη¯ − iη¯ d∗ X¯ − 1 (d∗ − d) < X, ¯ η¯ > . ¯ η] [X, ¯ = LX¯ η¯ − Lη¯ X¯ + (d∗ − d) < X, 2 2 Since 8 is a Lie algebroid morphism, dη¯ is admissible and is (8∗ )−1 -related to dη. ¯ η¯ >= ϕ∗ < X, η >, so Similarly, d∗ X¯ is 8-related to d∗ X. Finally, note that < X, ¯ η¯ > and d∗ < X, η > are 8-related while d < X, ¯ η¯ > and d < X, η > are d∗ < X, (8∗ )−1 -related. Hence, ¯ η] ¯ X, 8[ ¯ = [X, η]. ¯ Y¯ ] = [ξ, Y ]. Hence, Eq. (18) follows. This concludes the proof of the ¯ ξ, Similarly, 8[ theorem. Example 5.2. Recall that a hamiltonian operator on a Lie bialgebroid (B, B ∗ ) is a skewsymmetric two-form in 0(∧2 B ∗ ) satisfying the following Maurer-Cartan type equation [11]: 1 dI + [I, I] = 0. 2 In particular, I is called a strong hamiltonian operator if both dI and [I, I] vanish. Given a two-form I ∈ 0(∧2 B ∗ ), the graph LI = {X + I b X|X ∈ B} of its induced bundle map is a Dirac structure iff I is a hamiltonian operator. Suppose that 8 : A −→ B is a Lie bialgebroid morphism and I ∈ 0(∧2 B ∗ ) a (strong) hamiltonian operator. 8∗ pulls a two-form in 0(∧2 B ∗ ) back to a two-form in 0(∧2 A∗ ). It is easy to see that 8∗ I is then a (strong) hamiltonian operator in A. Moreover, the pull-back of the corresponding Dirac structure LI is exactly the graph LI¯ of the (strong) hamiltonian operator I¯ = 8∗ I. Example 5.3. Given a surjective submersion ϕ : M −→ N of manifolds M , N , its derivative defines a Lie algebroid morphism 8 : T M −→ T N . This is also a Lie bialgebroid morphism between (T M, T ∗ M ) and (T N, T ∗ N ), where T ∗ M and T ∗ N are considered as the cotangent algebroids for the zero Poisson structure. According to Courant [3], in this case, a Dirac structure simply corresponds to a foliation on the manifold together with a family of closed 2-forms on the leaves. The pullback of the Dirac structure is just the pullback of the foliation by ϕ together with the pullback of two-forms. Remark. A Dirac structure on a vector space is equivalent to a two-form on a subspace [3]. Thus, we could also pull back an isotropic subbundle by any bundle map, not just a surjection. Of course, the pullback might not be continuous if the map is not of constant rank. Moreover, it is not clear whether the integrability condition is preserved.

132

Z.-J. Liu, A. Weinstein, P. Xu

6. Left Invariant Dirac Structures on Poisson Groupoids To generalize Drinfel’d’s theorem on homogeneous spaces from Poisson groups to Poisson groupoids, we will first extend the notion of left invariant Dirac structure from Poisson groups to groupoids. −→ P ; α, β) be a Poisson groupoid, with Lie algebroid A. Here 0(A) is Let (G −→ identified with left-invariant vector fields on the groupoid. The dual bundle A∗ can be naturally identified with the conormal bundle of the identity space P in the groupoid, and therefore inherits a Lie algebroid structure according to Weinstein [19]. Moreover, it was shown in [16] that (A, A∗ ) is a Lie bialgebroid, which is called the tangent Lie bialgebroid of G. By T α G, we denote the subbundle of T G consisting of all vectors tangent to α-fibers. The group B(G) of bisections of G (submanifolds which project diffeomorphically to P by α and β) acts naturally on G by left multiplication: lK x = K · x, ∀K ∈ B(G), x ∈ G. As usual, the action lifts naturally to actions on T G and T ∗ G which leave T α G and (T α G)⊥ invariant. Define a map 8 : T ∗ G −→ A∗ as follows: given any x ∈ G and ξ ∈ Tx∗ G, set 8(ξ) ∈ A∗p with p = β(x) such that < 8ξ, v >=< ξ, T lx v >, ∀v ∈ Ap . Then, T ∗G    y G

8 −−−−−−−−−−−→

−−−−−−−−−−−→ β

A∗    y

(21)

(22)

P

is a bundle map. Remark. In terms of symplectic groupoids [2], 8 is just the β-map of the cotangent −→ A∗ [2]. However, our symplectic structure on T ∗ G differs by a minus groupoid T ∗ G −→ −→ A∗ . Therefore, 8 is a Poisson sign from the one on the symplectic groupoid T ∗ G −→ map. Another interesting way to think of 8 is as the momentum map for the lifted right action of B(G) on the cotangent bundle T ∗ G. Here A∗ is considered as a subset of 0(A)∗ in the form of delta-distributions. It can be checked that the image of the momentum map is just A∗ . This also indicates that 8 should be a Poisson map. The following lemma lists some basic properties of 8. Lemma 6.1. 1. ker8 = (T α G)⊥ ; 2. for any X ∈ 0(A), 8∗ X, as a section in T G, is exactly the left invariant tangent vector field X¯ obtained by left translating X along α-fibers; 3. 8 is B(G)-invariant, i.e., ∗ ξ) = 8(ξ), ∀K ∈ B(G), and ξ ∈ T ∗ G; 8(lK

Dirac Structures and Poisson Homogeneous Spaces

133

4. 8 is a Lie bialgebroid morphism, where (T ∗ G, T G) is equipped with the natural Lie bialgebroid structure associated with the Poisson structure G while (A∗ , A) is the flipping of the tangent Lie bialgebroid (A, A∗ ) 2 . Proof. The proof for (1)–(3) is obvious, and is left for the reader. For (4), since 8 is already known to be a Poisson map, it suffices to show that it is a Lie algebroid morphism. This is, however, quite clear since the Lie algebroid structure on A∗ is defined in terms of the Lie algebroid structure on T ∗ G by identifying A∗ with the conormal bundle of P in G [19]. In fact, T ∗ G is a LA-groupoid in the terminology of Mackenzie [15]. Proposition 6.2. A Dirac structure L¯ ⊂ T G ⊕ T ∗ G is the pullback of a Dirac structure in A ⊕ A∗ iff 1. L¯ is B(G)-invariant, and ¯ 2. (T α G)⊥ ⊂ L. Proof. Recall that lK , ∀K ∈ B(G), denotes the B(G)-action on T G ⊕ T ∗ G. As in the ¯ be the map (8∗ )−1 ⊕ 8 : T α G ⊕ T ∗ G −→ A ⊕ A∗ . It is obvious previous section, let 8 ¯ is also invariant under the B(G)-action, i.e., that 8 ¯ ◦lK = 8, ¯ ∀K ∈ B(G). 8 Suppose that L¯ is a Dirac structure in T G ⊕ T ∗ G. If L¯ is the pullback of a Dirac ¯ −1 (L). Clearly, L¯ is B(G)-invariant since 8 ¯ is structure L in A ⊕ A∗ , then, L¯ = 8 α ⊥ ¯ invariant. Since (T G) = ker8, it follows that (T α G)⊥ ⊂ L. Conversely, the condition (T α G)⊥ ⊂ L¯ implies that L¯ ⊂ T α G ⊕ T ∗ G, since L¯ is ¯ K·x ) = 8(l ¯ x ) = 8( ¯ x ). ¯ L| ¯ K L| ¯ L| isotropic. Since L¯ is B(G)-invariant, it follows that 8( ¯ ¯ Therefore, 8(L|x ) depends only on the base point p = β(x), and thus defines a subspace Lp in Ap ⊕ A∗p , which is easily seen to be maximal isotropic. Thus we obtain a maximal ¯ −1 (L). According to Theorem 5.1, L isotropic subbundle L in A ⊕ A∗ such that L¯ = 8 must be a Dirac subbundle. When G is a Poisson group, pullback Dirac structures in the sense above correspond exactly to left invariant Dirac strucures as discussed in Sect. 4. For this reason, we shall call any such pullback a left invariant Dirac structure. Suppose that F is a foliation on G with distribution D ⊂ T G such that G/F is a nice manifold. According to Theorem 3.3, every Poisson structure on G/F corresponds to a Dirac structure L¯ ⊂ T G ⊕ T ∗ G. Theorem 6.3. L¯ is the pullback of a Dirac structure L ⊂ A ⊕ A∗ iff 1. F is B(G)-invariant; 2. {·, ·}1 is B(G)-invariant; and 3. D ⊂ T α G and {α∗ f, g}1 = 0, ∀f ∈ C ∞ (P ) and g ∈ C ∞ (G/F ) ∼ = CL∞ ¯ (G). Proof. Suppose that L¯ is the pullback of a Dirac structure L under 8. Then L¯ is B(G)invariant. Thus, D = L¯ ∩ T G is also B(G)-invariant. According to Lemma 4.1, {·, ·}1 is B(G)-invariant. For (3), we note that L¯ ⊂ T α G ⊕ T ∗ G by Proposition 6.2 since L¯ is isotropic. Thus it follows that D = L¯ ∩ T G ⊂ T α G. Also note that α∗ f is constant along α-fibers, so D ⊂ T α G implies that α∗ f is admissible. Therefore, {α∗ f, g}1 is 2 In order to be consistent with previous notation for Lie bialgebroid morphisms, we have flipped both Lie bialgebroids here.

134

Z.-J. Liu, A. Weinstein, P. Xu

¯ we may choose Yα∗ f = 0. well-defined. Now since dα∗ f ∈ (T α G)⊥ = ker8 ⊂ L, Thus, {α∗ f, g}1 = Yα∗ f g = 0. ¯ Conversely, assume that (1)–(3) hold. So, for any f ∈ C ∞ (P ), α∗ f is L-admissible. ¯ Then, As in Sect. 3, let Yα∗ f ∈ X (G) be any vector field that Yα∗ f + dα∗ f ∈ 0(L). ∗ ¯ Yα∗ f g = {α f, g}1 = 0 for all admissible g. This implies that Yα∗ f ∈ 0(D) ⊂ 0(L). ∗ α ⊥ ¯ Thus, dα f ∈ 0(L). Since ker 8 = (T G) is spanned by all such vectors, it follows that ¯ ker8 ⊂ L. By Lemma 4.1, (1)–(2) imply that L¯ is B(G)-invariant. Thus L¯ is the pullback of L according to Proposition 6.2. 7. Poisson Actions −→ P be a Poisson groupoid with Poisson tensor πG . Suppose that G acts on a Let G −→ Poisson manifold (X, π) equipped with a moment map J : X −→ P . Here, the action is a map m : (g, x) 7→ g · x from G ×P X = {(g, x) ∈ G × X|β(g) = J(x)} to X, satisfying the usual condition g · (h · x) = (gh) · x. The action is a Poisson action if its graph = {(g, x, g · x)|β(g) = J(x)} is a coisotropic submanifold of G × X × X [18]. For example, consider a complete Poisson group H with dual Poisson group H ∗ . Then G = HH ∗ is a symplectic groupoid over H ∗ . Suppose that X is a Poisson H-space with an equivariant momentum map J : X −→ H ∗ . Then, (X, J) is a Poisson G-space under the G-action [22]: (g, u) · x = gx, g ∈ H, u ∈ H ∗ and x ∈ X such that J(x) = u. Any bisection K of the groupoid induces a diffeomorphism of G by left multiplication by K. This is denoted by lK . On the other hand, we also can define a diffeomorphism of X by x −→ K · x, ∀x ∈ X. This is also denoted by the same symbol lK when confusion is unlikely. A section of the moment map is a submanifold of X to which the restriction of J is a diffeomorphism. Similarly, a local section of J is a submanifold Y of X to which the restriction of J is a diffeomorphism between Y and an open subset J(Y) of P . If J is a submersion, through any point of X there always exists a local section of J. A global section Y induces a map, denoted by rY , from G to X given by: g −→ gY. Similarly, given any compatible (g, x) ∈ G ∗ X (i.e. β(g) = J(x)), a local section Y through the point x induces a map rY from a neighborhood of g to a neighborhood of x in the same way. The main theorem is the following: −→ P is a Poisson groupoid acting on a Poisson manifold Theorem 7.1. Suppose that G −→ equipped with a moment map J : X −→ P . This is a Poisson action iff 1. For any f ∈ C ∞ (P ), XJ ∗ f (x) = (rx )∗ Xα∗ f (u), where x ∈ X and u = J(x).

(23)

Dirac Structures and Poisson Homogeneous Spaces

135

2. For any compatible (g, x) ∈ G ∗ X, π(gx) = (lK )∗ π(x) + (rY )∗ πG (g) − (rY )∗ (lK )∗ πG (u),

(24)

where u = β(g) = J(x), K is any local bisection of G through g, and Y any local section through the point x. Remarks. (1) Here rx denotes the map: g −→ g · x from β −1 (u) to X. Since Xα∗ f (u) is tangent to β −1 (u), the right hand side of Eq. (23) makes sense. (2) When G is a Poisson group, i.e. P reduces to a point, the first condition is satisfied automatically, and the second one reduces to the usual condition of Poisson action since πG (u) = 0. Proposition 7.2. Under the hypotheses of Theorem 7.1, if the action is a Poisson action, then J : X −→ P is a Poisson map. Proof. Let x ∈ X, J(x) = u and ξ, η ∈ Tu∗ P be any covectors at u. For any g ∈ G with β(g) = u, as covectors at the point (g, x, gx), both (−β ∗ ξ, J ∗ ξ, 0) and (−β ∗ η, J ∗ η, 0) are conormal to . Therefore, πG (−β ∗ ξ, −β ∗ η) + π(J ∗ ξ, J ∗ η) = 0, which implies that (J∗ π(x))(ξ, η) = −(β∗ πG (g))(ξ, η) = πP (u)(ξ, η). That is J∗ π = πP . In other words, J is a Poisson map. Lemma 7.3. Let x ∈ X and J(x) = u. Suppose that δx ∈ Tx X and δu ∈ Tu G such that (δu , δx ) ∈ T(u,x) (G ∗ X). Let δx0 = m∗ (δu , δx ) ∈ Tx X. Then, δx0 = δx + (rx )∗ (δu − ∗ J∗ δx ),

(25)

where : P −→ G is the inclusion of the unit space. Proof. As a tangent vector in T(u,x) (G ∗ X), (δu , δx ) can be split into the sum of (∗ J∗ δx , δx ) and (δu − ∗ J∗ δx , 0). It is easy to see that m∗ (δu − ∗ J∗ δx , 0) = (rx )∗ (δu − ∗ J∗ δx ) and m∗ (∗ J∗ δx , δx ) = δx . This proves the lemma. Lemma 7.4. Suppose that (g, x, z) with z = gx is any point in , and (δg , δx , δz ) any tangent vector of at this point. Let K be any (local) bisection of G through the point g and Y any (local) section of J through the point x. Then, δz = rY∗ δg + lK∗ δx − lK∗ rY∗ ∗ J∗ δx .

(26)

Proof. Let u = β(g) = J(x), and δu = lK∗−1 δg . Then, δu is a tangent vector of G at u. It is clear that δz = m∗ (δg , δx ) = lK∗ m∗ (lK∗−1 δg , δx ) = lK∗ m∗ (δu , δx ) (By Lemma 7.3) = lK∗ (δx + (rx )∗ (δu − ∗ J∗ δx )) = lK∗ (δx + rY∗ (δu − ∗ J∗ δx )) = lK∗ (δx + rY∗ (lK∗−1 δg − ∗ J∗ δx ))

= r Y ∗ δ g + l K ∗ δ x − l K ∗ rY∗ ∗ J ∗ δ x .

136

Z.-J. Liu, A. Weinstein, P. Xu

Corollary 7.5. Suppose that (g, x, z) ∈ , K any (local) bisection of G through the point g, and Y any (local) section of J through the point x. For any ζ ∈ Tz∗ X, the ∗ ∗ ∗ ∗ ∗ ζ, J ∗ ∗ rY lK ζ − l K ζ, ζ) ∈ T(g,x,z) (G × X × X) is conormal to . covector (−rY −→ P is a Poisson groupoid. Then for any f ∈ C ∞ (P ) Lemma 7.6. Suppose that G −→ and u ∈ P , Xα∗ f (u) − Xβ ∗ f (u) is tangent to P and equals Xf (u). Proof. As a covector of G at u, α∗ df − β ∗ df is clearly conormal to the unit space P . Since P is a coisotropic submanifold, it follows that Xα∗ f (u) − Xβ ∗ f (u) is tangent to P . Hence Xα∗ f (u) − Xβ ∗ f (u) = α∗ (Xα∗ f (u) − Xβ ∗ f (u)) = Xf (u). −→ P is a Poisson groupoid acting on a Poisson manProposition 7.7. Suppose that G −→ ifold X with moment map J : X −→ P . Suppose that the action is a Poisson action. Then, for any f ∈ C ∞ (P ), XJ ∗ f (x) = (rx )∗ Xα∗ f (u), ∀x ∈ X, where u = J(x). Proof. Take any g ∈ G such that β(g) = J(x). Then (g, x, z) with z = gx is in . For any ∗ ∗ ∗ ∗ ζ, J ∗ ∗ rY l K ζ − lK ζ, ζ) ζ ∈ Tz∗ X and f ∈ C ∞ (P ), as covectors at (g, x, z), both (−rY ∗ ∗ and (−β df, J df, 0) are conormal to . Therefore, ∗ ∗ ∗ ∗ πG (g)(−rY ζ, −β ∗ df ) + π(x)(J ∗ ∗ rY lK ζ − lK ζ, J ∗ df ) = 0.

Now ∗ ∗ πG (g)(−rY ζ, −β ∗ df ) = < −Xβ ∗ f (g), rY ζ> = < −rY∗ Xβ ∗ f (g), ζ > = < −rY∗ lK∗ Xβ ∗ f (u), ζ >,

where the last step uses the fact: Xβ ∗ f (g) = lK∗ Xβ ∗ f (u). Also, ∗ ∗ ∗ ∗ π(x)(J ∗ ∗ rY lK ζ, J ∗ df ) = (J∗ π(x))(∗ rY lK ζ, df ) ∗ ∗ ∗ = πP (u)( rY lK ζ, df ) ∗ ∗ lK ζ) = −Xf (u)(∗ rY = < −lK∗ rY∗ ∗ Xf (u), ζ > (using Lemma 7.6) = < −lK∗ rY∗ (Xα∗ f (u) − Xβ ∗ f (u)), ζ >,

and

∗ ∗ ζ, J ∗ df ) =< XJ ∗ f (x), lK ζ >=< lK∗ XJ ∗ f (x), ζ > . π(x)(−lK

Therefore, it follows that < −lK∗ rY∗ Xα∗ f (u), ζ > + < lK∗ XJ ∗ f (x), ζ >= 0, which implies immediately that XJ ∗ f (x) = rY∗ Xα∗ f (u) = (rx )∗ Xα∗ f (u).

Dirac Structures and Poisson Homogeneous Spaces

137

−→ P is a Poisson groupoid acting on a Poisson manProposition 7.8. Suppose that G −→ ifold X with moment map J : X −→ P . Suppose that the action is a Poisson action. Then, (27) π(gx) = (lK )∗ π(x) + (rY )∗ πG (g) − (rY )∗ (lK )∗ πG (u), where u = β(g) = J(x), K is any bisection of G through g, and Y any local section through the point x. Lemma 7.9. For any u ∈ P and η ∈ Tu∗ G, Xα∗ ∗ η (u) + ∗ α∗ Xη (u) − ∗ X∗ η (u) = Xη (u).

(28)

∗ ∗

Proof. It is clear that α η − η, as a covector of G at u, is conormal to P . Hence, Xα∗ ∗ η (u) − Xη (u) is tangent to P . It follows that Xα∗ ∗ η (u) − Xη (u) = ∗ α∗ (Xα∗ ∗ η (u) − Xη (u)) = ∗ X∗ η (u) − ∗ α∗ Xη (u). This completes the proof of the lemma.

Lemma 7.10. ∗ ∗ ∗ ∗ ∗ ∗ π(x)(J ∗ ∗ rY lK ζ − l K ζ, J ∗ ∗ rY lK η − l K η) = (lK∗ π(x))(ζ, η) − (lK∗ rY∗ πG (u))(ζ, η).

Proof. ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ π(x)(J ∗ ∗ rY lK ζ, J ∗ ∗ rY lK η) = (J∗ π(x))(∗ rY lK ζ, ∗ rY lK η) ∗ ∗ ∗ ∗ ∗ ∗ = πP (u)( rY lK ζ, rY lK η) ∗ ∗ ∗ ∗ ζ (u), r l = < X∗ rY∗ lK Y Kη > ∗ ∗ ∗ ζ (u), r l = < ∗ X∗ rY∗ lK Y K η >,

∗ ∗ ∗ ∗ ∗ ζ (x), l lK ζ, lK η) = < XJ ∗ ∗ rY∗ lK π(x)(J ∗ ∗ rY K η > (using Proposition 7.7 ) ∗ ∗ ζ (u), l = < rY∗ Xα∗ ∗ rY∗ lK Kη >

∗ ∗ ∗ ζ (u), r l = < Xα∗ ∗ rY∗ lK Y Kη > .

Using the above relations, ∗ ∗ ∗ ∗ ∗ ∗ η (u), r l π(x)(lK ζ, J ∗ ∗ rY lK η) = − < Xα∗ ∗ rY∗ lK Y Kζ > ∗ ∗ ∗ ∗ ∗ ζ (u), α r l = < XrY∗ lK Y Kη >

∗ ∗ ∗ ζ (u), r l = < ∗ α∗ XrY∗ lK Y K η >,

∗ ∗ ∗ ∗ ∗ ∗ lK ζ − l K ζ, J ∗ ∗ rY lK η − lK η) π(x)(J ∗ ∗ rY ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ lK ζ, lK η) = π(x)(J rY lK ζ, J rY lK η) − π(x)(J ∗ ∗ rY ∗ ∗ ∗ ∗ ∗ ∗ ∗ −π(x)(lK ζ, J rY lK η) + π(x)(lK ζ, lK η) ∗ ∗ ∗ ζ (u) − Xα∗ ∗ r ∗ l∗ ζ (u) − ∗ α∗ Xr ∗ l∗ ζ (u), r l = < ∗ X∗ rY∗ lK Y Kη > Y K Y K ∗ ∗ +π(x)(lK ζ, lK η) (using Eq. (28) ) ∗ ∗ ∗ ∗ ∗ ζ (u), r l = < −XrY∗ lK Y K η > +π(x)(lK ζ, lK η)

∗ ∗ ∗ ∗ ∗ ∗ = −πG (u)(rY lK ζ, rY lK η) + π(x)(lK ζ, lK η) = −(lK∗ rY∗ πG (u)(ζ, η) + (lK∗ π(x))(ζ, η).

138

Z.-J. Liu, A. Weinstein, P. Xu

∗ ∗ ∗ ∗ Proof of Proposition 7.8. For any ζ, η ∈ Tz∗ X, both (−rY ζ, J ∗ ∗ rY lK ζ − l K ζ, ζ) and ∗ ∗ ∗ ∗ η, J ∗ ∗ rY lK η − lK η, η) are conormal to . Therefore, (−rY ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ζ, −rY η) + π(x)(J ∗ ∗ rY lK ζ − l K ζ, J ∗ ∗ rY lK η − lK η) − π(z)(ζ, η) = 0. πG (g)(−rY

The conclusion follows immediately by Lemma 7.10.

Proof of Theorem 7.1. One direction has been proved by the above series of propositions. It remains to prove the other direction. First, we note that the first condition implies that J : X −→ P is a Poisson map. This can be seen as follows. As a map defined on β −1 (u), we have (J ◦rx )(r) = J(rx) = α(r), ∀r ∈ β −1 (u). Then, it follows that J∗ XJ ∗ f (x) = J∗ (rx )∗ Xα∗ f (u) = α∗ Xα∗ f (u) = Xf (u). Given any point (g, x, z) ∈ ⊂ G × X × X. The conormal space of at this point is spanned by two types of vectors: (−β ∗ df, J ∗ df, 0) for any f ∈ C ∞ (P ) and ∗ ∗ ∗ ∗ ζ, J ∗ ∗ rY lK ζ − lK ζ, ζ) for any ζ ∈ Tz∗ X. Using the same arguments as in the (−rY proof of Propositions 7.2, 7.7 and 7.8, it can be easily checked that the evaluation of the Poisson tensor on all these vectors vanish. This concludes the proof. Remark. In the proof above, we only used the fact that Eq. (24) holds for one, instead of all, such bisections. Consequently, under the rest of the assumptions of Theorem 7.1, if Eq. (24) holds for any one bisection, it holds for all. 8. Poisson homogeneous spaces The notion of homogeneous space for a groupoid action is more subtle than for groups. (This point has already been made in [1], cited in [14]). One natural candidate for such a space is G acting on itself by left translations, but this action is not transitive in the usual sense, since β(gx) = β(x), so that the action is transitive only on each β-fibre. Instead, we define homogeneous G-spaces to be those which are isomorphic to G/H for some wide (i.e. containing all the identities) subgroupoid H of G. That is, we define G/H by the equivalence relation g ∼ h ⇐⇒ ∃n ∈ H such that gn = h, with the moment map J([g]) = α(g) and the action g · [h] = [gh]. The following is an intrinsic characterization of such spaces. Definition 8.1. A G-space X over P is homogeneous if there is a section σ of the moment map J which is saturating for the action in the sense that G · σ(P ) = X. The isotropy subgroupoid of the section σ consists of those g ∈ G for which g·σ(P ) ⊂ σ(P ). Proposition 8.2. A G-space is homogeneous if and only if it is isomorphic to G/H for some wide subgroupoid H ⊂ G. Proof. It is easy to see that a G-space of the form G/H is homogeneous, since the image of the identity section of G is a saturating section of G/H. On the other hand, given a homogeneous space X with saturating section σ, we define a map θ : G −→ X by θ(g) = g · σ(β(g)). θ is surjective: for every x ∈ X we have x = g · σ(p) for some g ∈ G and p ∈ P ; then β(g) = J(σ(p)) = p, so x = g · σ(β(g)) = θ(g). On the other hand, if θ(g) = θ(h), i.e. g · σ(β(g)) = h · σ(β(h)), we have g −1 h · σ(β(h)) = σ(β(g)). Since

Dirac Structures and Poisson Homogeneous Spaces

139

g −1 h can act on only one element of σ(P ), it follows that g −1 h belongs to the isotropy groupoid H of the section σ. Also, for n ∈ H, θ(gn) = gn · σ(β(gn)) = g · σ(p) for some p. But β(g) = J(σ(p)), so θ(gn) = g · (σ(β(g)) = θ(g). It follows that θ induces a bijection (which can be checked to be G-equivariant) from G/H to X. −→ P, α, β) be a Poisson groupoid, and H a connected closed subgroupoid. Let (G −→ Suppose that X = G/H is a Poisson manifold. Write p as the natural projection: G −→ X. Let {·, ·}1 be the difference bracket from C ∞ (X) ⊗ C ∞ (X) to C ∞ (G) as defined by Eq. (10): {ϕ, ψ}1 = p∗ {ϕ, ψ} − {p∗ ϕ, p∗ ψ}G , ∀ϕ, ψ ∈ C ∞ (X). X is a homogeneous space of the groupoid G. Write J as its moment map: X −→ P . Then, J ◦p = α. The main theorem of this section is Theorem 8.3. X is a Poisson homogeneous space iff 1. For any bisection K of G, ∗ ∗ ∗ ϕ, lK ψ}1 = lK {ϕ, ψ}1 , ∀ϕ, ψ ∈ C ∞ (X), {lK

i.e., {·, ·}1 is left invariant; and 2. for any f ∈ C ∞ (P ) and ψ ∈ C ∞ (X), {J ∗ f, ψ}1 = 0. We split its proof into two propositions. Proposition 8.4. The following statements are equivalent: 1. For any f ∈ C ∞ (P ), XJ ∗ f (x) = (rx )∗ Xα∗ f (u), ∀x ∈ X with u = J(x). 2. For any f ∈ C ∞ (P ) and ψ ∈ C ∞ (X) {J ∗ f, ψ}1 = 0. Proof. For any g ∈ G, let x = p(g) = [gH] ∈ X. Then, ∀f ∈ C ∞ (P ), {p∗ J ∗ f, p∗ ψ}G (g) = {α∗ f, p∗ ψ}G (g) = Xα∗ f (g)(p∗ ψ) = [p∗ Xα∗ f (g)]ψ = [p∗ rg∗ Xα∗ f (u)]ψ. Now, it is clear that (p◦rg )(γ) = p(γg) = [γgH] = rx (γ),

∀γ ∈ β −1 (u).

Hence, p◦rg = rx on β −1 (u). Therefore, {p∗ J ∗ f, p∗ ψ}G (g) = [(rx )∗ Xα∗ f (u)]ψ. On the other hand,

140

Z.-J. Liu, A. Weinstein, P. Xu

p∗ {J ∗ f, ψ}(g) = {J ∗ f, ψ}(pg) = {J ∗ f, ψ}(x) = XJ ∗ f (x)ψ. Thus, it follows immediately that {J ∗ f, ψ}1 = 0, ∀ψ ∈ C ∞ (X) is equivalent to the equation: XJ ∗ f (x) = (rx )∗ Xα∗ f (u), ∀x ∈ X. Proposition 8.5. The following statements are equivalent: 1. For any bisection K of G, ∗ ∗ ∗ ϕ, lK ψ}1 = lK {ϕ, ψ}1 , ∀ϕ, ψ ∈ C ∞ (X), {lK

i.e., {·, ·}1 is left invariant; 2. For any compatible (g, x) ∈ G ∗ X, π(gx) = (lK )∗ π(x) + (rY )∗ πG (g) − (rY )∗ (lK )∗ πG (u),

(29)

where u = β(g) = J(x), K is any local bisection of G through g, and Y any local section through the point x. Proof. Let γ be any point in G, p(γ) = [γH] = x ∈ X, and J(x) = α(γ) = u ∈ P . Also, let g = K(u) = K ∩ β −1 (u) ∈ G. Then, lK γ = gγ and lK x = g · x. Thus, ∗ ∗ ∗ ∗ ϕ, lK ψ}(γ) = {lK ϕ, lK ψ}(x) p∗ {lK ∗ ∗ dψ) = π(x)(lK dϕ, lK = lK∗ π(x)(dϕ, dψ),

and ∗ ∗ {p∗ lK ϕ, p∗ lK ψ}G (γ) ∗ ∗ ∗ dψ) = πG (γ)(p lK dϕ, p∗ lK = lK∗ p∗ πG (γ)(dϕ, dψ).

That is, ∗ ∗ {lK ϕ, lK ψ}1 (γ) = −lK∗ π(x)(dϕ, dψ) + lK∗ p∗ πG (γ)(dϕ, dψ).

On the other hand, ∗ lK {ϕ, ψ}1 (γ) = {ϕ, ψ}1 (gγ) = −(p∗ {ϕ, ψ})(gγ) + {p∗ ϕ, p∗ ψ}G (gγ) = −{ϕ, ψ}(gx) + {p∗ ϕ, p∗ ψ}G (gγ) = −π(gx)(dϕ, dψ) + p∗ πG (gγ)(dϕ, dψ).

Therefore, {·, ·}1 is left invariant iff lK∗ π(x) − lK∗ p∗ πG (γ) = π(gx) − p∗ πG (gγ), or p∗ πG (gγ) − lK∗ p∗ πG (γ) = π(gx) − lK∗ π(x).

(30)

Dirac Structures and Poisson Homogeneous Spaces

141

Since G is a Poisson groupoid, according to Theorem 2.4 in [23], we have πG (gγ) = lK∗ πG (γ) + rR∗ πG (g) − lK∗ rR∗ πG (u), where R is any local bisection of G through the point γ. Hence, it follows that p∗ πG (gγ) − lK∗ p∗ πG (γ) = p∗ rR∗ πG (g) − lK∗ p∗ rR∗ πG (u). Here, we have used the identity: p◦lK = lK ◦p, both being considered as maps from G to X. Let Y = p(R) ⊂ X. Then Y is a local section of J through the point x. It is simple to see that, as maps from G to X, p◦rR = rY . Hence, p∗ rR∗ πG (g) − lK∗ p∗ rR∗ πG (u) = rY∗ πG (g) − lK∗ rY∗ πG (u). This shows that Eq. (30) is equivalent to Eq. (24) with Y = p(R). The conclusion thus follows from Theorem 7.1 together with the remark following its proof. A Dirac structure L of A⊕A∗ is called regular if L∩A is a subalgebroid of A whose left translation defines a simple foliation on G3 . Then, L is regular iff L¯ is reducible . Thus, combining Theorem 6.3 and Theorem 8.3, we obtain the following main theorem, which is a generalization of Drinfel’d’s theorem in the groupoid context. Theorem 8.6. For a Poisson groupoid G, there is a 1-1 correspondence between Poisson homogeneous spaces G/H and regular Dirac structures L of its tangent Lie bialgebroid, where H is the α-connected closed subgroupoid of G corresponding to the subalgebroid L ∩ A. We end this section with some examples. Example 8.7. Under the same hypothesis as in Theorem 8.6, if moreover L is the graph of a hamiltonian operator H ∈ 0(∧2 A), its corresponding Poisson homogeneous space is still G, but equipped with a different Poisson structure, πG + 8∗ H. Here 8∗ H ∈ 0(∧2 T G) is the pull back of H under the morphism 8 : T ∗ G −→ A∗ . This Poisson structure in fact defines a Poisson affinoid structure in the terminology of Weinstein [21]. Example 8.8. For a Poisson manifold (P, π), let (T P, T ∗ P, π) be its canonical Lie bialgebroid. The corresponding Poisson groupoid is the pair groupoid G = P × P¯ . It is easy to see that any homogeneous G-space in this case is always of the form P × P/F, where F is a simple foliation on P . The groupoid G-action is given by: (x, y) · (y, [z]) = (x, [z]),

∀x, y, z ∈ P.

Moreover, this becomes a Poisson homogeneous G-space iff P is equipped with the original Poisson structure π (the Poisson structure on P/F may be arbitrary). In other words, in this case, Poisson homogeneous spaces are in 1-1 correspondence with Poisson structures on quotient manifolds of P . Thus, Theorem 8.6 reduces to Theorem 3.3. 3 In the case of groups, this is equivalent to saying that L ∩ A can be integrated to a connected closed subgroup of G. However, when G is a groupoid, that L∩A can be integrated to a connected closed subgroupoid seems not sufficient to get a simple foliation.

142

Z.-J. Liu, A. Weinstein, P. Xu

Example 8.9. Dually, we may switch the order and consider the Lie bialgebroid (T ∗ P, T P ) for a Poisson manifold P . Its corresponding Poisson groupoid G, if it exists, is in fact a symplectic groupoid of P (see Theorem 5.3 in [17]). It is not difficult to see that a homogeneous space X becomes a Poisson homogeneous space iff the moment map J : X −→ P is a Poisson map. Thus we obtain the following Corollary 8.10. Suppose that P is an integrable Poisson manifold with symplectic groupoid G. There is a one-one correspondence between reducible Dirac structures in the double E = T ∗ P ⊕ T P and homogeneous G-spaces X equipped with a compatible Poisson structure in the sense that the moment map J : X −→ P is a Poisson map. It is worth noting that when P is symplectic, for a given null Dirac structure on E = T P ⊕ T ∗ P , the corresponding pair of Poisson homogeneous spaces in Example 8.8 and 8.9 correspond to a Poisson dual pair. In other words, for a symplectic manifold P , a null Dirac structure on E = T P ⊕ T ∗ P , under a certain regularity condition, corresponds to a Poisson dual pair. It would be interesting to explore what happens for a general Dirac structure, and also even more general situation when P is degenerate. The last example is the following Example 8.11. As in Example 8.9, let P be an integrable Poisson manifold with symplectic groupoid G, and E = T ∗ P ⊕ T P . Assume that the Dirac structure arises from a hamiltonian operator, which is, in this case, a two form θ on P satisfying the equation: 1 dθ + [θ, θ] = 0. 2 Its corresponding Poisson homogeneous space, as described in Example 8.7, is G equipped with the “affine" Poisson structure πG + 8∗ θ, where 8 : T ∗ G −→ T P is the β-map of the cotangent symplectic groupoid as defined by Eq. (21). In general, it is not clear whether this is still nondegenerate. However, in the extreme case that P is a zero Poisson structure, we will see that it is still symplectic. To see this, we note that 8 fits into the following commutative diagram: T ∗G   #  πG y TG This implies that Therefore,

8 −−−−−−−−−−−→

−−−−−−−−−−−→ β∗

TP   id . y

(31)

TP

# ∗ b # ◦(β θ) ◦πG . (8∗ θ)# = −πG # ∗ b ◦(β θ) , (8∗ θ)# ◦ω b = −πG

where ω denotes the symplectic structure on G. On the other hand, it follows from the fact that Im(α∗ θ)b ⊂ T α G⊥ and T α G⊥ ⊂ ker8 that

Dirac Structures and Poisson Homogeneous Spaces

143

(8∗ θ)# ◦(α∗ θ)b = 0. Thus,

# # ∗ b ∗ b + (8∗ θ)# )◦(ω b + (α∗ θ)b ) = id + πG ◦((α θ) − (β θ) ). (πG

That is, in case that β ∗ θ = α∗ θ, the affine Poisson structure πG + 8∗ θ is nondegenerate and the corresponding symplectic form is ω + α∗ θ. Thus, when P is a zero Poisson manifold, its symplectic groupoid is T ∗ P equipped with the canonical cotangent symplectic structure. In this case, α = β and is just the natural projection from T ∗ P to P . A hamiltonian operator corresponds to any closed two form θ on P . The homogeneous space corresponding to its induced Dirac structure is again T ∗ P , with the non-degenerate Poisson structure coming from the sum of the canonical 2-form and the pullback of θ by the projection T ∗ P −→ P . Acknowledgement. In addition to the funding sources mentioned in the first footnote, we would like to thank several institutions for their hospitality while work on this project was being done: the Isaac Newton Institute (Weinstein, Xu); the Nankai Institute for Mathematics (Liu, Weinstein, Xu); IHES, Max-Planck Institut and Peking University (Xu). Thanks go also to Yvette Kosmann-Schwarzbach, Jiang-hua Lu, and Kirill Mackenzie for their helpful comments.

References 1. Brown, R., Danesh-Naruie, G., and Hardy, G.P.L.: Topological groupoids: II. Covering morphisms and G-spaces. Math. Nachr. 74, 143–156 (1976) 2. Coste, A., Dazord, P. and Weinstein, A.: Groupo¨ıdes symplectiques. Publications du D´epartement de Math´ematiques de l’Universit´e de Lyon, I, 2/A, 1–65 (1987) 3. Courant, T.J.: Dirac manifolds. Trans. A.M.S. 319, 631–661 (1990) 4. Courant, T.J., and Weinstein, A.: Beyond Poisson structures. Seminare sud-rhodanien de g´eom´etrie VIII. Travaux en Cours 27, Paris: Hermann, 1988 5. Dorfman, I.Ya.: Dirac structures and integrability of nonlinear evolution equations Chichester: Wiley, 1993 6. Drinfel’d, V.G.: Hamiltonian structures on Lie groups, Lie bialgebras, and the geometric meaning of the classical Yang-Baxter equations. Soviet Math. Dokl. 27, 68–71 (1983) 7. Drinfel’d, V. G.: Quasi-Hopf algebras. Leningrad Math. J. 2, 829–860 (1991) 8. Drinfel’d, V.G.: On Poisson homogeneous spaces of Poisson-Lie groups. Theor. Math. Phys. 95, 524–525 (1993) 9. Higgins, P. J. and Mackenzie, K. C. H.: Algebraic constructions in the category of Lie algebroids. J. Algebra 129, 194–230 (1990) 10. Kosmann-Schwarzbach, Y.: Exact Gerstenhaber algebras and Lie bialgebroids. Acta Appl. Math. 41, 153–165 (1995) 11. Liu, Z.-J., Weinstein, A. and Xu, P.: Manin triples for Lie bialgebroids. J. Diff. Geom. 45, 547–574 (1997) 12. Lu, J.-H.: Lie algebroids associated with Poisson actions, Duke Math. J. 86, 261–304 (1997) 13. Lu, J.-H. and Weinstein, A.: Poisson Lie groups, dressing transformations, and the Bruhat decomposition. J. Diff. Geom. 31, 501–526 (1990) 14. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry, LMS Lecture Notes Series, 124, Cambridge: Cambridge Univ. Press, 1987 15. Mackenzie, K.: Private communication 16. Mackenzie, K.C.H. and Xu, P.: Lie bialgebroids and Poisson groupoids. Duke Math. J. (73), 415–452 (1994)

144

Z.-J. Liu, A. Weinstein, P. Xu

17. Mackenzie, K.C.H. and Xu, P.: Integration of Lie bialgebroids. Preprint 18. Mikami, K., and Weinstein, A.: Moments and reduction for symplectic groupoid actions. Publ. RIMS Kyoto Univ. 24, 121–140 (1988) 19. Weinstein, A.: Coisotropic calculus and Poisson groupoids. J. Math. Soc. Japan 40, 705–727 (1988) 20. Weinstein, A.: Some remarks on dressing transformations. J. Fac. Sci. Univ. Tokyo 35, 163–167 (1988) 21. Weinstein, A.: Affine Poisson structures. Intl. J. Math. 1, 343–360 (1990) 22. Weinstein, A. and Xu, P.: Classical solutions of the quantum Yang-Baxter equations. Commun. Math. Phys. 148, 309–343 (1992) 23. Xu, P.: On Poisson groupoids, Intl. J. Math. 6, No. 1, 101–124 (1995) Communicated by T. Miwa

Commun. Math. Phys. 192, 145 – 168 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

A KAM Theorem for Hyperbolic-Type Degenerate Lower Dimensional Tori in Hamiltonian Systems ? Jiangong You Department of Mathematics, Nanjing University, Nanjing 210093, P.R. China Received: 23 October 1996 / Accepted: 24 June 1997

Abstract: A KAM theorem for degenerate lower dimensional tori in nearly integrable Hamiltonian systems is given in this paper. For the non-degenerate cases, both hyperbolic and elliptic, the KAM theorem has been well established by many authors ([8, 9, 11, 13, 14, 17]).

1. Introduction and Result Consider a real analytic Hamiltonian H(x, y, u, v) = h0 (y, u, v) + P (x, y, u, v)

(1.1) Pn

2n 2m in i=1 dxi ∧dyi + Pancomplex neighbourhood in C ×C with the symplectic structure n du ∧dv , of a 2n dimensional real domain u = v = 0, (x, y) ∈ T ×D ⊂ T n ×Rn , i i i=1 n where D is an open set of R . Denote by z = (u1 , · · · , um , v1 , · · · , vm ) ∈ R2m for simplicity. 0 If ∂h ∂z (y, 0) = 0, the unperturbed Hamiltonian system defined by h0 possesses a 2n dimensional invariant subspace u = v = 0 foliated by a family of invariant tori (y0 ,0) t. y = y0 , u = v = 0 and the flow on each torus is given by x(t) = x0 + ∂h0∂y ∂ 2 h0 (y, 0) 6= 0, i.e., if h0 is non-degenerate, the frequencies (ω1 , · · · , ωn ) = If det ∂y 2 ∂h0 (y, 0) can be regarded as parameters and one can equivalently consider perturbations ∂y of a family of linear integrable Hamiltonians, parameterised by the frequencies ω = ∂h0 (y,0) ∈ O ⊂ Rn with positive Lebesgue measure, ∂y ?

The work was supported by AvH Foundation in Germany.

146

J. You

H = N + P =< ω, y > +

1 < A(ω)z, z > +P, 2

(1.2)

∂ 2 h2

0 −1 in (x, y, z) ∈ T n × Rn × R2m , where A(ω) = ∂z20 (( ∂h ∂y ) (ω), 0) is a 2m × 2m matrix. This setting has been frequently used by many authors. In the following we will treat ω as independent parameters varying over a positive measure set O. The persistence of invariant tori for the perturbed Hamiltonians have been extensively 2 studied in the case that A is non-degenerate, i.e., det ∂∂zh20 (y, 0) 6= 0. If all the eigenvalues of A are not on the imaginary axis, the torus is called hyperbolic. For this case, if ω = (ω1 , · · · , ωn ) ∈ O satisfies the Diophantine condition

| < k, ω > | > γ|k|−τ ,

(1.3)

for τ > n + 1, 0 6= k ∈ Z n , Moser[14], Graff[8] and Zehnder[26] had proved that, there is a ω ∗ such that (1.2) at ω ∗ possesses an invariant torus with prescribed inner frequencies ω if perturbations are sufficiently small. Back to the original system (1.2), it follows that (1.1) has a Cantor family of invariant tori if the perturbation is smooth and small. If all the eigenvalues of A(ω) belong to iR1 \ 0, the torus is called elliptic. More precisely, for the following system H =N +P =

n X i=1

1X j (ω)(u2j + vj2 ) + P, 2 m

ω i yi +

(1.4)

j=1

Melnikov ([13]) in 1967 announced that, for a positive Lebesgue measure subset Oγ ⊂ O, (1.4)ω∈Oγ possesses a n dimensional invariant torus with frequencies satisfying the non-resonant conditions ˜ > | > γ|k|−τ , |l| ≤ 2 | < k, ω˜ > + < l,

(1.5)

˜ = (ω˜ 1 , · · · , ω˜ n , ˜ 1, · · · , ˜ m ), ˜ ) for k ∈ Z n , l ∈ Z m , |k| + |l| 6= 0, τ > n + 1, where (ω, provided the perturbation is sufficiently small. The complete proof was carried out fifteen years later by Eliasson, Kuksin and P¨oschel ([9, 11, 17]). In this case, only the measure estimate is available. One can’t tell if (1.4) has a torus with prescribed frequencies. More recently, developing Craig and Wayne’s method [7], Bourgain [3] proved the existence of quasi-periodic solutions for various Hamiltonian PDEs under the first Melnikov non-resonant condition, i.e., (1.5) holds for |l| ≤ 1. The result was applied to some kinds of PDEs with periodic boundary condition. His proof is based on the Liapounov-Schmidt reduction introduced by [7], and some sophisticated estimates of controling the inverse of matrices with singular sites. The remaining question is what happens when A contains zero eigenvalues? In this paper, we shall consider the simplest case of this degenerate problem: m = 1, the spectrum of A vanishes and ω satisfies (1.3). In this case the invariant torus T n of the unperturbed system is resonant, and it may be eliminated by some special perturbations, for example P = (u3 + u), if no further assumption is presented. Thus if we do not want to impose further restriction on the perturbation except the smallness and smoothness, the higher order terms of the unperturbed integrable system have to be taken into account. The simplest degenerate case is the nilpotent case (See Takens [22]), i.e., 1 H =< ω, y > + v 2 + f (u, v) + P (x, y, u, v, ω), 2

(1.6)

KAM Theorem for Degenerate Lower Dimensional Tori

147

in T n × Rn × R2 . If T n × {0} is an isolated torus of the unperturbed system (1.6)P =0 , it follows that f (u, v) = o(|u|2 + |v|2 ) 6= 0. By normal form theory, in this case, we can assume that f (u, v) = u3 G(u), where G(u) 6= 0 is a polynomial of u. The case with leading terms G(u) = cu2d−4 6= 0(d = 2, 3, · · ·) is not of interest in this paper, because in this case the torus can be eliminated by a simple perturbation, P = cu. Thus we consider the case with leading terms G(u) = cu2d−3 (d ≥ 2). Concerning the unperturbed system, there are two different types of n-dimensional invariant tori: 1. c < 0, which we call a hyperbolic type degenerate torus. In this case the origin in the (u, v) plane is a saddle-type degenerate singular point. This is the main subject of this paper. 2. c > 0, which we call an elliptic type degenerate torus. The origin in the (u, v) plane is then a degenerate centre. This case is more complicated. The persistence result has not been available. We shall give more comments for this case to show where is the problem. In this paper, we consider the persistence of an hyperbolic-type degenerate lower dimensional torus for a real analytical, perturbed integrable Hamiltonian system continuously depending on the parameter ω in an open set O ⊂ Rn . Without loss of generality, we normalise c as −1 and consider the Hamiltonian of the following form: 1 H =< ω, y > + v 2 − u2d + P (x, y, u, v), 2

d ≥ 2,

(1.7)

in (x, y, u, v)-space T n × Rn × R2 with standard symplectic structure dx ∧ dy + du ∧ dv. The goal of this paper is to prove, for any ω0 satisfying (1.3), there is a ω ∗ such that (1.7) at ω ∗ possesses a n dimensional invariant torus carrying rotational flow of frequencies ω0 , provided that the perturbation P is analytic and small enough. The obtained torus might be still degenerate, but it is saddle-like in the (u, v) plane. From now on, we fix a ω0 satisfying (1.3) and consider the complex extension of the real Hamiltonian (1.7) on the complex neighbourhood D(r, s, su ) = {(x, y, u, v) | |Imx| < r, |y| < s2d ; |v| < sd , |u| < su }, with parameter ω in B(ω0 , δ) = {ω ∈ Rn : |ω − ω0 | ≤ δ}, where Imx is the image part of x and | · | is the sup−norm for complex vectors, s ≥ su > 0. The norm of P on D(r, s, su ) × B is defined as ||P ||D×B = sup |P |. D×B

Theorem 1.1. Suppose that Hamiltonian (1.7) is analytic in (x, y, u, v) ∈ D and continuous in ω ∈ B. Moreover, 1 1 ||P ||D(r,s,s)×B ≤ 0 ≤ δ, 2d s 2 where 0 is a small constant depending on n, d, r, γ, τ . Then there is a ω ∗ ∈ B with |ω ∗ − ω0 | < 20 such that H at ω ∗ possesses a n dimensional invariant tori carrying rotational flow with internal frequencies ω0 . As a consequence of Theorem 1.1, we have the following

148

J. You

Theorem 1.2. Suppose that the analytic Hamiltonian 1 H = h(y) + v 2 − u2d + P (x, y, u, v), 2

(1.8)

is defined on {|Imx| < r} × {y ∈ D ⊂ C n } × {|u|, |v| < s}. Then there is a ∗ ∗ (n, d, D, s, r, γ, τ ) such that if ≤ ∗ , ∂h ∂y (y0 ) satisfies (1.3) and dist{y0 , D} ≥ M ∂h with M = supD | ∂y |, Hamiltonian (1.8) has an invariant torus carrying rotational flow of frequencies ∂h ∂y (y0 ). Remark. For general perturbations, the degenerate torus T n × {y = u = v = 0} of the unperturbed system will break into 2d − 1 non-degenerate tori. But we can not expect more than one if we do not present a further restriction on the perturbation besides smoothness and smallness. Moreover the normal behaviour of the obtained torus can not be predetermined; it depends on the perturbation. Remark. The analyticity is not necessary but it considerably simplifies the proof. Example. The theorem applies to the Duffing equation d2 x − x3 = p(ω1 t, · · · , ωn t, x), dt2

(1.9)

for proving the existence of quasi-periodic solutions with the frequencies (ω1 , · · · , ωn ). Similar results have been obtained in the pioneering work of Moser ([15]) for the equation with a linear term a()x, where a() ≈ 1 depending on the perturbation is an artificial parameter. I’d like to mention here, in a different setting, C. Cheng ([5]) considered the perturbation of the resonant tori in Hamiltonian systems. His result is a partial generalisation of the Poincar´e-Birkhoff fixed point theorem to the invariant torus case. As far as the author knows, that is the first result dealing with the resonant problem without adding any restriction on the perturbation. 2. Outline of the Proof The theorem will be proved by KAM iteration which involves an infinite sequence of coordinate transformations. The procedure is more complicated when we consider the normally resonant case. In the following, we outline the proofs. Let Bn = {ω, |ω − ω0 | ≤ Kγτ +1 }. As we will see, at each step of the KAM scheme, n a family of Hamiltonian H n = N n + Pn defined in D(rn , sn , sun ) with parameter ω ∈ Bn is considered near a n dimensional torus {y = 0, u = v = 0}, where 1 Nn =< ω0 , yn > + vn2 + fn (un ), 2 Pi∗n n i with fn (un ) = i=2 ai un , 2 ≤ i∗n ≤ 2d, ani < 2 are small constants coming from the perturbation. The Hamiltonians satisfy

KAM Theorem for Degenerate Lower Dimensional Tori

149

1 ||Pn ||D(rn ,sn ,sun ) ≤ n , s2n 2d 2d 2d − 3s2d n ≤ fn (un ) ≤ n sn , for u ∈ [−s1n , s1n ], −3sn ≤ fn (s1n ), fn (s2n ) ≤ −sn (2.1) for some s1n , s2n satisfying −sun ≤ s2n < 0 < s1n ≤ sun . We shall prove that there is a symplectic change of variables,

9n : (xn+1 , yn+1 , un+1 , vn+1 ) → (xn , yn , un , vn ); defined in a smaller domain Dn+1 with parameter in Bn+1 , such that Pn ◦ 8n = Nn+1 + Pn+1 in Dn+1 × Bn+1 satisfies |Pn+1 |Dn+1 ×Bn+1 ≤ |Pn |κDn ×Bn , κ > 1. Moreover, there is a continuous map φ−1 n : ωn+1 → ωn which maps Bn+1 into Bn . In what follows the Hamiltonian without subscription denotes the Hamiltonian in nth step, while those with subscription + denotes the Hamiltonian of n + 1th step. We outline one step of the KAM iteration. We first truncate the perturbations and keep the higher order terms in the next iteration step since it can be made smaller by shrinking the definition domain. More precisely, rewrite H as the following: H = N + R + (P − R), where R is a higher order truncation of P such that ||P − R|| is less than ||P ||κD , κ > 1 in a smaller domain (see the next section for details). Then we find a symplectic coordinate transformation to kill as many terms in R as possible. The transformation is generated by a Hamiltonian function F defined in a smaller domain D(r − ρ, 18 s, 18 su ). Let XF be the vector field with the Hamiltonian F . Denote by XFt the flow of XF and 8 = XFt=1 the time 1 map of the flow. By a Taylor series, we have (see [12, 17]), H ◦ 8 = (N + R) ◦ 8 + (P − R) ◦ 8 = (N + R) ◦ XFt=1 + (P − R) ◦ XFt=1 = N + R + {N, F } Z 1 (1 − t){{N + R, F }, F } ◦ XFt dt + {R, F } + (P − R) ◦ 8. + 0

= N + R + {N, F } + P¯ = N + R + {N2 , F } + {Nh , F } + P¯ = N¯ + {Nh , F } + P¯ , where

(2.2) (2.3) i∗

X 1 ai ui , i∗ ≤ 2d, N2 =< ω, y > + v 2 + a2 u2 , Nh = 2 i=3

Z

1

P¯ = 0

(1 − t){{N + R, F }, F } ◦ XFt dt + {R, F } + (P − R) ◦ 8.

The philosophy of the KAM method ([1, 2, 10, 14]) is to find a special F defined in a shrunk domain which makes the new perturbation P¯ in (2.2) much smaller and N + R + {N, F } a new normal form. In the non-degenerate case, i.e., (1.5) is satisfied,

150

J. You

we need not put the higher order terms of u into the normal form, i.e., Nh = 0, and F is obtained by solving a linear partial differential equation N + R + {N, F } = N+ ,

(2.4)

with a N+ similar to N , where {N, F } =

∂F ∂N ∂F ∂N ∂F ∂N + − . ∂x ∂y ∂v ∂u ∂u ∂v

In the non-degenerate case, N+ can be put into a normal form simultaneously. In the degenerate case, N consists of the higher order terms of u, and Eq. (2.4) is no longer a linear partial differential equation. We can not solve it completely as in the non-degenerate case. Note that the purpose of solving (2.4) is to find a function F so that (2.3) becomes a new normal form with a smaller perturbation. For this sake, we establish the KAM step as follows: Firstly, instead of solving (2.4), we solve Z 2π R dx = 0, (2.5) {N2 , F } + R − 0

R and treat {Nh , F } as a part of the new perturbation. The trouble is N¯ = N + R dx is no longer a normal form (see (3.16) for the precise formulation). Especially, it contains the linear term in u, and thus T n × {0} might be no longer an invariant torus of N¯ . However, due to the existence of higher order terms, there does exist a hyperbolic type torus in the effective domain corresponding to a hyperbolic equilibrium in the (u, v) plane. Secondly, we find a linear coordinate transformation to move the origin of the (u, v) plane to the hyperbolic equilibrium, which makes N¯ a normal form, say N˜ (containing no linear terms). As a matter of fact, the linear transformation does not smoothly but only continuously depend on parameters even if the original Hamiltonians smoothly depend on parameters. This observation means the method doesn’t apply to the elliptic degenerate case which requires measure estimates. For the hyperbolic case, however, the measure estimate is not necessary since we can fix the frequency vector at each KAM step. We have done one cycle of iteration if we could prove the new perturbation is smaller. This can be done by shrinking the definition domain. More precisely, we have to shrink the domain so that {Nh , F } and P − R are smaller. Certainly, this is available if the domain is sufficiently small. But it has no meaning if we don’t care about the position of the torus of N¯ . If all tori of N¯ are outside of the shrunk domain, we would lose an object at the next iteration step. Thus we have to balance the above two conflicting requirements and find a domain such that it does contain a hyperbolic torus of N¯ ; meanwhile, {Nh , F } and P − R are smaller in this domain. Obviously, the frequencies of the invariant torus of N˜ differs from that of N a little bit. After a KAM step, the frequencies of N˜ are ω+ = φ(ω) = ω + P0100 (ω) for the Hamiltonian H at ω. However, since P0100 (ω) is continuous and small, φ(B) still contains a smaller ball B+ centred at ω0 and the new perturbation is uniformly smaller in B+ , so that we can do the next iteration. Besides, there is an additional technical difficulty. We have to discard some higher order terms in N˜ at each iteration step since the coefficients of those terms might be big enough to destroy the convergence of the normal form series.

KAM Theorem for Degenerate Lower Dimensional Tori

151

Iterating the above step, we get a family of symplectic change of variables 9n and a family of Hamiltonian Hn defined in nested domains Bn ⊂ B such that (D9n )∗ XH ◦ 9n = XNn + XPn . By passing to the limit, it follows that H at ω ∗ ∈ ∩Bn is conjugated to an integrable Hamiltonian system N∞ which has a n torus carrying rotational flow of frequencies ω0 . That means H at ω ∗ ∈ ∩Bn has a invariant torus of frequencies ω0 . We refer to Sect. 4 for more details. In the next section we describe one KAM step in detail. Throughout this paper, we denote by c the constants which depend only on n, d, γ, δ, τ . 3. KAM Step In what follows, the Hamiltonian without subscription denotes the Hamiltonian in the ν th step, while those with subscription + denote the Hamiltonian of the ν + 1th step. We consider one step of KAM iteration in full details. Throughout this section, we consider a family of Hamiltonians H = N + P,

(3.1)

defined in D(r, s, su ), ω ∈ B(ω0 , 21 γK −τ −1 ) with 21 sd ≤ su ≤ s, where i∗

X 1 ai ui , 3 ≤ i∗ ≤ 2d. N = N2 + Nh =< ω, y > + v 2 + a2 u2 + 2 i=3

Since ω0 satisfies (1.3), we have that | < k, ω > | >

1 γ|k|−τ , 2

(3.2)

holds for 0 6= |k| ≤ K and all ω ∈ B, where K is the minimum integer satisfying 2K n e−Kρ ≤ for a given ρ. Remark. In this section, for given s, , we denote by 8d

α = 16d2 +2d−1 , s+ = αs, + = κ , κ = 1 +

1 . 4d + 1

(3.3)

Since a constant factor c independent of iteration step is irrelevant, in the following “< c” is abbreviated to be “< · ”. We assume that H satisfies 1 ||P ||D×B = , s2d and

1 s2d 4d , for i = 2, · · · , i∗ . ∗ < ·α i su

(3.4)

(3.5)

Moreover, there are s1 , s2 with −su ≤ s2 < 0 < s1 ≤ su , such that − 3s2d ≤ f (u) ≤ s2d , for u ∈ [s2 , s1 ], −3s2d ≤ f (s1 ), f (s2 ) ≤ −s2d , a2 ≤ 0,

(3.6) (3.7)

152

J. You

max |Nh (u)| < · s2d α− d ,

(3.8)

|a2 su | < · sd ,

(3.9)

1

|u|≤su

2

where f (u) = a2 u + Nh (u). It follows from (3.6) that su ≥

1 d s , 2

(3.10)

if su ≤ s ≤ The purpose of this section is to find a change of variables defined in a smaller domain D+ × B+ (ω0 ), such that the transformed Hamiltonian H+ = N+ + P+ satisfies all the above iteration assumptions with smaller s+ , su+ , α+ , c+ (see (3.43), (3.51), (3.37), (3.39), (3.31), (3.40), (3.41) below). In order to help the reader understand the complicated iteration assumptions, we add the following remarks: 1 4.

1. Assumption (3.5) is used to prove the convergence of the normal form series. 2. Assumption (3.6) guarantees that the critical points of f in [−su , su ] will not be completely moved out by the perturbation. 3. Assumption (3.7) means that the origin is a saddle-like singular point of the unperturbed system in the uv plane, which might be degenerate: i.e., a2 = 0. 4. Assumption (3.8) is crucial for controlling the size of {Nh , F }, which is treated as a part of the new perturbation. 5. Assumption (3.9) is used to control the size of F , and thus the size of the generated symplectic transformation. Truncating Perturbations. Expanding P into a Fourier-Taylor series, X P (x, y, u, v) = Pklpq ei(k,x) y l up v q , where the sum is taken over k = (k1 , · · · , , kn ) ∈ Z n , l = (l1 , · · · , ln ) ∈ Z+n , p, q ∈ Z+ . X

Let R=

Pklpq ei(k,x) y l up v q ,

(3.11)

|k|≤K,2|l|+q+2 ip∗ ≤2

be a P truncation of P , where K is the minimum integer satisfying 2K n e−Kρ ≤ , n |l| = i=1 li . 1 Note that if 2|l| + q + 2 ip∗ > 2, then 2|l| + q + 2 ip∗ ≥ 2 + i1∗ ≥ 2 + 2d , since all the qualities are integer. The following estimates come from the Cauchy estimates. Lemma 3.1. ||R||D(r−ρ, 1 s, 1 su )×B ≤ c(ρ)||P ||D×B = c(ρ)s2d . 2

2

(3.12)

Moreover, in a smaller domain, D1 = D(r − ρ, αs, su+ ), we have ||P − R||D1 < · (K n e−Kρ + α2d+ 2i∗ )||P ||D < · s2d 2 , d

if α =

8d 16d2 +2d−1

, ssu+ <α u

2d i∗

1 − 4d

.

(3.13)

KAM Theorem for Degenerate Lower Dimensional Tori

153

Proof. Equation (3.12) directly follows from the Cauchy inequality. Now we prove (3.13). Note that X X Pklpq ei(k,x) y l up v q + Pklpq ei(k,x) y l up v q . P −R = |k|≥K

|k|≤K,2|l|+q+2 ip∗ >2

Equation (3.13) follows from the following two estimates: X X Pklpq ei(k,x) y l up v q |D1 ≤ |P |e|k|r e|k|(r−ρ) | |k|≥K

|k|≥K

≤ s2d

X

κn e−κρ ≤ s2d 2

(3.14)

κ≥K

and

X

||

Pklpq ei(k,x) y l up v q ||D1

|k|≤K,2|l|+q+2 ip∗

≤

X

Z

||

0 2<2|l0 |+q 0 +2 ip∗

< · α(2|l|+q)d (

>2

≤3

su+ p ) | su

0

0

X

0

∂ |l |+q +p ( ∂y l0 ∂up0 ∂v q0 X

Pklpq ei(k,x) y l up v q )||D1

|k|≤K,2|l|+q+2 ip∗

>2

Pklpq ei(k,x) y l up v q |

|k|≤K,2|l|+q+2 ip∗ >2

2d+ 2d−1 8d

< · s2d 2 , Z y Z y1 Z u Z uZ v Z v R ··· ··· ··· . in D1 , where stands for 0 0 0 0 0 0 | {z } | {z } | {z } < · s2d α

p0

l0

(3.15)

q0

Solving the linear equation. Let Z 2π N¯ = N + R dx 0 X X = N + P0000 + P00j0 uj + P00j1 uj v + P0002 v 2 + hP0100 , yi. ∗

1≤j≤i∗

In the following, the sum “

P

1≤j≤ i2

” is always taken over

{(k, l, p, q)| 0 6= |k| ≤ K, 2|l| + q + 2

p ≤ 2} i∗

√ if there is no further claim. We denote 1 = −1(k, ω) for simplicity. Let X F = fklpq ei(k,x) y l up v q , be the solution of the following partial differential equation: {N2 , F } + R =

(3.16)

X

1fklpq +

∂F ∂N2 ∂F ∂N2 − + R = N¯ , ∂v ∂u ∂u ∂v

(3.17)

154

J. You

where X ∂F ∂N2 ∂F ∂N2 − = v 2a2 u qfklpq ei(k,x) y l up v q−1 ∂v ∂u ∂u ∂v X − pfklpq ei(k,x) y l up−1 v q . That is

fkl00 = −1−1 Pkl00 ,   −Pkl20 2a2 0 fkl20 = (13 + 8a2 1)−1 det  −Pkl11 1 4a2  , −Pkl02 −1 1   1 −Pkl20 0 fkl11 = (13 + 8a2 1)−1 det  −2 −Pkl11 4a2  , 0 −Pkl02 1   1 2a2 −Pkl20 fkl02 = (13 + 8a2 1)−1 det  −2 1 −Pkl11  . 0 −1 −Pkl02

For 0 < p 6= 2, fklp0 = −(12 + 2pa2 )−1 (1Pklp0 − 2a2 Pkl,p−1,1 ), fkl,p−1,1 = −(12 + 2pa2 )−1 (1Pkl,p−1,1 + pPklp0 ). Remark. Since do not know the size of a2 , which comes from the perturbation, we have to keep all the terms with k = 0. It makes N¯ a bit complicated. Later, we have to employee further transformations to put N¯ into the normal form. We first give estimates for F . Let Di = D(r−8ρ+iρ, 1i s, 1i su ) ⊂ D1 = D, 0 < i ≤ 4, and 0(ρ) = sup t3τ e−ρt . t≥0

Lemma 3.2.

1 ||F ||D2 ×B < · 0. s2d

(3.18)

Proof. Rewrite F into the following: F = F1 + F2 + F3 + F4 + · · · + F9 , where F1 =

X

1 fklpq ei(k,x) y l up v q ,

(3.19)

with 1 = −1−1 Pkl00 , fkl00

1 = −(12 + 2pa2 )−1 1Pklpq , for 0 < p + q 6= 2, fklpq

1 = −(12 + 8a2 )−1 1Pkl11 , fkl11

1 fklpq = −(12 + 8a2 )−1 1−1 (12 − 4a2 )Pklpq , (p, q) = (2, 0), (0.2);

(3.20)

KAM Theorem for Degenerate Lower Dimensional Tori

155

X F2 = − (12 + 8a2 )−1 8a22 Pkl02 ei(k,x) y l u2 ; X F3 = (12 + 8a2 )−1 2a2 Pkl11 ei(k,x) y l u2 ; X F4 = − (12 + 8a2 )−1 4a2 Pkl02 ei(k,x) y l uv; X F5 = − (12 + 8a2 )−1 2Pkl20 ei(k,x) y l uv; X F6 = − (12 + 8a2 )−1 1−1 2Pkl20 ei(k,x) y l v 2 ; X F7 = − (12 + 8a2 )−1 Pkl11 ei(k,x) y l v 2 ; X F8 = (12 + 2pa2 )−1 (2a2 )Pkl,p−1,1 ei(k,x) y l up , p 6= 2; X F9 = − p(12 + 2pa2 )−1 Pklp0 ei(k,x) y l up−1 v, p 6= 2. To prove Lemma 3.2, it suffices to prove that 1 ||Fi ||D2 ×B < · 0, i = 1, · · · , 9. s2d In the following, we only give the estimates for F1 , F8 , F9 , the estimates for the others are similar to one of F1 , F8 , F9 . By Cauchy estimates, for fixed k, X | Pklpq (ω)y l up v q |D×B ≤ s2d e−|k|r . In view of (3.1) - (3.9), we know that 21 sd < su ≤ s, a2 ≤ 0 and |a2 |su ≤ 2sd . It follows that || F1 (x, y, u, v)||D2 ×B XX 1 ≤ ( |fklpq (ω)y l up v q |D2 ×B )e|k|(r−ρ) k6=0

≤

X

|13 |e−|k|ρ s2d = 0s2d ,

k6=0

|| F8 (x, y, u, v)||D2 ×B X = || (12 + 2pa2 )−1 (2a2 )Pkl,p−1,1 ei(k,x) y l up ||D2 ×B d X 2 (1 + 2pa2 )−1 (2a2 u)Pkl,p−1,1 ei(k,x) y l up−1 v||D2 ×B = || dv X ≤ |2a2 u| · 2|s−d | · || (12 + 2pa2 )−1 Pkl,p−1,1 ei(k,x) y l up−1 v||D×B < · 0s2d , || F9 (x, y, u, v)||D2 ×B X = || (12 + 2pa2 )−1 pPklp0 ei(k,x) y l up−1 v||D2 ×B

156

J. You

d X 2 ( (1 + 2pa2 )−1 Pklp0 ei(k,x) y l up ) · v||D2 ×B du X d = 2s−1 (12 + 2pa2 )−1 Pklp0 ei(k,x) y l up )||D×B u s || = ||

d 2d 2d < · s−1 u s 0s < · 0s .

Again by the Cauchy inequality, it follows that ρ||Fx ||, s2d ||Fy ||, su ||Fu |, sd ||Fv || < · 0s2d ,

(3.21)

uniformly on D2 × B. Let ||Di F ||D×B = max{ρ−i1 |

∂iF |D×B , |i1 | + |i2 | + i3 + i4 = i}. ∂xi1 ∂y i2 ∂ui3 ∂v i4

Note that F is a polynomial in y of order 1, in v of order 2 and in u of order i∗ . From the Cauchy inequality and (3.5), it also follows that ||Di F ||D4 ×B < · 0

1 s2d 4d , ∗ < · 0α i su

(3.22)

for i = 2, · · · , i∗ . If is sufficiently small so that ·0 < ( 21 α)4d in (3.21), the symplectic coordinate transformation 8 = XFt |t=1 maps D 1 α into Dα . Moreover, 8 transforms H into H¯ = 2 N¯ + {Nh , F } + P¯ (see (2.3)). The new normal form. From (3.13), we know that P¯ is smaller only when the new 2d 1 definition domain of u is smaller, say ssu+ < α i∗ − 4d . In what follows we fix this domain u and find a symplectic change of variables 82 ◦ 81 which transforms N¯ of (3.16) into the normal form up to a smaller term; i.e., N+ = N¯ ◦ 82 ◦ 81 + O(+ ) satisfying (3.5)–(3.9) with s+ , su+ , + , where su+ is determined later. Firstly, we use X P00p1 uj1 , (3.23) 81 : u = u1 , v = v1 − (1 + 2P0002 )−1 ∗

1≤p≤ i2

P i2∗

P00p1 up v in N¯ ( see (3.16) ). It follows that X X N¯ ◦ 81 = N + P0000 + P00j0 up1 + (1 + P0002 )−1 ( P00p1 up1 )2

to kill the mixing terms

p=1

i∗ 2

i∗ ≥p≥1

+

P0002 v12

Denote by f1 (u1 ) = f (u1 ) +

X

≥p≥1

+ P0100 y. P00p0 up1 + (1 + P0002 )−1 (

i∗ ≥p≥1

(3.24) X i∗ 2

P00p1 up1 )2 .

(3.25)

≥p≥1

Now we fix the definition domain of the u1 variable of the next iteration step, which is a bit complicated. Let L = {u1 ∈ [s2 , s1 ], f1 (u1 ) ≥ −2s2d + }, where s1 , s2 are defined in (3.6).

KAM Theorem for Degenerate Lower Dimensional Tori

Lemma 3.3.

157

2d

|L| ≤ C0 α i∗ s1 ,

(3.26)

where |L| denotes the Lebesgue measure of L and C0 = 4(2 + 3 + · · · + 2d + 4 ). d

Proof. In view of (3.3), (3.6), together with the fact < 21 α2d if ≤ 2−8d−1 , we have 1 − 3s2d − s2d ≤ f1 (u1 ) ≤ 2s2d , for s2 ≤ u ≤ s1 , −4s2d ≤ f1 (s1 ), f1 (s2 ) ≤ − s2d . 2 (3.27) It follows that L = {u1 ∈ [s2 , s1 ], |f1 (u1 )| ≤ 2s2d + }, and max

s2 ≤u1 ≤s1

|f (u1 )| ≥

1 2d s . 2

Now Lemma 5.2 in the Appendix leads to the estimate in Lemma 3.3.

¯ ¯2 , s¯1 ] the connected Note that f1 (0) = 0 ≥ −2s2d + implies 0 ∈ L . Denote by L = [s component of L which contains zero. Since f1 (0) = 0 > −2s2d = f1 (s¯1 ) = f1 (s¯2 ), f1 + ¯ say um . Let s˜1 = s¯1 −um , s˜2 = s¯2 −um . will reach its maximum at an interior point of L, We have (3.28) f1 (s˜1 + um ) = f1 (s˜2 + um ) = −2s2d + . Without loss of generality, we assume that |s˜2 | ≤ s˜1 . Corollary 1. 2d

s˜1 ≤ 4C0 α i∗ s1 .

(3.29)

Now we move the origin to the maximum by the following linear symplectic change of variables: 82 : v1 = (1 + P0002 )− 2 v+ , u1 = (1 + P0002 ) 2 u+ + um . 1

1

(3.30)

Let N˜ = N¯ ◦ 81 ◦ 82 . It is easy to see that 1 N˜ =< ω+ , y > + v+2 + f˜(u+ ) + f1 (um ), 2 with

∗

f˜(u+ ) =

i X

a+i ui+ , a2+ ≤ 0.

(3.31)

i=2

Remark. The above symplectic change of variables depends only continuously on the parameter ω. As a consequence, the resulting Hamiltonian depends only continuously on ω. The following estimate will be used to prove the convergence of the coefficients of f and thus the convergence of normal form series. Lemma 3.4.

|ai+ − ai | < · α 4d , i = 1, · · · , i∗ . 1

(3.32)

158

J. You

Proof. Equation (3.32) can be directly verified from (3.5), (3.25), (3.26), (3.30) by the formula i 1 di f˜ 1 di f1 |u+ =0 = |u=um (1 + P0002 ) 2 . ai+ = i i! du+ i! dui Now N˜ is already in normal form. But if s1+ is too small, we can not get the result of Lemma 3.4 in the next KAM step which is crucial for proving the convergence of the normal form series. Since we don’t know the exact size of s1+ , we have to discard some higher order terms in f˜ according to s1+ . For the same reason we also need to shift s1+ up a little bit when necessary. Certainly, this modification will make the new perturbation a bit larger, but it does not influence the convergence of the iteration. Let 1 1 (3.33) [s2+ , s1+ ] = [(1 + P0002 )− 2 s˜2 , (1 + P0002 )− 2 s˜1 ]. More precisely, let i∗+ be the integer such that 1 1 1 1 2d 4d i∗ 4d i∗ + , (s + +1 α ) α ) , s1+ ∈ (s2d + + + + + +

(3.34)

1

∗

∗ ∗ 4d if si1+ ≤ s2d + + α+ , otherwise let i+ = i . In view of (3.28) and (3.30), it follows that that f˜(s1+ ) = f˜(s2+ ) = f1 (s¯1 ) = f1 (s¯2 ) = ¯1 , |s¯2 | ≥ sd+ . Note that (3.34) implies 2s2d + . Together with (3.32), it follows that s

3 3 3 s˜1 ≥ (s¯1 − um ) ≥ [(s¯1 − um ) − (s¯2 − um )] 4 4 4 1 1 3 3 4d 2 > (s¯1 − s¯2 ) ≥ sd+ > (s2d + + α+ ) . 4 4

s1+ >

(3.35) It follows that Define

i∗+

≥ 2. (

su+ =

1 − 2di ∗

10s1+ α+ 10s1+ ,

+

,

i∗

1 − 4d

1

2d 4d if s1++ ∈ (s2d + + α+ , s + + α+ otherwise,

), ;

(3.36)

here we assume that |s2+ | ≤ s1+ , otherwise we replace s1+ in (3.36) by |s2+ |. In view of (3.34), (3.36), we have s2d + + i∗ + su+

1

≤ α+4d .

(3.37)

By definition (3.36), i∗ +1

∗

3 − 4d

+ ≤ max{(10i+ +1 (s2d su+ + + α+

)

i∗ + +1 i∗ +

1

1

2d 4d 4d , s2d + + α + } < · s+ + α + .

It follows that ∗

max |

|u+ |≤su+

i X i=i∗ + +1

∗

a+i ui+ |

i X

≤ max

|u+ |≤su

|a+i ui+ |

i=i∗ + +1

∗

≤ 2|

i X i=i∗ + +1

i∗ +1

+ siu+ | ≤ 8su+ < · s2d + + .

(3.38)

KAM Theorem for Degenerate Lower Dimensional Tori

159

Pi ∗ Thus we can put the higher order terms i=i∗+ +1 a+i ui+ in N˜ into the new perturbation and keep only the terms which are bigger than + . More precisely, we set ∗

f+ (u+ ) =

i+ X

ai+ ui+ ,

i=2

for the next iteration step. As a matter of fact, we mention that su+ < · α i∗ − 4d su , 2d

1

by (3.33) and (3.36), which will lead to the estimate of (3.13) in D1 . Now we prove Lemma 3.5. 2d 2d 2d − 3s2d + ≤ f+ (u+ ) ≤ s+ + , for u ∈ [s2+ , s1+ ], and − 3s+ ≤ f+ (s1+ ), f (s2+ ) ≤ −s+ , (3.39) 2d ¯ we have Proof. In fact, since −2s2d ¯2 , s¯1 ], by the definition of L, + ≤ f1 ≤ s in [s 5 2d 2d − 2 s+ ≤ −2s+ − f1 (um ) ≤ f˜ ≤ 0 in [s˜1 , s˜2 ] . Combining with (3.33) and (3.38), it 2d follows that −3s2d + ≤ f+ ≤ s+ + in [s1+ , s2+ ]. Another half of (3.39) is proved similarly.

Lemma 3.6. ∗

max |Nh+ | = max |

|u+ |≤su+

|u+ |≤su+

i+ X

1 − 2d

a+i ui+ | ≤ c0 s2d + α+

,

(3.40)

i=3

where c0 = 10C 2d , the constant C is defined in Lemma 5.3. Proof. In view of (3.39 ), Lemma 5.3 and the definition of su+ , we have ∗

max |Nh+ | ≤

|u+ |≤su+

i+ X

∗

|a+i siu+ |

≤

i=2

i+ X

i − 2di ∗

|a+i |10i si1+ α+

+

i=2 ∗

≤ 10

i∗ +

− 1 α+ 2d

i+ X

∗

|a+i |si1+

≤ 10

i=2

≤

− 1 c0 α+ 2d s2d + .

i∗ +

− 1 Cα+ 2d

max |

0≤u≤s1+

i+ X

a+i ui |

i=2

Lemma 3.7. |a2+ su+ | < · sd+ . 2d+1

(3.41)

Proof. If i∗+ ≥ 3, then su+ ≥ s+ 3 by (3.34), (3.36). Combining with (3.39), it follows that

160

J. You ∗

|a2+ s2u+ |

≤

−1 |a2+ s21+ |α+ d

≤ max

|u|≤s1+

i∗ +

< · max | |u|≤s1+

X

− d1

ai+ ui |α+

i+ X

− d1

|ai+ ui |α+

i=2 −1

d < · s2d + α+ .

i=2

(3.42) Thus

4d−1

−1

−1

3 d −1 α+ d ≤ sd+ . |a2+ su+ | < · s2d + α+ su < · s+ ∗ In case that i+ = 2, we have su+ = 10s1+ in (3.36) since s1+ ≥ sd+ . It follows that

−d 2 −d |a2+ su+ | ≤ 10|a2+ |s21+ s−1 max |f+ (u)| < · sd+ . 1+ ≤ 10s+ |a2+ s1+ | ≤ 10Cs+ |u|≤s1+

So far, we have found a symplectic change of variables such that the normal form part N+ of the transformed Hamiltonian H+ = N+ + P+ satisfies all the iteration assumptions with s+ , r+ = r − ρ, + uniformly for ω ∈ B. Before proving the smallness of P+ , we first note that after a cycle of iteration the frequencies of the normal form is shifted to ω+ = φ(ω) = ω +P0100 (ω). Since K n e−Kρ ≤ , |P0100 | ≤ , the range of B under φ still contains the ball B(ω0 , 21 γK+−τ −1 ) in ω+ space. Treating ω+ as new parameters, we have | < k, ω+ > | >

1 γK+−τ 2

(3.43)

for ω+ ∈ B(ω0 , 21 γK+−τ −1 ) and 0 6= k ≤ K+ , where K is the minimum integer satisfying 2K+n e−K+ ρ ≤ + , if is sufficiently small. Now we arrive at a new normal form 1 N+ = e+ + (ω+ , y+ ) + v+2 + f+ (u+ ), 2 with e+ = e + P0000 + f1 (um ), satisfying all the inductive assumptions (3.1) - (3.9) with s+ , su+ , + . Estimates for the new perturbation. To finish one cycle of iteration, the only thing which remains is to estimate the new error term. Firstly, we give some estimates for XFt , 81 , 82 . The following (3.44), which will be used to prove our coordinate transformations is well defined. Equation (3.45) is useful for proving the convergence of the composition of coordinate transformations. Let D i8 α = D(r − 9ρ + iρ, 8i s+ , 8i su+ ). Especially, we denote D 1 α by D+ . 8

Lemma 3.8. 8 2 : D+ → D 1 α , 4

81 : D 1 α → D 1 α , 4

2

XFt : D 1 α → Dα , 0 ≤ t ≤ 1, 2

(3.44)

if s, is sufficiently small. Moreover, ||D9 − Id||D+ < E, ||D2 9||D+ < c, where 9 = 8 ◦ 81 ◦ 82 .

(3.45)

KAM Theorem for Degenerate Lower Dimensional Tori

161

Proof. Note that 82 is a linear transformation defined by (3.30). It follows that, in D+ , P0002

1 d 1 Es+ ≤ sd+ , 4 8 (1 + P0002 ) 1 P0002 |u − u+ | = |(1 + P0002 ) 2 u+ + um − u+ | = |um + 1 u+ | 1 + (1 + P0002 ) 2 1 1 1 ≤ su+ + su+ ≤ su+ , 10 4 8

|v1 − v+ | = |(1 + P0002 )− 2 v − v| = | 1

1 2

v| ≤

which implies 82 : D+ → D 1 α . Moreover, 4

||Di 82 ||D+ < 2, i ≥ 2. P Since | 1≤p≤ i∗ P00p1 up | ≤ sd = sd ≤ 2 81 : D 1 α → D 1 α . Moreover, by (3.37), 4

1 d 4 s+

(3.46)

if ≤

1 16 ,

It follows from (3.23),

2

||Di 81 − Id||D 1 α ≤ | 4

di dui

X ∗

P00p1 up |D 1 α ≤ C

1≤p≤ i2

4

sd i∗ 2

1

≤ C 3 ,

su

for i ≥ 2. Thus ||Di 81 || < 2, i ≥ 2.

(3.47)

To get the estimates for XFt , we start from the integral equation, Z XFt

t

= id + 0

J(∇F ) ◦ XFs ds.

XFt : D 1 α → Dα , 0 ≤ t ≤ 1, follows directly from (3.21). Since 2

Z DXFt = Id +

t 0

JD2 F ◦ XFs · DXFs ds.

It follows that ||DXFt − Id|| ≤ 2 max |D2 F | ≤ 2.

(3.48)

The estimates of higher order derivatives Di XFt (i ≥ 2) follows from (3.22). Equation (3.45) can be verified, in view of (3.46), (3.47), (3.48), from the following formula: D9 = (D8) ◦ 81 ◦ 82 · (D81 ) ◦ 82 · D82 , D2 9 = (D2 8) ◦ 81 ◦ 82 · ((D81 ) ◦ 82 · D82 )2 + (D8) ◦ 81 ◦ 82 · (D2 81 ) ◦ 82 · (D82 )2 + (D8) ◦ 81 ◦ 82 · (D81 ) ◦ 82 · (D82 )2 .

162

J. You

By the definitions of 8, 81 , 82 , 9 and Lemma 3.8, we know that H ◦ 9 = H ◦ 8 ◦ 8 1 ◦ 82 = N + + P + ∗

= N+ + (P¯ + {F, Nh }) ◦ 81 ◦ 82 +

i X

ai+ ui ,

i∗+ +1

is well defined in D+ × B+ . Moreover, we have the following estimates: ∗

|| P+ ||D+ ×B+ Z

i X i∗+

ai+ ui ||D 1 α ×B+

1

= ||( +

= ||(P¯ + {F, Nh }) ◦ 81 ◦ 82 +

{Rt , F } ◦ XFt dt + {R, F } + (P − R) ◦ 8 −

0 ∗ i X i∗+ 1

8

∂F ∂Nh ) ◦ 81 ◦ 82 ∂v ∂u

ai+ ui+ ||D 1 α ×B+ 8

Z < · ||(

0

{Rt , F } ◦ XFt dt + {R, F } + (P − R) ◦ 8)||D 1 α ×B 2

∂F ∂Nh ||D 1 α ×B + s2d + || + + ∂v ∂u 4 < · ||{Rt , F }||Dα ×B + ||{R, F }||Dα ×B + ||(P − R)||Dα ×B ∂F ∂Nh ||D 1 α ×B + s2d + || + + ∂v ∂u 2 ∂F ∂Nh ||D 1 α ×B + s2d < · ρ−1 (02 s2d + s2d 2 + || + + ) ∂v ∂u 2 ∂Nh ||D 1 α ×B + s2d < · ρ−1 (02 s2d + s2d 2 + 0sd || + + ), ∂u 2 where Rt = (1 − t){N + R, F }. Since su ≥ s ||

2d+1 3

(3.49)

( otherwise Nh = 0), we have

∂Nh ||D 1 α ×B ≤ 2s−1 max ||Nh (u)|| u s∈[−su+ ,su+ ] ∂u 2

2d − d − < · s2d α− d s−1 s u < ·s α 1

< · sd s

d−1 3

1

2d+1 3

α− d ≤ sd , 1

(3.50)

if s is sufficiently small. Combining with (3.49) and (3.50), we have 2d ||P+ ||D+ ×B+ < · ρ−1 0s2d 2 + s2d + + < · 0s+ + .

(3.51)

It means there is a constant c depending on d, n, m, γ, δ, but not on the iteration steps such that 1 ||P+ ||D 1 α ×B+ ≤ cρ−1 0+ . s2d 8 + One circle of the KAM step is finished. Remark. Equations (3.43), (3.51), (3.37), (3.39), (3.31),(3.40),(3.41) play the same roles as (3.2), (3.4), (3.5), (3.6),(3.7),(3.8), (3.9) in the next KAM step.

KAM Theorem for Degenerate Lower Dimensional Tori

163

4. Iteration and Convergence For any given r, let ρi =

1 2i+4 r,

9(r) =

∞ Y

1

0(ρi ) κi+1 ,

i=1 1 , is a well defined finite function of r (see [17, 21]). For any given where κ = 1 + 4d+1 s0 , 0 , r0 , we define some sequences inductively in the following:

rν = rν−1 − 8ρν−1 = r0 − 8

ν X

ρi ,

i=1 κ ν = cρ−1 ν−1 0(ρν−1 )ν−1 ,

αν = ν16d

8d 2 +2d−1

,

Y 1 αν−1 sν−1 = 8−ν ( i )s0 , 8 ν−1

sν =

i=0 n −Kρν

Kν = min{K : 2K e 1 Bν = B(ω0 , γKν−τ −1 ), 2 Dν = D(rν , sν , suν ),

≤ ν },

where sdν < suν ≤ sν depends on each KAM step and can not be defined uniformly. However, because all the estimates are made in sν , the convergence can be proved even if suν can not be defined explicitly, Let Hν = Nν + Pν (ων , xν , yν , uν , vν ). Summarizing conclusions (3.43), (3.37), (3.39), (3.31),(3.40),(3.41), (3.51), we have the following iteration lemma: Lemma 4.1. If s0 , 0 are sufficiently small, then the following holds for ν ≥ 0. Let Hν = N2ν + Nhν + Pν satisfies (3.1)- (3.9) in Dν × Bν with = ν , s = sν , su = suν . Then there is a frequency map φν+1 : Bν+1 → Bν and a symplectic change of variables (4.1) 9ν : Dν+1 → Dν , depending on ω ∈ φν+1 (Bν+1 ), such that Hν+1 = Hν ◦ 9ν , defined on Dν+1 × Bν+1 , has the form (4.2) Hν+1 = N2,ν+1 + Nh,ν+1 + Pν+1 , with i∗

N2,ν+1

ν+1 X 1 2 = eν+1 +(ων+1 , yν+1 )+ vν+1 +a2,ν+1 u2ν+1 , Nh,ν+1 (uν+1 ) = ai,ν+1 uiν+1 . (4.3) 2

i=3

Moreover, Hν+1 satisfies the estimates 1 ||Pν+1 ||Dν+1 ≤ ν+1 , s2d ν+1 1

(4.4)

1

|ai,ν+1 − ai,ν | < cαν4d , c+d for i = 2, · · · , i∗ν+1 .

(4.5)

164

J. You

Moreover, there are s1,ν+1 , s2,ν+1 with −su,ν+1 ≤ s2,ν+1 ≤ 0 ≤ s1,ν+1 ≤ su,ν+1 , such that 2d −3s2d ν+1 ≤ fν+1 (uν+1 ) ≤ sν+1 ν+1 , for u ∈ [s2,ν+1 , s1,ν+1 ], 2d −3s2d ν+1 ≤ fν+1 (s1,ν+1 ), fν+1 (s2,ν+1 ) ≤ −sν+1 ,

(4.6)

a2,ν+1 ≤ 0,

(4.7) − d1

max |Nh (uν+1 )| ≤ c0 s2d ν+1 αν+1 ,

(4.8)

u≤su,ν+1

where fν+1 (uν+1 ) =

Pi∗ν+1 i=2

|a2,ν+1 uν+1 | ≤ c0 sdν ,

(4.9)

ai,ν+1 uiν+1 .

Now we are in the position to prove the main theorems. We only give the proof of Theorem 1.1. Theorem 1.2 is an immediate consequence of Theorem 1.1. Proof of Theorem 1.1. Since the assumptions of Theorem 1.1 are satisfied, the iteration lemma applies for ν = 0 if we set s0 = s, su0 = s, N0 = N, P0 = P, B0 = B, and 0 is sufficiently small. Inductively, we obtain the following sequences: 9ν = 90 ◦ · · · ◦ 9ν : Dν × Bν → D0 × B0 , ν ≥ 1, −1 φν = φ−1 0 ◦ · · · ◦ φν : Bν → B0 , ν ≥ 0,

H ◦ 9ν = Hν = N2ν + Nhν + Pν

D ν × Bν ,

on

where Bν = φ Bν are nested domains. By same argument as in [17], in view of Lemma 3.8, 9ν , D9ν converges uniformly on D∞ × B∞ = D(r − 8ρ, 0, 0) × ∩∞ ν=0 Bν . In view of (4.5), N2ν + Nhν converges to, say N2∞ + Nh∞ , with ν

∞ X 1 N2∞ + Nh∞ = e∞ + (ω0 , y) + v 2 + ai,∞ ui , 2

i

i=2

where i∞ varies from 2 to 2d depending on the perturbation. Since ν = c0(ρν−1 )κν−1 = cν

ν−1 Y i=1

ν

ν

0(ρi )ν−i κ0 = (c κν

ν−1 Y

0(ρi )κ

−i−1

ν

ν

0 )κ ≤ (C9(r)0 )κ ,

i=1

ν κν

where C = supν c . It follows that ν → 0 if 0 is sufficiently small. The convergence of 9ν , D9ν , XHν implies that we can take the limit for XH◦9ν = (D9ν )∗ XH ◦ 9ν = XN2ν +Nhν + XPν ,

(4.10)

and arrive at (D9∞ )∗ XH◦9∞ = XN2∞ +Nh∞ on D∞ = D(r − 8ρ, 0, 0) uniformly for ω ∗ ∈ B ∞ = ∩∞ ν=0 Bν , where

KAM Theorem for Degenerate Lower Dimensional Tori

165

9∞ : T n → R n × T n , depending on ω ∈ B∞ and XN∞ is an integrable system on T n × {0} carrying the rotation flow of frequencies ω0 . It follows from (4.10), for any ω ∗ ∈ B∞ , (D9∞ )∗ XHω∗ ◦ 9∞ ({ω0 } × T n )) = XN∞ ({ω0 } × T n )), or equivalently,

φtHω∗ (9∞ ({ω0 } × T n )) = 9∞ ({ω0 } × T n ),

where φtHω∗ is the flow of XHω∗ . That means 9∞ ({ω0 }×T n ) is an embedding invariant torus of the original perturbed Hamiltonian system at ω ∗ ∈ B∞ . 5. Appendix The following Lemma 5.1 has been proved in [25]. For the sake of completeness, we repeat the proof here. Lemma 5.1. Suppose that g(u) is a mth differentiable function on the closure I¯ of I, where I ⊂ R1 is an interval. Let Ih = {u| |g(u)| < h, }, h > 0. If for some constant 1 d > 0, |g (m) (u)| ≥ d for ∀u ∈ I, then |Ih | ≤ ch m , where |Ih | denotes the Lebesgue −1 measure of Ih and c = 2(2 + 3 + · · · + m + d ). Proof. Let Ihm−1 = {u| |g (m−1) (u)| < h, }. Since for ∀u ∈ I, |(g (m−1) (u))0 | = |g (m) | ≥ d > 0, so Ihm−1 has at most one connected component and it follows that |Ihm−1 | ≤ 2h d . m−2 m−1 (m−2) 2 (m−1) Let Ih = {u| |g (u)| < h }. From the above, I − Ih = {u| |g (u)| ≥ m−1 h, u ∈ I} has at most two connected components. Denote these components by I(1) m−1 and I(2) . Thus, m−1 m−1 |(g (m−2) (u))0 | = |g (m−1) (u)| ≥ h, u ∈ I(1) ∪ I(2) . m−1 m−1 In the same way as the above, since Ihm−2 ∩ I(1) and Ihm−2 ∩ I(2) have at most m−1 m−1 one connected component in I(1) and I(2) respectively, we have m−1 m−1 |Ihm−2 ∩ I(1) | ≤ 2h, |Ihm−2 ∩ I(2) | ≤ 2h.

Thus, |Ihm−2 | ≤ |Ihm−2 ∩ (I − Ihm−1 )| + |Ihm−2 ∩ Ihm−1 | m−1 m−1 ≤ |Ihm−2 ∩ I(1) | + |Ihm−2 ∩ I(2) | + |Ihm−2 ∩ Ihm−1 |

≤ 4h + 2d−1 h = 2(2 + d−1 )h. Let Ih1 = {u| |g 0 (u)| < hm−1 , }. In the same way as the above , it follows that |Ih1 | ≤ 2(2 + 3 + · · · + m − 1 + d−1 )h,

166

J. You

and I − Ih1 has at most m connected components. Denote these components by 1 1 1 1 1 , I(2) , · · · I(m) and let Ih0 = {u| |g(u)| < hm }. Then |Ih0 ∩I(1) | ≤ 2h, · · · |Ih0 ∩I(m) |≤ I(1) 2h. So |Ih0 | ≤ |Ih0 ∩ (I − Ih1 )| + |Ih0 ∩ Ih1 |

≤ [2m + 2(2 + 3 + · · · + m − 1 + d−1 )]h ≤ 2(2 + 3 + · · · + m + d−1 )h ≤ ch. 1

Noticing that Ih = I 0 1 , we have that |Ih | ≤ ch m . hm

Lemma 5.2. Suppose that f1 (u) =

Pm i=2

ai ui is a polynomial of order m, which satisfies

max |f1 (u)| ≥

u∈[s2 ,s1 ]

Let

1 2d s . 2

2d L = {u ∈ [s2 , s1 ], |f (u)| ≤ 2s2d + = 2(αs) }.

Then

2d s1+ 2d ) m = C0 s 1 α m , s where |L| denotes the Lebesgue measure of L, C0 ≤ 4(2 + 3 + · · · + m + 2m ).

|L| ≤ C(

Proof. Without loss of generality, we assume |s2 | ≤ s1 . Consider an auxiliary polynomial m X ai si1 i U P (U ) = s2d i=2

with U = 1 2.

s2 u s1 . Then P (U ) is well defined in [ s1 , 1]

⊂ [−1, 1] with maxU ∈[ s2 ,1] |P (U )| ≥ s1

It follows that there must be a m0 ≤ m such that the coefficients of P (U ), 0 −2d ai sm ≥ (m!)−1 2−m0 . 1 s

Let m0 be the largest number so that (5.1) is satisfied. Then m X 1 d m0 m0 −2d 0 −2d P (U )| ≥ |(m0 )!am0 s1 s | − |i!ai sm U i| | 1 s dU (m0 )! i=m0 +1

≥ 2−m0

m 1 X −i − 2 ≥ 2−m0 −1 , m0 ! i=m0 +1

for U ∈ [ ss21 , 1]. Since L = {s1 U.|U ∈ [

s2 s2d , 1], P (U ) ≥ +2d }, s1 s

we have, by Lemma 5.1, |L| = s1 |{U |U ∈ [

2d 2d s2 s2d , 1], P (U ) ≥ +2d }| ≤ C0 s1 α m0 ≤ C0 s1 α m . s1 s

(5.1)

KAM Theorem for Degenerate Lower Dimensional Tori

167

Lemma 5.3. Suppose that X is the space of polynomial of order m defined in a finite interval I. Then two norms |f (u)| = max | u∈I

m X i=2

ai ui |, ||f (u)|| = max u∈I

m X

|ai ui |,

i=2

in X are equivalent, i.e., there is a constant C depending only on m, such that |f (u)| ≤ ||f (u)|| ≤ C|f (u)|. Acknowledgement. The work was done when the author visited the Mathematisches Institut der Universit¨at zu K¨oln and Mathematisches Institut der Technischen Universi¨at M¨unchen. He acknowledges the hospitality of both institutes. He would like to thank H.W. Broer, H.Hanssmann, T. K¨upper, J. P¨oschel, J. Scheurle and F. Takens for their interest and helpful comments. He also thanks the referee of this paper for his patient and careful reading of the manuscript and pointing out a number of errors in the previous version of the paper.

References 1. Arnold, V.I.: Proof of A. N. Kolmogorov’s theorem on the preservation of quasi periodic motions under small perturbations of the Hamiltonian. Usp. Math. USSR 18, 13–40 (1963) 2. Arnold, V.I.: Small denominators and problems of stability of motions in classical and celestial mechanics. Russ. Math. Surv. 18:6, 85–191 (1963) 3. Bourgain, J.: Construction of quasi-periodic solutions for Hamiltonian perturbations of linear equations and applications to nonlinear PDE. Int. Math. Res. Notices. 475–497 (1994) 4. Bruno, A.D.: Analytic form of differential equations. Trudy MMO 25(1971), Trans. Moscow Math. Soc. 25 131–288 (1971) 5. Cheng, C.Q.: Birkhoff-Kolmogorov-Arnold-Moser tori in convex Hamiltonian systems. Commun. Math. Phys. 177, 529–559 (1996) 6. Cheng, C.-Q. and Sun, Y.-S.: Existence of KAM tori in degenerate Hamiltonian systems. J. Differ. Eqs. 114, 288–335 (1994) 7. Craig, W. and Wayne, C.E.: Newton’s method and periodic solutions of nonlinear wave equations. Commun. Pure Appl. Math. 46, 1409–1498 (1993) 8. Graff, S.M.: On the continuation of stable invariant tori for Hamiltonian systems. J. Differ. Eqs. 15, 1–69 (1974) 9. Eliasson, L.H.: Perturbations of stable invariant tori for Hamiltonian systems. Ann. Sc. Norm. Sup. Psia. 15, 115–147 (1988) 10. Kolmogorov, A.N.: On quasi-periodic motions under small perturbations of the Hamiltonian. Dokl. Akad. Nauk. USSR. 98, 1–20 (1954) 11. Kuksin, S.B.: Nearly integrable infinite dimensional Hamiltonian systems. Lect. Notes in Math. 1556 Berlin: Springer, 1993 12. Kuksin, S.B., P¨oschel, J.: Invariant cantor manifolds of quasiperiodic oscillations for a nonlinear Schr¨odinger equation. Ann. of Math. 142, 149–179 (1995) 13. Melnikov, V.K.: On some cases of the conservation of conditionally periodic motions under a small change of the Hamiltonian function. Sov. Math. Dokl. 6, 1592–1596 (1965) 14. Moser, J.: Convergent series expansions for quasiperiodic motions. Math. Ann. 169(1), 136–176 (1967) 15. Moser, J.: Combination tones for Duffings equation. Comm.Pure Appl. Math. 18, 1965 16. Moser, J. and P¨oschel, J.: An extension of a result by Dinaburg and Sinai on quasi-periodic potentials. Comment. Math. Helv. 59, 39–85 (1984) 17. P¨oschel, J.: On elliptic lower dimensional tori in Hamiltonian systems. Math. Z. 202, 559–608 (1989) 18. P¨oschel, J.: A KAM-Theorem for some Nonlinear Partial Differential Equations. Ann. Scuola. Norm. sup. Pisa cl. 23, 119–148 (1996) 19. P¨oschel, J.: Integrability of Hamiltonian systems on Cantor Sets. Comm. Pure Appl. Math. 25, 653–695 (1982) 20. R¨ussmann, H.: On twist Hamiltonians. Talk on the Colloque Int.: M´ecanique c´eleste et systemes hamiltonians, Marseille, 1990

168

J. You

21. R¨ussman, H.: On the one dimensional Schr¨odinger equation with a quasiperiodic potential. Annals of the New York Acad. Sci. 375, 90–107 (1980) 22. Takens, F.: Singularities of vector fields. Publ. Math. I.H.E.S. 43, 479–528 (1974) 23. Wayne, C.E.: Periodic and quasi-periodic solutions for nonlinear wave equations via KAM theory. Commun. Math. Phys. 127, 479–528 (1990) 24. Xia, Z.: The existence of the invariant tori in the volume preserving diffeomorphism. Erg. Th. & Dyn. Sys. 12, 621–631 (1992) 25. Xu, J., You, J., Qiu, Q.: Invariant tori for nearly integrable Hamiltonian systems with degeneracy. To appear in Math. Z (1996) 26. Zehnder, E.: Generalized implicit function theorems with applications to some small divisor problem, I and II. Commun. Pure Appl. Math. 28, 91–140 (1975), 49–111 (1976) Communicated by A. Jaffe

Commun. Math. Phys. 192, 169 – 182 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

α-Continuity Properties of One-Dimensional Quasicrystals David Damanik Fachbereich Mathematik, Johann Wolfgang Goethe-Universit¨at, 60054 Frankfurt/Main, Germany Received: 3 April 1997 / Accepted: 25 June 1997

Abstract: We apply the Jitomirskaya-Last extension of the Gilbert-Pearson theory to discrete one-dimensional Schr¨odinger operators with potentials arising from generalized Fibonacci sequences. We prove for certain rotation numbers that for every value of the coupling constant, there exists an α > 0 such that the corresponding operator has purely α-continuous spectrum. This result follows from uniform upper and lower bounds for the k · kL -norm of the solutions corresponding to energies from the spectrum of the operator.

1. Introduction In this paper we consider a discrete one-dimensional Schr¨odinger operator on l2 (Z) with potential arising from a generalized Fibonacci sequence already studied by Bellissard et al. [2] and Iochum et al. [16], namely, (Hλ,θ,β u)(n) = u(n + 1) + u(n − 1) + λvθ,β (n)u(n),

(1)

where vθ,β (n) = χ[1−θ,1) (nθ + β), λ > 0, θ ∈ (0, 1) irrational and β ∈ [0, 1], together with the corresponding difference equation Hλ,θ,β u = Eu,

(2)

where the sequence (u(n))n∈Z will always be normalized in the sense that |u(0)|2 + |u(1)|2 = 1.

(3)

170

D. Damanik

In [2] it is shown by Bellissard et al. that Hλ,θ,β has purely singular spectrum which is supported on a Cantor set of zero Lebesgue measure (see also [33] for the golden case). They show that the spectrum of Hλ,θ,β is independent of β and study Hλ,θ ≡ Hλ,θ,0 . In this case the potential has the form vθ,0 = b(n + 1)θc − bnθc

(4)

and one can prove recursive relations for the transfer matrices at certain sites. One can then show that the spectrum coincides with the set of energies where the Lyapunov exponent vanishes. By [23], this set has zero Lebesgue measure. Let us remark that the absence of absolutely continuous spectrum also follows in a more general context for minimally ergodic operators with aperiodic potentials taking finitely many values by combining the results of Kotani [23] and Last-Simon [27]. For the concrete operator Hλ,θ we can also show continuity of the spectrum: Theorem 1. The point spectrum of Hλ,θ is empty for all λ, θ. Remarks. 1. This result follows quite easily from the symmetry properties of the potential which were deduced in [2]. However, this generalization of the corresponding result in the golden case [32] was not stated in [2]. We therefore state and prove it for the sake of completeness. 2. Related results were obtained by Delyon-Petritis [8] and Hof et al. [14]. Thus, Hλ,θ has purely singular continuous spectrum for all parameter values. The natural question arises whether estimates on the Hausdorff dimensional properties of the spectral measures of Hλ,θ can be obtained, which in turn give information about the dynamical properties of the associated quantum systems as shown by Guarneri [12, 13], Combes [5] and Last [26]. The related question about control of the Hausdorff dimension of the spectrum as a set was stated as an open problem in [2]. Recently, Jitomirskaya and Last provided an approach for such investigations in terms of estimates on the behavior of the solutions u of (2) [18, 19, 20]. This approach is a natural extension of the GilbertPearson theory relating the existence (resp. non-existence) of subordinate solutions to supports of the singular (resp. absolutely continuous) part of a given Schr¨odinger operator [11, 10, 22]. Consequently, using an upper bound result proved by Iochum et al. [15, 16] we apply this approach to the operator Hλ,θ . We therefore provide an extension of a similar result about the original Fibonacci operator which was announced in [18]. The organization is as follows: After collecting known properties of Hλ,θ we prove in Sect. 2 for certain rotation numbers θ uniform bounds on   21 bLc X |u(n)|2 + (L − bLc)|u(bLc + 1)|2  kukL ≡ 

(5)

n=0

for every solution u of (2) which imply Theorem 1. In Sect. 3 we show how α-continuity of the spectral measures of Hλ,θ follows from these bounds, proving Theorem 2 below. In the appendix we recall some well-known facts about spectral measures and m-functions along with the key inequality from the Jitomirskaya-Last approach.

α-Continuity Properties of One-Dimensional Quasicrystals

171

Theorem 2. Suppose θ is a bounded density number. Then for every λ > 0 there exists an αλ,θ > 0 such that Hλ,θ has purely αλ,θ -continuous spectrum, namely, the spectral measures do not give weight to sets of zero hαλ,θ measure. Remarks. 1. The α-dimensional Hausdorff measure hα is defined by hα (S) ≡ lim

inf

∞ X

δ→0 δ−covers of S

2. 3. 4. 5. 6.

|bn |α ,

n=1

whereSa δ-cover of a set S ⊆ R is a countable collection of intervals {bn }n∈N with ∞ S ⊆ n=1 bn and |bn | < δ ∀n, | · | denoting Lebesgue measure. See [9, 29] for a general exposition on this topic and [30, 31, 26] for decompositions of singular continuous spectral measures with respect to Hausdorff measures. Theorem 6.1 of [26] gives bounds on the time evolution of initial states with nontrivial αλ,θ -continuous component. Therefore, these bounds apply to every state in our model. If the characteristic function χ[1−θ,1) in the potential is replaced by a nicer function, for example a trigonometric polynomial, then the spectral measures are purely zerodimensional for λ large enough [17]. In particular, this result implies that for this set of θ’s the spectrum σ(Hλ,θ ) as a set has positive Hausdorff dimension. The set of θ’s satisfying the assumption of Theorem 2 contains all constant-type numbers (golden mean, silver mean, . . . ) and is of zero Lebesgue measure [24]. Extending this result to a larger class of θ’s essentially amounts to extending the upper bound result of [16] since the lower bound result which will be derived in Sect. 2 holds for almost every θ. Using the constancy of the spectrum we have the following:

Corollary 1. Suppose that θ is a bounded density number. Then for every λ and every β the spectrum of Hλ,θ,β has positive Hausdorff dimension. Remarks. 1. Since the dimensionality of the spectral measures only gives a lower bound for the dimensionality of the spectrum as a set, upper bounds for the latter dimensionality are most conveniently investigated by considering the spectra of the canonical periodic approximants introduced in [32, 2]. In fact, the Almost Mathieu operator gives an example where it is known for certain parameters that the spectrum has positive Lebesgue measure while the spectral measures are purely zero-dimensional [25, 26, 18]. 2. The investigation of the canonical periodic approximants has been carried out in [28] √ ) yielding an upper bound for the Hausdorff for the Fibonacci case (θ = θF ≡ 5−1 2 dimension of the spectrum. It has been shown that there exists a value λ0 such that this dimension is strictly smaller than 1 for all λ > λ0 . This in turn implies for these parameter values that the spectral measures must also be β(λ)-singular (i.e., they are supported on a set S with hβ(λ) (S) = 0, simply take S = σ(Hλ,θF )), where β(λ) < 1. 2. Uniform Bounds for kukL , Proof of Theorem 1 Consider the continued fraction expansion of θ:

172

D. Damanik

1

θ= a1 + Then, best rational approximations

1 a2 + · · · pk qk

≡ [a1 , a2 , . . . ].

of θ are given by (see, e.g., [24])

pk+1 = ak+1 pk + pk−1 , p0 = 0, p1 = 1,

(6)

qk+1 = ak+1 qk + qk−1 , q0 = 1, q1 = a1 .

(7)

Recall that θ is called a bounded density number if the following condition holds lim sup N →∞

N 1 X an < ∞. N n=1

As already remarked in [2], vθ,0 has the following symmetry properties: vθ,0 (n + qk ) = vθ,0 (n) 1 ≤ n < qk+1 − 1,

(8)

vθ,0 (−n) = vθ,0 (n − 1) n ≥ 2.

(9)

In the following, let u denote a solution of the difference equation (2), where β = 0, E ∈ σ(Hλ,θ ) and u is normalized in the sense of (3). Define u(n) U (n) ≡ u(n − 1) and

  21 bLc X kU (n)k2 + (L − bLc) kU (bLc + 1)k2  , kU kL ≡  n=1

where

1

kU (n)k ≡ (|u(n)|2 + |u(n − 1)|2 ) 2 .

The quantities kukL and kU kL have the same asymptotic behavior as L → ∞ since 2 1 2 kU kL

Let Tλ,θ (n, E) be the transfer matrix Tλ,θ (n, E) ≡

≤ kuk2L ≤ kU k2L .

E − λvθ,0 (n) −1 1 0

(10) ,

which provides the standard reformulation of the eigenvalue equation (2), namely U (n + 1) = Tλ,θ (n, E)U (n). Leaving the dependence on λ, θ and E implicit for notational convenience, we define M (n) ≡ Tλ,θ (n, E) × · · · × Tλ,θ (1, E) and Mn ≡ M (qn ). From (8) we obtain recursive relations for the matrices Mn :

α-Continuity Properties of One-Dimensional Quasicrystals

173

Mn+1 = Mn−1 Mnan+1 .

(11)

We would like to remark at this point that for the original Fibonacci operator (an = 1 ∀n), this follows more directly from an alternative description of the potential by means of a substitution rule. The recursive relation (11) in this case arises naturally from the self-similarity of the potential which is induced by the substitution rule; compare, for example, [3, 6]. The potential can also be generated by a substitution rule for other values of θ; compare [1] for a related class of operators. The recursion (11) induces a recursion for the traces xn ≡ tr(Mn ) of the transfer matrices at sites qn . In [2] it is shown that E is in the spectrum of Hλ,θ if and only if the sequence (|xn |)n∈N is bounded. Moreover, for every λ there exists a uniform bound cλ for all sequences (|xn |)n∈N corresponding to energies E ∈ σ(Hλ,θ ). Using this result, Iochum et al. derive upper bounds for solutions u of (2) in the case that θ is a bounded density number [15, 16]. Keeping track of the constants and using the uniform bounds on the traces, one verifies (2)

kukL ≤ Cλ,θ Lαλ,θ

(12)

for all solutions u, normalized in the sense of (3), corresponding to E ∈ σ(Hλ,θ ) with (2) , Cλ,θ . We will now prove a similar lower bound: suitably chosen αλ,θ Proposition 2.1. Suppose the sequence (qn ) associated to θ satisfies qn ≤ Bθn . Then (1) > 0 and L0 ≥ 1 such that for every λ > 0, there exist αλ,θ (1) 1 kukL ≥ √ Lαλ,θ ∀L ≥ L0 2

(13)

for all normalized solutions u corresponding to E ∈ σ(Hλ,θ ). Remark. The set of θ’s satisfying the assumption has full Lebesgue measure [21]. This proposition will be proved in three steps: • We consider kU kq5n and show that this quantity grows at least exponentially in n. • We show that the claimed lower bound holds at the sites q5n . • The lower bound also holds at all other sites. Lemma 2.2. For every irrational θ and every λ we have kU kqn ≥ Cλ kU kqn−5 for any normalized solution u corresponding to E ∈ σ(Hλ,θ ). Proof. The lemma will be proved by exploiting squares in the transfer matrix arrangements. We will use the following mass-reproduction technique: Suppose the 2-vector U (m) sees a square in the transfer matrix arrangement, that is U (m + 2l) = A2l U (m). Then the following inequality holds (compare, e.g., [32]): max(kU (m + 2l)k, |tr Al | kU (m + l)k) ≥ 21 kU (m)k. Typically, Al will be a cyclic permutation of a product Tλ,θ (l, E) × · · · × Tλ,θ (1, E),

174

D. Damanik

where l is equal to some qn . Thus tr Al = tr Mn = xn and we can use the uniform upper bound for these traces. This technique will be applied to a certain number of consecutive sites. Hence, if the sites k = 1, . . . , qn1 see the squares of cyclic permutations of a matrix product Mn2 , we have kU kqn1 +2qn2 ≥ Cλ kU kqn1 , where

(14)

1

Cλ ≡ (1 + ( 2c1λ )2 ) 2 > 1,

since kU k2qn +2qn 1 2

=

qn 1 X m=1

qn1 +2qn2

kU (m)k + 2

X

m=qn1 +1

qn 1

≥

X m=1

= (1 +

kU (m)k2

kU (m)k2 + ( 2c1λ )2 (

qn 1 X

kU (m)k2 )

m=1

( 2c1λ )2 )kU k2qn . 1

We have to consider three cases on different levels. Case 1: an ≥ 3 an 3 d ⇒ Mn = Mn−2 Mn−1 =M n Mn−1 ⇒ kU kqn ≥ Cλ kU kqn−1 ,

by (14). Case 2: an = 2 2 ⇒ Mn = Mn−2 Mn−1 ⇒ kU kqn ≥ Cλ kU kqn−2 ,

again by (14), applied to the sites 1, . . . , qn−2 . Case 3: an = 1 an−1 . ⇒ Mn = Mn−2 Mn−1 = Mn−2 Mn−3 Mn−2 Now, we have to consider the three cases on the next level: Case 3.1: an−1 ≥ 3 3 g ⇒ Mn = M n Mn−2 ⇒ kU kqn ≥ Cλ kU kqn−2 ,

and we are done in this case. Case 3.2: an−1 = 2 2 ⇒ Mn = Mn−2 Mn−3 Mn−2 ⇒ kU kqn ≥ Cλ kU kqn−3 ,

by the argument from Case 2. Case 3.3: an−1 = 1 ⇒ Mn = Mn−2 Mn−3 Mn−2 .

α-Continuity Properties of One-Dimensional Quasicrystals

175

Again, we have to consider the three cases on the next level: Case 3.3.1: an−2 ≥ 3 3 ⇒ Mn = Mn Mn−3 ⇒ kU kqn ≥ Cλ kU kqn−3 .

Case 3.3.2: an−2 = 2 2 ⇒ Mn = Mn−2 Mn−3 Mn−4 Mn−3 ⇒ kU kqn ≥ Cλ kU kqn−4 .

Case 3.3.3: an−2 = 1 ⇒ Mn = Mn−2 Mn−3 Mn−4 Mn−3 . For the last time we have to consider the three cases on the next level: Case 3.3.3.1: an−3 ≥ 3 3 ⇒ Mn = Mn0 Mn−4 ⇒ kU kqn ≥ Cλ kU kqn−4 .

Case 3.3.3.2: an−3 = 2 2 ⇒ Mn = Mn−2 Mn−3 Mn−4 Mn−5 Mn−4 ⇒ kU kqn ≥ Cλ kU kqn−5 .

Case 3.3.3.3: an−3 = 1 2 ⇒ Mn = Mn−2 Mn−1 = Mn−4 Mn−3 Mn−2 2 = Mn−4 Mn−5 Mn−4 Mn−3 Mn−2 = Mn−4 Mn−5 Mn−2

⇒ kU kqn ≥ Cλ kU kqn−5 . Hence, we obtain the desired inequality in all cases.

Proof of Theorem 1. From Lemma 2.2, we see that for all parameters λ, θ, there exist no l2 -solutions. Now, we need an estimate on the growth of the qn in the case that θ is a bounded density number in order to show that these numbers satisfy the assumption of Proposition 2.1. Lemma 2.3. Suppose that θ is a bounded density number. Then there exists a constant C such that qn ≤ C n . Proof. We compare the sequence (qn ) with the sequence (rn ) which is generated by the recursion rn+1 = 2an+1 rn with initial condition r1 = a1 . We have qn ≤ rn and rn =

n Y i=1

2ai ,

176

D. Damanik

yielding

n

1

ln(rn ) n =

1X ln 2ai . n i=1

The right-hand side is bounded by assumption. Hence, the assertion follows. Lemma 2.4. Suppose the sequence (qn ) associated to θ satisfies qn ≤ every λ > 0, there exists a γλ,θ > 0 such that

Bθn .

Then for

γ

kU kq5n ≥ q5nλ,θ for any normalized solution u corresponding to E ∈ σ(Hλ,θ ). Proof. By assumption we have n n ≤ q5n ≤ Cθ,2 , Cθ,1

and, by Lemma 2.2,

kU kq5n ≥ Cλn .

Choose γλ,θ > 0 such that

γ

Cθ,2λ,θ ≤ Cλ .

We have kU kq5n ≥ γ q5nλ,θ and the proof is complete.

Cλ γ Cθ,2λ,θ

≥ 1,

Proof of Proposition 2.1. Choose ε ∈ ( ε > 0 and

ln Cθ,2 −ln Cθ,1 γλ,θ , γλ,θ ). ln Cθ,2

γ

Cθ,1λ,θ  

α(1)

Cθ,2λ,θ γ Cθ,1λ,θ

(1) We have αλ,θ ≡ γλ,θ −

α(1)

Cθ,2λ,θ Choose n0 such that

!n

< 1.

n0 

−α(1)

≤ Cθ,2 λ,θ ,

and let L0 ≡ q5n0 . Now, let L ≥ L0 . Then q5n ≤ L < q5(n+1) for a suitably chosen n ≥ n0 . With these definitions the desired lower bound follows: kU kL ≥ kU kq5n γ

≥ q5nλ,θ nγ

≥ Cθ,1λ,θ (n+1)α(1) λ,θ

≥ Cθ,2 α(1)

λ,θ ≥ q5(n+1) (1)

≥ Lαλ,θ . Now use (10) to get the claimed bound for kukL .

α-Continuity Properties of One-Dimensional Quasicrystals

177

3. Non-Existence of α-Subordinate Solutions, Proof of Theorem 2 In Sect. 2 we obtained estimates on the behavior of eigenfunctions on the r ight halfline. Following the lines of Jitomirskaya–Last [18, 19, 20] , we will now show that these estimates imply the absence of α-subordinate solutions yielding bounds on the possible growth of the half-line m-functions. Using the symmetry (9) of the potential, one can then prove α-continuity of the spectral measure of the line operator. Proposition 3.1. Let αλ,θ ≡

(1) 2αλ,θ (1) (2) αλ,θ + αλ,θ

.

Then, for every pair (u1 , u2 ) of normalized solutions of (2) corresponding to E ∈ σ(Hλ,θ ), the following holds: ku1 kL

lim inf L→∞

αλ,θ 2−αλ,θ

> 0.

ku2 kL

Proof. Using the uniform bounds we obtain for L ≥ L0 , (1)

ku1 kL

α 2−α

ku2 kL

≥

√1 Lαλ,θ 2 (2) α (Cλ,θ Lαλ,θ ) 2−α

=√

1

(1)

α 2−α

(2)

α

Lαλ,θ −αλ,θ 2−α .

2Cλ,θ

For α = αλ,θ the right hand-side is uniformly bounded away from 0.

Once the positivity of this lim inf is obtained, one can employ the following general procedure from the study of the Fibonacci case in [20]: Let mφ+ be the family of Weyl– Titchmarsh m-functions on the right half-line corresponding to boundary conditions indexed by φ and let M be the trace of the Weyl matrix. M has the form M=

−1 + m+ + m−

1 m+

1 , + m1−

where m± are m-functions on the right and left half-line respectively corresponding to suitably chosen boundary conditions. The appendix recalls definitions and basic properties of these standard quantities. In order to prove Theorem 2 it suffices to show lim sup ε1−α |M (E + iε)| < ∞ ε→0

(15)

for spectral-almost every energy E. Combining the result of Gilbert [10], the Jitomirskaya-Last inequality (26) and the absence of absolutely continuous spectrum, we deduce that for spectral-almost every energy E there exists a phase φ ∈ (− π2 , π2 ] such that mφ+ (E + iε) tends to infinity as ε → 0. We shall prove (15) for these energies. To this end, we will only consider the first term; the second term is treated in a similar way. We therefore want to show lim sup ε1−α ε→0

1 < ∞, |m+ (E + iε) + m− (E + iε)|

(16)

178

D. Damanik

or, equivalently, lim inf εα−1 |m+ (E + iε) + m− (E + iε)| > 0. ε→0

(17)

Using the symmetry (9) and the Riccati equation (22), one can express m− (E + iε) in terms of m+ (E + iε) and E + iε (i.e., m− (E + iε) = g(m+ (E + iε), E + iε)). A straightforward calculation yields g(m, z) = Thus,

1 + z. −m + 1 − z

(18)

m+ (E + iε) + m− (E + iε) = f (m+ (E + iε), E + iε),

where f (m, z) = m + Consider the case

1 + z. −m + 1 − z

f (m+ (E + iε), E + iε) → 0

as ε → 0 (otherwise (17) holds trivially). Then, m+ (E + iε) converges to a real value. Moreover, by Propositions 3.1, 4.1 and 4.2, we infer that m+ (E + iε) converges more slowly than ε and obeys the desired lower bound on the speed of convergence. In order to complete the proof of Theorem 2, we are therefore left to show that ∂f (m+ (E + i0), E + i0) 6= 0. ∂m

(19)

A simple calculation shows ∂f (m, E + i0) = 0 ⇔ m = 1 − E ± i. ∂m 4. Appendix: Line Operators, m-Functions, Spectral Measures and the Jitomirskaya-Last Inequality In this section, we summarize the part of the Jitomirskaya-Last approach which relates α-continuity properties of the spectral measure of a line operator to an analysis of the solutions on the half-lines. We remark at this point that the half-line case is treated in [18] (proofs appear in [19]), along with some remarks on the line case. A comprehensive discussion of the line case will be provided in [20]. We first recall some basic facts which, for example, can be found in [4]. Let (Hu)(n) = u(n + 1) + u(n − 1) + V (n)u(n)

(20)

be a discrete one-dimensional Schr¨odinger operator on l2 (Z) with a real-valued bounded potential V . Consider for z = E + iε, ε ≥ 0, the difference equation Hu = zu.

(21)

If ε > 0, this equation has solutions u± (z, · ), where u± (z, · ) is square-summable at ±∞. These solutions are unique up to a constant factor. Define the pair of m-functions m± by

α-Continuity Properties of One-Dimensional Quasicrystals

m± (z) ≡ ∓

179

u± (z, 1) . u± (z, 0)

If one defines more generally mn,± (z) ≡ ∓

u± (z, n + 1) , u± (z, n)

then the following well-known Riccati equation holds (simply because u± are solutions) ∓ mn+1,± (z) ∓ mn,± (z)−1 + V (n + 1) − z = 0.

(22)

The function m+ corresponds to the case φ = 0 in the family of m-functions {mφ+ }φ∈(− π , π ] , 2 2

which is defined as follows. Let uφ1 = uφ1 (z, · ), uφ2 = uφ2 (z, · ) be the solutions of (21) satisfying φ cos(φ) sin(φ) u1 (z, 1) uφ2 (z, 1) = . − sin(φ) cos(φ) uφ1 (z, 0) uφ2 (z, 0) mφ+ is then uniquely defined by the equation uφ+ (z, · ) = uφ2 (z, · ) + mφ+ uφ1 (z, · ), where uφ+ (z, · ) is the multiple of u+ (z, · ) obeying uφ+ (z, 0) cos φ + uφ+ (z, 1) sin φ = 1. We have Im mφ+ (z) > 0

(23)

and mφ+ 2 =

− sin(φ2 − φ1 ) + cos(φ2 − φ1 )mφ+ 1

Let M (z) ≡

cos(φ2 − φ1 ) + sin(φ2 − φ1 )mφ+ 1 −1 + m+ (z) + m− (z)

1 m+ (z)

.

(24)

1 . + m−1 (z)

The measure µ in the Herglotz representation of M (z) is equivalent to the spectral measure of the operator H (the spectral measure is a matrix measure and µ is its trace). Consider for α ∈ [0, 1] the α-derivative of µ at E, Dµα (E) ≡ lim sup η→0

and the set Then, by [30, 31], and

µ((E − η, E + η)) (2η)α

F ≡ {E | Dµα (E) < ∞}. hα (R \F ) = 0 µ(S ∩ F ) = 0 for any S with hα (S) = 0.

180

D. Damanik

In particular, the restriction µαc ( · ) ≡ µ(F ∩ · ) is α-continuous. In [7], del Rio et al. prove Dµα (E) < ∞ ⇔ lim sup ε1−α |M (E + iε)| < ∞. ε→0

In order to show α-continuity of the spectral measure it therefore suffices to show that this lim sup is finite almost everywhere with respect to µ. Assign to each ε > 0 a length L(ε) via the equality kuφ1 (E + i0, · )kL(ε) kuφ2 (E + i0, · )kL(ε) =

1 . 2ε

(25)

Now, the Jitomirskaya–Last inequality [18, 19] reads as follows: √ √ kuφ1 (E + i0, · )kL(ε) 5 + 24 5 − 24 < φ . < |mφ+ (E + iε)| ku2 (E + i0, · )kL(ε) |mφ+ (E + iε)|

(26)

A straightforward application of this inequality yields Proposition 4.1. Suppose lim inf L→∞

Then

kuφ1 (E + i0, · )kL α

kuφ2 (E + i0, · )kL2−α

> 0.

lim sup ε1−α |mφ+ (E + iε)| < ∞. ε→0

Proof. By assumption we have lim sup L→∞

1

!1−α

kuφ1 (E + i0, · )kL kuφ2 (E + i0, · )kL

kuφ2 (E + i0, · )kL kuφ1 (E + i0, · )kL

< ∞.

Apply (25) and (26).

Proposition 4.2. Consider the case mφ+ 1 (E + iε) → ∞ for a certain phase φ1 and a certain energy E. Let φ2 6= φ1 . a) mφ+ 2 (E + iε) converges to cot(φ2 − φ1 ). b) If lim sup ε1−α |mφ+ 1 (E + iε)| < ∞, ε→0

then

lim inf εα−1 |mφ+ 2 (E + iε) − cot(φ2 − φ1 )| > 0. ε→0

Proof. Both assertions follow immediately from (24).

Acknowledgement. The author would like to thank S. Jitomirskaya and Y. Last for useful discussions and D. Buschmann for an earlier collaboration on this subject. The author would also like to thank A. Kechris and B. Simon for the hospitality of the Mathematics Department at Caltech where this work was done. Financial support from DAAD (Doktorandenstipendium HSP III) is gratefully acknowledged.

α-Continuity Properties of One-Dimensional Quasicrystals

181

References 1. Aubry, S., Godreche, C., Luck, J. M.: Scaling properties of a structure intermediate between quasiperiodic and random. J. Stat. Phys. 51, 1033–1075 (1988) 2. Bellissard, J., Iochum, B., Scoppola, E., Testard, D.: Spectral properties of one-dimensional quasicrystals. Commun. Math. Phys. 125, 527–543 (1989) 3. Bovier, A., Ghez, J.-M.: Spectral properties of one-dimensional Schr¨odinger operators with potentials generated by substitutions. Commun. Math. Phys. 158, 45–66 (1993); Erratum Commun. Math. Phys. 166, 431–432 (1994) 4. Carmona, R., Lacroix, J.: Spectral Theory of Random Schr¨odinger Operators. Boston: Birkh¨auser, 1990 5. Combes, J. M.: Connections between quantum dynamics and spectral properties of time-evolution operators. In Differential Equations with Applications to Mathematical Physics, W. F. Ames, E. M. Harrell, and J. V. Herold, Eds., Boston: Academic Press, 1993 6. Damanik, D.: Schr¨odinger operators with potentials generated by primitive substitutions: An invitation. To appear in Univ. Jagel. Acta Math 34, (1997) 7. del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) 8. Delyon, F., Petritis, D.: Absence of localization in a class of Schr¨odinger operators with quasiperiodic potential. Commun. Math. Phys. 103, 441–444 (1986) 9. Falconer, K. J.: Fractal Geometry. Chichester: Wiley, 1990 10. Gilbert, D. J.: On subordinacy and analysis of the spectrum of Schr¨odinger operators with two singular endpoints. Proc. Roy. Edinburgh 112A, 213–229 (1989) 11. Gilbert, D. J., Pearson, D. B.: On subordinacy and analysis of the spectrum of one-dimensional Schr¨odinger operators. J. Math. Anal. Appl. 128, 30–56 (1987) 12. Guarneri, I.: Spectral properties of quantum diffusion on discrete lattices. Europhys. Lett. 10, 95–100 (1989) 13. Guarneri, I.: On an estimate concerning quantum diffusion in the presence of a fractal spectrum. Europhys. Lett. 21, 729–733 (1993) 14. Hof, A., Knill, O., Simon, B.: Singular continuous spectrum for palindromic Schr¨odinger operators. Commun. Math. Phys. 174, 149–159 (1995) 15. Iochum, B., Testard, D.: Power law growth for the resistance in the Fibonacci model. J. Stat. Phys. 65, 715–723 (1991) 16. Iochum, B., Raymond, L., Testard, D.: Resistance of one-dimensional quasicrystals. Physica A187, 353–368 (1992) 17. Jitomirskaya, S.: In preparation 18. Jitomirskaya, S., Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 19. Jitomirskaya, S., Last, Y.: Power law subordinacy and singular spectra, I. Half-line operators. In preparation 20. Jitomirskaya, S., Last, Y.: Power law subordinacy and singular spectra, II. Line operators. In preparation 21. Khintchine, A.: Continued Fractions. Groningen: Noordhoff, 1963 22. Khan, S., Pearson, D. B.: Subordinacy and spectral theory for infinite matrices. Helv. Phys. Acta 65, 505–527 (1992) 23. Kotani, S.: Jacobi matrices with random potentials taking finitely many values. Rev. Math. Phys. 1, 129–133 (1990) 24. Lang, S.: Introduction to Diophantine Approximations. New York: Addison-Wesley, 1966 25. Last, Y.: A relation between a.c. spectrum of ergodic Jacobi matrices and the spectra of periodic approximants. Commun. Math. Phys. 151, 183–192 (1993) 26. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 27. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schr¨odinger operators. Preprint 28. Raymond, L.: A constructive gap labelling for the discrete Schr¨odinger operator on a quasiperiodic chain. Preprint

182

D. Damanik

29. Rogers, C. A.: Hausdorff Measures. London: Cambridge Univ. Press, 1970 30. Rogers, C. A., Taylor, S. J.: The analysis of additive set functions in Euclidean space. Acta Math. Stock 101, 273–302 (1959) 31. Rogers, C. A., Taylor, S. J.: Additive set functions in Euclidean space, II. Acta Math. Stock 109, 207–240 (1963) 32. S¨ut¨o, A.: The spectrum of a quasiperiodic Schr¨odinger operator. Commun. Math. Phys. 111, 409–415 (1987) 33. S¨ut¨o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) Communicated by B. Simon

Commun. Math. Phys. 192, 183 – 215 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Quantum Algebras and q -Special Functions Related to Coherent States Maps of the Disc Anatol Odzijewicz? Warsaw University Branch in Bialystok, Institute of Physics, Lipowa 41, PL-15-424 Bialystok, Poland. E-mail: [email protected] Received: 4 April 1996 / Accepted: 29 June 1997

Abstract: The quantum algebras generated by the coherent states maps of the disc are investigated. It is shown that the analytic realization of these algebras leads to a generalized analysis which includes standard analysis as well as q-analysis. The applications of the analysis to star-product quantizations and q-special functions theory are given. Among others the meromorphic continuation of the generalized basic hypergeometric series is found and a reproducing measure is constructed, when the series is treated as a reproducing kernel.

1. Introduction The coherent states method is one of the most powerful instruments of contemporary physics. It has extensive applications in quantum mechanics, statistical physics, nuclear physics, quantum optics and other domains of physics, see [16] for an exhaustive presentation and complete references. In general one can understand the coherent states map K as a map of the classical phase space into quantum phase space, i.e. complex projective Hilbert space. Analysing the notion of the physical system, one obtains its description in terms of a triple which consists of a differential manifold, complex separable Hilbert space H and the map of the manifold into the projective space of H. All other physical and mathematical ingredients which one needs in order to describe the system are derived from that triple, see [25]. In particular one can assign to it an operator algebra, which is a generalization of the Heisenberg algebra to a general phase space. Depending on the coherent states map, that algebra could be a C ∗ -algebra, von Neumann algebra or an algebra realized by unbounded operators with common domain containing all coherent states. We will discuss the general case of such algebras in a separate paper. ?

The paper was partially supported by KBN Grant No 2 Po3A 074 10.

184

A. Odzijewicz

In this paper we shall focus our attention on the case where the classical phase space is given by the disc DR ⊂ C of radius R ≤ ∞ and the coherent states map is a complex analytic map of the disc DR into the complex separable Hilbert space without the zero vector. In that case the quantum algebra AK (the K-Heisenberg algebra) is generated by the annihilation operator and its conjugate (the creation operator). Since the annihilation operator is completely determined by the coherent states map, one obtains the family of quantum algebras functionally parametrized by the latter. These algebras satisfy certain relations, which justify their geometric interpretation as noncommutative curves, see Sect. 2 and Ref. [27]. There is also a very nice and natural link of the K-Heisenberg algebra with complex analysis. It is given by the anti-linear monomorphism of the Hilbert space H in the vector space O(DR ) of complex analytic functions on the disc. This monomorphism follows from the analyticity of the coherent states map, see Sect. 2. After passing to the analytic realization of H one can interpret the annihilation operator as a generalized derivative ∂K , i.e. the K-derivative, by definition. The K-derivative operator, its right inverse (the K-integral) and other elements of AK , such as the operator of multiplication by analytic functions satisfy interesting relations (e.g a K-version of the Leibniz rule) which, taken together, give rise to non-standard analysis on the disc. This analysis is the K-deformation of the standard analysis, which one obtains if K is the Gauss coherent states map, see Sect. 2. The main purpose of this article is the analytic investigation of a class of K-quantum algebras. That class coresponds to the diagonal coherent states map, see Sect. 2, and naturally generalizes the q-Heisenberg algebra, see [21]; (q, p)-Heisenberg algebra, see [8]; the algebras discussed in [11, 12, 26, 30, 36, 37] and the algebra investigated in [17] as a non-commutative deformation of the disc. In numerous articles physicists discuss the natural q-deformations of the harmonic oscillator, see [21, 32, 33]. The algebras related to those deformations and algebras used in [32, 33] for the integration of the Schr¨odinger equation are also included in our scheme. Therefore, the K-algebras give a natural framework for the classification of the integrable physical systems for which the Schr¨odinger operator could be given as a function Hˆ = H(AA† , A† A) of the products of K-annihilation A and K-creation A† operators. The application of K-analysis to the ∗-product quantization is also very useful. Using the K-analysis one can find a K-analytical expression, see (41), for the K-star product of the two R-analytical functions on the disc DR . If one wants to calculate the ∗-product in terms of standard analysis one ends up with intricate computations and the final formulae are completely clumsy. If, however, one does the same by applying K-analysis, the final ∗-product formulae are obtained quickly and efficiently. A good illustration of this is given by formulae (42) and (43) which are q-deformation and (q, p)-deformation respectively of the Moyal-Wick product. In Sect. 5 we show that to each meromorphic function on C, which does not possess a pole at the origin, but could possess an isolated singularity, some unique K-algebra is related. If the function is rational, the K-analysis reduces to the theory of the basic hypergeometric series and, thus, in the limit one obtains the theory of hypergeometric series, see Sect. 6 and Sect. 7. For example, basic hypergeometric series could be considered as K-exponential functions, i.e. functions which satisfy the K-exponential equation, see (15). In that case we also obtain a generalization of the q-binomial formula and Euler formulae, see [7]. One of the problems of the investigated theory is the problem of finding the reproducing measure for the kernels given by the K-exponential function. That problem is equivalent to the moments problem, see [1], when moments are given as the inverses

Quantum Algebras and q-Special Functions

185

of the coefficients of the Taylor expansion of the K-exponential map. In Sect. 3 and Sect. 6 we propose a way to solve the moment problem and obtain the exact solution for most of the cases of interest. The result is that we find the reproducing measure for the reproducing kernel given by the generalized basic hypergeometric series 8R (zv). In the limit q → 1, this yields the reproducing measure for the ordinary hypergeometric series treated as a reproducing kernel. Finally, let us mention that in this paper we address our attention mainly to the analytical aspects of the theory of K-quantum algebras. We do not try to answer many interesting questions concerning the algebraic (i.e. C ∗ -algebraic and von Neumann algebraic) aspects of the problem. We also do not discuss here the possible ways of extending the notion of K-algebra to the case of general phase spaces and the connection of the above with geometric quantization and the ∗-product quantization. Some aspects concerning the C ∗ -algebras generated by the coherent states map are investigated in [23].

2. The Coherent States Map of the Disc In this section we define an algebra which generalizes the Heisenberg algebra. The definition will be based on the coherent states map. In the considered case by coherent states map we mean a complex analytic map K : DR → H \ {0} of the disc DR = {z ∈ C : |z| < R} into a complex separable Hilbert space H. Additionally we assume that the vector coefficients fn ∈ H in the expansion K(z) =

∞ X

fn z n

(1)

n=0

are given by fn = cn Cen , where C is a bounded operator with bounded inverse C −1 and {en }∞ n=0 is an orthonormal basis in H. The coefficients cn > 0, n ∈ N ∪ {0}, are subjected to the conditions √ R−1 = lim sup n cn

and

n→∞

After the expansion Cen =

P∞ m=0

cn−1 < +∞. n∈N cn sup

(2)

Cmn em one finds

K(z) =

∞ X

km (z)em ,

(3)

m=0

where the complex analytic functions km are defined by km (z) =

∞ X

cn Cmn z n

(4)

n=0

for |z| < R. Let Pk be the orthogonal projection to the subspace orthogonal to the vectors {f0 , f1 , . . . , fk−1 , fk+1 , . . .}. From the equality

186

A. Odzijewicz

+ N X 1 ck 2 1 Pk ) fm = δnm = lim fn ( N →∞ cn hfk |Pk fk i cm k=0 1 1 1 −1 1 −1 C fn C fm = fn (CC † )−1 fm , cm cm cn cm *

(5)

one concludes that + * N X D E ck 2 Pk w = v (CC † )−1 w lim v N →∞ hfk |Pk fk i

(6)

k=0

P∞ P∞ P∞ for vectors v = n=0 vn c1n fn and w = n=0 wn c1n fn with n=0 |vn | < ∞ and P∞ n=0 |wn | < ∞. Since all vectors v, w ∈ H are decomposable in this way, the series ∞ X

(CC † )−1 =

k=0

ck 2 Pk hfk |Pk fk i

(7)

converges in the weak sense to the operator (CC † )−1 . Formula (7) is to be used then for the derivation of z n , see (18). Let us now define the annihilation operator A by the condition that the coherent states K(z) are its eigenstates with eigenvalues z ∈ DR , i.e. A K(z) = zK(z).

(8)

Using (8) one finds C −1 ACe0 = 0

and

C −1 ACen =

cn−1 en−1 . cn

(9)

From (9) and the conditions (2) one concludes that C −1 AC is a bounded operator. In consequence the annihilation operator A and the creation operator A† (which is the hermitian conjugate of A) are bounded, too. The closure of the algebra spanned by A and A† , in the proper topology, gives a C ∗ -algebra AK . By proper topology we mean weak or norm topology. But, for the purposes of this article it is not necessary to specify which one. In the case C = J and cn = √1n! one obtains the Heisenberg algebra. (Let us remark here that in the Heisenberg case ||A|| = ∞ and R = ∞, i.e. there is no C ∗ -algebra.) The above motivates us to call AK the Heisenberg algebra generated by the coherent states map K. In brief we shall call it K-Heisenberg algebra. Namely, one arrives at a family of noncommutative curves AK parametrized by coherent states maps. The creation and annihilation operators (A† , A) play the role of noncommutative coordinates which replace the commutative ones (z, z). The relation between A† and A is determined by K. The appropriate examples of such a relation will be presented subsequently. Since fN +1 = lim

z→0

1

z

(K(z) − N +1

N X k=0

fn z n )

and

f0 = K(0)

(10)

Quantum Algebras and q-Special Functions

187

one finds by induction that fN belongs to the closure of the linear span of the subset {K(z), |z| < ε < R} ⊂ H. Hence, coherent states K(z), z ∈ DR , form a linear dense subset in the Hilbert space H. Therefore, the map IK : H → O(DR ) defined by IK (v): =hv |K(·) i,

v∈H

(11)

is an anti-linear monomorphism of the vector space H into the vector space O(DR ) of holomorphic functions on the disc. Then one can identify the Hilbert space H with the subspace IK (H) ,→ O(DR ) and consider the analytic representation IK ◦AK ◦IK −1 of the algebra AK . From the defining property (8) one concludes that the creation operator A† in analytical representation is realized as the operator of multiplication by the complex coordinate z, (IK ◦A† ◦IK −1 ϕ)(z) = zϕ(z),

(12)

where ϕ ∈ IK (H). The analytic representation IK ◦A◦IK −1 =: ∂

(13)

of the annihilation operator will be called a K-derivative. This terminology is justified by the fact that for the Heisenberg case ( C = J and cn = √1n! ) ∂ is ordinary derivative with respect to the complex variable z ∈ DR . The same motivation leads us to the definition of the K-exponential function Exp(v, z): =hK(v) |K(z) i

(14)

which, as it is easy to show, satisfies the equation ∂Exp(v, z) = vExp(v, z).

(15)

Let us remark here that ∂ acts on the coordinate z, whereas we shall denote by ∂ the K-derivative acting on the complex conjugate coordinate z. From † IK (c−1 n C

−1

en )(z) = z n

(16)

one concludes that monomials z n , n ∈ N ∪ {0} belong to IK (H). So, one can apply the K-derivative to them: E 1 D (17) ∂z n = 2 A(CC † )−1 fn |K(z) . cn Substituting (7) into (17) one finds ∂z n =

hAPn fn |K(z) i . hPn fn |Pn fn i

(18)

The general coherent states map is the superposition of the diagonal coherent states map (i.e. when C = J) with the bounded operator C : H → H. If C is a unitary operator one deals with the change of an orthonormal basis only. Therefore, the diagonal case apart from its own interest is also crucial for the investigation of the general one. Let us now 2 turn our attention to the diagonal case. For this purpose we denote qn : =( cn−1 cn ) , n ∈ N, and q0 = 0. Then (9) and (17) reduce to

188

A. Odzijewicz

AA† en = qn+1 en

A† Aen = qn en

(19)

and ∂z n = qn z n−1

(20)

respectively. Taking into account the conditions (2) and the equality cn = √

c0 , q1 · · · q n

(21)

we obtain the lower estimates cn ≥

c0 n ||A||

and

||A|| ≥ R

(22)

for the coefficients cn and the norm of the annihilation operator. If the sequence {qn } is √ convergent then limn→∞ qn = R. Since A† A and AA† are diagonalized by the orthonormal basis {en }, any relation R(qn+1 , qn ) = 0 between the factors given by the function R of two real arguments is also valid for the operators: R(AA† , A† A) = 0. (23) · · ◦8}(0) are defined iteratively. If R(qn+1 , qn ) = qn+1 − 8(qn ) the factors qn = 8◦ | ·{z n

Therefore, in that case the algebra AK is determined by the relation (23). The subcase when 8 is a fractional map will be investigated in Sect. 4. The two subsequent examples concern the K-Heisenberg algebras which are related to q-deformed and (q, p)-deformed harmonic oscilators see [21, 8] Example 1. The simplest and most well known example (e.g. see [21]) of the algebra AK is given by a linear map 8(x) = qx + h, where 0 ≤ q ≤ 1 and h ≥ 0. From qn+1 = qqn + h one obtains that qn -factors and the (q, h)-exponential function are given by 1 − qn , 1−q ∞ X 1 vz n Expq,h (v, z) = ( ) , n!q h qn = h

(24) (25)

n=0

q h where z, v ∈ DR , R = 1−q and n!q = 1(1 + q) · · · (1 + · · · + q n−1 ). The relation (23) is in this case identical with the q-deformed Heisenberg commutation relations AA† − qA† A = h and ||A|| =

q

h 1−q .

(26)

The K-derivative has the following functional realization (∂q,h ϕ)(z) = h

ϕ(z) − ϕ(qz) . (1 − q)z

(27)

Quantum Algebras and q-Special Functions

189

Example 2. Let us consider the coherent states map Kq,p : DR → H \ {0} defined by factors q n − p−n , (28) qn = h q − p−1 where 0 < q < 1 and 0 < p < 1. Then we have Expq,p (v, z) =

n ∞ X p( 2 )

n=0

and (∂q,p ϕ)(z) =

n!qp

(vz)n

(29)

ϕ(qz) − ϕ(p−1 z) . (q − p−1 )z

(30)

The relation for A and A† in this case is (AA† − qA† A)log q = (AA† − p−1 A† A)log p . −1

(31)

Let us introduce now the operators (Qϕ)(z) = ϕ(qz)

and

(P ϕ)(z) = ϕ(p−1 z).

(32)

Then instead of (31) one can take its “uniformized” version with the operator Q as a “parameter” A† A =

1 (Q − Q−α ), q − p−1 1 AA† = (qQ − (qQ)−α ), q − p−1

(33)

p where 0 < α = log log q . Hence, the algebra Aq,p is included in a more general class of K-Heisenberg algebras investigated in Sect. 5 which are parametrized by functions R, see (130). In the considered case the function R is given by

R(x) =

x − x−α q − p−1

(34)

for which limx→0 R(x) = +∞. Thus the exponential function Expq,p is defined on the whole complex plane C = DR and the quantum algebra Aq,p , like the Heisenberg algebra, is generated by unbounded operators. The algebra Aq,p and (q, p)-analysis are investigated in [8]. Below we will use the notation ∂q : =∂q,1 , ∂1 : =∂1,1 and ∂0 : =∂0,1 , where ∂q,h is given by (27). After a simple computation we find two formal expansions for the general Kderivative and ∂0 -derivative ∂= ∂0 =

∞ X k=1 ∞ X k=1

qk z k−1 (∂0 z − z∂0 )∂0 k ,

(35)

(−z)k k ∂1 , k!

(36)

190

A. Odzijewicz

where z k acts on ϕ ∈ IK (H) as a multiplication operator. The series (35) and (36) for some concrete coherent states map K : DR → H \ {0} could be convergent in the weak operator topology. Since the standard derivative ∂1 and derivative ∂0 have functional realization, the expansions (35),(36) and (27) show that the K-derivative ∂ has functional realization too. Substituting (36) into (35) one obtains the expression of the K-derivative ∂ by the standard differential operator of infinite rank. This exhibits the merit of the Kderivative analysis on the disc. Namely, making the computations in terms of ∂-analysis one can avoid the operations with extremely complicated standard (i.e. ∂1 ) differential expressions. In order to see how it works in a particular case, let us compare the versions of the ∗q,h -star product formula given by (42) with its ∂1 -version which one obtains after substituting (35) and (36). 3. Covariant Symbols and Reproducing Property Among various quantizations there are two of them which are more or less familiar to physicists; these are the ∗-product quantization [3, 4, 5] and the positive operator measure method [28, 29]. In the case considered here the phase-space is taken to be a disc DR and the two above mentioned additional structures, i.e., the ∗-product and a positive operator measure are defined by the coherent states map K : DR → H \ {0}. In this section an explicit “K-analytic” formula of the ∗-product for any two R-analytic physical quantities is delivered. The comparison of this formula (41) with the Berezin integral form of the ∗-product leads to an interesting integral – analytic identity for covariant symbols of operators from the quantum algebra AK . We derive also an explicit formula for the Moyal–Wick product using (41), for the cases of Example 1 and Example 2 of the previous section. Apart from that we discuss such a reproducing measure for which the corresponding reproducing kernel is given by the K-exponential function (14) and also we reveal the relation of this reproducing measure with positive operator measure via resolution of identity. Finally one formulates the K-moments problem, the solution of which is to be then found in Sect. 6. The purpose of this section is the description of the algebra AK in terms of the ∗-product quantization (see [3–5] on the disc. With help of the coherent states map K : DR → H \ {0} one can define the covariant Berezin symbol (see [3, 4]) for each bounded operator and thus for X ∈ AK , hXi(z, z): =

hK(z) |XK(z) i . hK(z) |K(z) i

(37)

Since K is an analytic map, the symbol hXi is a real analytic function on the disc DR ; moreover, X is determined by its symbol hXi uniquely. Therefore, the K-Heisenberg algebra AK can be realized as the ∗-product algebra of the covariant symbols. For X, Y ∈ AK the ∗-product of their symbols hXi hY i is defined by hXi ∗ hY i(z, z): =

hK(z) |XY K(z) i . hK(z) |K(z) i

(38)

The star multiplication depends on the coherent states map and can be considered as a noncommutative deformation of the usual commutative multiplication of the real analytic functions f, g ∈ OR () defined on the open subset which contains D||A|| . In order to prove this, let us notice that the operators

Quantum Algebras and q-Special Functions

191 ∞ X

: f (A† , A) : : = : g(A† , A) : : =

fnm A† Am , n

n,m=0 ∞ X

gnm A† Am , n

(39)

nm=0

where fnm and gnm are coefficients in the Taylor expansion of f and g, respectively, belong to AK . The functions f and g are covariant symbols of these operators D E D E : f (A† , A) : = f and : g(A† , A) : = g. (40) Hence the star product f ∗ g has a well defined meaning for f, g ∈ OR (). Below we will use the normal ordering also for the analytic representation of A and † A . Namely, the notations : g(∂, z) : and : f (z, ∂) : mean that after the Taylor expansion, one keeps K-derivatives ∂ and ∂ on the right hand side of the coordinates multiplication operators. Applying (14) and (15) to (38) we find the ∗-product multiplication formula for f, g ∈ OR () 1 : g(∂, z) : [f (z, z)Exp (z, z)] Exp (z, z) 1 : f (z, ∂) : [g(z, z)Exp (z, z)]. = Exp (z, z)

(f ∗ g)(z, z) =

(41)

The formulae (41) are extremely useful in the case when K-derivative ∂ and its complex conjugate are functionally realized. The above occurs, for example, in the cases described by Example 1 and Example 2 of the previous section. For those cases one can convert the expression (41) into the following ∗-product multiplication formulae (f ∗q,h g)(z, z) = (f ∗q,p,h g)(z, z) =

∞ X hk Expq,h (q k z, z) k k k (∂q f )(z, z)(∂ q Q g)(z, z), k!q Expq,h (z, z) k=0 ∞ X k=0

(42)

hk k(k−1) Expq,p,h (p−k z, z) × p 2 k!qp Expq,p,h (z, z)

k

k

k ×∂ q,p (g(z, zQ))Q ∂q,p f.

(43)

The dash over Q or P means that the operator acts on z only. To obtain (42) and (43) from (41) we used the formulae n

∂q (f g) =

n X n k=0

n ∂q,p (f g) =

k

n X n k=0

k

(Qk ∂q n−k f )∂q k g, q

(Qk ∂ n−k f )∂ k P n−k g,

(44)

qp

n!q is the q-Newton symbol. where nk q = n!q (n−k)! q The formulae (43) and (42) give the q and (q, p) deformations, respectively, of the Moyal–Wick star product. In the limit they reproduce the Poisson bracket

192

A. Odzijewicz

{f, g} = lim

q,p→1 h→0

1 [f ∗q,p,h g − g ∗q,p,h f ] h

(45)

of the functions. Therefore, the above results provide examples of the ∗-product quantization (see [3, 4, 5]) on the disc. In (42) and (43) one can replace the derivative operators ∂q,h and ∂q,p,h by the standard derivative ∂1 if one use the formulae (35) and (36). Consequently we arrive at ∂1 -differential expressions for the considered examples of ∗-product formulae, which are usually expressed in a much more complicated manner. Let us discuss now the quantum probabilistic aspects of the algebra AK . This is related to the integral realization of the scalar product in the Hilbert space IK (H). Therefore, we shall assume the existence of a positive regular measure µ on the disc DR for which the resolution of the identity Z 11 = P (z, z)dµ(z, z), (46) DR

where P (z, z) =

|K(z)ihK(z)| hK(z) |K(z) i

(47)

holds. Thus, by identifying the space IK (H) with a subspace of the Hilbert space 1 dµ), L2 (DR , Exp Z 1 dµ(z, z) ϕ(z)ψ(z) (48) hϕ |ψ i = Exp (z, z) DR 1 dµ) for ϕ, ψ ∈ IK (H) ⊂ O(DR ). Functions from IK (H) are distinguished in L2 (DR , Exp by the reproducing property Z 1 ϕ(z) = dµ(v, v), ϕ(v)Exp (v, z) (49) Exp (v, v) DR

where the K-exponential function Exp plays the role of the reproducing kernel. The resolution of the identity allows one to express the operator X ∈ L(H) in terms of its covariant symbol (see [23]) Z Z dµ(z, z) dµ(v, v)hXi(z, v)P (z, z)P (v, v). (50) X= DR

DR

From (46) one finds the integral formula for the ∗-product of covariant symbols, Z 2 hXi(z, v)hY i(v, z)|a(z, v)| dµ(v, v), (51) (hXi ∗ hY i)(z, z) = DR

where a(z, v) =

Exp(v, z) 1

1

,

(52)

Exp(z, z) 2 Exp(v, v) 2 is the transition amplitude between the coherent states [K(z)] and [K(v)]. Interpreting covariant symbols hXi and hY i as the mean value functions of the operators X and 2 Y respectively and |a(z, v)| as the density of the transition probability, one obtains a quantum probabilistic interpretation of the above formulae, see e.g in [23, 24]. It was mentioned in Sect. 2 that monomials {z n }, n ∈ N ∪ {0}, belong to IK (H). Hence, one can evaluate the scalar product on them

Quantum Algebras and q-Special Functions

Z zmzn DR

193

E 1 D 1 dµ(z, z) = en (CC † )−1 em . Exp (z, z) cn cm

(53)

For the special case C = 11 (53) reduces to Z 1 1 dµ(z, z) = zmzn δ . (54) 2 nm Exp (z, z) |c DR n| √ Upon passing to polar coordinates z = xeiϕ one can reduce (54) to the problem of 2 moments on the interval [0, R ] Z

R2

1 dν(x) = Exp x c2n

xn 0

(55)

if one puts dµ(z, z) = dν(x)dϕ. The equality (55) could be treated as the defining property of the measure ν. This is the famous moment problem of finding ν from a knowledge of the moments cn fixed (see [1]). Let us now reformulate the moment problem in a way more adequate for K-analysis. 1 dν by KNamely, in (55) we replace the integration with respect to the measure Exp integration 1 n+1 Ixn : = x (56) qn+1 with some unknown analytic weight function σ. The K-integral I is defined as the right inverse of the K-differential (57) ∂I = 11. We represent the function σ by its Taylor expansion σ(x) =

∞ X

an xn ,

(58)

n=0

where lim sup n→∞

p n

|an | <

1 . R

(59)

(We consider the case when R < ∞.) Then, instead of looking for a measure dν which satisfies (55) we will be looking for an analytic function σ which satisfies the moment condition Ixn σ(x)|x=R2 =

1 c2n

(60)

for n = 0, 1, . . . , . Since one can always rescale: x → R12 , qn → R12 qn and ak → R2k ak , we assume in (60), without loss of generality, that R = 1. Finally, we obtain the following infinite system of equations: ∞ X k=0

1 a k = q1 · · · q n , qn+k+1

∞ X 1 ak = 1 qk+1 k=0

(61)

194

A. Odzijewicz

for the coefficeints ak , where n ∈ N. The infinite matrix [qn+k+1 ]∞ n,k=0 is a matrix of the Hankel type. In Sect. 6 we discuss more accurately the case when the factor qn is an analytic function of q n , where 0 < q < 1. It is so, for example, in the case presented in n Example 1 , where qn = 1−q 1−q . Here the moment problem (61) is solved by the function σ(x) =

1 1 · . q 1 − q Expq 1−q x

(62)

For a detailed discussion of this case, and other ones related to the basic hypergeometric series, see Sect. 6. 4. The Algebras Generated by Fractional Maps The study of this class of algebras is motivated by [17]. The authors of the paper under consideration investigate exhaustively the parabolic case (see Subsect. B.) and interpret this very quantum algebra as the quantum disc. Here we show that the hyperbolic case (see Subsect. C.) is the q-deformation of their model. In this section we treat in brief all the three subcases which emerge. We also present explicit forms of reproducing kernels and reproducing measures corresponding to these kernels. Finally we give explicit formulae for the ∗-product. The formulae for the hyperbolic case seem to be new ones. An interesting three-parameter class of quantum algebras is related to the family of fractional maps ax + b , (63) ϕg (x) = cx + d ab where g = ∈ Mat 2×2 (R). For the existence of a fixed point of the map ϕg we cd assume, without loss of generality, that (Trg)2 > 4 det g. Let us also assume that det g = 0 or det g = ±1. Finally, one obtains three subcases: A. The case det g = 0, for which ϕg is a constant map. B. The parabolic case: det g = 1 and (Trg)2 = 4, i.e. ϕg has one fixed point. C. The hyperbolic case: det g = −1 or det g = 1 and (Trg)2 > 4, i.e. ϕg has two fixed points. The subcase of affine maps ϕg (x) = qx + h, i.e. c = 0, q = ad and h = db which intersects with all the subcases A, B and C, has been described as an example in the two previous sections. Henceforth we shall assume that c 6= 0. Although the case A is a subcase of the affine one, we prefer to consider it independently, because of its exceptional character. The hyperbolic case is generic and it is possible to obtain the parabolic subcase from it by a limiting process. A. Let us put ϕg (x) = h > 0. Then qn = ϕg ◦ · · · ◦ϕg (0) = h and the coherent states | {z } n

map Kh : D√|h| → H \ {0} is given by Kh (z) = c0 2

∞ X

h

−n 2

z n en .

(64)

n=0

Since the case considered is specialized from the q-Heisenberg algebra by q = 0, from (25), (26), (27) and (43) one obtains

Quantum Algebras and q-Special Functions

Exph (z, z) =

195

1 , 1 − zz h

(65)

AA† = h, ∂ = ∂0 , ∞ X zz k (f ∗h g)(z, z) = hk (1 + )(∂0 k f )(z, z)(∂ 0 g)(0, z) h

(66) (67) (68)

k=0

respectively. The reproducing measure

1 Exph dµh

is given by

1 1 dµh = lim dµq,h , q→0 Expq,h Exph

(69)

where dµq,h is the reproducing measure for the affine case, see (62). We conclude from 1 the above that the Hilbert space L2 O(D√|h| , Exp dµh ) of square integrable complex h analytic functions on D√|h| , is a Hardy–Lebesgue space and the reproducing property (49) becomes for the considered case the Cauchy integral formula. B. Since the parabolic case is characterized by the fractional map ϕg which possesses only one fixed point r = ϕg (r) > 0, we will use the following parametrization of the matrix g: 1 − µ, rµ , (70) g(µ, r) = − µr , 1 + µ where µ ∈ R \ {0}. From the property g(µ, r)n = g(nµ, r), n ∈ Z, one finds qn = ϕgn (0) = r and cn 2 = r−n

n n + µ1

(k)n 2 c0 , (1)n

(71)

(72)

where k = 1 + µ1 and (k)n = k(k + 1) · · · (k + n − 1). Thus the coherent states map and the exponential function for the parabolic case are given by s n ∞ X (k)n z √ en (73) Kr,µ (z) = c0 (1)n r n=0

and ∞ X (k)n (1)n zz n ( ) = (1)n n! r n=0 zz k 1 zz 2 = c0 2F1 = c20 (1 − )−k ; r 1 r

Expr,µ (z, z) = c20

(74)

√ respectively. The Gauss hypergeometric series is convergent for z ∈ DR , where R = r, and k > 0. The moment problem (55) for the coefficients (72) is solved by the measure dµk,r (z, z) =

k−1 zz (1 − )−2 d(%2 )dϕ, πr r

(75)

196

A. Odzijewicz

where z = %eiϕ . Therefore the Hilbert space L2 O(DR , Exp1 dµk,r ) of square integrable k,r complex analytic functions on DR is the Hilbert space of the discrete series representation z (Uk,r (g)ϕ)(z) = (b √ + a)−k ϕ(σg (z)) r a, b 2 2 of the group SU (1, 1) 3 g = , |a| − |b| = 1, where σg (z) = b, a the irreducibility of the representation Uk,r and the equivariance Uk,r (g)◦IK ◦Kk,r = IK ◦Kk,r ◦σ(g) ∀g∈SU (1,1)

(76) √ az+ rb . √1 bz+a r

From

(77)

of the coherent states map [Kk,r ] : DR → CP (H), it follows that IK (H) ' L2 O(DR , dµk,r ). CP (H) denotes the complex projective space over H and [Kk,r ] is – what we call – the projectivisation of Kk,r . Equivalently, the discrete series representation Uk,r can be considered as a homomorphism of the group SU (1, 1) into the group of invertible elements of the algebra AKk,r . Namely, in the covariant symbols representation one has hUk,r (g1 g2 )i = hUk,r (g1 )i ∗ hUk,r (g2 )i,

(78)

where the symbols are given by hUk,r (g)i(z, z) =

Expk,r (z, σg (z)) . Expk,r (z, z)

(79)

The relation between the annihilation and creation operators of the algebra AKk,r is given by µ (80) (1 + µ)AA† − (1 − µ)A† A = rµJ + AA† A† A. r As for the case of r = 1, see [17]. The r-deformation is not essential, because it could be eliminated by the scale transformation z → √1r z. The K-derivative ∂µ,r = IKµr ◦A◦IKµr −1 possesses the following functional realization 1 1 = rµ ∂1 (81) ∂µ,r = rµ∂1 µz∂1 + 1 µz∂1 + µ + 1

including the standard derivative. Hence, in the parabolic case we can specialize the star product formula (41) to the form (f ∗r,µ g)(z, z) =

∞ ∞ X X ri

1 Expr,µ (z, z)

i=0 n=0

i!

i

∂1 (gn (z)z n ) ×

1 [(∂1 i f )(z∂1 + k)n−i Expr,µ ]}(z, z). ×{ (z∂1 + k)n

(82)

C. The matrix g which defines the hyperbolic fractional map ϕg has the spectral decomposition 1

1

g = q 2 P+ + q − 2 P− ,

(83)

Quantum Algebras and q-Special Functions

197

where q ∈ R \ {0} and P+2 = P+ , P−2 = P− , P+ P− = P− P+ = 0. Let us take the following parametrization r1 r2 1 r2 r1 , x= , y= ,z = (84) r 1 − r2 r1 − r 2 r 1 − r2 r1 − r2 t, −x −z, x of the projectors P+ = and P− = , where zt = yx and (t − z)2 = 1. y, −z −y, t The following properties are immediate: t=

and ϕg (r2 ) = r2 , ϕg (r1 ) = r1 g(q, r1 , r2 )n = g(q n , r1 , r2 ), P+ (r1 , r2 ) = P− (r2 , r1 ), g(q, r1 , r2 ) = g(q −1 , r2 , r1 ).

(85) (86) (87) (88)

Because of (87), it is enough to consider the case r2 < r1 only. From (86) one finds qn = ϕgn (0) = r2 which gives c n 2 = c0 2

1 − qn , 1 − rr21 q n

(89)

( r2 q; q)n 1 = c0 2 r2−n r1 . q1 · · · q n (q; q)n

(90)

Hence, in the hyperbolic case the coherent states map is Kq,r1 ,r2 (z) = c0

∞ X n=0

1 ( r2 q; q)n 2 −n r1 r2 (q; q)n

z n en .

(91)

The exponential function Expq,r1 ,r2 is given by the basic hypergeometric series ∞ X ( rr21 q; q)n vz n ( ) = (q; q)n r2 n=0 r2 qq vz 2 r 1 ; q; , = c0 281 q r2

Expq,r1 ,r2 (v, z) = c0 2

(92)

√ z, v ∈ DR , R = r2 . The moment problem for coefficients (90) is solved by applying the q-analogue of the Euler integral representation of the hypergeometric series (see [7]). Then the exponential function Expq,r1 ,r2 satisfies the reproducing property (49) with the measure 1 dµq,r1 ,r2 , where Exp q,r1 ,r2

dµq,r1 ,r2 (z, z) =

∞ zz 1 − q ( r2 q; q)∞ X n zz q δ( − q n )d(zz)dϕ. r2 πc0 2 r22 ( zz q; q) ∞ n=0 r

(93)

1

The algebraic relation for A and A† assumes the form (q 2 − q − 2 )[AA† A† A + r1 r2 J] = (q 2 r1 − q − 2 r2 )A† A + (q − 2 − q 2 )AA† . (94) 1

1

1

1

1

1

198

A. Odzijewicz

√ The norm of A is ||A|| = (1 − q)r2 . The K-derivative in the hyperbolic case is ∂ = r2 (1 − q)∂q

1 1 − rr21 Q

(95)

and ∂n =

((1 − q)r2 )n n ∂q . ( rr21 qQ; q)n

(96)

Hence, the Leibniz formula for ∂ takes the following form: n ((1 − q)r2 )n X n (Qk ∂q n−k f )∂q k g. ( rr21 qQ; q)n k q

∂ n (f · g) =

(97)

k=0

Using the identity

∞ X 1 n−1+i = xi , (x; q)n n−1 q

(98)

i=0

for x =

r2 r1 qQ,

and formulae (zz; q)k Expq,r1 ,r2 (zz), ( rr21 qzz; q)k

(99)

(zz; q)n r2 qQ; q)n Expq,r1 ,r2 )(zz) = r2 Expq,r1 ,r2 (zz) r1 ( r1 qzz; q)n

(100)

(Qk Expq,r1 ,r2 )(zz) = ((

we obtain the ∗-product formula for the hyperbolic subcase (f ∗ g)(z, z) =

∞ X k,i=0

r2 1 (q )i (zz; q)i+k (1k,i g)(z, z)(Qi ∂q k f )(z, z), k!q i!q r1

(101)

r 1 − q n (n + i − 1)!q ( r21 q; q)n−k 1 − q (n − k)!q ( rr21 qzz; q)n+i

(102)

where (1k,i g)(z, z)

∞ X n=k

gn (z)z n−k [(1 − q)r2 ]n

P∞ and g(z, z) = n=0 gn (z)z n . The case described in Subsect. A is related to the theory of Hardy spaces. The essence of this theory relies on the fertile interplay of functional analysis and the theory of analytic functions of one variable, see [22]. Many constructions, concepts and statements of Hardy space theory can be transferred onto the parabolic and hyperbolic quantum algebra AK case. The hyperbolic case may be viewed as a q-deformation of Hardy space theory. Althogether one is lead into the domain of interesting and nontrivial mathematics. This is to be developed further in a subsequent paper.

Quantum Algebras and q-Special Functions

199

5. Algebras Related to Meromorphic Functions and Basic Hypergeometric Series In what follows we shall focus our attention on a class of coherent states maps and quantum algebras related to them. This class is parametrized by the meromorphic functions R defined on C. We shall assume that R may have isolated singularity at zero, i.e. R(z) =

∞ X

rk z k ,

(103)

k=−N

where N ∈ N ∪ {0}, r−N 6= 0.. Additionally we shall assume in what follows that R(q n ) > 0, for n > 0, R(1) = 0 and R(0) > 0. If one admits that R(q n ) = 0, for some n ∈ N, then the algebra AK will be realized by endomorphisms of a finite dimensional vector space. We shall not consider this case in our paper. Once R is given we define the K--derivative according to ∂R : =∂q

1−q 1−q R(Q) = R(qQ)∂q . 1−Q 1 − qQ

(104)

The K-derivative is a natural generalization of K-derivative (95) described in the Subsect. C of the previous section. Likewise in that case one can carry out the program of finding out explicit formulae for the other objectives of the theory, i.e. the exponential function the relation between the annihilation and creation operators, the reproducing measure and the ∗-product. The reason for this is the fact that (104) gives an explicit functional realization of the K-derivative. The main results of this section are: a generalization of the q-binomial theorem (see Proposition 1); the explicit realization (see (136)) of the ∗-product formula (41) for “R-class” of physical systems and at last – the normal ordering for the operator AA† . Applying the derivative (104) to monomials z n one finds qn = R(q n ).

(105)

Then, from (14) and (21) one has ExpR (v, z) =

∞ X n=0

1 (vz)n , R(q) · · · R(q n )

(106)

where we put c0 = 1, which is equivalent to the normalization ExpR (0) = 1. By definition we assume that for n = 0 the coefficient in (106) is equal 1. The convergence radius for the series (106) is equal to R(0). Thus it is infinite, for the case when N > 0. Since, ExpR depends on the product vz only, we shall put v = 1. Subsequently the substitution ExpR and ∂R into Eq (15) gives: [

1−q R(qQ)∂q ExpR ](z) = ExpR (z). 1 − qQ

(107)

After simple calculation one can then reduce (107) to the following q-difference equation R(Q)ExpR (z) = zExpR (z), (108) of infinite rank in general.

200

A. Odzijewicz

Proposition 1. The following generalization of the q-binomial theorem is true: ExpR (z) =

∞ X n=0

∞

Y 1 n z = (1 − F (zq n )G(Q)) · 1, R(q) · · · R(q n )

(109)

n=0

where z , z − R(0) (Q − 1)R(qQ) + (1 − qQ)R(0) G(Q) = QR(qQ) F (z) =

(110)

for N = 0, and F (z) = z, qQ − 1 G(Q) = QR(qQ)

(111)

for N > 0. Proof. From the proceeding considerations it follows that the left-hand side of formula (109) is a solution of Eq. (108). However let us solve it again by the iteration method. In order to do so, we reformulate (108) ending up with the following form: ExpR (z) = [(1 − F (z)G(Q))ExpR ](qz),

(112)

where functions F and G are given by (110) and (111), in dependence whether the function R is regular (N = 0) or singular (N > 0) for z = 0. Since 1 (1 − Q), (113) ∂q = z(1 − q) one can rewrite (108) as follows: 1 1 1 1 − qQ − ExpR (qz) + z ExpR (z) [(1 − Q)ExpR ](z) = z R(qQ) R(0) Q R(0) (114) for N = 0, or 1 − qQ ExpR (qz) (115) [(1 − Q)ExpR ](z) = z R(qQ)Q for N > 0. The above expressions can be transformed into 1 1 − qQ z ExpR (z) = 1 − 1+ R(0) −1 ExpR (qz) z − R(0) Q R(qQ) or

ExpR (z) =

1−z

qQ − 1 QR(qQ)

(116)

ExpR ](qz)

(117)

respectively. This proves the formula (112). We now arrive at the right-hand side of the identity (109) via iteration of (112).

Quantum Algebras and q-Special Functions

201

Substituting the function 1−x 1−q

R(x) =

(118)

into (109) one obtains the Euler formula ∞

X 1 1 = zn. (z; q)∞ (q; q)n

(119)

n=0

If one puts x−1 x into (109), one obtains another Euler formula R(x) =

(z; q)∞ =

n ∞ X (−1)n q ( 2 )

(q; q)n

n=0

For R(x) =

(120)

zn.

1−x 1 − aq x

(121)

(122)

the identity (109) reduces to the q-binomial theorem ∞ X (a; q)n n=0

(q; q)n

zn =

(az; q)∞ . (z; q)∞

(123)

Therefore, the identity (109) is a generalization of classical identities well known in the theory of basic hypergeometric series (see [2, 7]), as expected. Let us now restrict our attention to the subcase when R is a rational function regular at infinity and without a pole at zero. If it possesses (s + 1) poles and r singularities one can represent it in the following form R(x) = rRs (x): =

(1 − x)(1 − b1 q −1 x) · · · (1 − bs q −1 x) , (−x)s−r+1 (1 − a1 q −1 x) · · · (1 − ar q −1 x)

(124)

where a1 , . . . , ar , b1 , . . . , bs are such numbers that rRs (q n ) > 0 for n ∈ N and s−r+1 ≥ 0. Applying the derivative ∂r,s : =∂rRs to the monomial z n one finds the q-factors (1 − q n )(1 − b1 q n−1 ) · · · (1 − bs q n−1 ) , (1 − a1 q n−1 ) · · · (1 − ar q n−1 )(−q)1+s−r

(125)

i1+s−r h n 1 (a1 ; q)n · · · (ar ; q)n (−1)n q ( 2 ) = , q1 · · · qn (q; q)n (b1 ; q)n · · · (bs ; q)n

(126)

qn = and thus c2n =

Hence, we find that the K-exponential function for this subcase is given by the basic hypergeometric series r8s

202

A. Odzijewicz

Expr,s (v, z) = hKr,s (v) |Kr,s (z) i = ∞ i1+s−r h X n (a1 ; q)n · · · (ar ; q)n (vz)n = (−1)n q ( 2 ) = (q; q)n (b1 ; q)n · · · (bs ; q)n n=0 a1 , · · · , a r ; q, vz . = r8s b1 , · · · , b s

(127)

(See the book by G. Gasper and M. Rahman [7] for the definition of r8s and for its fundamential properties.) Then, for r = s + 1 the coherent states map Kr,s and thus the exponential function Expr,s are defined on the unit disc D1 . If r < s + 1, they are defined on the whole complex plane. From the above it follows that the exponential function ExpR is a natural generalization of the basic hypergeometric series. Therefore we shall call it also the generalized basic hypergeometric series. The basic hypergeometric series r8s , being a K-exponential function, satisfies Eq. (108), which in that case assumes the form (−1)r

s X

r+n r+1−n (−1)n σn (b1 , . . . , bs )[8 z) − q8 z)] = r s (q r s (q

n=0

= (−1)s+1

r X

n+s+1 σn (a1 , . . . , ar )q n+s+1 z8 z), r s (q

(128)

n=0

where σn (y1 , . . . , ym ) is the symmetric polynomial defined by the identity (1 − y1 x) · · · (1 − ym x) =

m X

(−1)n σn (y1 , . . . , ym )xn .

(129)

n=0

Let us now discuss briefly the relation between A, A† and Q, where Q is defined by QK(z): =K(qz). From (19) and (105) one finds the defining relations AA† = R(qQ), A† A = R(Q)

(130)

for the quantum algebra AKR . 2 Using the above identities and ||A† A|| = ||A|| one obtains, for N = 0, the following expression for the norms r (131) ||A|| = ||A† || = sup R(q n ). n∈N

Due to (130) one finds also that the annihilation and creation operators generating the quantum algebra AKR are unbounded if N > 0. Considering A and A† as a mutually conjugate non-commuting set of coordinates on the quantum curve (23) we could interpret (130) as a parametrization of the last one by the operator Q. In such a way the meromorphic function R “uniformizes” the considered curve. So, by analogy with algebraic geometry, where algebraic curves are “coordinatized” by the field of rational functions on them (see [31]), we will interpret the generalized Heisenberg algebra AK as a “coordinatization” of the quantum curve to which it is related.

Quantum Algebras and q-Special Functions

203

In [17], the authors call the algebra Aµ,r (from the subcase B. in Sect. 4) – the “quantum disc”. However, we think that it is more suitable to reserve this term for the set of coherent states K(z), z ∈ DR , and to interpret the map K : DR → H \ {0} as a quantization of the classical states z ∈ DR , (see [25] for an exhaustive disscussion). Let us remark here that K(z), z ∈ DR , are the eigenvectors of the commutative subalgebra of AK and are generated by the annihilation operator A. The ∗-product formula (41) for the quantum algebra AKR can be expressed in the functional way in terms of the q-derivative ∂q and the operator Q. In order to do this we expand the function g in a Taylor series with respect to the anti-holomorphic coordinate g(z, z) =

∞ X

gn (z)z n ,

(132)

n=0

and use the generalized Leibniz formula for the derivative ∂R , n X n (qQ; q)n−k Qk n−k n n ∂R (g · f )(x) = R(qQ) · · · R(q Q) ∂ g (x) × k q R(qQ) · · · R(q n−k Q) R k=0 (qQ; q)k k ∂ f (x). (133) × R(qQ) · · · R(q k Q) R Since

n−k ∂R ExpR (z, z) = z n−k ExpR (z, z)

1 k n n (1 − q)k ∂R z = R(qQ) · · · R(q k Q)z n−k , k!q k q (qQ; q)k after a simple calculation one comes to the following ∗-product formula:

(134)

and

∞

(135)

∞

X 1 X k 1 (1 − q)k ∂q (gn (z)z n ) × ExpR (z, z) k!q (qQ; q)n n=0 k=0 (qQ; q)n−k Qk k Exp (∂ f ) (z, z), (136) ×R(qQ) · · · R(q n Q) q R R(qQ) · · · R(q n−k Q)

(f ∗ g)(z, z) =

where ∂q and Q act on the coordinate z and ∂q acts on the coordinate z. In the special case when f (z) = z and g(z) = z, one finds from (41) the normal ordering for the operator AA† , i.e. AA† = : F (A† , A) :

(137)

where the function F is given by F (z, z) =

∞ X 1 ExpR (q k z, z) ∂R (zExpR (z, z)) = . rk q k ExpR (z, z) ExpR (z, z)

(138)

k=−N

For the definition of rk see formula (103). In Sect. 6 we will find reproducing measure (see (182) and (187)) for R which satisfies: R(0) = 1, R(q n ) > 0 and the function R(x; q)∞ := R(x)R(qx) · · · R(q n x) · · ·

(139)

is analytic without the poles in the disc D% for some % > q. So in this case one can recalculate the star product (136) using (51) with the measure dµS given by (187).

204

A. Odzijewicz

6. Generalized Basic Hypergeometric Series In this section we shall study some properties of the generalized basic hypergeometric series 8R , i.e. the exponential function ExpR defined in Sect. 5. We shall restrict our attention to the subcase R = S, where S satisfies conditions (167). These conditions are very natural and not very restrictive. Hence, they admit an ample class of meromorphic functions under consideration. We will prove the identities (177) and (179) which, respectively, provide us with the meromorphic continuation on the whole complex plane and Mittag–Leffler decomposition of generalized basic hypergeometric series ExpS . In order to achieve this we solve the infinite system of linear Eqs. (140). As a result one obtains additionally new identities (173) and (174) with the S function involved. With help of these one may then express the function ExpS in terms of the integral formula (188) and the operator formula (189). Finally, due to the identity (180) we find the reproducing measure (187) for the kernel ExpS (zv). A key role in the subsequent considerations will be played by the equation ∞ X

q (n+1)(k+1) αk = βn ,

∀n∈N∪{0}

(140)

k=0 ∞ for the complex sequence {αn }∞ n=0 , when the sequence {βn }n=0 is fixed. Below we shall assume that 0 < q < 1. In that case the infinite matrix (n+1)(k+1) ∞ (141) q n,k=0

defines a self-adjoint Hilbert-Schmidt operator in the Hilbert space l2 with the kernel equal to {0}. However, we will look for the solutions from a class of sequences which contains all convergent sequences and thus the sequences belonging to l2 . The sequences ∞ {αn }∞ n=0 , which solve Eq. (140) for some sequence {βn }n=0 form a vector space A. From the Cauchy criterion one obtains the necessary lim sup k→∞

p 1 k |αk | ≤ q

(142)

p 1 k |αk | < q

(143)

and sufficient lim sup k→∞

∞ conditions for a sequence {αk }∞ k=0 to be a solution of (140) for a given {βn }n=0 . In accordance with this, let A0 ⊂ A denotes a set of sequences satisfying the condition (143). ∞ The set A0 is a star-shaped set (i.e. {cαn }∞ n=0 ∈ A0 if {αn }n=0 ∈ A0 ) and ∞ ∞ {|α|n }n=0 ∈ A0 if {αn }n=0 ∈ A0 . The vector space of all convergent sequences is contained in A0 .

Proposition 2. Equation (140) has solution from the A0 if and only if the power series β(z) =

∞ X n=0

βn z n

(144)

Quantum Algebras and q-Special Functions

205

admits the meromorphic extension on the whole complex plane with the Mittag-Leffler decomposition β(z) =

∞ X

αk

k=0

1 q −(k+1) − z

(145)

∞ ∞ such that {αn }∞ n=0 ∈ A0 . The sequence {αn }n=0 ∈ A0 solves Eq. (140) with {βn }n=0 given by (144).

Proof. From the Cauchy criterion one obtains lim sup

p k

k→∞

|αk | <

1 q

⇒

βn < +∞

⇒

lim sup

p k

k→∞

|αk | ≤

1 , q

and thus ⇒

q
βn = α(q n+1 )

⇒

q≤R

for n ∈ N ∪ {0}, where R is the convergence radius of the series α(z) :=

∞ X

αk z k+1 .

(146)

k=0

Thus one has lim sup n→∞

p n

|βn | = lim sup n→∞

q n |α(q (n+1) )| = q N +1 ,

where N is a number for which α0 = · · · = αN −1 = 0 and αN 6= 0. So, β is convergent for |z| < qN1+1 . Now, multiplying (140) by z n and summing with respect to the index n one obtains β(z) =

∞ X ∞ X ( q (n+1)(k+1) αk )z n .

(147)

n=0 k=0

P∞ Since q < R the series k=0 q (n+1)(k+1) |αk | is convergent and β(|z|) < +∞ for |z| < 1 . So, one can change the order of summation in (147). As a result of this one arrives q N +1 at (145). Conversely, applying the operator ∂0 n to (145) and putting z = 0 one obtains Eq. (140). This is possible since the series (144) and (145) converge uniformly on a compact subset of the disc {z ∈ C : |z| < q1 }, and thus the ∂0 -differentiation commutes with the summations in (145). Corollary 3. Equation (145) is another form of Eq. (140) From the above we conclude that solving Eq. (140) is equivalent to finding of the analytic extension of the series (144) to the meromorphic function defined on the whole complex plane. So, in order to find this extension let us multiply β by the integral function 1 , where k ∈ N∪{0}. Hence, (qz; q)∞ which possesses poles of rank one in the points qk+1 the function γ(z) := (qz; q)∞ β(z)

(148)

206

A. Odzijewicz

is analytic on C. Now, taking into account that the coefficients of the Taylor expansion of γ are given by (∂0 n γ)(0) and applying the ∂0 -Leibniz formula (see (44)) to (148) one obtains γ(z) =

∞ X n=0

k+1 ∞ X n X (−1)k q ( 2 ) n (∂0 γ)(0)z = ( βn−k )z . (q; q)k

n

n

(149)

n=0 k=0

Since the meromorphic function γn (z) :=

(1 − q n+1 z) γ(z) (qz; q)∞

(150)

is analytic in the neighbourhood of the point q −(n+1) , from the Laurent expansion of β in q −(n+1) it follows αn =

n γ(q −(n+1) ) γ(q −(n+1) )(−1)n q ( 2 ) , = (q; q)n (q; q)∞ q q n+1 ( q1 ; q1 )n (q; q)∞

1

q

γ (q −(n+1) ) = n+1 n

(151)

where ( q1 ; q1 )0 = 1. Finally, from the expansion (149) one finds the solution of Eq. (140) n k+1 ∞ l (−1)n q ( 2 ) X X (−1)k q ( 2 ) −l(n+1) ( βl−k )q αn = (q; q)n (q; q)∞ q (q; q)k

(152)

l=0 k=0

for fixed {βn }∞ n=0 which satisfies the conditions of Proposition 2. Let us now present two other formulations of Eq. (140). They will have functional form and can be used for finding new solutions of (140) from the given ones. The first formulation has the form of an integral equation. In order to obtain it let us notice that for {αk }∞ k=0 ∈ A0 the series α defined by (146) converges on the disc of the radius p (153) (lim sup k |αk |)−1 > q. k→∞

For q < % < (lim sup

p k

|αk |)−1

(154)

α(ξ) dξ. ξ − q n+1

(155)

k→∞

one has βn = α(q

n+1

1 )= 2πi

Z ∂D%

Thus, for |z| < 1 one obtains the integral equation for the function α : Z 1 β(z) = Kq (z, ξ)α(ξ), 2πi ∂D%

(156)

which is equivalent to Eq. (140). The kernel in (156) is given by Kq (z, ξ) =

∞ X n=0

zn . ξ − q n+1

(157)

Quantum Algebras and q-Special Functions

207

The second way to formulate Eq. (140) is the operator equation α(qQ)

1 = β(z), 1−z

(158)

P∞ where |z| < 1. Since ||qQ|| = q < %, the operator α(qQ) = k=0 αk q k+1 Qk+1 is well defined. Acting by ∂0 and qQ on both sides of (158) and using the relation ∂0 Q = qQ∂0 we shall arrive at the following statements Proposition 4. If the couple (α, β), where α ∈ A0 , satisfies (158) then zα, Qα ∈ A0 and the couples (zα, qQβ)

and

(Qα, ∂0 β)

(159)

satisfy Eq. (158), too. Proposition 5. If the couple (α, β), where α ∈ A0 , satisfies (158) then F (z, Q)α ∈ A0 , and the couple (F (z, Q)α, F (qQ, ∂0 )β)

(160)

also satisfies Eq. (158). One assumes here that F is analytic on a polydisc D% ×D% , where % > 1. In particular we see from (160) that the function α0 = Qr α solves the equation α0 (q n+1 ) = βr+n

(161)

α(q n+1 ) = βn .

(162)

if α solves the equation

Thus, instead of the full system (162) one can solve a reduced system (161) and one obtains α from α = Q−r α0 . Since R = q r R0 , where R0 is the convergence radius for α0 , the function α0 belongs to the domain of the operator Q−r . If βni = 0 for some infinite subset of indices ni , then βn = 0 for all n ∈ N ∪ {0}. So, in the nontrivial case, there exists r such that βn 6= 0 for n ≥ r. From the above consideration it follows that the case βn 6= 0, for n ∈ N ∪ {0}, is the crucial one. Therefore, we will consider that case below. If α0 = · · · = αN −1 = 0 and αN 6= 0, then the function S defined by α(z) =

1 S(z)α(qz) q N +1

(163)

is meromorphic on the disc D% , where R > % > q, and has inside D% a finite number of singularities. It is so, since α ∈ O(D% ) and % < R. In addition S(0) = 1, and from βn 6= 0 it follows that S(q n ) 6= 0 for n ∈ N ∪ {0}. The function S(·; q)n defined for n ≥ 1 by S(z; q)n = S(z)S(qz) · · · S(q n−1 z), and for n = 0 by S(z; q)0 = 1, is holomorphic on the D% for sufficiently large n. Iterating (163) we find that

(164)

208

A. Odzijewicz

α(z) = S(z; q)∞ z N +1 .

(165)

Thus from (162) and (165) one obtains βn =

S(q; q)∞ n+1 N +1 (q ) . S(q; q)n

(166)

Concluding, we have the following statement. Proposition 6. Equation (163) for the holomorphic function α, with S being a fixed meromorphic function on DR , R > q, which satisfies the conditions S(·; q)∞ ∈ O(DR ), S(q n ) 6= 0 and S(0) = 1,

(167)

is equivalent to Eq. (140) with βn 6= 0. The formula (166) gives the condition for {βn } which is a solvability condition for the system (140). Hence, Eq. (140) is solved by the coefficients of the Taylor expansion of S(z; q)∞ in z = 0. Since S(0) = 1, the function S is holomorphic in some neighbourhood of z = 0. Taking its expansion S(z) =

∞ X

sk z k ,

(168)

k=0

one finds from (163) the recurrence αN +n =

n−1 1 X sn−k q k αN +k 1 − qn

(169)

k=0

for n ≥ 1. Let us recall here that α0 = · · · = αN −1 = 0. This reccurence is solved by X αN +n = αN xn q −n σi1 ...in (x1 , . . . , xn−1 )si11 . . . sinn , (170) i1 +2i2 +···+nin =n

where xk =

qk 1−q k

and the polynomial σi1 ...in (x1 , . . . , xn−1 ) is defined as a linear j

n−1 combination of the monomials xj11 . . . xn−1 with coefficients equal to 0 or 1 and j1 + j2 + · · · + jn−1 = i1 + i2 + · · · + in − 1. The explicit form of σi1 ...in (x1 , . . . , xn−1 ) can be calculated from X σi1 ...in (x1 , . . . , xn−1 )si11 . . . sinn =

i1 +2i2 +···+nin =n

=

n−1 X

(

X

σi1 ...ik (x1 , . . . , xk−1 )si11 . . . sikk )sn−k .

(171)

k=0 i1 +2i2 +···+kik =k

One finds the factor αN from α(q) = β0 = S(q; q)∞ q N +1 .

(172)

Now, because of the equivalence of Eqs. (140) and (163) one obtains – as a result of the comparison of (152) with (170) – the following identities:

Quantum Algebras and q-Special Functions

209

n k ∞ l S(q; q)∞ (−1)n q ( 2 ) X X (−1)k q ( 2 ) ( )q −ln = (q; q)∞ (q; q)n S(q; q)l−k (q; q)k

l=0 k=0

q = σ0 1 − qn

X

σi1 ...in (

i1 +2i2 +···+nin =n

q n−1 q ,..., )si1 . . . sinn . (173) 1−q 1 − q n−1 1

Using (165) and (152) we find other identities S(z; q)∞ =

n k ∞ ∞ l S(q; q)∞ X (−1)n q ( 2 ) X X (−1)k q ( 2 ) ( )(q −l z)n = (q; q)∞ (q; q)n S(q; q)l−k (q; q)k

n=0

=

S(q; q)∞ (q; q)∞

∞ X l X l=0 k=0

l=0 k=0

k 2

(−1)k q ( ) (q −l z; q)∞ . S(q; q)l−k (q; q)k

(174)

Because of (166) the power series β assumes the form β(z) = S(q; q)∞ q N +1 ExpS (q N +1 z),

(175)

where ExpS is the exponential function defined by (106). The exponential functions ExpS for which S satisfies the conditions (167) form an ample subclass of the class of all generalized exponential functions defined in Sect 5. For example, the conditions (167) are satisfied by the rational function rSs

(z) =

(1 − z)(1 − b1 z) · · · (1 − bs z) (1 − a1 z) · · · (1 − ar z)

(176)

if bi q n 6= 1, for i = 1, . . . , s and n ∈ N ∪ {0}, and min |a1i | =: % > q. Let us remark that after setting some parameters a1 , . . . , ar and b1 , . . . , bs to zero in (124) one can consider rSs as a special case of the rational function . If S ∈ O(DR ) then s+1Rs S(·; q)∞ ∈ O(DR ). Therefore, functions holomorphic on DR , where R > q, such that S(0) = 1 and S(q n ) 6= 0 also satisfy (167). The other properties of ExpS follow from Proposition 2. Namely, from (148) and (149) we obtain for |z| < 1 ExpS (z) =

γS (z) , (z; q)∞

(177)

where γS (z) =

∞ X n X ( n=0 k=0

k

(−1)k q ( 2 ) )z n (q; q)k S(q; q)n−k

(178)

is analytic on the whole complex plane. Hence the right-hand side of the identity gives the analytic continuation of ExpS to the meromorphic function defined on C. This continuation (that is natural to be called the generalized basic hypergeometric function) has the Mittag-Leffler decomposition given by ExpS (z) =

k ∞ X (−1)k q ( 2 )

k=0

see the identity (145).

(q; q)∞

γS (q −(k+1) )

1 , q −k − z

(179)

210

A. Odzijewicz

After the substitution z = q n+1 the identity (174) leads to the following two identities: k ∞ 1 X (−1)k q ( 2 ) 1 = γS (q −k )q (n+1)k S(q; q)n (q; q)∞ (q; q)k

(180)

k l k X X 1 (−1)k q ( 2 ) = . S(q; q)n S(q; q)l−k (q; q)k (q; q)n−l

(181)

k=0

and

l=0 k=0

Taking into account the equality k

a−k (a; q)∞ (−1)k q ( 2 ) = , a→∞ (aq k ; q)∞ (q; q)k

(182)

lim

one can rewrite (180) in the following way: 1 = S(q; q)n

Z

1

log x

1 1 a− log q (a; q)∞ x γS ( ) lim dq x, (1 − q)(q; q)∞ x a→∞ (ax; q)∞ n

0

(183)

where the dq x := (1 − q)

∞ X

xδ(x − q k )dx

(184)

k=0

is the so called q-measure, see [7, 18]. In the case when the function S(·; q)∞ has no poles in the disc D% with % > q, the identity (183) is also valid for the function S1 . Hence in that case the measure log x

1 1 a− log q (a; q)∞ dνS = γ 1 ( ) lim dq x (1 − q)(q; q)∞ S x a→∞ (ax; q)∞

(185)

solves the moment problem (55) for ExpS . Therefore, the kernel ExpS (zv) acquires the reproducing property Z ExpS (zw)ExpS (wv)dµS (w, w), (186) ExpS (zv) = D1

with respect to the measure dµS (w, w) =

1 1 dνS (x)dϕ, 2π ExpS (x)

(187)

√ where w = xeiϕ and D1 is the unit disc. In the rational case, i.e. when S = rSs (see (176)) Eq. (186) gives the reproducing formula for the kernel given by the basic hypergeometric series r8s (zv). After the substitution of (165) and (175) into the integral equation (156) and the operator equation (158), respectively, we obtain the following two expressions for the generalized exponential function: Z 1 Kq (z, ξ)S(ξ, q)∞ ξ N +1 dξ (188) ExpS (zq N +1 ) = 2πiS(q; q)∞ q N +1 ∂ D%

Quantum Algebras and q-Special Functions

211

and ExpS (z) =

1 1 , S(qQ; q)∞ S(q; q)∞ 1−z

(189)

where |z| < 1 and q < % < R. The last formula may be considered as a generalization of the q-binomial theorem. 7. Examples and Applications In the case of qn = R(q n ) the quantum algebra AK =: AR is “uniformized” by the relations (130), where the operator Q plays the role of a “coordinate”. If R is invertible on the interval [0, 1] which contains the spectrum of Q then one can express Q by A and A† . Therefore in that case Q belongs to the algebra AR . Thus the algebra AR,q generated by A, A† and Q is a subalgebra of AR . In the general case one can consider AR,q as an algebra naturally related to AR . Taking into account (130) we find that AR,q is determined by the relations AA† = R(qQ),

(190)

A† A = R(Q),

(191)

QA† = qA† Q,

(192)

AQ = qQA.

(193)

It is easy to obtain the last two relations (192) and (193) if one passes to the holomorphic representation in which Q is given by (32) and A† acts as the multiplication by z. Now if one replaces in (190) the operator Q by the occupation number operator N en = nen , which is related to Q by Q = q N , one finds the equivalent relations AA† = R(q N +1 ),

(194)

A† A = R(q N ),

(195)

[N, A] = −A,

(196)

[N, A† ] = A† .

(197)

The above equivalence is valid due to 0 < q < 1, which is not the case in general, as an example of q being a root of unity shows. As is readily seen, the relations (196) and (197) do not depend on R. Therefore, in what follows, we take these relations for granted and we shall not rewrite them for each subcase separately. Let us now consider the following two subcases described by the simplest rational functions:

212

A. Odzijewicz

q 1 1 ( + x) + , (1 − q)(1 − q 2 ) x (1 − q)2 1−x R(x) = r2 . 1 − rr21 x R(x) = −

(198) (199)

In the subcase (198) the relations (194) give rise to the quantum algebra Uq (sl(2)), which is a q-deformation of U (sl(2)), see [6, 19]. Let us mention also that quantum groups SUq (2) and SUq (1, 1) can be considered as special subcases of the algebra (190)–(193). Hence we suppose that identities which are satisfied by the generalizated exponential (see Sect. 6) may by useful in the representation theory of these quantum groups, see [18, 13] It is easy to see that the relations (190)–(193) for R given by (199) take the form 1 − q N +1 , AA† = r2 1 − rr21 q N +1 A† A = r 2

1 − qN . 1 − rr21 q N (200)

The algebra (200) is the q-deformation of the algebra N +1 , N +k+1 N A† A = r2 , N +k AA† = r2

(201) where we put rr21 = q k . The above two quantum algebras are related to hyperbolic and parabolic algebras, respectively, which are investigated in Sect. 4. A natural interpretation of the algebra AR,q arises if one studies the physical system with Schr¨odinger operator H = H(Q) being a function of the operator Q. Then the algebra AR,q can be considered as a symmetry algebra of that system. Since Q is diagonal in the holomorphic realization such systems are integrable. The energy operator of the form H = H( 1−Q 1−q ), where H does not depend on the parameter q, corresponds to the operator H = H(N ) for q → 1. Therefore systems described by the H = H(Q) are q-deformed versions of the ones described by H = H(N ). In the physical literature there exist many examples of systems of the above type, see e.g. [21, 32]. The problem now is how to find out whether the energy operators defined in the Schr¨odinger representation are of the form H = H(Q). One can formulate that problem in the inverse way. Namely one may ask how to construct a Schr¨odinger representation of the algebra (190)–(193). Following V.Spiridonov, see [32, 33], we shall do that for the case when R is a polynomial of degree N R(x) = (x − λ1 ) · · · (x − λN ),

(202)

where λ1 , . . . , λN ∈ R. The algebra (190) with R given by (202) appeared in [32, 33] as the symmetry algebra for the one-dimensional Schr¨odinger operator. In order to find its representation in L2 (R, dx) let us introduce the mutually conjugated operators

Quantum Algebras and q-Special Functions

213

d + fj (x), dx d † Aj = − + fj (x), dx

Aj =

(203)

where the functions fj yield the chain of differential equations 0 2 fj0 (x) + fj+1 (x) + fj2 (x) − fj+1 (x) = λj+1 − λj = µj .

(204)

This chain is equivalent to the following intertwining relations: † † Lj Aj = Aj Lj+1 , Aj Lj = Lj+1 Aj ,

(205)

where † Lj = Aj Aj + λj ,

(206)

see [33]. If we restrict ourselves to the class of solutions invariant under the transformation √ √ fj+N (x) = qfj ( qx) =: (T fj )(x), (207) µj+N = qµj , then the operators A := T −1 AN −1+j · · · Aj , † † A† := Aj · · · AN −1+j T

(208)

satisfy the relations (190)–(193). Since in that case the operator Q is – up to additive constant – a Schr¨odinger operator Q + const = −

d2 + fj2 (x) − fj0 (x) + λj = Lj , dx2

(209)

we discover that the algebra AR,q is its symmetry algebra. The spectral problem of the operator (209) generates a certain solution of the system given by (204) and (207), for details we refer to [34]. From the above considerations we conclude that the holomorphic representation of the algebra AR,q leads to the explicit description of coherent states of the system given by the Hamiltonian (209). If R(x) = 1−x 1−q , the physical system is a q-deformed harmonic oscillator. The standard harmonic oscillator is obtained as the limit case for q → 1. Other examples of such physical systems may be found in [30, 11, 12]. The interesting question is what kind of structures one obtains in the limit q → 1. We investigate this problem for the case when R =rSs is a rational function given by (176). Let us assume that ai = q αi −1 , bj = q βj −1 . Then the generalized derivative will correspond to the operator ∂r,s =

d d (β1 + z dz ) · · · (βs + z dz ) d = lim (1 − q)r−s−1 ∂r Ss , d d dz q→1 (α1 + z dz ) · · · (αr + z dz )

(210)

214

A. Odzijewicz

where r ≤ 1 + s, and boundary exponential function is given by the hypergeometric series lim Expr Ss [(1 − q)r−s−1 z] =rFs (z) =

q→1

=

∞ X (α1 )n · · · (αr )n n z . n!(β1 )n · · · (βs )n

(211)

n=0

Equation (15) or (108) takes the form of the differential equation s Y i=1

Y d d d (βi + z ) rFs = (αj + z )F r s , dz dz dz r

(212)

j=1

see [20]. In the considered case the algebra AR,q determined by the relations (190)– (193) corresponds to the algebra Ar,s generated by the operators A, A† and N which satisfy the relations (N + 1)(β1 + N ) · · · (βs + N ) , AA† = (α1 + N ) · · · (αr + N ) N (β1 − 1 + N ) · · · (βs − 1 + N ) A† A = . (α1 − 1 + N ) · · · (αr − 1 + N )

(213)

Algebras of this sort (when the right hand side of (213) reduces to a polynomial ) appear in problems of quantum optics as well as in quantum many–bodies physics. For the detailed description of these interrelations the reader is referred to papers of V.P. Karassiov, see [14, 15]. At last let us remark that the correspondence between quantum algebras and the theory of special functions survives if one considers a general complex manifold, e.g. noncompact Riemann surface or bounded domain in CN instead of the disc. The general case is to be investigated in a subsequent publication. Acknowledgement. The part of the article was prepared during the stay of the author at Institut f¨ur Reine Mathematik der Humboldt Universit¨at in Berlin. The hospitality of Professor T.Friedrich at the Institute is gratefully acknowledged. The author highly appreciates all suggestions and remarks of both Referees due to whom the paper acquired a more desirable form.

References [1] [2] [3] [4] [5] [6] [7] [8]

Akhiezer, N.I.: Klassicheskaja Problema Momentov i Nekotorye Voprosy Analiza Sviazannye s Neju. Moskva: Gosudarstvennoe Izdatelstvo Fizik. Matematicheskoj Literatury, 1961 Bailey, W.N.: Generalized Hypergeometric Series. Cambridge: Cambridge University Press, 1935 Berezin, F.A.: Commun. Math. Phys. 40, 153 (1975) Berezin, F.A.: Commun. Math. Phys. 63, 131 (1978) Flato, M., Lichnerowicz, A., Sternheimer, D.:Deformation of Poisson brackets, Dirac brackets and application. J. Math. Phys. 17, 1762–1794 (1976) Drinfeld, V.G.: Quantum groups. Proceedings of the International Congress of Mathematicians (Berkeley), Providence, RI: Am. Math. Soc. 1987, pp. 789–820 Gasper, G., Rahman, M.: Basic Hypergeometric Series. Cambridge: Cambridge University Press, 1990 Gong, Ren-Shan: A completeness relation for the coherent states of the (p, q)-oscillator by (p, q)integration. J. Phys. A.: Math. Gen. 27, 375–379 (1994)

Quantum Algebras and q-Special Functions

[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32]

[33] [34] [35] [36] [37]

215

Jimbo, M.: A q-difference analogue of Uq (G) and the Yang-Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) Jimbo, M.: A q-analogue of U (gl(N + 1)), Hecke algebra and the Yang-Baxter equation. Lett. Math. Phys. 11, 247–252 (1986) Jorgensen, P.E.T., Werner, R.F.: Coherent states of the q-canonical commutation relations. Preprint Osnabr¨uck, (1993) Jorgensen, P.E.T., Schmit, R.F., Werner, R.F.: q-Relations and Stability of C*-Isomorphism Classes. Algebraic Methods in Operator Theory. Basel: Birkh¨auser-Verlag, 1993 Kakehi, T., Masuda, T.,Ueno, K.: Spectral analysis of a Q-difference Operator which Arises from the Quantum SU (1, 1) Group. J. Operator Theory 33, 159–196 (1995) Karassiov, V. P.: Polynomial Deformations of the Lie Algebra sl(2) in Problems of Quantum Optics. Translated from Teoreticheskaya i Matematicheskaya Fizika, Vol. 95, No. 1, 3–19, April, (1993) Karassiov, V. P.: G-invariant polynomial extensions of Lie algebras in quantum many-bodies physics. J. Phys. A: Math. Gen. 27, 153–165 (1994) Klauder, R, Skagerstam, Bo-Sture: Coherent States – Applications in Physics and Mathematical Physics. Singapore: World Scientific, (1985) Klimek, S., Lesniewski, A.: Quantum Riemann Surfaces I. The Unit Disc. Commun. Math. Phys. 146, 103–122 (1992) Klimyk, A.V., Vilenkin ,N.I.: Representations of Lie groups and Special Functions. London: KAP, 1992 Kulish, P.P., Reshetikhin, N.Yu.: J. Soviet. Math. 23, 2435 (1983) Matematicheskaja Enciklopedia T. 1. Moskva: 1977 pp. 1004–1005 Maximov, V.M., Odzijewicz, A.: The q-deformation of quantum mechanics of one degree of freedom. J. Math. Phys. 36(4), April (1995) Nikolski, N. K.: Lekcii ob operatore sdviga. Moskva: Nauka, 1980 Odzijewicz, A.: Covariant and Contravariant Berezin Symbols of Bounded Operators. Quantization and Infinite-Dimensional Systems. New York–London: Plenum Press, 1994, pp. 99–108 Odzijewicz, A.: On reproducing kernels and quantization of states. Commun. Math. Phys. 114, 577–597 (1988) Odzijewicz, A.: Coherent States and Geometric Quantization. Commun. Math. Phys. 150, 385–413 (1992) O’Raifeartaigh, L., Ryan, C.: On generalised commutation relation. Proceedings of the Royal Irish Academy. vol. 62, Sect. A Ostrovski, V.L., Samoilenko, Yu.S.: Infinite dimensional representations of quadratic and polynomial ∗-algebras. Academy of Science of the Ukrainian SSR, Institute of Mathematics, Preprint 91.4 Prugoveˇcki, E.: Stochastic Quantum Mechanics and Quantum Spacetime. Dordrecht: Reidel, 1986 Prugoveˇcki, E. and Ali, S. T.: Nuovo. Cim. A63, 171 (1981) Pusz, W., Woronowicz, L.S.: Rep.Math.Phys. 27, 231 (1989) Shafarevich, I.R.: Osnovnye poniatia algebry. Sovremennye problemy matematiki T. 11. Moskva: 1986 Spiridonov, V.: Deformation of Supersymmetric and Conformal Quantum Mechanics Through Affine Transformations. In: Proc. of the Intern. Workshop on Harmonic Oscillators, College Park, USA, March 1992. Eds. D. Han, Y.S. Kim, and W.W. Zachary, NASA Conf. Publ. 3197, 1993, pp. 93–108 Spiridonov, V.: Universal Superposition of Coherent States and Self-Similar Potentials. Phys. Rev. A, 1909–1935, (1995) Veselov, A.P., Shabat, A.B.: Dressing chain and spectral theory of Schr¨odinger operator. Funk. Anal. i ego Dril. 27, n.2, 1–21 (1993) Woronowicz, L.S.: Unbounded Elements Affiliated with C ∗ -Algebras and Non-Compact Quantum Groups. Commun. Math. Phys. 136, 399–432 (1991) Woronowicz, L.S.: Lett. Math. Phys. 23, 251 (1991) Woronowicz, L.S.: Commun. Math. Phys. 114, 417 (1992)

Communicated by T. Miwa

Commun. Math. Phys. 192, 217 – 244 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Extensions of Conformal Nets and Superselection Structures D. Guido1,? , R. Longo1 , H.-W. Wiesbrock2,?? 1

Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, I–00133 Roma, Italy. E-mail: [email protected], [email protected] 2 Freie Universit¨ at Berlin, Institut f¨ur Theoretische Physik, Arnimallee 14, D-14195 Berlin, Germany. E-mail: [email protected] Received: 19 March 1997 / Accepted: 1 July 1997

Abstract: Starting with a conformal Quantum Field Theory on the real line, we show that the dual net is still conformal with respect to a new representation of the M¨obius group. We infer from this that every conformal net is normal and conormal, namely the local von Neumann algebra associated with an interval coincides with its double relative commutant inside the local von Neumann algebra associated with any larger interval. The net and the dual net give together rise to an infinite dimensional symmetry group, of which we study a class of positive energy irreducible representations. We mention how superselection sectors extend to the dual net and we illustrate by examples how, in general, this process generates solitonic sectors. We describe the free theories associated with the lowest weight n representations of PSL(2, R), showing that they violate 3regularity for n > 2. When n ≥ 2, we obtain examples of non M¨obius-covariant sectors of a 3-regular (non 4-regular) net. 0. Introduction Haag duality is one of the most important properties in Quantum Field Theory for the analysis of the superselection structure. It basically says that the locality principle holds maximally. Concerning Quantum Field Theory on the usual Minkowski spacetime, duality may be always assumed, in a Wightman theory, because wedge duality automatically holds and one can enlarge the net to a dual net without affecting the superselection structure [3, 26]. Nevertheless there might be good reasons in lower dimensional theories for Haag duality not to be satisfied, [24]. An important case occurs in Conformal QFT: as such a theory naturally lives on a larger spacetime, duality may fail on the original spacetime ? ??

Supported in part by Ministero della Pubblica Istruzione and CNR-GNAFA. Supported by the DFG, SFB 288 “Differentialgeometrie und Quantenphysik”.

218

D. Guido, R. Longo, H.W. Wiesbrock

because contributions at infinity are possibly not detectable there. Moreover in low spacetime dimensions, the superselection structure of the dual net may change due to the occurrence of soliton sectors [31, 26], and new information being contained in the inclusion of the two nets. This paper is devoted to an analysis of conformal QFT on the real line, namely one-dimensional components of a two-dimensional chiral conformal QFT. The first aspect that we discuss concerns the symmetries of the dual net. Starting with a conformal net A on R, the Bisognano–Wichmann property holds automatically true [5, 14], thus the dual net . Ad (a, b) = A(−∞, b) ∩ A(a, ∞) a < b on R is local and obviously translation-dilation covariant with respect to the same translation-dilation unitary representation. We shall however prove that Ad is even conformally covariant with respect to a new unitary representation of PSL(2, R). The construction of the new symmetries is achieved by a new characterization of local conformal precosheaves on the circle in terms of what we call a +hsm (halfsided modular) factorization, namely a quadruple (Ni , i ∈ Z3 ; ), where the Ni ’s are mutually commuting von Neumann algebras with a joint cyclic separating vector such that Ni ⊂ Ni+1 is a +hsm modular inclusion in the sense of [35] for all i ∈ Z3 . This characterization makes only use of the modular operators and not of the modular conjugations as in [36]. As a second result we shall deduce a structural property for any conformal net A: A is automatically normal and conormal, namely if I ⊂ I˜ is an inclusion of proper intervals then lclA(I)cc = A(I), ˜ = A(I) ∨ A(I)c , A(I)

(1) (2)

˜ denotes the relative commutant in A(I) ˜ and X cc = (X c )c . This where X c = X 0 ∩ A(I) property is useful in the analysis of the superselection structure, as discussed below. The next issue will be to compare the net A with the dual net Ad . We shall give a detailed study of the inclusion A ⊂ Ad in a particular model, namely when A is the net associated with the nth derivative of the U (1)-current algebra, the latter turning out to coincide with Ad . This is a first quantization analysis and to this end we shall give formulas relating two irreducible lowest weight representations of PSL(2, R) that agree on the upper triangular matrix subgroup P0 . In other words we are studying a class of representations of a certain infinite dimensional Lie group, the amalgamated free product PSL(2, R) ∗P0 PSL(2, R), where a classification is simply obtained. For example an explicit formula for the unitary γ whose second quantization implements the canonical endomorphism associated with the inclusion A(−1, 1) ⊂ Ad (−1, 1) (the product of the ray inversion unitaries of the two nets) will be given as a function of the skew-adjoint generator E of the dilation one-parameter subgroup γ=

(E − 1)(E − 2) · · · (E − n) . (E + 1)(E + 2) · · · (E + n)

We shall show that the net A is not 4-regular if n ≥ 2 and not 3-regular if n ≥ 3, where A is said to be k-regular if A remains irreducible after removing k − 1 points from R. The 4-regularity property played a role in the covariance analysis in [16] where the problem of its general validity remained open.

Extensions of Conformal Nets and Superselection Structures

219

We generalize the construction of the Buchholz, Mack and Todorov sectors for the current algebra ([7]) to the nth derivative of the current algebra, showing that they form a group isomorphic to R2n+1 , and that none of them is covariant w.r.t. the conformal group (if n 6= 0). In particular this shows results in [16] to be optimal, at least on the real line. Finally we shall illustrate by examples how sectors of A localized in a bounded interval may have extension to Ad with soliton localization. 1. Structural Properties of Conformal Local Precosheaves on S 1 In the algebraic approach, see [18], chiral conformal field theories are described as conformally covariant local precosheaves A of von Neumann algebras on proper intervals of the circle S 1 . We start by reviewing some aspects of this framework. An open interval I of S 1 is called proper if I and the interior I 0 of its complement are not empty. The circle will be explicitly described either as the points with modulus one in C or as the one-point compactification of R, these two description being related by the Cayley transform: C : S 1 → R ∪ {∞} given by z → i(z + 1)(z − 1)−1 . The group PSL(2, R) acts on S 1 via its action on R ∪ {∞} as fractional transformations. Intervals are labeled either by the coordinates on R or by complex coordinates of the endpoints in S 1 ⊂ C, where in the later case intervals are represented in positive cyclic order. A precosheaf A is a covariant functor from the category J of proper intervals with inclusions as arrows to the category of von Neumann algebras on a Hilbert space H with inclusions as arrows, i.e., a map I → A(I) that satisfies: A. Isotony. If I1 ⊂ I2 are proper intervals, then A(I1 ) ⊂ A(I2 ). The precosheaf A will be a (local) conformal precosheaf if in addition it satisfies the following properties: B. Locality. If I1 and I2 are disjoint proper intervals, then A(I1 ) ⊂ A(I2 )0 , C. Conformal invariance. There exists a strongly continuous unitary representation U of PSL(2, R) on H such that U (g)A(I)U (g)∗ = A(gI),

g ∈ PSL(2, R), I ∈ J .

D. Positivity of the energy. The generator of the rotation subgroup U (R(·)) (conformal Hamiltonian) is positive. Here R(ϑ) denotes the rotation of angle ϑ on S 1 . E. Existence of the vacuum. There exists a unit vector ∈ H (vacuum vector) which is U (PSL(2, R))-invariant and cyclic for ∨I∈J A(I). The Reeh-Schlieder Theorem now states that the vacuum vector is cyclic and separating for any local algebra A(I). Let us recall that uniqueness of the vacuum is equivalent both to the irreducibility of the precosheaf or to the factoriality property for local algebras. We shall denote by Mob

220

D. Guido, R. Longo, H.W. Wiesbrock

the M¨obius group, namely the group of conformal transformations in C that leave the unit circle globally invariant. The group PSL(2, R) is then identified with the subgroup of orientation preserving transformations and Mob is generated by PSL(2, R) and an involution. Let I1 be the upper semi-circle parameterized as (0, +∞); we associate to I1 the following two one-parameter subgroups of Mob : First the dilations (relative to I1 ), −t/2 0 e 3I1 (t) = t/2 , 0 e leaving I1 globally stable, second the translations 1 s TI1 (s) = . 0 1 mapping I1 into itself for positive s ≥ 0. In general, if I is any interval in S 1 , there exists a g ∈ PSL(2, R) such that I = gI1 and we set 3I = g3I1 g −1 ,

TI = gTI1 g −1 .

The definition of the dilations does not depend on g, while the translations of I are defined up to a rescaling of the parameter, that however does not play any role in the following, because we are only interested in the subgroups generated by them. The subgroup generated by 3I1 (·) and TI1 (·), denoted by P0 , is the subgroup of upper triangular matrices of PSL(2, R) and plays an important role in the following, especially in the next section. We shall associate with any proper interval I a diffeomorphism rI of S 1 , the reflection mapping I onto the causal complement I 0 , i.e. fixing the boundary points of I. In the case of I1 = (0, +∞), rI1 x = −x, and one can extend this definition to a generic I as before. Notice that rI1 is orientation reversing. By a (anti-)representation U of Mob we shall mean the obvious generalization of the notion of unitary representation where U (rI ) is anti-unitary. For a general conformal precosheaf the Bisognano–Wichmann Property holds [5, 14]: U extends to a unitary (anti-)representation of Mob such that, for any I ∈ J , U (3I (2πt)) = 1it I , U (rI ) = JI ,

(3) (4)

where 1I and JI are the Tomita-Takesaki modular operator and modular conjugation associated with (A(I), ). This implies Haag duality: A(I)0 = A(I 0 ),

I ∈ I.

Let now (N ⊂ M, ) be a triple where N ⊂ M is an inclusion of von Neumann algebras acting on a Hilbert space H and ∈ H is a cyclic and separating vector for N and M. 1. (N ⊂ M, ) is said to be standard if is cyclic also for the relative commutant . N c = M ∩ N 0 of N in M, see [10].

Extensions of Conformal Nets and Superselection Structures

221

2. If σtM denotes the modular automorphism associated to (M, ), then the triple M (N ) ⊂ N for, (N ⊂ M, ) is said to be ± half-sided modular (± hsm) if σ−t respectively, all t ≥ 0 or all t ≤ 0. 3. A ±hsm factorization of von Neumann algebras is a quadruple (N0 , N1 , N2 , ), where {Ni , i ∈ Z3 } is a set of pairwise commuting von Neumann algebras, is a 0 , ) is a ±hsm inclusion for cyclic separating vector for each Ni and (Ni ⊂ Ni+1 each i ∈ Z3 . In the work [37], local conformal precosheaves have been characterized in terms of ±hsm standard inclusions of von Neumann algebras and the adjoint action of the modular conjugations. (This work is based on a statement about hsm modular inclusions [34], whose correct proof is contained in [2].) We shall give here below an alternative characterization in terms of a ±hsm factorization, that has the advantage of using only the modular groups and not the modular conjugations. Lemma 1.1. Let G be the universal group (algebraically) generated by 3 one-parameter subgroups 3i (·), i ∈ Z3 , such that 3i and 3i+1 have the same commutation relations of 3Ii and 3Ii+1 for each i ∈ Z3 , where I0 , I1 , I2 are intervals forming a partition of S 1 . Then G is isomorphic to PSL(2, R), the universal covering group of PSL(2, R), and the 3i ’s are continuous one parameter subgroups naturally corresponding to 3Ii . Proof. Obviously G has a quotient isomorphic to PSL(2, R), and we denote by q the quotient map. As the exponential map is a local diffeomorphism of the Lie algebra of a Lie group and the Lie group itself, there exists a neighbourhood U of the origin R3 such that the map (t0 , t1 , t2 ) → 3I0 (2πt0 )3I1 (2πt1 )3I2 (2πt2 ) is a diffeomorphism of U with a neighbourhood of the identity of PSL(2, R). Therefore the map 8 : (t0 , t1 , t2 ) ∈ U → 30 (2πt0 )31 (2πt1 )32 (2πt2 ) ∈ G is still one-to-one. It is easily checked that the maps g8 : U → G, g ∈ G, form an atlas on G, thus G is a manifold. In fact G is a Lie group since the group operations are smooth, as they are locally smooth. Now G is connected by construction and q is a local diffeomorphism of G with PSL(2, R), hence a covering map, that has to be an isomorphism because of the universality property of PSL(2, R). Theorem 1.2. Let (N0 , N1 , N2 , ) be a +hsm factorization of von Neumann algebras and let I0 , I1 , I2 be intervals forming a partition of S 1 in counter-clockwise order. There exists a unique local conformal precosheaf A on S 1 such that A(Ii ) = Ni , i ∈ Z3 , with the vacuum vector. The (unique) positive energy unitary representation U of PSL(2, R) is determined by the modular prescription U (3Ii (2πt)) = 1it Ii . Notice that every +hsm factorization of von Neumann algebras arises by considering the von Neumann algebras associated to 3 intervals of S 1 as in the above theorem, due to the geometric property of the modular group (3). Proof. The subgroup of PSL(2, R) generated by the one-parameter subgroups 3Ii (2πt) and 3Ii+1 (2πs), i ∈ Z3 , is a two-dimensional Lie group Pi isomorphic to the translation0 , ) is a +hsm standard inclusion, by a result first stated dilation group P0 . As (Ni , Ni+1 in [34] with an erroneous proof and whose correct proof is given in [2], the unitary is group generated by 1it Ii and 1Ii+1 is isomorphic to Pi , indeed there exists a unitary representation Ui of Pi determined by Ui (3Ii (2πt)) = 1it Ii and Ui (3Ii+1 (−2πs)) = 1is , therefore by Lemma 1.1, there exists a unitary representation U of PSL(2, R), Ii+1 such that U |Pi = Ui .

222

D. Guido, R. Longo, H.W. Wiesbrock

Let t0 =

1 2π

ln 2. Then we have ([34, 2], see the remarks above) it0 0 0 Ad 1it I0 1I1 (N0 ) = N1 ,

(5)

it0 it0 it0 it0 it0 0 0 Ad 1it I2 1I0 1I1 1I2 1I0 1I1 (N0 ) = N0 .

(6)

and similarly The element 3I2 (2πt0 )3I0 (2πt0 )3I1 (2πt0 )3I2 (2πt0 )3I0 (2πt0 )3I1 (2πt0 ) is easily seen to be conjugate to the rotation by π in PSL(2, R), hence Eq. (6) entails U (2π) to implement an automorphism on N0 . Set A(I0 ) := N0 . If I is an interval of S 1 , then I = gI0 for some g ∈ PSL(2, R), and we set A(I) = U (g)A(I0 )U (g)∗ . Since the group GI0 of all g ∈ PSL(2, R) such that gI0 = I0 is generated by 3I0 (t), t ∈ R and by rotations of 2kπ, k ∈ Z, then U (g)A(I0 )U (g)∗ = A(I0 ) for all g ∈ GI0 and the von Neumann algebra A(I) is well defined. The isotony of A follows if we show that gI0 ⊂ I0 implies A(gI0 ) ⊂ A(I0 ). Indeed any such g is a product of an element in GI0 and translations TI0 (·) and TI00 (·) mapping I0 into itself, hence the isotony follows by the half-sided modular conditions. By (5) we have it0 it0 it0 0 Ad 1it I1 1I2 1I0 1I1 (N0 ) = N2 , and since the corresponding element in PSL(2, R) maps I0 onto I2 , we get N2 = A(I2 ) and analogously N1 = A(I1 ). The locality of A now follows by the factorization property. Finally U is a true representation of PSL(2, R) by the vacuum conformal spinstatistics theorem [17], and the positivity of the energy follows by the Bisognano– Wichmann property (3), see [36, 37]. Although a conformal precosheaf satisfies Haag duality on S 1 , duality on R does not necessarily hold. Lemma 1.3. Let A be a local conformal precosheaf on S 1 . The following are equivalent: (i) The restriction of A to R satisfies Haag duality: A(I) = A(R\I)0 . rm(ii) A is strongly additive: If I1 , I2 are the connected components of the interval I with one internal point removed, then A(I) = A(I1 ) ∨ A(I2 ). (iii) If I, I1 , I2 are intervals as above A(I1 )0 ∩ A(I) = A(I2 ). Proof. Note that by M¨obius covariance we may suppose that the point removed in (i) and (ii) is the point ∞. Now (i) ⇔ (ii) because R\I consists of two contiguous intervals in S 1 whose union has closure equal I 0 , and by Haag duality A(I) = A(I 0 )0 . Similarly (ii) ⇔ (iii) because, after taking commutants and renaming the intervals, one relation becomes equivalent to the other one.

Extensions of Conformal Nets and Superselection Structures

223

Examples of conformal precosheaf on S 1 that are not strongly additive, i.e. not Haag dual on the line, were first given in [19, 8] and [38]. We will look in some detail at an example of [38] in Sect. 3. Haag duality on S 1 entails duality for half-lines on R hence essential duality, namely the dual net of the restriction A0 to R is local: . I 7→ Ad0 (I) = A(R\I)0 , I ⊂ R. Due to locality the net Ad0 is larger than the original one, namely A0 (I) ⊂ Ad0 (I),

I ⊂ R.

Ad0 is usually called the dual net, see [3, 8] and its main feature is that it obeys Haag duality on R. The dual net does not in general transform covariantly under the covariance representation of the starting net. Theorem 1.4. Let A be a local net of von Neumann algebras on the intervals of R, a cyclic and separating vector for the von Neumann algebra A(I) associated with each interval I ⊂ R and U a -fixing unitary representation of the translation-dilation group acting covariantly on A. The following are equivalent: (i) A extends to a conformal precosheaf on S 1 . (ii) The Bisognano–Wichmann property holds for A, namely 1it R+ = U (3R+ (2πt)).

(7)

Proof. (i) ⇒ (ii): See [5, 14]. (ii) ⇒ (i): Note first that, by translation covariance, 1it (a,∞) = U (3(a,∞) (2πt)) for all a ∈ R. Hence A(−∞, a) is a von Neumann subalgebra of A(a, ∞)0 that is cyclic on and globally invariant under the modular group of A(a, ∞)0 with respect to , hence, by the Tomita-Takesaki theory, duality for half-lines holds A(a, ∞)0 = A(−∞, a). Recall now that if N ⊂ M is an inclusion of von Neumann algebras and is a cyclic and separating vector for both N and M, then (N ⊂ M, ) is +hsm iff (M0 ⊂ N 0 , ) is −hsm [34, 2]. Then it is immediate to check (A(−∞, −1), A(−1, 1), A(1, ∞), ) to be a +hsm factorization of von Neumann algebras, so we get a conformal precosheaf from Theorem 1.2. Due to Bisognano–Wichmann property this is indeed an extension to S1 of the original net. Note as a consequence that a local net on R as above with property (7) automatically has a PCT symmetry, namely JR+ A(I)JR+ = A(−I),

∀ interval I ⊂ R.

Now, if A is a local conformal precosheaf on S 1 , its restriction A0 to R does not depend, up to isomorphism, on the point we cut S 1 , because of M¨obius covariance. The local precosheaf on S 1 extending Ad0 is thus well defined up to isomorphism. We call it the dual precosheaf of A and denote it by Ad . Corollary 1.5. The dual precosheaf of a conformal precosheaf on S 1 is a strongly additive conformal precosheaf on S 1 .

224

D. Guido, R. Longo, H.W. Wiesbrock

Proof. By construction, the dual net satisfies Haag duality on R, hence strong additivity by Lemma 1.3. Remark. Let us compare the precosheaves A and Ad on S 1 . First we observe that equality holds if and only if the conformal precosheaf A is strongly additive. As mentioned above locality implies / I, A(I) ⊂ Ad (I) if − 1 ∈ i.e. if I does not contain the point infinity, while Haag duality on S 1 implies Ad (I) ⊂ A(I)

if − 1 ∈ I.

Therefore the observable algebras associated with bounded intervals of the real line are enlarged while the others, associated with intervals containing the point at infinity, decrease. The observable algebras associated with half-lines, i.e. with intervals having −1 as a boundary point, remain fixed. Due to the Bisognano–Wichmann property, which holds for all conformal precosheaves, altering the algebras implies a change in the representation of the conformal group PSL(2, R). Moreover, since the algebras associated with half-lines coincide, both representations agree on the isotropy group of the point at infinity, i.e. on the subgroup P0 of PSL(2, R) generated by translations and dilations. An inclusion N ⊂ M of von Neumann algebras is said to be normal if N = N cc , where X c = X 0 ∩ M denotes the relative commutant of X in M, and conormal if M is generated by N and its relative commutant w.r.t. M, i.e., M = N ∨ N c (i.e. M0 ⊂ N 0 is normal). We shall then say that a local conformal precosheaf A is (co-)normal if the inclusion A(I1 ) ⊂ A(I2 ) is (co-)normal for any pair I1 ⊂ I2 of proper intervals of S 1 . By Haag duality, normality and conormality are equivalent properties of conformal precosheaves. Theorem 1.6. Any conformal precosheaf on S 1 is normal and conormal. Proof. Let us consider first an inclusion of two proper intervals I1 ⊂ I2 with a common boundary point. If A is strongly additive, the inclusion of von Neumann algebras A(I1 ) ⊂ A(I2 ) is conormal as in this case A(I1 )0 ∩ A(I2 ) = A(I2 \I1 ). In the general case, by conformal invariance we may assume that I1 and I2 are respectively the intervals of the real line (1, +∞) and (0, +∞). By definition then A(I1 ) = Ad (I1 ), A(I2 ) = Ad (I2 ), with Ad the dual net, hence the inclusion A(I1 ) ⊂ A(I2 ) is conormal by Corollary 1.5 and the above argument. As A(I2 )0 ⊂ A(I1 )0 is conormal, A(I1 ) ⊂ A(I2 ) is also normal. It remains to show the normality of A(I1 ) ⊂ A(I2 ) when I1 ⊂ I2 are intervals with no common boundary point, e.g. I1 = (b, c) and I2 = (a, d), with a < b < c < d. Then we set I3 = (a, c) and I4 = (b, d), therefore I1 = I3 ∩ I4 and both I3 and I4 are subintervals of I2 with a common boundary point. Then the double relative commutant of A(I1 ) in A(I2 ) is given by A(I1 )cc ⊂ A(I3 )cc ∩ A(I4 )cc = A(I3 ) ∩ A(I4 ) = A(I1 ),

(8)

where the last equality is a consequence of duality and additivity and implies the first inclusion; the opposite inclusion is elementary. Corollary 1.7. Let (N ⊂ M, ) be a +hsm standard inclusion of von Neumann algebras. In this case: • The inclusion N ⊂ M is normal and conormal.

Extensions of Conformal Nets and Superselection Structures

225

• There exists a unique strongly additive local conformal precosheaf A of von Neumann algebras on S 1 with M = A(0, +∞), N = A(1, +∞), and the vacuum vector. • There exists a bijection between local conformal precosheaves A of von Neumann algebras on S 1 with M = A(0, +∞), N = A(1, +∞), the vacuum vector, and von Neumann subalgebras N0 of N 0 ∩ M cyclic on such that (N0 ⊂ M, ) is a −hsm inclusion and (N0 ⊂ N 0 , ) is a +hsm inclusion. Proof. Starting with the last point, notice that (M0 , N0 , N , ) is a +hsm factorization of von Neumann algebras, and clearly any +hsm factorization arises in this way, therefore the thesis is a consequence of Theorem 1.2. In the special case N0 = N 0 ∩ M we then obtain the second statement by Lemma 1.3(ii)⇐⇒ (iii). The first point is then a consequence of Theorem 1.6.qed Note as a consequence that a hsm modular standard inclusion (N ⊂ M, ) is also pseudonormal: N ∨ JN J = M ∩ JMJ, where J is the modular conjugation of (N 0 ∩M, ) and has the continuous interpolation property (see [10]). The irreducibility of A in Corollary 1.7 is equivalent in particular to the factoriality of N and M, see [17]. This is also equivalent to the center of N and M to have trivial intersection. Thus we have the following. Corollary 1.8. Let (N ⊂ M, ) be +hsm and standard. Then N and M have R ⊕ the same center Z and (N ⊂ M, ) has a direct integral decomposition N = Z Nλ dµ(λ), R⊕ R⊕ M = Z Mλ dµ(λ), = Z λ dµ(λ), where each (Nλ ⊂ Mλ , λ ) is either a +hsm standard inclusion of III1 factors or trivial (N = M = C). Proof. The modular group acts trivially on the center, so that 0 0 Ad 1it M (N ∩ M ) = N ∩ M , ∀t ∈ R. 0 0 0 Since Ad 1it M N = M for a suitable t0 , we immediately obtain N ∩ M = M ∩ M and M ∩ M0 = N ∩ M0 ⊂ N ∩ N 0 ,

i.e. Z(M) ⊂ Z(N ). Using the commutants we obtain the equality and the direct integral decomposition as stated. Applying [34], Theorem 12, and [2], we finish the proof. The following corollary summarizes part of the above discussion, based on results in [4, 34, 2]. Corollary 1.9. There exists a one-to-one correspondence between: • Isomorphism classes of standard +half-sided modular inclusions (N ⊂ M, ). • Isomorphism classes of Borchers triples (M, U, ), (i.e. M is a von Neumann algebra with a cyclic separating unit vector and U is a one-parameter -fixing unitary group with positive generator s. t. U (t)MU (−t) ⊂ M, t > 0) such that is cyclic for U (t)M0 U (−t) ∩ M for some, hence for all, t > 0. • Isomorphism classes of translation-dilation covariant, Haag dual nets on R with the Bisognano–Wichmann property 1it R+ = U (3(2πt)).

226

D. Guido, R. Longo, H.W. Wiesbrock

Isomorphism classes of strongly additive local conformal precosheaves of von Neumann algebras on S 1 . The notion of isomorphism in the above setting has an obvious meaning. Note however that an isomorphism between local conformal precosheaves can be defined as an isomorphism of precosheaves relating the vacuum states, as in this case it will automatically intertwine the M¨obius representations as these are unique, being fixed by the modular prescriptions. 2. Representations of PSL(2, R) and Derivatives of the U (1)-Current 2.1. A class of representations of PSL(2, R) ∗P0 PSL(2, R). In Sect. 1 we have seen that we may associate with any conformal precosheaf on S 1 another conformal precosheaf on S 1 which is its Haag dual net on R. This amounts to “cut the circle,” namely to fix a special point (“∞”) and to redefine the local algebras associated to intervals which are relatively compact in S 1 \ {∞} in such a way that Haag duality holds on S 1 \ {∞}. The representations of PSL(2, R) associated with the two nets coincide when restricted to the group P0 generated by translations and dilations, therefore give a representation of PSL(2, R) ∗P0 PSL(2, R), i.e., the free product of two copies of PSL(2, R) amalgamated by the subgroup P0 . Let us denote by i1 , resp. i2 , the embeddings of PSL(2, R) into the first, resp. the second, component of the free product, and by i the immersion of P0 in the amalgamated free product. Then we shall consider on PSL(2, R)∗P0 PSL(2, R) the topology generated by the maps i1 i2 , namely a unitary representation U of PSL(2, R) ∗P0 PSL(2, R) is strongly continuous if and only if U ◦ i1 and U ◦ i2 are strongly continuous. We shall classify the class of strongly continuous unitary positive energy irreducible representations of PSL(2, R)∗P0 PSL(2, R) whose restrictions U ◦ik are still irreducible or, equivalently, such that U ((H1 − H2 )T ) is a scalar, where T is the generator of the translations belonging to p0 , the Lie algebra of P0 and Hk = ik (H) are the generators of the rotation subgroup. This amounts to classify the unitary positive energy representation with U ((H1 − H2 )T ) central, as these decompose into a direct integral of irreducible representations in the previous class. As we shall see, this is the general case in a free theory. Theorem 2.1. Let U be an irreducible unitary representation of PSL(2, R)∗P0 PSL(2, R) with positive energy, namely −iU (T ) is positive. Then U ◦ ik is irreducible for some k = 1, 2 if and only if both the U ◦ ik are irreducible, and if and only if U ((i1 (H) − i2 (H))i(T )) ∈ C, where H resp. T generate rotations resp. translations in PSL(2, R). Moreover, such representations are classified by pairs of natural numbers (n1 , n2 ), where nk is the lowest weight of the representation U ◦ ik of PSL(2, R). As is known the matrices 1 1 E= 2 0

0 , −1

T =

0 0

1 , 0

S=

0 −1

0 0

form a basis for the Lie algebra sl(2, R) and verify the commutation relations [E, T ] = T,

[E, S] = −S,

[T, S] = −2E.

(9)

Extensions of Conformal Nets and Superselection Structures

227

Let us recall that the conformal Hamiltonian is H = 2i (T + S) and that the lowest weight of the representation is its lowest eigenvalue. The Casimir operator λ = E(E − 1) − T S

(10)

is a central element of the universal enveloping Lie algebra, thus its value in an irreducible unitary representation is a scalar. If U is a unitary irreducible non-trivial lowest weight representation of PSL(2, R), then the selfadjoint generator −iU (T ) of U (etT ) is positive and non-singular, therefore U (etE ) and eit log(−iU (T )) give a representation of the Weyl commutation relations, namely U restricts to a unitary representation of P0 , that has to be irreducible because any bounded operator commuting with E and T also commutes with S due to the formula (10). The von Neumann uniqueness theorem then implies that the restriction of U to P0 is unitarily equivalent to the Schr¨odinger representation, therefore E 7→ d/dx, T 7→ −iex on L2 (R). We now describe all lowest weight representations of PSL(2, R) (or its universal covering group PSL(2, R)) as extensions of the representation of P0 . Let us fix now the unitary irreducible representation of PSL(2, R) with lowest weight 1 and denote by E0 , T0 and S0 the image in this representation of the above Lie algebra generators E, T and S. Proposition 2.2. • Each non-trivial irreducible unitary representation U of PSL(2, R) with lowest weight ≥ 1 is unitarily equivalent to the representation obtained by exponentiation of the operators Tλ = T0 , Eλ = E0 , Sλ = S0 − λT0−1 , for some λ > 01 . • All λ > 0 appear and λ = α(α − 1) if U has lowest weight α. m • λ may be written as λ = m n ( n − 1), m, n ∈ N, if and only if U is a representation of the nth covering of PSL(2, R). Proof. The first two statements follow from the above discussion since the value of the Casimir operator in the unitary representation with lowest weight α is equal to λ = α(α − 1), see [20], and one gets the formula for Sλ by multiplying both sides of (10) by T −1 . To check the last point, first observe that when λ ≥ 0, λ = α(α − 1), α ≥ 1, we get an orthonormal set of eigenvectors for the (self-adjoint) conformal Hamiltonian 1 x d −x d −x e − e + λe . Hλ = 2 dx dx x

In fact, set φα = eαx e−e and define the following operators aα ± = 2Eλ ± i(Tλ + Sλ ). We . also set for simplicity of notation H = Hα(α−1) . Since Haα = aα ± ± (H ± 1), Hφα = αφα α n . α n and a− φα = 0 then φα = (a+ ) φα is an orthogonal set of eigenvectors of H with eigenvalues α + n. An application of the Stone-Weierstrass theorem shows that it is actually a basis, and the generated vector space is a G˚arding domain for Hλ , T , E. The rest of the statement follows easily. Proof of Theorem 2.1. If U ((i1 (H) − i2 (H))i(T )) is a scalar, U ◦ i2 (etH ) belongs to (U ◦ i1 (PSL(2, R) ∗P0 PSL(2, R)))00 and U ◦ i1 (etH ) belongs to (U ◦ i2 (PSL(2, R) ∗P0 PSL(2, R)))00 , therefore, since U is irreducible, U ◦ ik is irreducible too, k = 1, 2. On the other hand, if say U ◦ i1 is irreducible, we may identify it with one of the 1

If A and B are linear operators with closable sum, the closure of their sum is denoted simply by A + B.

228

D. Guido, R. Longo, H.W. Wiesbrock

representations described in Proposition 2.2 for some α ∈ R. Then, since U is irreducible and U ◦ i1 |P0 = U ◦ i2 |P0 , U ◦ i2 too has to be of the form described in Proposition 2.2, hence U ((i1 (H) − i2 (H))i(T )) is a scalar. The rest of the statement is now obvious. Corollary 2.3. Let I → A(I) be a second quantization conformal precosheaf on S 1 as described in the following subsection, I → Ad (I) be its dual net and U be the above representation of PSL(2, R)∗P0 PSL(2, R), then the irreducible components of U belong to the family described in Theorem 2.1. . Proof. Since the dual net may be described in terms of the local algebras Ad (a, b) = A(−∞, b) ∩ A(a, ∞) and the map which associates a local algebra with a local subspace is an isomorphism of complemented lattices (cf. [1]), the representation U is indeed a second quantization. On the one-particle space, the construction of the dual net may be done on any irreducible component, and the result follows. Corollary 2.4. Every irreducible lowest weight representation of PSL(2, R) extends to a (anti-)representation of Mob in a unique (up to a phase) way. Proof. Let Eλ , Tλ and Sλ be the generators of the representation of lowest weight −1 0 α as above. Since Mob is generated by PSL(2, R) and e.g. the matrix , which 0 1 correspond to the change of sign on R, we need an antiunitary C which satisfies CEλ C = Eλ , CTλ C = −Tλ and CSλ C = −Sλ . (Because Eλ , Tλ and Sλ are generators, C is then uniquely defined up to a phase.) Since in the Schr¨odinger representation the complex conjugation C satisfies the mentioned commutation relations with T0 , S0 and E0 , it trivially has the prescribed commutation relations with Tλ and Eλ , and the last relation follows by the formula Sλ = S0 − λT0−1 . 2.2. A modular construction of free conformal fields on S 1 . For a certain class of pairs (M, G), where M is a homogeneous space for the symmetry group G, modular theory may be used to construct a net of local algebras on M starting from a suitable (anti-) representation of the symmetry group G [6]. For related works, pointing also to other directions, the interested reader should consult [27–29]. We sketch here the case of the action of the M¨obius group on S 1 . We recall that a real subspace K of a complex Hilbert space H is called standard if K ∩ iK = {0} and K + iK is dense in H, and Tomita operators j, δ are canonically associated with any standard space (cf. [25]). One may easily show that the subspaces K0 , iK and iK0 are standard subspaces if K is such, where the symplectic complement K0 is defined by K0 = {h ∈ H ; Im(h, g) = 0 ∀g ∈ K}. As shown in [6], with any positive energy representation of Mob on a Hilbert space H we may uniquely associate a family I → K(I) of standard subspaces attached to proper intervals I in S 1 satisfying the following properties: 1) 2) 3) 4)

I1 ⊂ I2 ⇒ K(I1 ) ⊂ K(I2 ) cK(I)0 = K(I 0 ) U (g)K(I) = K(gI) δIit = U (3I (t)), jI = U (rI )

(isotony), (duality), (conformal covariance), (Bisognano–Wichmann property),

that is to say, I → K(I) is a local conformal precosheaf of standard subspaces of H on the proper intervals of S 1 . The subspaces K(I) are defined as

Extensions of Conformal Nets and Superselection Structures

229

1 . K(I) = {h ∈ H | jI δI2 h = h}.

We notice that the precosheaf is irreducible, i.e. ∨K(I) is dense in H, if and only if U does not contain the trivial representation. Applying the second quantization functor, we then get a local conformal precosheaf of von Neumann algebras acting on the Fock space eH . Now we observe that we may extend the lowest weight representations (with integral α = n) described in Proposition 2.2, to an (anti-)representation of Mob (cf. Corollary 2.4) in a coherent way, e.g. choosing the complex conjugation C for any α as in the proof of Corollary 2.4, and we get a family of conformal precosheaves I → Kn (I) of standard spaces on S 1 . The groups etE and etT have a unique fixed point, namely {∞}, in S 1 and we may therefore identify S 1 \ {∞} with R. Then, since the modular groups of the half-lines do not depend on n by construction, we get Kn (I) = K1 (I) when I is a half-line, namely {∞} is one of its edges. Theorem 2.5. With the above notations, let I → K(I) be a conformal precosheaf of standard subspaces of H on S 1 such that K(I) = K1 (I) for any half-line I. Then, there exists n ∈ N such that K(I) = Kn (I) for any interval I. Proof. By the Bisognano–Wichmann Theorem for conformal precosheaves on S 1 (cf. [5, 14]) we get that U|P0 coincides with the restriction to P0 of the n = 1 lowest weight representation of PSL(2, R). Therefore, since U is a positive energy representation, it should be of the form described in Proposition 2.2. Suppose now we start with the unique irreducible positive energy unitary representation U of the translation-dilation group, with non-trivial restriction to the translation subgroup, on a Hilbert space H. According to [6], we may then consider the associated precosheaf of standard subspaces I → K(I) on the half-lines I ⊂ R. The following corollary summarizes some properties discussed in Sect. 1 and some results of the previous subsection. Corollary 2.6. Let I → K(I) the above described precosheaf on the half-lines of R. Then there exists a bijective correspondence between • Extensions of K to a local conformal precosheaf on the intervals of S 1 . • Real standard subspaces of K(−∞, 1)0 ∩ K(0, ∞) -halfsided invariant w. r. t. the subgroup of dilations centered in 0 and +halfsided invariant w.r.t. the subgroup of dilations centered in 1. • The real linear spaces Kn (0, 1), n ∈ N. 2.3. Multiplicative perturbations: a formula for the canonical endomorphism. We now give an alternative way to pass from the representation of lowest weight 1 to the representation with lowest weight α ≥ 1. In this subsection we denote by E, T, S the Lie algebra generators in the lowest weight 1 representation, and with E, T, Sα the corresponding generators in the lowest weight α case. Instead of defining the generator Sα as S − λT −1 , λ = α(α − 1), we will define the unitary Rα corresponding to the ray inversion or, equivalently, the unitary γ = γα = Rα R = Jα J,

(11)

where J, resp. Jα is the modular conjugation of K(−1, 1), resp. Kα (−1, 1), as J = CR and Jα = CRα with the same anti-unitary conjugation commuting with them. In the

230

D. Guido, R. Longo, H.W. Wiesbrock

examples below the second quantization of γ will implement the canonical endomorphism of the inclusion of algebras Aα (−1, 1) ⊂ A(−1, 1) given by (α − 1)-derivative of the current algebra (in case α is an integer). We now make some formal motivation calculations, that may however be given a rigorous meaning. First note that γ commutes with E, because both J and Jα commute with E, hence γ must be a bounded Borel function of E γ = fα (E)

(12)

because the bounded Borel functions of E form a maximal abelian von Neumann algebra. In order to determine f = fα , note that the formulas RT R = S, Rα T Rα = Sα = S − λT −1

(13) (14)

imply γ ∗ T γ = T − λRT −1 R, hence γ ∗ T γT −1 = 1 − λRT −1 RT −1 = 1 − λ(T RT R)−1 . On the other hand by (10), and since T ET

T RT R = T S = E(E − 1) −1

(15) (16)

= E − 1, thus T f (E)T −1 = f (E − 1),

(17)

formula (15) implies f to satisfy the functional equation λ f (z − 1) =1− f (z) z(z − 1)

(18)

and |f (z)| = 1 for all z ∈ iR. Proposition 2.7. If α = n is an integer, then γ=

(E − 1)(E − 2) · · · (E − n + 1) . (E + 1)(E + 2) · · · (E + n − 1)

(19)

In the general case, γ = fα (E) with fα (z) =

0(z + 1)0(z) , 0(z + α)0(z − α + 1)

(20)

where 0 is the Euler Gamma-function. Proof. Let γα be given by the formula (19). In order to check that γα gives (up to a phase) the unitary (11) it is enough to check that γα Eγα∗ = Rα ERα = E, γα Sγα∗ = Rα T Rα = Sα

(21) (22)

because the representation generated by E and S is irreducible, see (10) and the remarks below it. The first equation is obvious because γα is a function of E. To verify the second equation we notice that from S = RT R we get S positive and non-singular and (10)

Extensions of Conformal Nets and Superselection Structures

231

shows that this also holds for E(E + 1) = T S. Since SES −1 = E + 1 the functional equation for fα implies fα (E)Sfα (E)∗ = fα (E)fα (E + 1)∗ S = (1 −

λ )S = S − λT −1 = Sα . E(E + 1)

2.4. Lowest weight representations of PSL(2, R) and derivatives of the U (1)-current. On the space C ∞ (S 1 , R) of real valued smooth functions on the circle S 1 , we consider the seminorm ∞ X k|φˆ k |2 kφk2 = k=1

c = −isign(k)φˆ k , where the φˆ k ’s denote the Fourier coefficients and the operator I: Iφ k of φ. Since I 2 = −1 and I is an isometry w.r.t. k · k, (C ∞ (S 1 , R), I, k · k) becomes a complex vector space with a positive bilinear form, defined by polarization. Thus, taking the quotient by constant functions and completing, we get a complex Hilbert space H. We note that the symplectic form ω may be written as Z 1 −i X ˆ k f−k gˆ k = gdf. ω(f, g) = Im(f, g) = 2 2 S1 k∈Z

One might recognize this form as coming from the commutation relations for U (1)− currents. The natural action of PSL(2, R) on S 1 gives rise to a unitary representation on H: U (g)φ(t) = φ(g −1 t). Then, observing that I cos kt = sin kt for k ≥ 1, it is easy to see that cos kt is an eigenvector of the rotation subgroup U (θ): U (θ) cos kt = cos k(t − θ) = (cos kθ + sin kθI) cos kt = eikθ cos kt,

k ≥ 1,

and that all the eigenvectors have this form. Therefore the representation has lowest weight 1. We need another description of the Hilbert space H which is more suitable to be generalized. First we choose another coordinate on S 1 , namely x = tan(t/2), x ∈ R, ˙ R ˙ being the one-point compactification and therefore identify C ∞ (S 1 , R) with C ∞ (R), of R. Since the symplectic form is the integral of a differential form it does not depend on the coordinate: Z 1 g(x)df (x). ω(f, g) = 2 R A computation shows that the anti-unitary I applied to a function f coincides up to an additive constant with the convolution of f with the distribution 1/(x + i0) on R, therefore, since the symplectic form is trivial on the constants, the (real) scalar product may be written as

232

D. Guido, R. Longo, H.W. Wiesbrock

Z 1 1 ∗ g(x) f 0 (x)dx hf, gi = ω(f, Ig) = 2 x + i0 Z Z ∞ 1 1 f (x)g(y) dxdy = const pfˆ(−p)g(p)dp ˆ = 4π |x − y + i0|2 0

(23)

˙ w.r.t. this norm. and H may be identified with the completion of C ∞ (R) Note that since If = −if if suppfˆ ⊂ [0, +∞), H is also the completion of C ∞ (R, C) R∞ modulo {f |fˆ|(−∞,0] = 0} with scalar product (f, g) = 0 pfˆ(p)g(p)dp. ˆ . ˙ + R2(n−1) [x], n ≥ 1, where Rp [x] Let us now consider the space X n = C ∞ (R) denotes the space of real polynomials of degree p, and the bilinear form on it given by Z 1 1 hf, gin = f (x)g(y) dxdy. 4π |x − y + i0|2n It turns out that h · , · in is a well defined positive semi-definite bilinear form on X n which degenerates exactly on R2(n−1) [x]. On this space one may define also a symplectic form by Z 1 f (x)g(y)δ0(2n−1) (x − y)dxdy. ωn (f, g) = 2 This form might be read as the restriction of ω1 to the nth derivatives. Therefore we can recognize this symplectic form as coming from the commutation relations for the nth derivatives of U (1)−currents. This form again degenerates exactly on R2(n−1) [x], and the operator I defined before connects the positive form with the symplectic form for . any n in such a way that (·, ·)n = h·, ·in + iωn (·, ·) becomes a complex bilinear form on (X n , I). We shall denote by Hn the complex Hilbert space obtained by completing the quotient X n /R2(n−1) [x]. a b With any matrix g = in SL(2, R) we may associate the rational transforc d ax+b mation x → gx = cx+d and then, for any n ≥ 1, the operators U n (g) on X n : U n (g)f (x) = (cx − a)2(n−1) f (g −1 x). It turns out that g → U n (g) is a representation of PSL(2, R), n ≥ 1, and that the positive form is preserved (cf. [38]) as well as the symplectic form and the operator I, therefore U n extends to a unitary representation of PSL(2, R) on Hn . We remark that while X n and R2(n−1) [x] are globally preserved by U n , the space ˙ is not, and that explains why the space X n had to be introduced. C ∞ (R) By definition the space H1 coincides with the space H and the representation U 1 with the representation U , which we proved to be lowest weight 1. We observe that, for ˙ one gets functions in C ∞ (R), Z 1 ˆ |p|2n−1 fˆ(−p)g(p), hf, gin = 2 R Z 1 ˆ ω(f, g)n = p2n−1 fˆ(−p)g(p), 2 R hence (f, g)n = (Dn−1 f, Dn−1 g)1 , i.e. Dn−1 is a unitary between Hn and H1 ≡ H, where D is the derivative operator. The following holds:

Extensions of Conformal Nets and Superselection Structures

233

Theorem 2.8. The representation U n has lowest weight n. Proof. Making use of the results of Proposition 2.7, we have to show that Rn R =

n−1 Y k=1

E−k E+k

,

n ≥ 1,

where Rn = Dn−1 U n (r)(Dn−1 )∗ with r the ray inversion, R = R1 . This amounts to prove n−1 Y E − k n−1 n U (r)Dn−1 . U (r) = (24) D E+k k=1

Now we take Eq. (24) as an inductive hypothesis. Then, Eq. (24) for n+1 can be rewritten, using the inductive hypothesis and the relation U n+1 (r) = x2 U n (r), as E−n n 2 n Dn−1 U n (r)D. (25) D (x U (r)) = E+n Finally we observe that U n (r)D = x2 DU n − 2(n − 1)xU n , hence Eq. (25) is equivalent to (26) (E + n)D n (x2 ·) = (E − n)Dn−1 (x2 D · −2(n − 1)x·). Since E = −xD, Eq. (26) follows by a straightforward computation.

Proposition 2.9. The unitary representations of PSL(2, R) on H given by Dn−1 U n (Dn−1 )∗ ,

n≥1

coincide when restricted to the subgroup of translations and dilations on R. Proof. We have to prove that Dn−1 U n (g) = U (g)Dn−1 when g is a translation or a dilation. Fortranslations, Un (t)f (x) = f (x − t), and the equality is obvious; for 0 eλ/2 dilations, g = , U n (λ)f (x) = eλn f (e−λ x), hence Dn−1 U n (λ)f (x) = 0 e−λ/2 f (n) (e−λ x) = U (λ)Dn f (x). The family of representations Dn−1 U n (Dn−1 )∗ on the Hilbert space H constitute a concrete realization of the family of (integral) lowest weight representations described in Proposition 2.2, therefore we may construct a family of local conformal precosheaves of standard subspaces of H as explained in Subsect. 2. In the next subsection we shall give another description of these precosheaves, showing that they coincide with the ones described in [38]. 2.5. Relations among local spaces. Let us fix an n ≥ 1 and, for any proper interval I of ˙ let us set R, X n (I) = {f ∈ X n : f |I 0 ≡ 0}. It is easy to check that these spaces satisfy the properties 1. I1 ⊂ I2 =⇒ X n (I1 ) ⊂ X n (I2 ) (isotony), 2. I1 ∩ I2 = ∅ =⇒ X n (I1 ) ⊂ X n (I2 )0 (locality), 3. U n (g)X n (I) = X n (gI), ∀g ∈ PSL(2, R) (covariance),

234

D. Guido, R. Longo, H.W. Wiesbrock

. and that the immersion iIn : X n (I) → Hn is injective. Therefore the spaces Kn (I) = I n − (in X (I)) , where the closure is taken w.r.t. k · kn , form a local conformal precosheaf of subspaces of Hn , and the following property obviously holds: 4.

_

Kn (I) = Hn

(irreducibility).

˙ I⊂R

Therefore, by the first quantization version of results mentioned in Sect. 1, these spaces are standard, the Bisognano–Wichmann property and duality on the circle hold. . Now we identify Hn with H via the unitary Dn−1 , and set Kn (I) = Dn−1 Kn (I). Then, if I is compact in R and f ∈ Kn (I), f may be integrated n − 1 times, giving a function which still has support in I, therefore Z tj f = 0, j = 0, . . . , n − 2}, I ⊂⊂ R, (a) Kn (I) = {[f ] ∈ H : f |I 0 = 0, where [f ] denotes the equivalence class of f modulo polynomials. If I is a half line in R, Kn (I) is an invariant subspace of the dilation subgroup, which is the modular group of K(I). Using Takesaki’s result, see [32], this implies that Kn (I) = K(I),

I a half-line,

(b)

(For an alternative proof of this fact, see [38].) Then, by duality and the formula for the compact case, we obtain Kn (I) = {[f ] ∈ H : f |I 0 = pf,I ∈ Rn−1 [x]}

I 0 ⊂⊂ R.

(c)

Finally we observe that, since the Bisognano–Wichmann property holds, these precosheaves coincide with those abstractly constructed in Subsect. 2. . Now, we fix a bounded interval in R, e.g. (−1, 1), and consider the family Kn = Kn ((−1, 1)). The concrete characterization of Kn given in the preceding subsection shows that Km ⊆ Kn if m ≥ n. Now, we may show Theorem 2.10. The following dimensional relations hold: codim(Km ⊂ Kn ) = m − n, m ≥ n, 0 ∩ Kn ) = max((m − n − 1), 0). dim(Km Before proving Theorem 2.10, we discuss some of its consequences. Definition 1. A precosheaf W K is said n−regular if, for any partition of S 1 into n intervals n I1 , . . . , In , the linear space j=1 K(Ij ) is dense in H. We recall that irreducible conformal precosheaves are 2-regular, because duality holds and local algebras are factors. Corollary 2.11. The conformal precosheaf K1 is n-regular for any n. The conformal precosheaf K2 is 3-regular but it is not 4-regular. The conformal precosheaves Kn , n ≥ 3, are not 3-regular. Moreover, strong additivity and duality on the line hold for the precosheaf I → K1 (I) only, therefore it is the dual precosheaf of I → Kn (I) for any n.

Extensions of Conformal Nets and Superselection Structures

235

Proof. First we recall that a precosheaf is strongly additive if and only if it coincides with its dual precosheaf. Then, the precosheaf K ≡ K1 is strongly additive because its dual net should be of the form Kn (cf. Corollary 2.6) and should satisfy Kd (−1, 1) ⊇ K(−1, 1). As a consequence, K is n-regular for any n. Then, since the spaces for the half-lines do not depend on n, the dual net of Kn does not depend on n either, hence coincides with K. Since PSL(2, R) acts transitively on the triples of distinct points, we may study 3-regularity for the special triple (−1, 1, ∞) in R ∪ {∞}. Then, (Kn (∞, −1) ∨ Kn (−1, 1) ∨ Kn (1, ∞))0 = (K1 (∞, −1) ∨ Kn (−1, 1) ∨ K1 (1, ∞))0 = (K1 (−1, 1)0 ∨ Kn (−1, 1))0 = Kn (−1, 1)0 ∧ K1 (−1, 1), where we used strong additivity and duality for K1 . By Theorem 2.10, 3-regularity holds if and only if n = 1, 2. Violation of 4-regularity for K2 may be proved by exhibiting a function which is localized in the complement of any of the intervals (∞, −1), (−1, 0), (0, 1), (1, ∞), i.e. belongs to K2 (−1, 0)0 ∩ K2 (0, 1)0 ∩ K1 (−1, 1):   1 + x if − 1 ≥ x ≥ 0, φ(x) = 1 − x if 0 ≥ x ≥ 1, 0 if |x| ≥ 1. In the same way we may construct a function which violates 3-regularity for K3 , namely 2 x − 1 if |x| < 1, φ(x) = 0 if |x| ≥ 1 Clearly, φ ∈ K3 (∞, −1)0 ∩ K3 (−1, 1)0 ∩ K3 (1, ∞)0 = K30 ∩ K1 .

Lemma 2.12. codim(Km+1 ⊂ Km ) = 1. R m−1 Proof. Since Km+1 = {φ ∈ Km φ(x)dx = 0}, and we may find a funcx 0 tion ψm−2 ∈ C0∞ (R) : ψm−1 (x) = xm−1 , x ∈ (−1, 1), we get Km+1 = {φ ∈ Km ω(ψm−1 , φ) = 0}. Because the functional φ −→ ω(ψm , φ) is continuous and non zero on Km , the thesis follows. Proof of Theorem 2.10. The first statement of the theorem easily follows from 0 ∩ Km . We observe Lemma 2.12. Now, let us consider the relative commutants Km+p that, by the Poincar´e inequality, the norm on K1 is equivalent to the Sobolev norm for the space H 1/2 , i.e., we may identify K1 ' H 1/2 (−1, 1) as real Hilbert spaces. We also recall that the Dirac measure δ does not belong to H −1/2 , but belongs to H −1/2− for each > 0 (see, e.g., [33]). Then 0 ∩ Km = {φ ∈ Km hφ0 , ψi = 0 ∀ψ ∈ Km+p } Km+p Z tj φ = 0, j = 0, . . . , m − 2, = {φ ∈ H 1/2 (−1, 1)

hφ(m+p) , ψi = 0, ∀ψ ∈ H m+p−1/2 (−1, 1)}. . Then f = φ(m+p) ∈ H 1/2−m−p {−1, 1}, i.e., f should be a combination of Dirac’s δ measures with supports in {−1, 1} and their derivatives. Since f ∈ H 1/2−m−p , it has Pm+p−2 (j) (j) the form f = j=0 (cj δ(−1) + dj δ(1) ). The condition φ ∈ Km may be written as

236

D. Guido, R. Longo, H.W. Wiesbrock

hf, tq i = 0,

q = 0, . . . , 2m + p − 2.

(27)

0 ∩ Km will be the difference between the dimension of the space The dimension of Km+p . Pm+p−2 (j) (j) ) cj , dj ∈ R}, which is 2(m + p − 1), and the number of 1 = { j=0 (cj δ(−1) + dj δ(1) independent conditions in Eq. (27). We may also write the conditions in Eq. (27) as

hf, P i = 0

where P is a polynomial of degree 2m + p − 2.

They are independent if the only polynomial P of degree ≤ 2m + p − 2 satisfying hf, P i = 0 for any f ∈ 1 is the null polynomial. Indeed, such polynomial should have zeroes with multiplicities greater than m + p − 1 for the points −1 and 1, therefore, either p = 0, and then there exists exactly one non trivial such polynomial, or p > 0, and the null polynomial is the unique solution. In conclusion, if p > 0, the conditions 0 ∩ Km is in Eq. (27) are independent, and the dimension of Km+p 2(m + p − 1) − (2m + p − 1) = p − 1. If p = 0 the independent conditions in Eq. (27) are 2m − 2, and the dimension is 2(m − 1) − (2m − 2) = 0, which corresponds to the general fact (see, [17]) that local algebras of irreducible conformal theories are factors.

3. Examples of Superselection Sectors for the First Derivative of the U (1)-current In this section we shall discuss examples of superselection sectors of the first-derivative theory. All these sectors are abelian, i.e. are equivalence classes of automorphisms, and we will see that they are non covariant under the conformal group. In particular, recalling that the first-derivative precosheaf is 3-regular but not 4-regular, this shows that the assumption of 4-regularity in [16] cannot be avoided in general in order to obtain the automatic covariance of superselection sectors. As we shall see, all these sectors will be obtained by generalizing methods of the Buchholz-Mack-Todorov approach to sectors (see [7]). On the one hand the conformal net on R associated with the current algebra contains as a subnet the one associated with the first (nth ) derivative of the current algebra (cf. Corollary 2.11 and the Remark after Corollary 1.5), therefore BMT sectors may be restricted to the conformal net on R associated with the first (nth ) derivative of the current algebra. The sectors described here will be extensions (of such restrictions) to the conformal precosheaf on S 1 associated with the first (nth ) derivative of the current algebra. On the other hand they may be seen as sectors on a suitable global algebra in a way which is formally identical to the BMT procedure. Now let A be a local conformal precosheaf on R and Ad its Bisognano–Wichmann dual net. Consider the unitary 0 = JJd , where J and Jd are the modular conjugations of A(−1, 1) and Ad (−1, 1) with respect to the vacuum vector , in other words 0 is the product of the two ray inversion unitaries of the nets A and Ad . The unitary 0 implements the canonical endomorphism γ of Ad (−1, 1) into A(−1, 1). Let now ρ be a morphism of A. We define the “extension” of ρ to Ad by ρ˜ = Ad0∗ ργ . If I contains the origin (possibly at the boundary), γ sends Ad (I) into A(I):

Extensions of Conformal Nets and Superselection Structures

ˆ 0 J ⊂ JA(I) ˆ 0 J = A(I), Ad0(Ad (I)) = JAd (I)

237

(28)

where Iˆ is the image of I under the ray inversion map. Therefore, if {ρI } is the family of representations defining ρ and I is an interval containing the origin, then ρ˜I = γ −1 ρI γ gives a representation of A(I) and these representations are coherent. However if I does not contain the origin, there are two minimal intervals containing both I and the origin, one in which they are in clockwise order and the other in which they are in the counterclockwise one, and the two corresponding representations do not necessarily agree on the algebra of the intersection. If they do not, ρ˜ is not a representation of the dual precosheaf, nevertheless, if the point at infinity is removed and ρ is localized in a compact interval, only one choice remains, and we get a representation of the net Ad0 on the line. Clearly equivalent endomorphisms give rise to equivalent representations and if we choose the localization region I0 not containing the origin, say I0 = (a, b), b > a > 0, then ρ˜ is localized in (a, ∞). We have therefore shown that any transportable sector on A(I) gives rise to a (possibly solitonic) sector on Ad0 (I). (In this lower dimensional theory one might also interpret these sectors as coming from order variables, [30], Chapter 3.8.) In Subsect. 3 we show examples of this phenomenon. Conversely, if we assume that the two above mentioned representations agree, we get a representation of the precosheaf Ad , and assuming again that the localization interval I0 does not contain the origin, ρ˜ is localized in I0 . If we further assume that ρ is covariant and finite statistics, we obtain that ρ˜ is finite statistics too, because the index may be computed by looking at the endomorphisms of the von Neumann algebra Ad (0, ∞) = A(0, ∞). Hence ρ˜ is covariant by the strong additivity of the dual net (see [16]) and this implies that ρ and ρ˜ determine equivalent representations of the net A0 on the line. In fact, by the construction of the dual net, the product JJd of the modular conjugations for the interval (−1, 1) relative to the two theories coincides with the product of the two unitaries implementing the conformal transformation t → −1/t. Then, denoting by r, rd the corresponding automorphisms and by ud and u the unitaries ˜ d and Ad(u)ρ = rρr we have such that Ad(ud )ρ˜ = rd ρr Ad(ud )ρ˜ = rd ρr ˜ d = rρr = Ad(u)ρ. 3.1. Buchholz, Mack and Todorov approach to sectors of the current algebra . We defined the one-particle space for the current algebra as the completion of the space ˙ modulo constant functions w.r.t. the norm given in (23). We may X = X 1 = C ∞ (R) 1 then define A(S ) as the ∗ - algebra generated by W (h), h ∈ X with the relations W (h)W (h)∗ = 1 (unitarity) and W (h)W (k) = exp(i/2ω(h, k))W (h + k) (CCR). BMT automorphisms of A(S 1 ) are then given in terms of differential forms φ on S 1 . Setting R . αφ (W (h)) = ei hφ W (h), it is easy to see that α extends to an automorphism of A(S 1 ). By CCR, it follows that αφ is inner if and only if the form φ is exact, i.e. there exists a function R f ∈ RX s.t. φ = df , and that two automorphisms αφ , αψ are equivalent if and only if φ = ψ, i.e. R if the two forms give the same cohomology class in H 1 (S 1 ). The constant Q(αφ ) := φ will be called the charge of αφ . For any open interval I in S 1 we set A(I) to be the subalgebra of A(S 1 ) generated by Weyl unitaries W (h) such that the support of h is contained in I. Clearly the algebras . . associated with disjoint intervals commute and βg A(I) = A(gI), where βg (W (f )) = W (U (g)f ).

238

D. Guido, R. Longo, H.W. Wiesbrock

We observe that BMT automorphisms are locally internal, i.e. for any interval I and any form φ there exists a function f with support in some larger interval Iˆ such that df |I ≡ φ|I , therefore αφ |A(I) ≡ adW (f )|A(I) . Also, the superselection sectors corresponding to a given charge are conformally covariant w.r.t. the adjoint action of the conformal group on X, i.e. the automorphisms αφ and βg · αφ · βg−1 are in the same class for any conformal transformation g. Indeed, since the class of inner automorphisms is globally stable under the action of the conformal group and the charge is additive, namely Q(αφ ◦ αψ ) = Q(αφ ) + Q(αψ ), the action of PSL(2, R) on BMT automorphisms gives a linear action on BMT charges, i.e. a one dimensional linear representation of PSL(2, R). Any such representation being trivial, BMT sectors are covariant. Now we give a local description for these sectors. We observe that the second quantization algebra associated with the standard space K(I) coincides with π(A(I))00 = R(I), where π is the vacuum representation of A(S 1 ) on the Fock space eH . Moreover, the ˆ map π|A(I) is faithful, and the restriction of αφ to A(I), being implemented in A(I), uniquely extends to a normal automorphism of R(I). As a consequence, αφ gives rise to a representation I → αφI of the precosheaf A in the sense of [17], where R i hI φ I W (h) αφ (π(W (h))) = e and, recalling that h ∈ H is localized in I if it is equal to a constant cI in I 0 , we have . set hI = h − cI . We described BMT locally normal representations via automorphisms of A(S 1 ). Conversely, the global algebra A(S 1 ) plays the role of the universal algebra w.r.t. the family of the locally normal BMT representations, in the sense that the classes of such representations modulo unitary equivalence appear as classes of global automorphisms of A(S 1 ) up to inners. 3.2. Restriction of localized sectors. As we have already seen, local algebras associated with compact intervals on the line for the first-derivative net may be described as Z . R2 (I) = {W (h) ∈ R(I) : h(x)dx = 0}00 . Of course these algebras form a net of local algebras on the real line which is covariant with respect to the action of translations and dilations, but Haag duality does not hold on R. The quasi-local algebra A2 (R) generated by the algebras of compact intervals is a subalgebra of the quasi-local algebra A(R) of the current algebra on the line, therefore any BMT automorphism of A localized in some compact interval I gives a representation of A2 (R) which is equivalent to the vacuum representation if and only if it has zero charge, but, due to the failure of Haag duality, the intertwining unitary is not necessarily localized in I. Such a unitary exhibits instead a solitonic localization, i.e. it necessarily belongs to the von Neumann algebra of any half line containing the localization region. The restrictions of BMT sectors are then translation and dilation covariant. On the contrary, appears, if we consider classes of automorphisms of A2 (R) modulo inners, a new charge R R i.e. R twoRautomorphisms αφ and αψ are equivalent if and only if both φ = ψ and tφ = tψ are equal. As a consequence, such sectors are no longer translation covariant. 3.3. Conformal solitonic sectors. In the first-derivative theory, the automorphism αφ is localized in a compact interval I of R when φ is constant outside I, therefore solitonic

Extensions of Conformal Nets and Superselection Structures

239

sectors on A(R) may become localized when restricted to A2 (R). This shows that, conversely, sectors on A2 may become solitonic when extended to A as described at the beginning of this section. Here we shall consider φ as a function on R rather than as a differential form, identifying φ(t) with φ(t)dt. If φ is constant outside I, αφ Ris equivalent to the vacuum (as +∞ a representation) when both φ(+∞) = φ(−∞) (= 0) and −∞ φ = 0. As a consequence, superselection sectors are described by two charges: Z Z Q0 = φ(+∞) − φ(−∞) = φ0 (t)dt, Q1 = tφ0 (t)dt. These sectors are clearly transportable, but a simple computation shows that they are covariant under translations and dilations if and only if Q0 = 0, i.e. only if they are restrictions of BMT sectors. Restrictions to A2 (R) of solitonic sectors on A(R) give then an example of localized non covariant sectors on the line. As we shall see, these sectors may be extended to transportable sectors on the circle. 3.4. Generalized BMT approch to sectors: Local description. We recall that the conformal precosheaf R2 of the first-derivative theory may be described as second quantization algebras on the same Fock space as the current algebra, R2 (I) = {W (h) : h ∈ K2 (I)}00 . We have seen that Z K2 (I) = {[f ] ∈ H : f |I 0 ≡ pf,I , pf,I ∈ R0 [x], (f (x) − pf,I )dx = 0} I ⊂⊂ R, K2 (I) = {[f ] ∈ H : f |I 0 ≡ pf,I pf,I ∈ R0 [x]}

I half line,

K2 (I) = {[f ] ∈ H : f |I 0 ≡ pf,I , pf,I ∈ R1 [x]}

I 0 ⊂⊂ R.

In order to extend a BMT automorphism αφ to the first-derivative theory on the circle we have to choose a real number λ and then set I (W (f )) = eihφ,f iI,λ W (f ), αφ,λ

where f belongs to K2 (I), with

Z

hφ, f iI,λ = Taking φ, ψ such that Q =

R

φ=

R

φ(x)(f (x) − pf,I (λ))dx. −1 ψ we may compute αφ,λ · αψ,µ :

−1 I (αφ,λ · αψ,µ ) (W (f )) = ei(hφ,f iI,λ −hψ,f iI,µ ) W (f ) 0

= e−iQ(λ−µ)hδx ,f i ad W (h)(W (f )),

(29)

with h0 = φ − ψ, where x is any point in I 0 . Therefore we have proved that two BMT automorphisms with the same (non zero) charge extend to equivalent automorphisms if and only if λ = µ. Since a simple calculation shows that the translated automorphism βT (t) αφ,λ βT (−t) is equal to αφ( · +t),λ−t , we conclude that non trivial BMT sectors give rise to a one parameter family of non covariant sectors on the circle. −1 is equivalent to a new autoMoreover, when λ 6= µ, the automorphism αφ,λ · αψ,µ morphism θc , c = −Q(λ − µ): 0

θcI (W (f )) = eichδx ,f i W (f ),

x ∈ I 0.

(30)

240

D. Guido, R. Longo, H.W. Wiesbrock

Since pf,I is constant whenever I ⊂ R and hence hδx0 , f i vanishes, we conclude that θc is localized in only one point, the point at infinity. We may easily show that θc is invariant under translations and that dilations act on the charge c. By conjugating θc with a conformal transformation we will get a family of automorphisms localized in different points of the real line. In particular, requiring the automorphisms to be localized in zero, we get a family ζc : R . ic x (f (y)−pf,I (y))dy W (f ) x ∈ I 0 . (31) ζc (W (f )) = e 0 Rx Indeed it is easy to see that, if f ∈ K2 (I), then 0 (f (y)−pf,I (y))dy does not depend 0 / I. Now let 0 be in I, and suppose for simplicity on x ∈ I , and is equal to zero when 0 ∈ that I ⊂ R. Then formula (31) may be obtained by formula (30) using R(π), the rotation by π, which in the real line coordinates is t → −1/t. As in Proposition 2.9, such rotation is implemented by DU (2) (R(π))D∗ on the Hilbert space H, and, since f has compact support, we may set x = ∞ in formula (31). Hence the correspondence between (31) and (30) follows by the equality Z

∞

2 0

Z

− x1

(f (y) − pf,I )(y) = hδ0 , 2x 0

1 i. − x

(f − pf,I )(y)dy + f

When f is localized in R and we choose x = ∞ as before, the automorphisms ζc furnish extensions to the circle of the restriction to A2 (R) of solitonic sectors on A(R). It is not difficult to see that dilations act on these automorphisms dilating the charge, therefore these sectors are non covariant too. In the following subsection we shall see that the Weyl algebra on the symplectic space (X 2 , ω2 ) is a global algebra for all these sectors, i.e. the given automorphisms of the precosheaf modulo unitaries on the Fock space are described by automorphisms of this global algebra modulo inners. In doing that we shall see that the described sectors form a group isomorphic to R3 . 3.5. Generalized BMT approach to sectors: Global description. Now we describe some ˙ on (X 2 , ω2 ). If φ is a measure on R ˙ natural automorphisms of the Weyl algebra A2 (R) R 2 such that (1 + t )d|φ|(t) < ∞, we set R αφ2 (W (h)) = ei hdφ W (h) h ∈ X 2 , ˙ This automorphism is inner if and only and αφ2 extends to an automorphism of A2 (R). 000 if it is of the form adW (h), where h = φ, i.e. if and only if the first three moments (charges) of φ vanish: Z . Qk (φ) = tk dφ(t) = 0, k = 0, 1, 2. As a consequence, two such automorphisms are equivalent if all their charges coincide. We now consider the corresponding sectors, i.e. classes of automorphisms modulo inners. Proposition 3.1. The only covariant sector on A2 (R) in the above class is the identity sector.

Extensions of Conformal Nets and Superselection Structures

241

Proof. First we see the behavior of the automorphisms under translations: Q0 (βT (t) αφ2 βT (−t) ) = Q0 (αφ2 ), Z 2 Q1 (βT (t) αφ βT (−t) ) = (x − t)dφ(x) = Q1 (αφ2 ) − tQ0 (αφ2 ), Z Q2 (βT (t) αφ2 βT (−t) ) = (x − t)2 dφ(x) = Q2 (αφ2 ) − 2tQ1 (αφ2 ) + t2 Q0 (αφ2 ). Then we compute the charges of automorphisms transformed with the ray inversion r : x → −1/x: Z Q0 (βr αφ2 βr ) = x2 dφ(x) = Q2 (αφ2 ), Z 2 Q1 (βr αφ βr ) = x2 (−1/x)dφ(x) = −Q1 (αφ2 ), Z Q2 (βr αφ2 βr ) = x2 (−1/x)2 dφ(x) = Q0 (αφ2 ). From the first equations we derive that a translation covariant sector has Q0 = Q1 = 0, while covariance under ray inversion amounts to Q0 = Q2 and Q1 = 0, from which the thesis easily follows. Remark. If we generalize the preceding construction to the case of n derivatives, thus obtaining sectors parameterized by 2n + 1 charges, the preceding proof generalizes as well, then showing that the identity sector is the only covariant sector in that case also. Remark. As BMT sectors, also the sectors described above are additive in the sense that the vector charge (Q0 , Q1 , Q2 ) of the composition of two sectors is just the sum of the two charges. Then the action of PSL(2, R) on these sectors gives a linear representation of this group on R3 . The absence of covariant sectors means that the action is free and therefore has no one-dimensional representations, i.e. it is irreducible. ˙ are the sub-algebras generated by the Weyl unitaries The local subalgebras for A2 (R) whose test functions are zero outside I. Then the representation of A2 (R) on the Fock 2 space eH is faithful when restricted to local algebras, i.e. A2 (I) may be seen as a weakly dense subalgebra of the second quantization algebra of the space K2 (I). By a classical Sobolev embedding argument, the functions in K2 (I) are continuous, therefore the automorphisms αφ2 |A2 (I) uniquely extend to normal automorphisms of R2 (I), so that αφ2 gives rise to an automorphism of the precosheaf I → R2 (I). Proposition 3.2. All sectors described above may be localized in two points. Proof. First we observe that some of them may be localized even in one point, in fact the multiples of the δ0 function give sectors with Q1 = Q2 = 0, while for the measures cx−2 δ∞ (x) we have Q0 = 0, Q1 = 0, Q2 = c, therefore we may restrict to the case (Q0 , Q1 ) 6= (0, 0). Then we have to show that for any triple Qi , i = 0, 1, 2, (Q0 , Q1 ) 6= (0, 0), we may find a measure φ = λδa + µδb with the given momenta for some λ, µ, a, b ∈ R, or equivalently solve the system ( =λ+µ Q0 Q1 = λa + µb , Q2 = λa2 + µb2

242

D. Guido, R. Longo, H.W. Wiesbrock

whose solutions are obtained choosing a b for which Q1 −Q0 b 6= 0 and Q0 b2 −2Q1 b+Q2 6= 0 and then setting a = (Q2 − Q1 b)(Q1 − Q0 b)−1 ,λ = (Q1 − Q0 b)2 (Q0 b2 − 2Q1 b + Q2 )−1 , µ = Q0 − λ. The preceding proposition constitutes indeed another proof that these sectors are non covariant, since the following theorem holds: Proposition 3.3. Let I → A(I) be a local conformal precosheaf on S 1 . Then a covariant sector ρ with finite index which may be localized in two points is trivial. Proof. We may suppose that the two points are {0, ∞}. We first observe that ρ is indeed an automorphism because, if j is the antiunitary modular conjugation for R(0, ∞), jρj is still localized in the same two points and then any intertwiner between ρjρjand the identity (which exists e.g. by [17]) is localized in these two points and is therefore a number by two-regularity. The same argument shows that ρ commutes with the dilations, because the cocycle in the covariance equation for the dilation group is then trivial. Now the state ω0 · ρ−1 is dilation invariant, and therefore, by a cluster argument, coincides with the vacuum state, which ends the proof. In this last part of the subsection we show the relation between the local and the global picture of the superselection sectors of the first-derivative theory or, more precisely, we show that all the sectors described in Subsect refsubsec:locsect are (normal extensions ˙ described here. of) the sectors of A2 (R) Proposition 3.4. The sectors [αφ,λ ], [θc ] [ζc ] are of the form [αµ2 ], µ measure. Proof. Given a non trivial sector [αφ,λ ], we may choose a representative s.t. supp φ ⊂ I0 , where I0 is a given compact interval in R. In order [αφ,λ ] to be localized in I0 , we should have hφ, f iI0 ,λ = 0 for any f localized inR I00 , i.e., since on I00 f coincides with pf,I0 , we R xφ(x) get φ(x)(x − λ)dx = 0, i.e. λ = λ0 = R (the denominator does not vanish since φ(x)

[αφ,λ ] is non trivial). Then, according to formula (3), one has [αφ,λ ] = [θ−Q(λ−λ0 ) ] · [αφ,λ ], hence, since the class {αµ , µ measure} is closed under composition, it is enough to prove the statement for [αφ,λ0 ], [θc ] and R[ζc ]. R we have φ(x)pf,I (λ0 )dx = φ(x)pf,I (x)dx, therefore As far as αφ,λ0 is concerned, R hφ, f iI,λ and therefore, integrating by parts, R R 0 Rcoincides with φ(x)(f (x) − pf,I (x))dx with − ( (f − pf,I ))dφ(x). Observing that (f − pf,I ) is exactly the representative in X 2 of D−1 [f ] which vanishes outside I we conclude that the automorphism αφ,λ0 2 ˙ and the relation among the charges is comes from the automorphism α−dφ onA2 (R), 2 2 2 Q0 (α−dφ ) = 0, Q1 (α−dφ ) = Q(αφ,λ ), Q2 (α−dφ ) = 2λ0 Q(αφ,λ ). In the same way we may show that the “solitonic” automorphisms ζc in Eq. (31) come from the automorphisms αµ2 on A2 (R) with µ = −cδ0 . Conjugating ζc with the ray inversion we see that the automorphisms θc in Eq. (30) localized at infinity come ˙ with µ = cx−2 δ∞ . from the automorphisms αµ2 on A2 (R) Remark. In [9] it is shown that in any diffeomorphism covariant theory on S 1 , in physics terms a theory with a stress-energy tensor, superselection sectors are covariant. It is well known that the usual way to associate a stress-energy tensor to the derivative of the

Extensions of Conformal Nets and Superselection Structures

243

U (1)-current formally leads to a conformal charge c = ∞. Our result then shows that the U (1)-current derivative theories indeed do not have a stress energy tensor. Acknowledgement. H.-W. W. wishes to thank the University of Rome II for the kind hospitality and the CNR for financial support during a visit in Rome where this collaboration started. He also wants to thank Bert Schroer warmly for various helpful discussions.

References 1. Araki, H.: A lattice of von Neumann algebras associated with the quantum field theory of a free Bose field. J. Math. Phys. 4, 1343 (1963) 2. Araki, H., Zsido, L.: Extension of the structure theorem of Borchers and its application to half-sided modular inclusions. Manuscript, preliminary version (1995), to appear 3. Bisognano, J., Wichmann, E.: On the duality condition for a Hermitean scalar field. J. Math. Phys. 16, 985 (1975) 4. Borchers, H.-J.: The CPT Theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315 (1992) 5. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal Quantum Field Theory. Commun. Math. Phys. 156, 201–219 (1993) 6. Brunetti, R., Guido, D., Longo, R.: In preparation 7. Buchholz, D., Mack, G., Todorov, I.: The current algebra on the circle as a germ of local field theories. Nucl. Phys. B, Proc. Suppl. 56, 20 (1988) 8. Buchholz, D., Schulz-Mirbach, H.: Haag duality in conformal quantum field theory. Rev. Math. Phys. 2, 105 (1990) 9. D’Antoni, C., Fredenhagen, K.: In preparation 10. Doplicher, S., Longo, R.: Standard and split inclusions of von-Neumann-algebras. Inv. Math. 75, 493 (1984) 11. Fredenhagen, K.: Generalization of the theory of superselection sectors. In: The algebraic theory of superselection sectors. Introduction and recent results. D. Kastler, ed. Singapore: World Scientific, 1990 p. 379 12. Fredenhagen, K., J¨orss, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansion. Commun. Math. Phys.176, 541 (1996) 13. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebra II: Geometric aspects and conformal covariance. Rev. Math. Phys. Special Issue, 113 (1992) 14. Fr¨ohlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569 (1993) 15. Guido, D.: Modular covariance, PCT, Spin and Statistics. Ann. Ist. H. Poincar`e, 63, 383 (1995) 16. Guido, D., Longo, R.: Relativistic Invariance and Charge Conjugation in Quantum Field Theory. Commun. Math. Phys. 148, 521 (1992) 17. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11 (1996) 18. Haag, R.:Local Quantum Physics. Berlin–Heidelberg–New York: Springer-Verlag, 1996 19. Hislop, P.D., Longo, R.: Modular structure of the local algebras associated with the free massless scalar field theory. Commun. Math. Phys. 84, 71 (1982) 20. Lang, S.: SL(2, R). Berlin–Heidelberg–New York: Springer-Verlag, 1985 21. Longo, R.: Solution of the factorial Stone-Weierstrass Conjecture. Invent. Math. 76 145 (1984) 22. Longo, R.: An analogue of the Kac-Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 451 (1997) 23. Longo, R., Rehren, R.: Net of subfactors. Rev. Math. Phys. 7, 567 (1995) 24. M¨uger, M.: Superselection Structure of massive Quantum Field Theories in 1+1 Dimensions. hepth/9705019 25. Rieffel, M., Van Daele, A.: A bounded operator approach to Tomita-Takesaki theory. Pacific J. Math. 69, 187–221 (1976) 26. Roberts, J.E. Spontaneously broken gauge symmetry and superselection rules. Proc. Int. School of Math. Physics, Camerino (1974), G. Gallavotti, ed., Univ. di Camerino, 1976

244

D. Guido, R. Longo, H.W. Wiesbrock

27. Schroer, B.: Motivations and Physical Aims of Algebraic QFT” Ann. of Phys. Vol. 255 No. 2, 270 (1997) 28. Schroer, B.: Wigner Representation Theory of the Poincare Group, Localization, Statistics and the SMatrix. Nucl. Phys. B 499, 519 (1997) 29. Schroer, B.: Modular Localization and the Bootstrap-Formfactor Program. Nucl. Phys. B 499, 547 (1997) 30. Schroer, B.: A Course on: Quantum Field Theory and Local Observables. Manuscript, based on Lectures held in Rio de Janeiro and Berlin 31. Streater, R.F., Wilde, I.F.: Fermion states of a Boson field” Nucl. Phys. B24, 561 (1970) 32. Stratila, S., Zsido, L.: Lectures on von Neumann algebra. Turnbridge Wells: Abacus Press, 1979 33. Tr´eves, F.: Topological vector spaces, distributions and kernels. New York: Academic Press, 1967 34. Wiesbrock, H.-W.: Half-Sided Modular Inclusions of von Neumann algebras. Commun. Math. Phys. 157, 83 (1993); Erratum. Commun. Math. Phys. 184, 683–685 (1997) 35. Wiesbrock, H.-W.: Symmetries and Half-Sided Modular Inclusions of von Neumann algebras. Lett. Math. Phys. 29, 107 (1993) 36. Wiesbrock, H.-W.: A comment on a recent work of Borchers. Lett. Math. Phys. 25, 157-159 (1992) 37. Wiesbrock, H.-W.: Conformal Quantum Field Theory and Half-Sided Modular Inclusions of von Neumann algebras. Commun. Math. Phys. 158, 537 (1993) 38. Yngvason, J.: A Note on Essential Duality. Lett. Math. Phys. 31, 127 (1995) Communicated by H. Araki

Commun. Math. Phys. 192, 245 – 260 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

A Trinomial Analogue of Bailey’s Lemma and N = 2 Superconformal Invariance? G. E. Andrews, A. Berkovich Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected], berkov [email protected] Received: 8 March 1997 / Accepted: 29 June 1997

Dedicated to Dora Bitman on her 70th birthday

Abstract: We propose and prove a trinomial version of the celebrated Bailey’s lemma. As an application we obtain new fermionic representations for characters of some unitary as well as nonunitary models of N = 2 superconformal field theory (SCFT). We also establish interesting relations between N = 1 and N = 2 models of SCFT with 2 2 and 3 1 − 4ν . A number of new mock theta function central charges 23 1 − 2(2−4ν) 2(4ν) identities are derived.

1. Brief Review of Bailey’s Method and its Generalizations It may come as a surprise that Manchester, England was an ideal setting for pure mathematics during the height of World War II. However, a number of historical coincidences conspired to make this the case. In particular, mathematics that would later prove extremely valuable in the development of statistical mechanics and conformal field theory (CFT) flourished there. Essentially, Bailey, extending the original ideas of Rogers, came up with a variety of new Rogers-Ramanujan type identities during the winter 1943–44 [1]. Hardy who was then editor for the Journal of London Mathematical Society got Freeman Dyson to referee the paper. Realizing that Bailey and Dyson were the only people in England interested in the subject, Hardy put Bailey in contact with Dyson. The resulting correspondence lead to [2], the first paper which, implicitly at least, contained Bailey’s lemma and the first hint at the “Bailey chain”. A charming account of the Dyson-Bailey collaboration appears in Dyson’s article, “A Walk Through Ramanujan’s Garden” [3]. A few years later, Slater, in a study building on Bailey’s work, systematically derived 130 identities of the Rogers-Ramanujan type [4,5]. In the last decade, Bailey’s technique ?

Partially supported by National Science Foundation Grant: DMS-9501101.

246

G. E. Andrews, A. Berkovich

was streamlined and generalized by Andrews [6] and further extended by Agarwal, Andrews and Bressoud [7,8]. Bailey’s method may be summarized as follows. Let α = {αr }r≥0 , β = {βL }L≥0 be sequences related by the identities βL =

∞ X

αr (q)L−r (aq)L+r

r=0

,

L ∈ Z≥0 ,

  (1−aq−1 )(1−aq1−2 )···(1−aq−n ) , (a)n = 1 ,  (1 − a)(1 − aq) · · · (1 − aq n−1 ) ,

(1.1)

n ∈ Z<0 , n=0 n ∈ Z>0

(1.2)

and let γ = {γL }L≥0 , δ = {δr }r≥0 be another pair of sequences related by γL =

∞ X r=L

Then the new identity

∞ X

δr . (q)r−L (aq)r+L

αL γL =

L=0

∞ X

(1.3)

β L δL

(1.4)

L=0

holds. A pair of sequences (α, β) that satisfies (1.1) is called a Bailey pair relative to a. Analogously, a pair of sequences (γ, δ) subject to (1.3) is referred to as the conjugate Bailey pair relative to a. In [2], Bailey proved that (ρ1 , ρ2 )L (aq/ρ1 ρ2 )L 1 , (aq/ρ1 , aq/ρ2 )L (q)M −L (aq)M +L (ρ1 , ρ2 )L (aq/ρ1 ρ2 )L (aq/ρ1 ρ2 )M −L δL = (aq/ρ1 , aq/ρ2 )M (q)M −L

γL =

(1.5) (1.6)

with (a1 , a2 )L ≡ (a1 )L (a2 )L , L ≤ M ∈ Z≥0 satisfy (1.3) for any choice of parameters ρ1 , ρ2 . Combining (1.4, 1.5, 1.6) yields L ∞ X (ρ1 , ρ2 )L (aq/ρ1 ρ2 )M −L aq βL = (aq/ρ1 , aq/ρ2 )M ρ1 ρ2 (q)M −L L=0 (1.7) ∞ X (ρ1 , ρ2 )L (aq/ρ1 ρ2 )L αL . (aq/ρ1 , aq/ρ2 )L (q)M −L (aq)M +L L=0

From the last equation, we deduce immediately: Bailey’s Lemma. Sequences (α0 , β 0 ) defined by 0 αL =

βL0 =

(ρ1 , ρ2 )L (aq/ρ1 ρ2 )L αL , (aq/ρ1 , aq/ρ2 )L

∞ X (ρ1 , ρ2 )r (aq/ρ1 ρ2 )r (aq/ρ1 ρ2 )L−r r=0

(aq/ρ1 , aq/ρ2 )L (q)L−r

form again a Bailey pair relative to a.

(1.8) βr

(1.9)

A Trinomial Analogue of Bailey’s Lemma

247

Obviously, Bailey’s lemma can be iterated ad infinitum leading to a Bailey chain [6,9] of new identities (α, β) → (α0 , β 0 ) → (α00 , β 00 ) → (α000 , β 000 ) → . . .

(1.10)

with the parameter a remaining unchanged throughout the chain. P. Paule [10] independently discovered the essentials of the iterative process formalized by the Bailey chain. The notion of a Bailey chain was upgraded to a “Bailey lattice” in [7,8] where it was shown how to pass from a Bailey pair with given parameter a to another pair with an arbitrary new parameter. Further important developments have taken place in the last few years. In [11], Milne and Lilly found higher-rank generalizations of Bailey’s lemma. Many new polynomial identities of Rogers-Ramanujan type were discovered in [12-19] as the result of recent progress in CFT and Statistical Mechanics initiated by the Stony Brook group [20-22]. Following an observation made by Foda-Quano [23], these identities were recognized as new (α, β) pairs. New (γ, δ) pairs were discovered in [24,25]. Intriguing connections between Bailey’s lemma and the so-called renormalization group flows connecting different models at CFT were discussed in [18,26–28]. This paper is intended as the first step towards a multinomial (or higher-spin) generalization of Bailey’s lemma. Here we concentrate on the trinomial case. Our main assertion is Theorem 1 (Trinomial analogue of Bailey’s lemma). If for L ≥ 0, a = 0, 1, βea (L) =

L X

α ea (r)

Ta (L, r, q) , (q)L

α e0 (r)

(−1)M +1 r r q 2 + q− 2

r=0

(1.11)

then for M ∈ Z≥0 M X

L (−1)L q 2 βe0 (L) =

∞ X r=0

L=0

T1 (M, r, q) (q)M

(1.12)

and M X L=0

(−q −1 )L q L βe1 (L) =

∞ X r=0

α e1 (r)

(−1)M T1 (M, r, q) (q)M (1 − q M ) T1 (M − 1, r + 1, q) 1 + q −1−r (1 − q M ) T (M − 1, r − 1, q) , − 1 1 + q r−1

−

(1.13)

where Ta (L, r, q) are q-trinomial coefficients [28] to be defined in the next section. The pair of sequences (e αa , βea ) that satisfies identities (1.11) will be called a trinomial Bailey pair. The rest of this paper is organized as follows. In Sect. 2 we shall collect the necessary background on q-trinomials and then prove Theorem 1. In Sect. 3 we shall exploit this theorem to derive a number of new q-series identities related to characters of N = 2 SCFT. We conclude with a brief discussion of the physical significance of our results and some comments about possible generalizations.

248

G. E. Andrews, A. Berkovich

2. q-Trinomial Coefficients and a Trinomial Analogue of Bailey’s Lemma 2.1. Preliminaries. Before turning our attention to the q-trinomial coefficients, let us L briefly recall the ordinary trinomials A [29], defined by 2 x+1+

1 x

L =

L = 0, A 2

L X L xA , A 2

(2.1)

A=−L

|A| > L.

(2.2)

By applying the binomial theorem twice to (2.1) we find by coefficient comparison that X L! L . (2.3) = A 2 j!(j + A)! (L − 2j − A)! j≥0

Furthermore, it is easy to deduce from (2.1) the following recurrences: L L−1 L−1 L−1 = + + , A 2 A−1 2 A A+1 2 2 which along with (2.2) and

0 =1 0 2

(2.4)

(2.5)

specify trinomials uniquely. Equations (2.4) and (2.5) lead to the Pascal-like triangle for L numbers A 2

−

− −

1 − −

1 3 − −

1 2 6 − −

1 1 3 7 − −

1 2 6 − −

1 3 − −

1 − −

− −

− .

q-Analogues of trinomials were forced into existence in the work of Andrews and Baxter on the generalized Hard Hexagon model [29]. These analogues proved to play an important role in Partition Theory [30-32] and Statistical mechanics [33-35]. Unlike binomials, trinomials admit not one but many q-analogues which we now proceed to describe. 2.2. Definitions and properties of q-trinomials. The straightforward q-deformation of (2.3) is as follows X q j(j+B) (q)L L; B; q = . (2.6) (q)j (q)j+A (q)L−2j−A A 2 j≥0

Let us further define another useful q-analogue of (2.3): L(L−n)−A(A−n) L; A − n; q −1 2 , Tn (L, A, q) = q A 2

n ∈ Z.

(2.7)

A Trinomial Analogue of Bailey’s Lemma

249

The polynomials Tn (L, A, q) are symmetric under A → −A, Tn (L, A, q) = Tn (L, −A, q),

(2.8)

and vanish for |A| > L: Tn (L, A, q) = 0

if

|A| > L .

(2.9)

The generalization of Pascal-triangle type recurrences (2.4) found in [32, 34] is Tn (L, A, q) = Tn (L − 1, A − 1, q) + Tn (L − 1, A + 1, q)+ q L−

1+n 2

Tn (L − 1, A, q) + (q L−1 − 1)Tn (L − 2, A, q) .

(2.10)

Additionally, there are four more identities needed Tn (L, A, q) = Tn+2 (L, A, q) + (q L − 1)q − q

L−A 2

1+n 2

Tn (L − 1, A, q),

L

Tn+1 (L, A, q) = Tn (L, A, q) + (q − 1)Tn (L − 1, A + 1, q), T1 (L, A, q) − T1 (L − 1, A, q) =

(2.11) (2.12) (2.13)

L−A

L+A

= q 2 T0 (L − 1, A + 1, q) + q 2 T0 (L − 1, A − 1, q), (2.14) T1 (L, A, q) + T1 (L − 1, A, q) = = T−1 (L − 1, A + 1, q) + T−1 (L − 1, A − 1, q) + 2T−1 (L − 1, A, q). Identities (2.11), (2.12), (2.13) follow from Eqs. (2.24), (2.23), (2.16) of [29] and identity (2.14) is Eq. (4.5) of [34]. Next we shall require the limiting formula lim T1 (L, A, q) =

L→∞

(−q)∞ , (q)∞

(2.15)

which is equation (2.51) of [29]. Let us combine (2.11) and (2.12) with n = −1 to obtain q

L−A 2

T0 (L, A, q) − T1 (L, A, q) = = (q L − 1){T−1 (L − 1, A + 1, q) + T−1 (L − 1, A, q)}.

(2.16)

We now replace A by −A in the above equation to get, with the help of (2.8), q

L+A 2

T0 (L, A, q) − T1 (L, A, q) = = (q L − 1){T−1 (L − 1, A − 1, q) + T−1 (L − 1, A, q)}.

If we add (2.16) and (2.17) and use (2.14), the result is A L A q 2 q 2 + q − 2 T0 (L, A, q) − 2T1 (L, A, q) = = (q L − 1){T1 (L, A, q) + T1 (L − 1, A, q)}, which may be conveniently rewritten as

(2.17)

(2.18)

250

G. E. Andrews, A. Berkovich L

q− 2

T0 (L, A, q) =

A

A

q 2 + q− 2

(1 + q L )T1 (L, A, q) (2.19)

L

q − 2 (1 − q L )

−

A

A

q 2 + q− 2

T1 (L − 1, A, q).

We’re now ready to prove Theorem 1. 2.3. Proof of Theorem 1. We shall prove Theorem 1 in two steps. First, we find a trinomial analogue of a conjugate Bailey pair and then use a standard Bailey Transform argument. To this end let us introduce an auxiliary function φ(L, q) L

φ(L, q) = q 2

(−1)L (q)L

(2.20)

with the easily verifiable property φ(L + 1, q) =

√

q

1 + qL φ(L, q). 1 − q L+1

(2.21)

Next we multiply both sides of (2.19) by φ(L, q) and sum both extremes of the result on L from A to M to obtain with the aid of (2.21) the following M X L=A

L

q2

(−1)M +1 T1 (M, A, q) (−1)L T0 (L, A, q) = A A , (q)L (q)M q 2 + q− 2

(2.22)

which can be restated as a trinomial analogue of the conjugate relation (1.3), ∞ X

γ e0 (A, M ) =

L=A

T0 (L, A, q) , δe0 (L, M ) (q)L

(2.23)

with conjugate pair (e γ0 , δe0 ) in this case being γ e0 (A, M ) =

(−1)M +1 T1 (M, A, q) , A A (q)M q 2 + q− 2

L δe0 (L, M ) = θ(L ≤ M )q 2 (−1)L ,

where

(2.24) (2.25)

n

1 if a ≤ b (2.26) 0 otherwise . The proof of the first statement of Theorem 1 (1.12) now easily follows by a Bailey Transform argument θ(a ≤ b) =

∞ X

α e0 (r)e γ0 (r, M ) =

r=0

=

∞ X L=0

∞ X r=0

δe0 (L, M )

L X r=0

α e0 (r)

∞ X L=r

T0 (L, r, q) δe0 (L, M ) (q)L ∞

T0 (L, r, q) X e α e0 (r) = δ0 (L, M )βe0 (L). (q)L

(2.27)

L=0

Substituting (2.24) and (2.25) into (2.27) we arrive at the desired result (1.12).

A Trinomial Analogue of Bailey’s Lemma

251

Similar to the binomial case, (1.12) can be interpreted as a defining relation identity e for a new trinomial Bailey pair α e1 , β1 . However, unlike the binomial case, the second analogue of (1.3) ∞ X T1 (L, A, q) (2.28) δe1 (L, M ) γ e1 (A, M ) = (q)L L=A

is now needed to iterate further. A A e To find a γ e1 , δ1 pair we multiply Eq. (2.22) by q 2 q − 2 and then replace A by A + 1(A − 1) to get M X (−1)M +1 T1 (M, A + 1, q) (−1)L L+A+1 q 2 T0 (L, A + 1, q) = , (q)L (q)M 1 + q −1−A

(2.29)

M X (−1)M +1 T1 (M, A − 1, q) (−1)L L−A+1 q 2 T0 (L, A − 1, q) = . (q)L (q)M 1 + q −1+A

(2.30)

L=A+1

L=A−1

Adding (2.29) and (2.30) and using (2.9), (2.13) gives M X (−1)L {T1 (L + 1, A, q) − T1 (L, A, q)} = (q)L L=A−1 (−1)M +1 T1 (M, A + 1, q) T1 (M, A − 1, q) + . = (q)M 1 + q −1−A 1 + q −1+A

(2.31)

Next we treat the sum in (2.31) as follows M X (−1)L {T1 (L + 1, A, q) − T1 (L, A, q)} = (q)L

L=A−1

M X (−1)L+1 (−1)L − = T1 (L + 1, A, q)+ (q)L (q)L+1 L=A−1

M X (−1)L (−1)L+1 + T1 (L + 1, A, q) − T1 (L, A, q) (q)L+1 (q)L

(2.32)

L=A−1

=−

M +1 X

−q −1

L=A

L

qL

T1 (L, A, q) (−1)M +1 + T1 (M + 1, A, q). (q)L (q)M +1

Combining (2.31), (2.32) and replacing M by M − 1 yields M X L=A

−q

−1

L

q

L T1 (L, A, q)

(q)L

(−1)M T1 (M, A, q) = (q)M (1 − q M ) T1 (M − 1, A + 1, q) 1 + q −1−A (1 − q M ) T1 (M − 1, A − 1, q) , − 1 + q −1+A

−

(2.33)

252

G. E. Andrews, A. Berkovich

which is nothing else but (2.28) with 1 δe1 (L, M ) = θ(L ≤ M ) − qL , q L (−1)M γ e1 (L, M ) = T1 (M, A, q) (q)M

(2.34) (2.35)

(1 − q M ) T1 (M − 1, A + 1, q) 1 + q −1−A (1 − q M ) T (M − 1, A − 1, q) . − 1 1 + q −1+A

−

The proof of the second statement of Theorem 1 (1.13) follows again by the Bailey Transform argument (2.27) (with subindex 0 replaced by 1, everywhere). Unlike (1.12), Eq. (1.13) does not appear to be a defining relation for the new Bailey pair and therefore can not be iterated further. Finally, letting M tend to infinity in (1.12), (1.13) and using the limiting formula (2.15) we find Theorem 2. If a pair of sequences (e αa=0,1 , βea=0,1 ) is subject to identities (1.11) then ∞ X

(−1)L q 2 βe0 (L) = L

L=0

∞ (−1)∞ (−q)∞ X α e0 (r) r r 2 + q− 2 (q)2∞ q r=0

(2.36)

and ∞ X L=0

−q −1

q L βe1 (L) = L

∞ (−1)∞ (−q)∞ X 1 1 α e (r) − 1 (q)2∞ 1 + q r+1 1 + q r−1

(2.37)

r=0

hold.

3. Applications 3.1. Preliminaries. Recently it was shown [26] that Bailey’s lemma “connects” M (p, p+ 1) models of CFT with N = 1 SM (p+1, p+3) and N = 2 SM (p+1, 1) models of SCFT1 . In this section we shall demonstrate that Theorem 2 leads to very different relations between these models. We begin by collecting h A inecessary definitions and formulas. are defined as For A, B ∈ Z q-binomial coefficients B q hAi (q)A for 0 ≤ B ≤ A = (q)B (q)A−B (3.1) B q 0 otherwise . 1 Throughout this paper notations M (p, p0 ), N = 1 SM (p, p0 ) N = 2 SM (p, p0 ) stand for models of CFT and SCFT with central charges

1−

6(p0 − p)2 3 , pp0 2

1−

2(p − p0 )2 pp0

, 3 1−

2p0 p

respectively.

A Trinomial Analogue of Bailey’s Lemma

253

The following properties of q-binomials hAi lim

B 1/q hAi B

A→∞

q

= q B(B−A)

hAi B

q

,

1 (q)B

=

(3.2) (3.3)

are well known. Next we state some bosonic character formulas 0

M (p, p0 )[36 − 38] : χp,p r,s (q) =

∞ 0 1 X j(jpp0 +rp0 −sp) {q − q (jp+r)(jp +s) }, (q)∞ j=−∞

(3.4)

where p0 > p ≥ 2 are positive coprime integers and r ∈ {1, 2, . . . , p − 1}, s ∈ {1, 2, . . . , p0 − 1} are labels of irreducible highest weight representations ∈ (−q (br−bs) )∞ b p ,b p0 b (q) = N = 1 SM (b p , pb )[38 − 41] : χ b r ,b s (q)∞ ∞ X j(jb pb p 0 +b rb p 0 −b sb p) (jb p +b r )(jb p 0 +b s) 2 2 q , −q 0

(3.5)

j=−∞

where ∈a =

1/2 1

for a = 0 for a = 1

(mod 2) . (mod 2)

(3.6)

0

and p being coprime integers) and rb ∈ pb 0 > pb ≥ 2 are positive integers (with p −p 2 {1, 2, . . . , pb − 1}, sb ∈ {1, 2, . . . , pb 0 − 1}, ∈ ∈ −1 (−qe y)∞ (−qe y )∞ ∗ 2 (q)∞ ∞ p j+e r +e s X 2 1 − q 2e p +j(e r +e s) qj e , p j+e r )(1 + yq e p j+e s) (1 + y −1 qe j=−∞

N = 2 SM (e p , 1)†) [42 − 44] :

p ,1 χ ee (q, y) = e r ,e s

(3.7)

where pe ≥ 2 is a positive integer and e = 1/2, (1) in the A sector, re, se are half integers with 0 < re, se, re + se ≤ pe − 1; ∈ e = 1. 2 (2) in the P sector, re, se are integers with 0 < re − 1, se, re + se ≤ pe − 1; ∈ All N = 2 SM (e p , pe 0 > 1) characters were calculated by Ahn et al [46] in terms of fractional level string functions. However, for the vacuum sector for the N = 2 SM (e p , pe > 1) model (with pe > pe 0 ≥ 2; pe , pe 0 coprime) character formula similar to (3.7) (−q 1/2 y)∞ (−q 1/2 y −1 )∞ p ,e p0 χ ee (q, y) = ∗ e r ,e s (q)2∞ (3.8) 2 ∞ pe p 0 +j(e r +e s)e p0 p j+e r +e s X qj e (1 − q 2e ) ; re = se = 1/2 −1 q e p j+e r )(1 + yq e p j+e s) j=−∞ (1 + y 2

See [45] for the latest discussion regarding (3.7).

254

G. E. Andrews, A. Berkovich

was recently found in [47]. Presumably, (3.8) also holds for sufficiently small re, se ∈ Z+ 21 , such that the embedding diagram is the same as in vacuum case re = se = 1/2. There are many important differences between N = 2 SM (p, 1) and N = 2 SM (p, p0 > 1) models. In particular, in contrast to the N = 2 SM (p, 1) case, N = 2 SM (p, p0 > 1) models are neither unitary nor rational [46, 47]. Moreover, while characters (3.7) have nice modular properties [48], those of N = 2 SM (p, p0 > 1) do not. Nevertheless, one can show that the characters (3.8) are, in fact, mock theta functions3 , i.e. they exhibit sharp asymptotic behaviour when q(|q| < 1) tends to a rational point of the unit circle. While bosonic characters (3.4, 3.5, 3.7) were known for quite some time, new fermionic expressions for these characters became available only in the last few years. Existence of the fermionic representations suggests that the Hilbert space of (S)CFT can be described in terms of quasi-particles obeying Pauli’s exclusion principle. The equivalence of the bosonic and fermionic character formulas gives rise to many new q-series identities of Rogers-Ramanujan type. Remarkably, in many known cases these identities admit polynomial analogues which can be written as defining relations (1.11) for trinomial Bailey pairs. 3.2. Polynomial analogues of generalized G¨ollnitz-Gordon identities. N = 1 SM (2, 4ν), N = 2 SM (4ν, 1) relation. Many polynomial Fermi-Bose character identities for N = 1 SM (2, 4ν), ν ≥ 2 were derived in [34, 35]. Not to overburden our narrative with cumbersome notations we shall consider here only the simplest of these identities ∞ X

q

n2 1 2

ν P

−n1 N2 +

Nj2

j=2

n1 ,...,nν =0

=

∞ X

(− )j q νj

" # i P ν hN i Y ni + L + n 1 − 2 N j 2 = j=2 n1 q ni i=2 q 2

+j/2

(3.9)

{T0 (L, 2νj, q) + T0 (L, 2νj + 1, q)},

j=−∞

where

Nj = nj + nj+1 + . . . + nν .

(3.10)

Letting L in (3.9) tend to infinity yields ∞ X

q

n2 1 2

ν P

−n1 N2 +

Nj2

j=2

n1 ,...,nν =0

hN i 2

n1

q

1 = (q)n2 · · · (q)nν

∞ 2 (−q 1/2 )∞ X = (− )j q νj +j/2 = χ b2,4ν 1,2ν−1 (q), (q)∞ j=−∞

(3.11)

where we used (3.3) and a limiting formula lim {T0 (L, A, q) + T0 (L, A + 1, q)} =

L→∞

(−q 1/2 )∞ (q)∞

(3.12)

proven in [29]. Identity (3.11) is nothing else but Andrews generalization of G¨ollnitzGordon identities [49]. A moment’s reflection shows that (3.9) is in the form (1.11) with 3 Notion of mock theta function was introduced by Ramanujan in his last letter to Hardy, dated January 1920.

A Trinomial Analogue of Bailey’s Lemma

255

 j νj 2 j/2 −j/2  )   (−1) q (q + q 1 2 α e0 (r) = (−1)j q νj +j/2    2 (−1)j q νj −j/2

1 βe0 (L) = (q)L

∞ X

q

n2 1 2

−n1 N2 +

ν P

for r = 2νj, j > 0 for r = 0 . for r = 2νj + 1, j ≥ 0 for r = 2νj − 1, j ≥ 1

Nj2

j=2

n1 ,...,nν =0

(3.13)

" # i P ν hN i Y ni + L + n 1 − 2 N j 2 . j=2 n1 q ni j=2 q (3.14)

Substituting (3.13), (3.14) into (2.36) gives ∞ X L,n1 ,...,nν =0

(−1)L q (q)L =

L+n2 1 2

ν P

" # i P ν hN i Y ni + L + n 1 − 2 N j 2 = j=2 n1 q ni i=2 q ∞ (3.15) X q νj+1/2 q νj j νj 2 +j/2 (− ) q + = 1 + q 2νj 1 + q 2νj+1 j=−∞

−n1 N2 +

Nj2

j=2

(−1)∞ (−q)∞ (q)2∞

1/2 1/2 ) + q 1/2 χ e 4ν,1 ), =χ e 4ν,1 1/2,2ν+1/2 (q, q 3/2,2ν−1/2 (q, q

which establishes an advertised relation between N = 1 SM (2, 4ν) and N = 2 SM (4ν, 1) models of SCFT. Moreover, the left-hand side of Eq. (3.15) provides new fermionic companion form for N = 2 SM (4ν, 1) characters. This form is quite different from the known fermionic representation [22,26] given in terms of D4ν -Cartan matrix. 3.3. Trinomial Bailey flow from M (3, 4) (Ising) model to N = 2 SM (6, 1) model of SCFT. In [30] the following polynomial identity L h i ∞ X X 2 2 L q j /2 = q 6j +j (T0 (L, 6j, q) + T0 (L, 6j + 1, q)) j q j=0

−

j=−∞ ∞ X

(3.16) q

6j 2 +5j+1

(T0 (L, 6j + 2, q) + T0 (L, 6j + 3, q))

j=−∞

was proven. One may check that in the limit L → ∞ this identity reduces to Fermi-Bose character identity for M (3, 4) (Ising) model X q j 2 /2 j≥0

(q)j

∞ 2 (−q 1/2 )∞ X 6j 2 +j = (q − q 6j +5j+1 ) (q)∞ j=−∞

=

χ3,4 1,1 (q)

+q

1/2

(3.17)

χ3,4 2,1 (q).

The middle expression in (3.17) is remarkably similar to (3.5) with pb = 3, pb 0 = 4. This similarity suggests an interpretation of (3.17) as a character of some extended Virasoro algebra. It is straightforward to verify that (3.16) is the defining relation (1.11) for the trinomial pair

256

G. E. Andrews, A. Berkovich

α e0 (r) =

 2  q 6j (q j + q −j )    1    q 6j 2 +j

= 6j, j > 0 =0 = 6j + 1, j ≥ 0 , = 6j − 1, j > 0 = 6j + 2 and r = 6j + 3 , j ≥ 0 = 6j − 2 and r = 6j − 3 , j > 0 1 X j2 h L i q2 . βe0 (L) = j q (q)L

q 6j −j    2   −q 6j +5j+1   6j 2 −5j+1 −q 2

for r for r for r for r for r for r

(3.18)

(3.19)

j≥0

Next we apply Theorem 2 to the pair (3.18, 3.19), the result is  3j ∞ X L+j2 (−1)L h L i (−1)∞ (−q)∞  X 6j 2 +j q 3j+1/2 q q 2 = q +  (q)L j q (q)2∞ 1 + q 6j 1 + q 6j+1 j=−∞ L,j≥0  3j+1  ∞ 3j+3/2 X 2 q q − q 6j +5j+1 + 6j+2 1 + q 1 + q 6j+3  j=−∞ 1 3 − 21 6,1 2 2 =χ e 6,1 q, q + q q, q . χ e 1 3 1 3 2,2 2,2 (3.20) Recently, Warnaar proposed polynomial identities similar to (3.16) for all models M (p, p + 1), p ≥ 3 [17]. The p = 3 case is the one treated above. We have checked that his conjecture implies the following identities:  p−2 P L+m2 1 −Lm + m1 m2 + 1  mi (2mi −mi−1 −mi+1 ) 1 X 2 4 4 i=2 q  L,m1 ,...,mp−2 ≥0 ) p−2 (−1)L h L i Y h mi−12+mi+1 i = mi (q)L m1 q q i=2

=q

∞ (−1)∞ (−q)∞ X j 2 p(p−1)+j(p−1) 1 − q 4pj+2 , q 2 (q)∞ (1 + q 2pj )(1 + q 2pj+2 ) j=−∞

where mp−1 ≡ 0. Therefore the Trinomial Bailey flow for p = 1 (mod 2) is p−1 M (p, p + 1) ←→ N = 2 SM 2p, . 2

(3.21)

(3.22)

This is to be contrasted with the Bailey flow discussed in [26] where one has M (p, p + 1) ←→ N = 2 SM (p + 1, 1).

(3.23)

3.4. Results related to Rogers-Ramanujan identities. It is well known that the RogersRamanujan identities X q j(j+a) j≥0

(q)j

=

∞ 1 X j(10j+1+2a) − q (2j+1)(5j+2−a) q (q)∞ j=−∞

= χ2,5 1,2−a (q) ; a = 0, 1

(3.24)

A Trinomial Analogue of Bailey’s Lemma

257

admit the polynomial analogues X

q j(j+a)

h 2L − j − a i j

j≥0

q

∞ X

q j(10j+1+2a)

=

j=−∞

−q

h

i 2L L − 5j − a q

(2j+1)(5j+2−a)

h

i 2L , L − 5j − 2 q

(3.25)

which reduce to (3.24) as L → ∞. It is rather surprising that polynomials appearing in (3.25) have a q-trinomial representation as well [30]. In particular, for a = 0, one has X

q j2

h 2L − j i j

j≥0

∞ X

q 60j

=

q

2

−4j

L, 10j; q 2

j=−∞

− q 60j +q

2

+44j+8

60j 2 +16j+1

− q 60j

2

10j

2

L, 10j + 4; q 2 10j + 4 L, 10j + 1; q 2 2

+64j+17

(3.26)

10j + 1 2 L, 10j + 5; q 2 10j + 5

2

Identity (3.26) is not of the form (1.11). However, if we replace q by multiply the result by q X

q

j2 2

hL + j i

j≥0

2j

√

q

L /2 2

√1 q

in (3.26) and

we obtain with the help of (2.7), (3.2),

∞ X

=

.

q 20j

2

+2j

(T0 (L, 10j, q) + T0 (L, 10j + 1, q))

j=−∞ ∞ X

−

(3.27) q

20j 2 +18j+4

(T0 (L, 10j + 4, q) + T0 (L, 10j + 5, q)),

j=−∞

which gives rise to trinomial Bailey pair βe0 (L) =

1 X j2 h L + j i q2 , 2j √q (q)L

(3.28)

j≥0

α e0 (r) =

 2  q 20j (q 2j + q −2j )     1    20j 2 +2j q        

q 20j 2

2

−2j

−q 20j +18j+4 2 −q 20j −18j+4

for r for r for r for r for r for r

Next we apply Theorem 2 to derive

= 10j, j > 0 =0 = 10j + 1, j ≥ 0 . = 10j − 1, j > 0 = 10j + 4 and r = 10j + 5 , j ≥ 0 = 10j − 4 and r = 10j − 5 , j > 0

(3.29)

258

G. E. Andrews, A. Berkovich

X

q

L+j 2 2

L,j≥0

(−1)L h L + j i = 2j √q (q)L  ∞ (−1)∞ (−q)∞  X 20j 2 +2j q =  (q)2∞ j=−∞

−

∞ X j=−∞

q 20j

2

+18j+4

q 5j+2 1 + q 10j+4

1

q 5j+ 2 q 5j + 1 + q 10j 1 + q 10j+1 ! 5  q 5j+ 2 . + 1 + q 10j+5 

! (3.30)

We note that the expression on the right-hand side of (3.30) bears a strong resemblance to formula (3.8) with pb = 10, pb 0 = 2. It is also similar in form to the 8(q) and 9(q) considered by Ramanujan in his development of the fifth-order mock theta functions [50]. Expressions in (3.30) are not modular functions. Nevertheless, using a Poisson summation formula one can show that asymptotic behaviour of (3.30) can be neatly expressed in terms of exponential forms. For instance, when q = e−t and t → 0 we have for (3.30) r π 9π2 2 cos e 20t . (3.31) 5πt 10 Proof of (3.31) along with asymptotic analysis of nonunitary characters (3.8) will be given elsewhere. 4. Discussion It is widely believed that different fermionic expressions for the (super) conformal character are related to different integrable pertubations of the same (super) conformal model. Thus, it would be interesting to identify perturbations which correspond to the new fermionic representations for N = 2 SM (4ν, 1) characters found in Sect. 3.2. Furthermore, following [26,27], it is tempting to interpret the relations N = 1 SM (2, 4ν) ←→ N = 2 SM (4ν, 1) p−1 , p = 1 (mod 2) M (p, p + 1) ←→ N = 2 SM 2p, 2

(4.1)

established here as massless renormalization group flows. If such an interpretation is indeed correct, then one should be able to carry out Thermodynamic Bethe Ansatz (TBA) analysis of these flows along the lines of [51]. We expect that related TBA systems will have the same incidence structure as that of fermionic forms discussed in Sect. 3. Also, we would like to point out that a “folding in half” relation between N = 2 SM (4ν, 1) and N = 1 SM (2, 4ν) models has already been noticed in [52,53]. Partition theoretical interpretation of our results will undoubtably lead to construction of subtractionless bases for N = 2 super Virasoro modules. From the mathematical point of view it is highly desirable to find an appropriate qhypergeometric background for the Trinomial analogue of Bailey’s lemma. Recall that the classical Bailey’s lemma is intimately related to the q-Pfaff-Saalsch¨utz formula n X (a)n (b)n (q −n )j (c/a)j (c/b)j j q = , (q)j (c)j (cq 1−n /ab)j (c)n (ab/c)n j=0

(4.2)

A Trinomial Analogue of Bailey’s Lemma

259

which was first derived by Jackson [54]. In this direction we have already determined that ∞ X

(−q −n )L q

(1+2n)L 2

L=0 ∞ X

(−q −n )L q nL βe1 (L) =

L=0

βe0 (L) =

1+2n ∞ (−q n+1 )2∞ X q 2 r (−q −n )r α e0 (r), (q)∞ (q 2n+1 )∞ (−q n+1 )r

(4.3)

r=0

∞ X (−q n+1 )2∞ q nr (1 + q r )(−q −n )r α e1 (r) (4.4) (1 − q n )(q)∞ (q 2n+1 )∞ (−q n+1 )r r=0

with nonnegative integral n in (4.3) and positive integral n in (4.4). Identities (4.3, 4.4) can be derived from (2.12, 2.13, 2.15, 2.22) after a bit of labour. It immediately follows that q n in (4.3, 4.4) may be replaced by an arbitrary parameter, say ρ. Details will be given elsewhere [55]. Building on a proposal made in [32], Schilling and Warnaar defined and extensively studied q-multinomials [56–58]. One may wonder if these new objects will lead to additional generalizations of Bailey’s lemma. We strongly believe that the answer is “yes” and hope to say more about it in a subsequent paper. Note Added. Soon after this paper was completed, Warnaar [59] provided a simple and elegant proof of the conjecture from [17] used in deriving (3.21). Moreover, he has shown that each ordinary Bailey pair gives rise to a trinomial Bailey pair. In particular, he demonstrated that the trinomial Bailey pair (3.18–19) is a “descendant” of the A(1) and A(2) Bailey pairs of Slater’s list [4]. Acknowledgement. We would like to thank O. Foda, B.M. McCoy, Z. Reti, K. Voss, and S.O. Warnaar for interesting discussions and helpful comments.

References 1. Bailey, W.N.: Proc. London Math. Soc. (2) 49, 421 (1947) 2. Bailey, W.W.: Proc. London Math. Soc (2) 50, 1 (1949) 3. Dyson, F.J.: In: Ramanujan Revisited. Ed. by G. E. Andrews et al., London–New York: Academic Press, 1988 p. 7 4. Slater, L.J.: Proc. London Math. Soc. (2) 53, 460 (1951) 5. Slater, L.J.: Proc. London Math. Soc. (2) 54, 147 (1952) 6. Andrews, G.E.: Pac. Journ. Math. 114, 267 (1984) 7. Agarwal, A.K., Andrews, G.E., Bressoud, D.M.: J. Indian Math. Soc. 51, 57 (1987) 8. Bressoud, D.M.: In: Ramanujan Revisited. Ed. by G. E. Andrews et al., London–New York: Academic Press, 1988 p. 57 9. Andrews, G.E.: q-series: Their development and application in analysis, number theory, combinatorics, physics and computer algebra. Providence, Rhode Island: American Math. Society, 1986 10. Paule, P.: J. Math. Anal. Appl. 107, 225 (1985) 11. Milne, S.C., Lilly, G.M.: Bull. Amer. Math. Soc. 26, 258 (1992) 12. Melzer, E.: Int. J. Mod. Phys. A9, 1115 (1994) 13. Berkovich, A.: Nucl. Phys. B431, 315 (1994) 14. Foda, O., Quano, Y.-H.: Int. J. Mod. Phys. A10, 2291 (1995) 15. Kirillov, A.N.: Prog. Theor. Phys. Suppl. 118, 61 (1995) 16. Warnaar, S.O.: J. Stat. Phys. 82, 657 (1996) 17. Warnaar, S.O.: J. Stat. Phys. 84, 49 (1996) 18. Berkovich, A., McCoy, B.M.: Lett. Math. Phys. 37, 49 (1996) 19. Berkovich, A., McCoy, B.M., Schilling, A.: Rogers-Schur-Ramanujan type identities for M (p, p0 ) minimal models of conformal field theory. q-alg/9607020, Commun. Math. Phys. (to appear)

260

G. E. Andrews, A. Berkovich

20. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Phys. Lett. B304, 263 (1993) 21. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Phys. Lett. B307, 68 (1993) 22. Dasmahapatra, S., Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Int. J. Mod. Phys. B7, 3677 (1993) 23. Foda, O., Quano, Y.-H.: Int. J. of Mod. Phys. A12, 1651 (1997) 24. Schilling, A., Warnaar, S.O.: Int. J. Mod. Phys. B11, 189 (1997) 25. Schilling, A., Warnaar, S.O.: A higher level Bailey lemma: Proof and Applications. q-alg/9607019, Ramanujan J. (to appear) 26. Berkovich, A., McCoy, B.M., Schilling, A.: Physica A228, 33 (1996) 27. Chin, L.: Central charge and the Andrews-Bailey Constructions hep-th/9607168 28. Berkovich, A., McCoy, B.M., Schilling, A., Warnaar, S.O.: Bailey flows and Bose-Fermi identities for the (1) (1) conformal coset models (A(1) 1 )N × (A1 )N 0 /(A1 )N +N 0 . hep-th/9702026, Nucl. Phys. B (to appear) 29. Andrews, G.E., Baxter, R.J.: J. Stat. Phys. 47, 297 (1987) 30. Andrews, G.E.: J. Amer. Math. Soc. 3, 653 (1990) 31. Andrews, G.E.: In: Analytic Number Theory. B. Berndt et al eds., Boston: Birkh¨auser, 1990, pp. 1–11 32. Andrews, G.E. Contemp. Math.166, 141 (1994) 33. Warnaar, S.O., Pearce, P.A.: J. Phys. A27, L891 (1994) 34. Berkovich, A., McCoy, B.M., Orrick, W.P.: J. Stat. Phys. 83, 795 (1996) 35. Berkovich, A., McCoy, B.M.: Generalizations of Andrews-Bressoud Identities for N = 1 Superconformal Model SM (2, 4ν). hep-th/9508110, Int. J. of Math. and Comp. Modelling (to appear) 36. Feigin, B.L., Fuchs, D.B.: Funct. Anal. Appl. 17, 241 (1983) 37. Rocha-Caridi, A.: In: Vertex Operators in Mathematics and Physics. ed. J. Lepowsky et al. Berlin: Springer, 1985 38. Dobrev, V.K.: Suppl. Rendiconti Circolo Matematici di Palermo, Serie II, Numero 14, 25 (1987) 39. Feigin, B.L., Fuchs, D.B.: Func. Anal. Appl. 16, 114 (1982) 40. Goddard, P., Kent, A., Olive, D.: Commun. Math. Phys. 103, 105 (1986) 41. Meurman, A., Rocha-Caridi, A.: Commun. Math. Phys. 107, 263 (1986) 42. Dobrev,V.K.: Phys. Lett. B186, 43 (1987) 43. Matsuo, Y.: Prog. Theor. Phys. 77, 793 (1987) 44. Kiritsis, E.B.: Int. J. Mod. Phys. A3, 1871 (1988) 45. D¨orrzapf, M.. Commun. Math. Phys. 180, 195 (1996) 46. Ahn, C., Chung, S., Tye, S.-H.: Nucl. Phys. B365, 191 (1991) 47. Eholzer, W., Gaberdiel, M.R.: Unitarity of rational N = 2 superconformal theories. hep-th/9601163 48. Ravanini, F., Yang, S.-K.: Phys. Lett. B195, 202 (1987) 49. Andrews, G.E.: The theory of partitions. London: Addison-Wesley, 1967 50. Andrews, G.E., Garvan, F.G.: Adv. in Math. 73, 242 (1989) 51. Zamolodchikov, A.: Nucl. Phys. B358, 524 (1991) 52. Melzer, E.: Supersymmetric Analogs of the Gordon-Andrews Identities, and related TBA systems. hep-th/9412154 53. Moriconi, M., Schoutens, K.: Nucl. Phys. B464, 472 (1996) 54. Jackson, F.H.: Messenger of Math. 39, 745 (1910) 55. Andrews, G.E., Berkovich, A.: In preparation 56. Schilling, A.: Nucl. Phys. B467, 247 (1996) 57. Warnaar, S.O.: Commun. Math. Phys. 184, 203 (1997) 58. Schilling, A., Warnaar, S.O.: Supernomial coefficients, polynomial identities and q-series. q-alg/9701007 59. Warnaar, S.O.: A note on the trinomial analogue of Bailey’s lemma. info q-alg/9702021 Communicated by T. Miwa

Commun. Math. Phys. 192, 261 – 285 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Operator of Fractional Derivative in the Complex Plane Petr Z´avada Institute of Physics, Academy of Sciences of Czech Republic, Na Slovance 2, CZ-180 40 Prague 8, Czech Republic. E-mail: [email protected] Received: 31 July 1996 / Accepted: 30 June 1997

Abstract: The paper deals with a fractional derivative introduced by means of the Fourier transform. The explicit form of the kernel of the general derivative operator acting on the functions analytic on a curve in the complex plane is deduced and the correspondence with some well known approaches is shown. In particular, it is shown how the uniqueness of the operation depends on the derivative order type (integer, rational, irrational, complex) and the number of poles of the considered function in the complex plane. 1. Introduction The fractional differentiation and integration (also called fractional calculus) is a notion almost more as old as the ordinary differential and integral calculus. Naturally, each slightly gifted student who has just understood what is the first and second derivative can ask the question: well, but what is, for example, a 1.5-fold derivative? There are several ways to answer such a question. An excellent review on the theory of arbitrary order differentiation (generally of complex order) including also interesting historical notes and a comprehensive list of references to original papers (more than a thousand items) is given in a recently published monograph [8]. Some new results were presented also in a recent conference [5] dedicated to this topic. The fractional calculus has also plenty of applications, see also e.g. [6, 4, 9, 10] and citations therein. Possible use in quantum mechanics and the field theory is discussed in [11]. Recently, the fractional derivative was mentioned also in [2] as a particular case of pseudo-differential operators applied in non-local field theory. Apparently, the general prescription for a definition of fractional derivative is using some representation of an ordinary n-fold derivative (primitive function) which can be in some natural way interpolated to an n-non integer. Actually, following the mentioned monograph, all the known approaches are always somehow connected with some of the following relations.

262

P. Z´avada

1) The well known formula for an n-fold integral Z x1 Z x1 Z x2 Z xn 1 dx2 dx3 ... f (t)dt = (x1 − t)n−1 f (t)dt 0(n) a a a a

(1.1)

allows the substitution of n by some real α > 0. In this way fractional integration is introduced. Then fractional derivatives can be obtained by ordinary differentiation of fractional integrals. This is the basis of the construction known as the Riemann–Liouville fractional calculus. Let us note that in this approach the resulting function even in the case of fractional derivatives in general depends on fixing the integration limit a on the right hand side of (1.1). 2) The Cauchy formula for analytic functions in some region of the complex plane Z 0(n + 1) f (z)dz (1.2) f (n) (z0 ) = 2iπ (z − z0 )n+1 C in principle enables generalization to fractional derivatives, nevertheless the direct extension to non-integer values of n leads to difficulties arising from multivaluedness of the term (z − z0 )α+1 , and the result also depends on the choice of the cut and integration curve. 3) Analytic continuation of the derivative (integral) of the exponential and power function dα exp(cz) = cα exp(cz), dz α

dα 0(β + 1) (z − c)β−α . (z − c)β = α dz 0(β − α + 1)

(1.3)

Obviously these relations allow to define fractional derivatives of the functions which can be expressed as linear combinations of power and exponential functions. Also this approach is not completely consistent, as can be illustrated by fractional derivative of exponential function expanded to the power series exp(cz) =

∞ X (cz)k , 0(k + 1)

(1.4)

k=0

but for α-non integer, ∞

X (cz)k−α dα dα α α = exp(cz) = c exp(cz) = 6 c dz α 0(k − α + 1) dz α k=0

∞ X (cz)k 0(k + 1)

! .

(1.5)

k=0

So the task of extrapolation of the integer derivative order to an arbitrary one has no unique solution. The approach proposed in this paper is based on the fractional derivative of the exponential function (1.3) entering the Fourier transform of a given function. On this basis in Sect. 2 the explicit form of the kernel of the fractional derivative operator is deduced. In Sect. 3 the composition relation for the derivative operator is proved. The generalization of the case of the functions on the real axis to the case of the function on the complex plane is done in Sect. 4, which is concluded by the theorem summarizing the results. The last section is devoted to the discussion of some consequences following from the theorem and to a comparison with known approaches as well.

Operator of Fractional Derivative in the Complex Plane

263

2. Definition of the Fractional Derivative by Means of Fourier Transform Let f (x) be a function having the Fourier picture fe(k) : Z +∞ Z +∞ 1 e f (x) exp(ikx)dx, f (x) = fe(k) exp(−ikx)dk. f (k) = 2π −∞ −∞ Then let us create the function Z +∞ 1 α (−ik)α fe(k) exp(−ikx)dk, f (x) = 2π −∞ and define Dα (w) =

1 2π

Z

α > −1,

(2.1)

(2.2)

+∞

(−ik)α exp(−ikw)dk.

(2.3)

The function f α (x) can be formally expressed Z +∞ Dα (x − y)f (y)dy. f α (x) = Dα f =

(2.4)

−∞

−∞

Now let us calculate the integral (2.3), which depends on the way of passing about the singularity k = 0 and the choice of the branch and cut orientation of the function k α . To begin let us assume the cut is given by the half line either (0, −∞) or (0, +∞). For complex functions ξ α = (ξ1 + iξ2 )α we shall accept the phase convention lim (ξ1 + iξ2 )α =| ξ1α |

ξ2 →0+

ξ1 , ξ2 ≥ 0,

(2.5)

i.e. it holds ξ ≥ 0 cut orientation ξ<0 exp(+iπα) 1 (0, +∞) lim (ξ + i) / | ξ |= , exp(+iπα) 1 (0, −∞) →0+ exp(+iπα) exp(2iπα) (0, +∞) α α . lim (ξ − i) / | ξ |= exp(−iπα) 1 (0, −∞) →0+ α

α

Let us define α (w) = D±

(−1)α 2π

Z

(2.6)

+∞±i0

(ik)α exp(−ikw)dk.

(2.7)

−∞±i0 α

If we accept the phase convention (2.6) for k in the integral (2.7), then the uncertainty of phase of the expression (2.7) is involved only in the factor (−1)α = exp(iα [2n + 1] π),

(2.8)

where n is any integer number. Within arbitrariness given by (2.8) the functions D+α , α do not depend on the cut orientation and it holds D− D+α (w) = 0 α (w) = 0 D−

for w < 0, for w > 0,

(2.9)

which is evident from the corresponding integrals having paths closed in infinity – for α ) in the upper (lower) half-plane. Now we shall calculate the integrals (2.7) in D+α (D− the remaining regions of w. Let us split them into two parts:

264

P. Z´avada

α D± (w)

(−1)α = 2π

"Z

Z

0±i0

α

(ik) exp(−ikw)dk ,

(ik) exp(−ikw)dk + −∞±i0

#

+∞±i0

α

0±i0

(2.10) and substitute the real parameter w for the complex one z1 = w + i z2 = w − i

for k < 0 for k > 0,

(2.11)

where > 0. This substitution ensures absolute convergence of both integrals in (2.10). Next let us make in (2.10) the further substitution ξ = ikz1 ξ = ikz2

for k < 0 for k > 0.

(2.12)

α (w) we get functions depending also on : In this way instead of D± Z Z (−1)α 1 1 α α α E± (w, ) = ξ exp(−ξ)dξ + ξ exp(−ξ)dξ . 2πi z1α+1 K1 z2α+1 K2

(2.13)

We assume a) The cut of the complex function k α is given by half-line (0,+∞). b) Function values z1α , z2α are considered values of one complex function z α in two different points. According to the cut orientation of z α one can get either z2α = (z1α )∗ or z2α = (z1α )∗ exp(2iπα). Cut of the function z α is assumed (0, +∞) (0, −∞)

for E+α α for E− .

(2.14)

Later on we shall come back to these assumptions and judge how they affected our result. Assumptions concerning the cuts of k α , z α correspond to phases or to the intervals of phases of variables k, z and correspondingly to phases of the variable ξ. All considered possibilities are summarized in Table 1. Table 1. The phase correspondence of variables ξ, z, k depending on the cut orientation integral E+α

α E−

variable

k<0

k>0

k

π

0

z

(0,π)

(π,2π)

ξ

(3π/2,5π/2)

(3π/2,5π/2)

k

π

2π

z

(0,π)

(−π,0)

ξ

(3π/2,5π/2)

(3π/2,5π/2)

The corresponding integration paths are shown in Fig. 1. Instead of the interval (3π/2, 5π/2) for arg ξ we took the interval (−π/2, π/2) since the corresponding phase shift is the same for both integrals in (2.13) and can be included in the factor (−1)α ahead of the integrals. Obviously, it holds Z Z Z ∞ ξ α exp(−ξ)dξ = − ξ α exp(−ξ)dξ = ξ α exp(−ξ)dξ = 0(α + 1). (2.15) K2

K1

0

Operator of Fractional Derivative in the Complex Plane

265

Im ξ K2′

K2

←K2″ K1″→

Re ξ K1′

K1

Fig. 1. Integration paths in Eqs. (2.13), (2.15)

After inserting into (2.13) we get α E± (w, )

(−1)α+1 0(α + 1) = 2iπ

1 1 − (w + i)α+1 (w − i)α+1

,

> 0.

(2.16)

We shall regard the integrals (2.7) as the limits α α D± (w) = lim E± (w, ), →0+

(2.17)

hence the kernel of the operator (2.3) which can be also considered the generalized function is symbolically written (−1)α+1 0(α + 1) 1 1 α , (2.18) − D± (w) = 2iπ (w + i0)α+1 (w − i0)α+1 where the two modes correspond to different cuts of (w + i)α+1 , D+α (w) α (w) D−

for cut (0, +∞) for cut (0, −∞).

(2.19)

Let us note the function (2.18) is well defined for any complex α = α1 + iα2 6= −1, −2, . . ., but for the beginning we assume α2 = 0. Now let us go back to the assumptions a), b) which imply result (2.16). a)Assuming the opposite cut orientation (0,−∞) for k α in Eq. (2.10) and repeating the corresponding sequence of steps does not change anything for E+α (w, ) whereas in α (w, ) the phase arg ξ will shift by −2π in both integrals in (2.13), but the case of E− this change can be included in the factor (−1)α ahead of the integral. b) Let us assume the cuts in z α having the opposite orientation than in (2.14). Let us take e.g., function E+α (w, ), then the phase of ξ complies with 3π/2 < arg ξ < 5π/2 −π/2 < arg ξ < π/2

for k > 0 for k < 0,

(2.20)

266

P. Z´avada

i.e. the phase of the second integral in (2.13) is now shifted by −2πα, and instead of (2.16) we get (−1)α+1 0(α + 1) exp(−2iπα) 1 α − , > 0. (2.21) E+ (w, ) = 2iπ (w + i)α+1 (w − i)α+1 Obviously this function for w ∈ (−∞, +∞) with an assumed cut of z α on (0, −∞) is identically equal to the function (2.16) with the cut on (0, +∞). The analogous result α . can be obtained also for E− So we have shown that the result (2.16) of the integration (2.7) does not depend on the choice of cut orientation of the functions k α and wα . The cut orientation of wα in (2.16) and (2.18) is dictated only by the way of passing around the singularity k = 0 in initial integrals (2.7) and the correspondence (2.19) always holds. Next let us notice the α α , E± . Obviously it holds important property of the functions D± d dw d dw

α α+1 (w, ) = E± (w, ), E±

(2.22) α (w) D±

=

α+1 D± (w).

Now let us go back to Eq. (2.4) and consider how to calculate this integral. It is possible either a) First to find the difference of both terms in (2.16), then take the integration and limit for → 0, or b) First to calculate both integrals independently, then take the limit of their difference. Let us discuss both ways and compare the results. a) The action of operator Dα ± as the integral of difference. We make a calculation separately for the two cases: a1) α = n ≥ 0 is an integer number. In that case the complex function wα has no cuts (we can omit the subscript ±) and for n = 0 we can write E 0 (w, ) =

1 · , π w 2 + 2

(2.23)

i.e. for → 0 we get the known representation of the δ-function (see e.g. [3], p.35) δ(w) = lim

→0+

1 · π w 2 + 2

which acts on the function f , Z Z +∞ 1 +∞ δ(x − y)f (y)dy = lim f (y)dy = f (x). 2 + 2 →0+ π (x − y) −∞ −∞ Using Eqs. (2.22)−(2.25) one can easily show Z +∞ dn f (x) = lim E n (x − y, )f (y)dy →0+ −∞ dxn for any n ≥ 0 for which the integral converges. So it is possible to identify

(2.24)

(2.25)

(2.26)

Operator of Fractional Derivative in the Complex Plane

Dn (w) = dn f (x) = dxn

Z

267

dn δ(w), dwn

+∞ −∞

(2.27)

Dn (x − y)f (y)dy,

n

i.e. action of the operator D corresponds to an n-fold derivative. a2) α is a real, non integer number. Taking into account the cut orientations (2.19), calculation of limits (2.17) gives α (w) = 0, D− α D+ (w) = 0,

w > 0, w < 0,

α (w) = − D−

w < 0,

0(α + 1) sin([α + 1]π) , πwα+1 0(α + 1) sin([α + 1]π) , D+α (w) = + πwα+1

(2.28)

w > 0.

The first two equations corresponds with Eqs. (2.9), therefore again it confirms the correct correspondence of cuts in (2.19). After inserting (2.28) into Eq. (2.4) we get Z 0(α + 1) sin([α + 1]π) ∓∞ f (y)dy α (x) = − . (2.29) f± π (x − y)α+1 x For y = x and α ≥ 0 the integral has a singularity. The method for regularization of this integral will become apparent in the next part. b) The action of operator Dα ± as a difference of two integrals. After inserting of (2.18) into (2.4) we get (−1)α+1 0(α + 1) α (x) = lim f± →0+ 2iπ (2.30) Z +∞ Z +∞ f (y)dy f (y)dy − . α+1 α+1 −∞ (x − y + i) −∞ (x − y − i) We assume for the present the function f (y) is analytic on the whole real axis. The last equation can be rewritten Z +∞−i Z +∞+i f (z)dz f (z)dz (−1)α+1 0(α + 1) α . − f± (x) = lim α+1 α+1 →0+ 2iπ −∞−i (x − z) −∞+i (x − z) (2.31) Again let us separate two cases: b1) α = n ≥ 0 is an integer number. In Eq. (2.31) we can link up both integration paths in z = ±∞ and write Z dn f (−1)n+1 0(n + 1) f (z)dz = , (2.32) f n (x) = lim n+1 →0+ 2iπ dxn C (x − z) where C is any closed curve enclosing the singular point. Therefore for n ≥ 0 the operator Dn can be again identified with an ordinary n-fold derivative. b2) α is a real, non integer number. Similarly, as in case b1) we get the integrals Z (−1)α+1 0(α + 1) f (z)dz α (x) = lim , (2.33) f± α+1 →0+ 2iπ C± (x − z)

268

P. Z´avada

where the paths C± can pass about the corresponding cut as shown in Fig. 2. The integrals converge provided that f (z) = 0. (2.34) lim z→±∞ z α The integration paths can be decomposed into three parts L1± , L2± , L3± and after evaluation of the corresponding integrals one gets Z ∓∞ (±1)α f (x) 0(α + 1) sin([α + 1]π) f (z)dz α + , (2.35) f± (x) = − lim α+1 →0+ π αα x∓ (x − z) or using the relation (A.10) from the Appendix,

a

Im z

← L3 +

L1 + →

x

L2-

ε L2 +

b

Im z

x

Re z

← L1-

ε

L3- →

Re z

.

(2.36)

Fig. 2. Integration paths in Eq. (2.23): a) C+ , b)C−

1 →0+ 0(−α)

α f± (x) = − lim

Z

∓∞ x∓

(±1)α f (x) f (z)dz + α+1 (x − z) αα

For α < 0 the integrals are finite and the second terms vanish. In this case Eq. (2.35) is identical with Eq. (2.29). On the other hand we have assumed the initial integral (2.33) is finite, therefore the sum of both terms in (2.35) is also finite even for any α > 0. In this sense by addition of the second term the integral in (2.29) can be regularized. Note that in contradistinction to (2.18) the relation (2.36) is well defined also for α = −1, −2, −3, . . ., but α 6= 0, 1, 2, 3, . . .. The last equation for the α negative integer Z x 1 −n (x) = (x − z)n−1 f (z)dz (2.37) f± 0(n) ∓∞ is a special case of the n-fold integral formula (1.1). Now, let us assume 0 < α < 1 and calculate integral (2.33) by parts. Obviously ∓∞± i0 Z Z 1 f (z)dz f 0 (z)dz f (z) = − + (2.38) α+1 α C± (x − z)α (x − z)α ∓∞∓ i0 C± (x − z) and the integral on the right side is finite. Any α can be written as the sum of an integer and fractional part, α = n + 1α,

n = [α],

0 ≤ 1α < 1.

(2.39)

For n ≥ 0 we can repeat integration by parts n + 1 times and if the function f meets the requirements,

Operator of Fractional Derivative in the Complex Plane

269

f p (z) =0 z→±∞ (x − z)α−p

for p = 0, 1, ...n,

lim

(2.40)

then instead of (2.35) we get (−1)1α 0(1α) →0+ 2iπ

α f± (x) = lim

=−

0(1α) sin(1απ) π

Z

Z

f n+1 (z)dz = (x − z)1α

C±

∓∞ x

f n+1 (z)dz . (x − z)1α

(2.41)

Let us note that the integrals (2.33) can be modified also in another way. We assume that all integrals Z f (z)dz , γ = 1α, 1α + 1, ...α + 1 (2.42) I± (γ, x) = (x − z)γ C± converge, then the recurrent relation holds: d I± (γ, x) = −γI± (γ + 1, x). dx

(2.43)

Application of this relation for integral (2.33) and α = n + 1α gives (−1)1α 0(1α) →0+ 2iπ

α (x) = lim f±

=−

0(1α) sin(1απ) π

d dx

d dx

n+1 Z

n+1 Z

C± ∓∞ x

f (z)dz = (x − z)1α

f (z)dz . (x − z)1α

(2.44)

on n+1 - fold derivative of the Actually in Eq. (2.41) we apply the operation D1α−1 ± function f whereas in Eq. (2.44) both operations are interchanged. Using relation (A.10), obviously we can modify both equations: Z

x

1 f n+1 (z)dz = (x − z)1α 0(1 − 1α)

n+1 Z

x

f (z)dz . (x − z)1α ∓∞ ∓∞ (2.45) So we have shown the operator Dα with the kernel (2.18) can be considered a continuous interpolation of the ordinary n-fold derivative (integral) of the functions analytic on the real axis and fulfilling the condition (2.34). α (x) = f±

1 0(1 − 1α)

d dx

3. Composition of Fractional Derivatives In this section we shall investigate how the composition of the operators Dα is realized in the representation given by Eq. (2.18). Therefore we shall deal with the integrals Z +∞ Dα (x − y)Dβ (y − z)dy. (3.1) I= −∞

Let us denote

270

P. Z´avada

1 , (w + iτ )γ 1 h• (γ, w) = − , (w − iτ )γ h• (γ, w) =

and I1 I2 I3 I4 then I=

a

τ >0

(3.2)

R +∞ = −∞ h• (α + 1, x − y)h• (β + 1, y − z)dy, R +∞ = −∞ h• (α + 1, x − y)h• (β + 1, y − z)dy, R +∞ = −∞ h• (α + 1, x − y)h• (β + 1, y − z)dy, R +∞ = −∞ h• (α + 1, x − y)h• (β + 1, y − z)dy,

(3.3)

(−1)α+β+1 0(α + 1)0(β + 1) (I1 + I2 + I3 + I4 ). 4π 2

(3.4)

b

Im z

Im z

z1

K'

K z1

K

L ϕ

Re z

Re z

z2 z2

Fig. 3. Integration paths in Eq. (3.5): a) case when the integral vanishes, b) case leading to the result (3.8)

Now let us calculate the more general integral Z dz , J= α+1 (z − z )β+1 (z − z) 2 1 K

(3.5)

where K is an arbitrary line in the complex plane, z1 , z2 any two (diverse) points and let α + β > −1. We also assume the cuts of the function in the integral do not intersect the line K. Then there are two possibilities: a) K is not passing between the points z1 , z2 , see Fig. 3a. Then obviously J = 0, since the line K can be closed in infinity by the arc in the half plane which does not contain singularities z1 , z2 . b) K is passing between the points z1 , z2 , see Fig. 3b. K 0 denotes line crossing the segment hz1 , z2 i perpendicularly at its center. If we assume that the cuts do not intersect any of both lines K, K 0 , then in the integral (3.5) path K can be substituted by K 0 . Further, if we denote z0 = (z1 + z2 )/2,

r exp(iϕ) = (z2 − z1 )/2,

(3.6)

and substitute z = z0 + it exp(iϕ), then J=

i exp[iϕ(α + β + 1)]

Z

+∞

−∞

dt . (r − it)α+1 (r + it)β+1

(3.7)

Operator of Fractional Derivative in the Complex Plane

271

The last integral can be found in tables (see e.g. [7], p.301), using (3.6) we obtain 2iπ 0(α + β + 1) . α+β+1 (z2 − z1 ) 0(α + 1)0(β + 1)

J=

(3.8)

Let us note that the opposite orientation of the line K (i.e. point z2 on the left side with respect to the direction of K, as the line L in Fig. 3b) should give result (3.8) with opposite sign. Obviously the integrals I3 , I4 vanish and for I1 , I2 we get I1 =

−2iπ 0(α + β + 1) , (x − z + 2iτ )α+β+1 0(α + 1)0(β + 1)

(3.9)

I2 =

+2iπ 0(α + β + 1) , α+β+1 (x − z − 2iτ ) 0(α + 1)0(β + 1)

(3.10)

and inserting into (3.4) gives I=

(−1)α+β+1 0(α + β + 1) 2iπ

1 1 − (x − z + 2iτ )α+β+1 (x − z − 2iτ )α+β+1

. (3.11)

For τ → 0 this expression corresponds to Dα+β (x − z) in (2.18). This result is formally correct, nevertheless its drawback is that it does not reflect the correspondence of the cut orientations in the initial expressions (3.1) and the final (3.11). More rigorous discussion about the cuts we postpone to the Appendix, and here give only the result. The following composition relation holds for operators (2.18) with equally oriented cuts: Z α+β (x − z) = D±

+∞ −∞

β α D± (x − y)D± (y − z)dy

α, β 6= −1, −2, −3...;

α + β > −1.

(3.12) Let us note that validity of relation (3.12) can be verified in the initial representation α as a generalized function, then the condition (2.7) as well. Further, considering D± α + β > −1 can be omitted and the composition relation has the form Z

+∞ −∞

Z α+β D± (x−ξ)f (ξ)dξ

+∞

Z

+∞

= −∞

−∞

β α D± (x−y)D± (y−ξ)f (ξ)dξdy,

α, β 6= −1, −2, .. (3.13)

for all functions analytic on the real axis and fulfilling lim

z→±∞

f (z) = 0. z α+β

(3.14)

γ−1 γ d Equation (3.13) follows from (3.12) and relation dw D± (w) = D± (w). Repetitional integration by parts on both sides can reduce the sum α + β by any natural number, so in this way validity of (3.13) is proved. All our previous considerations concerned the action of the operator Dα on the real axis. Next we shall try to enlarge them on the whole complex plane.

272

P. Z´avada

4. Fractional Derivative in Complex Plane First, let us illustrate the notion of fractional derivative introduced in the previous part with a particular example. Take the function 1 1 1 1 − , (4.1) = − f (x) = 1 + x2 2i x + i x − i for which (2.33) gives (−1)α+1 0(α + 1) →0+ 2iπ

Z

α (x) = lim f±

C±

dz . (x − z)α+1 (1 + z 2 )

(4.2)

Obviously for α > −2, the integration paths in Fig. 2 closed by arcs in infinity as shown in Fig. 4, are possible and

a

Im z

b

C(∞)

+i

Im z

C(∞)

+i C-

C+

x x

Re z

Re z

-i

-i

Fig. 4. Integration paths in Eq. (4.2) closed in infinity: a) C+ , b) C−

α f± (x)

(−1)α+1 0(α + 1) = 2i

1 1 − α+1 (x + i) (x − i)α+1

.

(4.3)

Even though the result formally does not depend on a given subscript (+ or −), it is α . In accordance with necessary to take into account a different cut orientation for f+α , f− the phase convention (2.6), we take ) (x + i)α+1 = Rα+1 exp[iϕ(α + 1)] for f+α (x − i)α+1 = Rα+1 exp[i(2π − ϕ)(α + 1)] ) , (4.4) (x + i)α+1 = Rα+1 exp[iϕ(α + 1)] α for f− , (x − i)α+1 = Rα+1 exp[−iϕ(α + 1)] where R=

√ 1 + x2 ,

ϕ=

x π − arcsin . 2 R

After inserting into (4.3) and a simple rearrangement, we obtain

(4.5)

Operator of Fractional Derivative in the Complex Plane

273

Im z

C1

+i

C2

x

Re z

Fig. 5. Tentative modification of integration paths in Eq. (4.2) and Fig. 4

α f± (x) =

√

±1 1 + x2

α+1 0(α + 1) sin

arcsin √

x 1 + x2

±

π 2

(α + 1) .

(4.6)

Remark. From the last relation it can be easily shown that e.g. n (x) ≡ f n (x), where f n is an ordinary a) for α integer, α = n > −1, it holds f+n (x) = f− n-fold derivative, α (x) = (−1)α f+α (−x), b) f− α (x) → arctan(x) ± π/2. The c) for α → −1 we get a primitive function of f : f± same results can be obtained also from (2.45) for n = −1, 1α = 0.

Using formula (2.33) we shall try to generalize the operator of the fractional derivative on real axis to the whole complex plane. For this operator we shall demand again: Dα f (z) =

dn f dz n

for α = n ≥ 0,

(4.7)

(4.8) Dα ◦ Dβ = Dα+β . For making the generalization more transparent, we shall do it in several steps. 1) Let us go back to Fig. 4 and ask a question what the results (4.3)−−(4.6) will change, if the cut (and integration path correspondingly) is oriented otherwise than along real axis, but e.g. as shown in Fig. 5. Obviously for Eq. (4.3) nothing will change, but the form of Eqs. (4.4),(4.6) will depend on the mutual position of the cut and both poles. Because we have accepted phase convention (2.6) only for cuts on the real axis, it is now necessary to make a more general consideration to determine the phases of both terms in (4.3). Instead of (4.1) let us take a complex function a2 a1 + (4.9) f (z) = z − z1 z − z2 and integration paths C displayed in Fig. 6a. The corresponding integral will be a2 a1 . (4.10) + f α (z0 ) = −(−1)α+1 0(α + 1) (z0 − z1 )α+1 (z0 − z2 )α+1 The path C passes about the cut of function 1/wα+1 = 1/(z0 − z)α+1 in variable z, i.e. the cut of the function 1/wα+1 is oriented in the opposite direction, see Fig. 6b. The phases of w1α+1 = (z0 − z1 )α+1 and w2α+1 = (z0 − z2 )α+1 must be fixed with respect to this cut. Fig. 6b prompts the following rule.

274

P. Z´avada Im z

z1

C

a

Im w

b

ϕ1 ϕ2

w2

z0

ϕ2

z2

ϕ1

Re z

Re w

w1

Fig. 6. a) Integration paths in Eq. (2.33) for the function (4.9). ϕ1 , ϕ2 are phases corresponding to terms wkα+1 = (z0 − zk )α+1 in the result of integration (4.11). b) The same phases represented for variable w

Rule 1. The phase of the complex variable w is given by the angle ϕ of the arc leading from the positive real half axis and measured in the positive direction (against the clockwise sense) to the point w, and if the arc intersects the cut, ϕ is reduced by 2π. This rule can be applied also directly for the situation in Fig. 6a for fixing the phase of (z0 −z)α+1 . The only modification is that ϕ is measured from the half line (z0 , z0 −∞). Therefore angles ϕ1 , ϕ2 in Fig. 6 fix phases in (4.10), a1 exp(−iϕ1 [α + 1]) a2 exp(−iϕ2 [α + 1]) f α (z0 ) = (−1)α 0(α + 1) + . (4.11) | (z0 − z1 )α+1 | | (z0 − z2 )α+1 | Now the introduced rule can be applied also for integration on paths C1 (C2 ) in Fig. 5. Doing this, we shall get the same result as in the case of integration on paths C+ (C− ) in Fig. 4. Therefore depending on the position of the chosen derivative cut in respect to the both poles, the function ( 4.1) has (up to factor (2.8)) in a given point x two different values of fractional derivative given by Eq. (4.6). 2) We shall now generalize the prescription for calculating the fractional derivative of function having the form (4.9) for functions having a finite number of poles h(z) =

N X k=1

ak , (z − zk )nk +1

nk ≥ 0.

(4.12)

Let the derivative cut be given by the half line L ≡ [z0 , z0 + exp(iθ)∞] which does not go through any of poles zk . Then (−1)α+1 0(α + 1) X h (z0 ) = lim →0+ 2iπ N

Z

α

k=1

= (−1)α

N X 0(α + nk + 1) k=1

0(nk + 1)

C(L)

ak dz (z0 − z)α+1 (z − zk )nk +1

ak exp(−iϕk [α + 1]) , | z0 − zk |α+1 (z0 − zk )nk

(4.13)

the integration path C(L) is shown in Fig. 7. Angles ϕk are calculated using Rule 1, therefore it is obvious that the function hα (z0 ) in (4.13) can have, according to the cut orientation, as many values, as many different poles zk the function (4.12) has (number of values = number of poles of the function (4.12)).

Operator of Fractional Derivative in the Complex Plane

z1

275

Im z

zN

C(∞)

ϕ1

C(L) ϕ2

z0

z2

Re z

z3

Fig. 7. Integration paths in Eq. (2.33) with the function (4.12) and phases ϕk appearing in the result of integration (4.13)

3) Now let us consider functions (4.12), (4.13) in the case when the cut is not a line, but some general curve connecting the points z0 and exp(iθ)∞. Then the result (4.13) will be formally the same, only the prescription for fixing angles ϕk must be modified. General cases are illustrated in Fig. 8, where it is apparent that the arc, which “measures” the angle, may intersect the cut several times. The consistent assigning of angles ϕk corresponding to points zk in Eq. (4.13) can be ensured by the following prescription. Rule 2. Let us choose on the half line (z0 , z0 − ∞) any reference point zR . The angle ϕk is given by angle zR z0 zk measured in the positive sense and then for each intersection with the cut is corrected as follows. Superpose the palm of the right- or left-hand at an intersection in such a way that fingers lead from zR towards zk and thumb leads in the direction of the cut from the branching point z0 . If this condition is met by the right (left) hand, ϕk will be enhanced (reduced) by 2π. Remark. It is substantial for all zk to choose one common reference point zR . A shift of this point results at most in only equal shifts of all angles ϕk → ϕk + 2nπ, which do not change Eq. (4.13).

a

Im z

zR

z0

ϕ2

b

z1

Im z zR

ϕ1

Re z

ϕ2

z1 ϕ1

z0 z2

Re z

z2

Fig. 8. a), b) Examples of the curvilinear cuts generating integration paths for Eq. (2.33) with the function (4.12). Other symbols are defined in Rule 2

276

P. Z´avada

Therefore for any curvilinear cut the relation (4.13) can be written hα (z0 ) = (−1)α

N X 0(α + nk + 1) ak exp(−i[ϕk + 2mk π][α + 1]) k=1

0(nk + 1)

| z0 − zk |α+1 (z0 − zk )nk

.

(4.14)

The set of integer numbers (or more exactly their differences) characterizes the way the curvilinear cut passes among the poles zk . If we accept curvilinear cuts for the derivative of the function (4.1), then using (4.14) one can obtain (−1)k(α+1) 0(α + 1) x π α √ (α + 1) , (4.15) + (2k + 1) f (x) = √ α+1 sin arcsin 2 1 + x2 1 + x2 where k = m2 − m1 . Let us note that π lim f α (x) = arctan(x) + (2k + 1) , 2

α→−1

(4.16)

i.e. we obtain the infinite (but countable) set of primitive functions for the function (4.1). 4) So far we have considered only analytic functions of the form (4.12), for which the corresponding integral on the whole circle C(∞) vanishes. Now we are going to consider the general case, when this integral vanishes only on a part of the circle. But first, let us go back to the operator (2.18) acting on analytic functions on the real axis (or part of this axis) and try to generalize it for analytic functions on a curve in complex plane connecting a pair of points on the circle C(∞), see Fig. 9a. Let this curve be given as a continuous complex function of the real parameter t ∈ (−∞, +∞) z = x(t) + iy(t) ≡ ψ(t),

ψ(−∞) = ∞1 ,

ψ(+∞) = ∞2 ,

(4.17)

then its derivative

dy dx +i (4.18) dt dt determines in the complex plane the vector tangent to the curve ψ and oriented in direction of increasing t. The function ψ 0 (t) =

ν(t) = √

ψ 0 (t) = exp[iω(t)] · ψ ∗0 (t)

ψ 0 (t)

(4.19)

represents this vector in its normalized value (ω(t) is phase of this vector) and the function iν(t) normalized vector perpendicular to ψ and oriented left with respect to the course of ψ. Now let us define the function of the complex variable z ∈ ψ with complex parameters z0 ∈ ψ and α 6= −1, −2, −3, . . ., 1 (−1)α+1 0(α + 1) 1 α Dψ (z0 − z) = lim − →0+ 2iπ (z0 − z + iν)α+1 (z0 − z − iν)α+1 (4.20) having the curvilinear cuts (for α 6= 0, 1, 2, ..) corresponding to both terms coming out of the points z0 ± iν and going jointly along the curve ψ to points ∞1 or ∞2 , see Fig. 9b. So, in contradistinction to the linear cuts the form of the cut of wα+1 = (z0 − z)α+1 does depend on the position of point z0 on ψ. Next, using this function we define the operator

Operator of Fractional Derivative in the Complex Plane

a

Im z

b

C∞

∞1

277

Im z

∞1 ∞2

c

C∞

z0+iεν

z0

Im z

C(ψ,ε)

∞2

∞2

z0-iεν Re z

Re z

Re z

K-

Fig. 9. a) Integration paths for the operator (4.20) in complex plane. b) Corresponding curvilinear cuts in Eq. (4.20). c) Corresponding integration path in Eq. (4.23)

Z Dα ψ

f=

α fψ± (z0 )

Z

= ψ

α Dψ (z0

− z)f (z)dz =

+∞ −∞

α Dψ (ψ(t0 ) − ψ(t))f (ψ(t))ψ 0 (t)dt

(4.21)

acting on the functions analytic on the curve ψ for which the integral converges. Remark. Let us note the integral (4.21) depends on the choice of the cut end-point (∞1 or ∞2 ) but does not depend on in which the direction (∞1 → ∞2 or ∞1 ← ∞2 ) the integration is done. That is a consequence of the fact that the change of integration direction ψ(t) → ψ(−t), dz → −dz implies the change ν(t) → −ν(t), which implies α α (z0 − z) → −Dψ (z0 − z), therefore the integral does not change. the change Dψ If we calculate (4.21) as the difference Z Z (−1)α+1 0(α + 1) f (z)dz f (z)dz α − , Dψ f = lim α+1 α+1 →0+ 2iπ ψ (z0 − z + iν) ψ (z0 − z − iν) (4.22) then this difference can be expressed as Z Z (−1)α+1 0(α + 1) f (z)dz α Dψ (z0 − z)f (z)dz = lim , (4.23) α+1 →0+ 2iπ ψ C(ψ,) (z0 − z) where the path in “distance” passes about the cut coming out the branching point z0 on the curve ψ to infinity, see Fig. 9c. To label the integration cuts ending either at ∞1 or ∞2 we can accept the following convention. Let ∞1 = exp(iθ1 )∞ and ∞2 = exp(iθ2 )∞, then for ( ) ( ) cos θ1 < cos θ2 ψ+ = (z0 , ∞1 ), ψ− = (z0 , ∞2 ) we define cos θ1 6= cos θ2 cos θ1 > cos θ2 ψ = (z0 , ∞2 ), ψ− = (z0 , ∞1 ) ( ) ( + ) sin θ1 < sin θ2 ψ+ = (z0 , ∞1 ), ψ− = (z0 , ∞2 ) we define cos θ1 = cos θ2 sin θ1 > sin θ2 ψ+ = (z0 , ∞2 ), ψ− = (z0 , ∞1 ) (4.24) α α , D , f , C (ψ, ). and correspondingly we index related symbols Dα ± ψ± ψ± ψ± Let us note the definition of the operator Dα ψ by Eqs. (4.20), (4.21) ensures that the both curves C± (ψ, ) are oriented in such a way that after their closing by the circle

278

P. Z´avada

Im z

C1

ψ+

z0

ψ−

Re z C0

Fig. 10. Domain G (grey area) and the curves defined in the assumptions of the Theorem

C(∞), see Fig. 9c, there arises a closed curve having always clockwise orientation. We denote these curves as K± (ψ, ). Now, after all these preparation steps, we can come to the formulation and proof of the following theorem. Theorem. Assumptions: i) G is the domain in complex plane containing the part (or parts) C1 ≡ G ∩ C(∞) of the circle C(∞). The curve C1 forms one part, the curve C0 is the second part and closed curve CG ≡ C0 ∪ C1 constitutes the complete domain boundary (Fig. 10). ii) ψ is a curve defined in (4.17) and ψ ⊂ G, ψ ∩ C0 = ∅. / ψ, zk ∈ / CG , k = 1 . . . N iii) The function f (z) is analytic in G except for the poles zk ∈ and for a given α ≡ α1 + iα2 6= −1, −2, −3... and any z1 ∈ C1 meets the condition lim

z→z1

f (z) = 0. z α1

(4.25)

Statements: 1. Operation

Z α Dα ψ± f = fψ± (z0 ) =

ψ

α Dψ± (z0 − z)f (z)dz,

z0 ∈ ψ

(4.26)

is a fractional derivative satisfying the conditions (4.7), (4.8). In particular that means the operation does not depend on ψ if α = n ≥ 0 is integer. α is given also equivalently by 2. fψ± Z (−1)α+1 0(α + 1) f (z)dz α fψ± (z0 ) = lim . (4.27) α+1 →0+ 2iπ C± (ψ,) (z0 − z) α 3. For N = 0 the value fψ± (z0 ) does not depend on ψ, i.e. up to the factor (−1)α is uniquely defined. 4. For N ≥ 1 and α non integer there remain cases: α (z0 ) does depend on ψ, but the set of values 4.1 α1 is rational and α2 = 0, then fψ± is finite.

Operator of Fractional Derivative in the Complex Plane

279

α 4.2 α1 is irrational or α2 6= 0, then fψ± (z0 ) does depend on ψ as well and in general the set of values is infinite, but countable.

Proof. We shall start from Statement 2. Obviously, its validity follows from Eqs. (4.22), (4.23). Now let us consider Statement 4. If we express the function f (z) as the sum f (z) = g(z) + h(z), where g(z) is analytic in G and h(z) has the form (4.12), then obviously Z (−1)α+1 0(α + 1) f (z)dz α fψ± (z0 ) = lim α+1 →0+ 2iπ (z 0 − z) C± (ψ,) Z Z (−1)α+1 0(α + 1) (−1)α+1 0(α + 1) f (z)dz f (z)dz = lim − α+1 α+1 →0+ 2iπ (z − z) 2iπ (z 0 0 − z) K± (ψ,) C0 (4.28) ! N X 0(α + n + 1) exp(−i[ϕ + 2m π][α + 1]) a k k k k = (−1)α + I(z0 ) , 0(nk + 1) | z0 − zk |α+1 (z0 − zk )nk k=1

where we have denoted I(z0 ) =

0(α + 1) 2iπ

Z C0

f (z)dz . (z0 − z)α+1

(4.29)

The angles ϕk + 2mk π corresponding to the poles zk are evaluated according to Rule 2. In the function I(z0 ) the orientation of the path C0 is accordant with K± and the phase of (z0 − z)α+1 must be also according to the Rule 2 related to the same reference point zR . Now, if α1 = p/q (p, q are not commensurable) and α2 = 0, then each term in the last sum has the same value also for m0k = mk + q, i.e. depending on the form of the cut and the corresponding set {mk , 1 ≤ k ≤ N } the expression (4.28) has only a finite number of values for some z0 . On the other hand for α irrational different sets {mk } give different values of the sum, therefore in general the number of values is infinite. For α2 6= 0 multi-value factors in the sum (4.28) are expanded 1α = exp(2iπmα1 ) · exp(−2πmα2 ),

(4.30)

i.e. mα1 , mα2 determine the phase and scale of individual terms in the sum. Obviously for a complex α the set of values f α (z0 ) depending on ψ is in general infinite, similarly as for α irrational. Therefore Statements 4.1, 4.2 are proved. For N = 0, Eq. (4.28) is simplified: α (z0 ) = (−1)α I(z0 ), fψ±

(4.31)

and the trueness of Statement 3 is evident. Finally, let us consider Statement 1. Validity of the condition (4.7) follows from the statement 2. For α integer the path C± (ψ, ) can be closed around the pole z0 (having positive orientation) and we get the Cauchy integral Z f (z)dz n! = f n (z0 ). (4.32) 2iπ C(z0 ) (z − z0 )n+1 The condition (4.8) requires validity Z Z Z α+β β α Dψ± (x − ξ)f (ξ)dξ = Dψ± (x − y)Dψ± (y − ξ)f (ξ)dξdy. ψ

ψ

(4.33)

ψ

This relation follows from (4.21) and (3.13) in which the substitutions ξ → ψ(ξ) and y → ψ(y) are applied. So the whole proof is completed.

280

P. Z´avada

5. Discussion 5.1. Some remarks regarding the theorem. Now, let us look into the statements of the Theorem more comprehensively. The function I(z0 ) in (4.28) can be expressed Z 0(α + 1) f (z)dz = I(z0 ) = 2iπ (z − z)α+1 0 C0 Z 0(α + 1) exp(−i2mπ[α + 1]) f (z) exp[−iϕ(z)(α + 1)]dz . (5.1) = 2iπ |(z0 − z)α+1 | C0 Since the cut ψ does not intersect the curve C0 the phases of all points on C0 are “corrected” by the same factor standing ahead of the integral. Now it is obvious that for α = p/q the function (4.28) can have at most q N +1 different values, including a multi-value factor (−1)α . Actually, the function (4.28) can be written as ! Z N X 0(α + 1) 0(α + nk + 1) f (ξ)dξ ak α α + f (z) = (−1) α+1 0(nk + 1) (z − zk )nk +α+1 2iπ C0 (z − ξ) k=1 (5.2) and considered a multi-value function with the value at point z depending on the choice of cut ψ in (4.26). At a given point z there are equivalent any two cuts for which the region closed by them and the curve C1 does not contain any pole zk . Obviously, except for the points zk the function f α (z) is (as an original function f ) analytic in the domain G. Moreover, in the region of α in which f α = Dα f exists, this function is apparently analytic also with respect to α. A special case takes place when α is a negative integer. Then due to the singularity 0(−n) the kernel (4.20) loses the sense. Nevertheless, assuming in (4.20) initially α 6= −n one can proceed to representation (4.27), then making the limit → 0 (like Eq. (2.35)) gives Z f (z0 ) 0(α + 1) sin([α + 1]π) f (z + ν0 )dz α + , fψ± (z0 ) = − lim α+1 →0+ π α(−ν0 )α ψ± (z0 − z − ν0 ) (5.3) where ν0 is given by Eq. (4.19) and represents the direction of the cut at z0 (orientation is assumed from z0 to infinity). This formula already makes sense for α = −n (if condition (4.25) holds). Using relation (A.10) gives Z 1 −n (z0 ) = − (z0 − z)n−1 f (z)dz, (5.4) fψ± (n − 1)! ψ± which is a modification of the known formula (1.1) for an n-fold primitive function. The set of values f −n (z0 ) depends on ψ is as follows. If f (z) has no poles in the domain G, then any two integration paths ψ1 , ψ2 in (5.4) can be connected by a fragment of C1 , in this way there arises closed curve C and it holds Z 1 (z0 − z)n−1 f (z)dz = fψ−n (z0 ) − fψ−n (z0 ), (5.5) 0=− 2 1 (n − 1)! C i.e. f −n (z0 ) is determined uniquely. Now, let us suppose f (z) has inside curve C just one pole, for z → zp ,

Operator of Fractional Derivative in the Complex Plane

f (z) →

ap . (z − zp )np +1

281

(5.6)

Then obviously any curve C in the integral (5.5) can be always substituted by a couple of curves K0 ; each of them is closed in the phase range h0, 2πi and having the pole zp inside (e.g. circles centered at zp ). Instead of (5.5) one gets the difference Z map (z0 − z)n−1 −n −n dz, (5.7) 1p ≡ fψ2 (z0 ) − fψ1 (z0 ) = (n − 1)! K0 (z − zp )np +1 where m is the integer depending on the shape of the curve C (m represents the number of “twists” on C). Using the Cauchy formula, the last equation gives  0 np ≥ n   2iπmap . (5.8) 1p =  (z0 − zp )n−np −1 np < n  (n − np − 1)!np ! For more poles all the corresponding terms (5.8) are simply added. For example integration constants in (4.16) representing the case n = 1, np = 0 fulfill (5.8). The composition relation for α = −n, β < 0 and ψ ≡real axis is proved in the Appendix (Eq. (A.17)) and apparently can be transformed to any curvilinear path ψ. Let us note, for α negative integer only the representation of Dα given by Eq. (5.3) (or (5.4)) makes sense and conversely for α non-negative integer only the representation given by Eq. (4.26) with kernel (4.20) (or equivalently by Eq. (4.27)) is well defined. For any α complex but non-integer both representations are well defined. These two representations differ in the corresponding integration paths: i) Eq. (4.27) – integration on some curve enveloping the cut, the path can be closed. ii) Eq. (5.3) – integration on the cut itself, the path cannot be closed. For α integer it is specific that the cuts disappear. Perhaps the most restrictive assumption in the Theorem is the condition (4.25). The question if one can in some consistent way apply Dα to the functions not obeying this condition in any part of C(∞) (and simultaneously having the ordinary derivatives – primitive functions) requires further study. Obviously one possible way is to consider such functions as generalized functions as well. 5.2. Concluding remarks. Have we said something new? First let us show what is not new. Apparently: a) The content of Eq. (2.45) is almost identical with the Liouville definition of right and left -handed fractional derivatives in [8], p. 95. The only distinction is in the phase of α since in the cited definition for the real functions the real value of the derivatives f− is ad hoc postulated. b) Equation (4.27) is the Cauchy type integral which ordinarily serves as one of the possible starting points for the definition of the fractional derivative in the complex plane, see [8],p. 415. In our approach the integration path is uniquely defined by the chosen cut. On the other hand, what is new seems be the following: 1) The general form of the kernel (4.20) from which both above mentioned formulae follow.

282

P. Z´avada

2) The construction based on the integration paths enveloping curvilinear cuts, which in the result allows one to identify the fractional derivative-integral with the multivalued function and to determine how the number of values depends on the derivative order type and the number of poles which the given function has in the considered region.

6. Appendix: The Correspondence of Cuts in the Composition Relation Similarly as in (3.2) we define 1 h± (γ, w) = (w+iτ )γ 1 h± (γ, w) = − (w−iτ )γ

τ > 0,

(A.1)

where indices ± denote the cut orientations (0,±∞). Let us consider the integrals Z +∞ h(a, x − y)h(b, y − z)dy (A.2) I = lim τ →0+

−∞

for various combinations of the cut orientations and their locations above or below the real axis. First let us assume a < 1, b < 1, a + b > 1.

(A.3)

For the more general integral (3.5) we have shown the integral vanishes, when the integration line does not separate off both singularities, which is the case of the eight integrals (A.2) involving combinations h± h± , h± h∓ , h± h± , h∓ h± . Let us calculate (A.2) when e.g. x < z and both singularities are above the real axis. Then |I| = |exp(iaπ)I1 + I2 + exp(−ibπ)I3 | = 0, where

Z

Ik = Lk

dy |(x − y)a (y − z)b |

L1 ≡ (−∞, x), L2 ≡ (x, z), L3 ≡ (z, +∞).

Using simple substitutions in the known relation Z 1 0(λ)0(µ) , xλ−1 (1 − x)µ−1 dx = 0(λ + µ) 0 and denoting da+b−1 ≡ (x − z)a+b−1 one can get I1 =

(A.4)

(A.5)

(A.6)

1 0(1 − a)0(a + b − 1) , · da+b−1 0(b)

1 0(1 − a)0(1 − b) , · da+b−1 0(2 − a − b) 1 0(1 − b)0(a + b − 1) . I3 = a+b−1 · d 0(a) After inserting into (A.4) we get I2 =

(A.7)

Operator of Fractional Derivative in the Complex Plane

cos(aπ)

283

0(1 − b)0(a + b − 1) 0(1 − a)0(a + b − 1) 0(1 − a)0(1 − b) + + cos(bπ) =0 0(b) 0(2 − a − b) 0(a) (A.8) sin(aπ)0(1 − a) sin(bπ)0(1 − b) = . (A.9) 0(b) 0(a)

The last identity also follows from the known formula (see e.g.[1] ,p. 256) 0(γ)0(1 − γ) sin(γπ) = π.

(A.10)

Now let us calculate (A.2) for the remaining combinations h± h± , h± h∓ , h± h± , h∓ h± . For τ → 0 it holds exp(iπc) lim h(γ, w) = , (A.11) τ →0+ |wγ | where the phases c of functions h entering the integral (A.2) are in accordance with the convention (2.6) summarized in Table 2. This table enables one to obtain the phases ck of their products which are listed in the first three columns of Table 3 The integrals of Table 2. The phases c depending on the cut location h(a, x − y) y<x h+

y>x

0

−a

h−

0

−a

h+

−2a

−a

h−

0

h(b, x − z) y
a

−b −b −b b

y>z 0 0 − 2b 0

all the combinations summarized in the table can be expressed like (A.4), I=

3 X

exp(ick π)Ik .

(A.12)

k=1

If we denote G≡

0(a + b − 1) 2iπ , · da+b−1 0(a)0(b)

(A.13)

then using identities (A.8),(A.10) the sum (A.12) can be evaluated. The results are given in the last column of Table 3. Let us compare the corresponding rows in the upper and lower part of the table. Obviously for any x, z one can write Z +∞ 2iπ0(a + b − 1) ± h (a + b − 1, x − z) h± (a, x − y)h± (b, y − z)dy = lim lim τ →0+ −∞ τ →0+ 0(a)0(b) (A.14) and equally for h± . So far we have assumed (A.3), however using the identity d h(γ, w) = −γh(γ + 1, w), dw

(A.15)

we can enlarge the validity of (A.14) to any a, b, a + b > 1,

a, b 6= 0, −1, −2, −3, . . . .

(A.16)

284

P. Z´avada

Table 3. The phases ck of products h(a, x − y)h(b, y − z) and the resulting integral I (last column) for the case x < z (upper part) and x > z (lower part) y<x

x
z
I

h+ h+

−b

−a − b

−a

+ exp(−iπ[a + b])G

h + h+

−a − 2b

− exp(−iπ[a + b])G

−2a − b

−a − b

h + h−

−b

−a − b

−a

+ exp(−iπ[a + b])G

h + h−

−2a + b

−a + b

−a

− exp(−iπ[a − b])G

h − h+

−b

−a − b

−a

+ exp(−iπ[a + b])G

h − h+

−b

+a − b

+a − 2b

− exp(+iπ[a − b])G

h − h−

−b

−a − b

−a

+ exp(−iπ[a + b])G

h − h−

+b

+a + b

+a

− exp(+iπ[a + b])G

h+ h+

−b

0

−a

−G

h + h+

−2a − b

−2a − 2b

−a − 2b

+ exp(−2iπ[a + b])G

h + h−

−b

0

−a

−G

h + h−

−2a + b

−2a

−a

+ exp(−2iπa)G

h − h+

−b

0

−a

−G

h − h+

−b

−2b

+a − 2b

+ exp(−2iπb)G

h − h−

−b

0

−a

−G

h − h−

+b

0

+a

+G

In this way we have proven the composition relation (3.12). (Note that a = α + 1, b = β + 1.) Alternatively, the composition relation can be easily proved in the representation given by Eq. (2.28) for α, β < 0. The relations Z +∞ α+β β α (x − z) = D± (x − y)D± (y − z)dy, α, β < 0, (A.17) D± −∞

after inserting from (2.28) and some simple substitutions, immediately follow from (A.6). Acknowledgement. I would like to express my gratitude to P. Kol´aˇr for critical reading of the manuscript and valuable comments.

References 1. Abramowitz, M., Stegun, I. (eds): Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables. Washington, D.C., New York: National Bureau of Standards, NBS, 1964 2. Barci, D.G., Bollini, C.G., Oxman L.E., Rocca M.C.: Non-Local Pseudo-Differential Operators. e-print hep-th/9606183 3. Gel’fand, I.M., Shilov, G.E.: Generalized Functions, Vol.1: Properties and Operations. New York: Academic Press, 1964 4. McBride, A.: Fractional Calculus and Integral Transforms of Generalized Functions. San Francisco: Pitman, 1979 5. Nishimoto, K.(ed.): Fractional Calculus and Applications. Proceedings of the 3rd Conference on Fractional Calculus, Tokyo 1989, Nihon Univ., Japan, 1990

Operator of Fractional Derivative in the Complex Plane

285

6. Oldham, K., Spanier, J.: The Fractional Calculus: Theory and Applications of Differentiation to Arbitrary Order. New York, Academic Press, 1974 7. Prudnikov, A.P., Brychkov, Yu.A., Marichev, O.I.: Integrals and Series, Vol.1. London–New York: Gordon and Breach, 1986 8. Samko, S.G., Kilbas, A.A., Marichev O.I.: Fractional Integrals and Derivatives: Theory and Application. London–New York: Gordon and Breach, 1993 9. Shogenov, V.Kh., Shchanukov-Lafishev, M.Kh. and Beshtoev, Kh.M.: Communication of the JINR Dubna, 1997, P4-97-81 (in russian) 10. Woon, S.C.: Analytic Continuation of Operators. Applications: from Number Theory and Group Theory to Quantum Field and String Theories. Preprint DAMTP-R-97/33, e-print hep-th/9707206 11. Zaganescu, M.: Some applications of fractional integration and differentiation in quantum mechanics and field theory. Preprint U.T.F.T. 5/82, Univ. Timisoara, 1982 Communicated by H. Araki

Commun. Math. Phys. 192, 287 – 307 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Driven Tracer Particle in One Dimensional Symmetric Simple Exclusion C. Landim1,2 , S. Olla3 , S. B. Volchan1 1 IMPA, Estrada Dona Castorina 110, CEP 22460 Rio de Janeiro, Brasil. E-mail: [email protected], [email protected] 2 CNRS URA 1378, Universit´ e de Rouen, 76128 Mont Saint Aignan, France 3 D´ epartement de Math´ematiques, Universit´e de Cergy-Pontoise, 2 Av. Adolphe Chauvin, Pontoise 95302 Cergy-Pontoise Cedex, France and Centre de Math´ematiques Appliqu´ees, Ecole Polytechnique, 91128 Palaiseau cedex, France. E-mail: [email protected]

Received: 5 December 1996 / Accepted: 30 June 1997

Abstract: Consider an infinite system of particles evolving in a one dimensional lattice according to symmetric random walks with hard core interaction. We investigate the behavior of a tagged particle under the action of an external constant driving force. We prove that the diffusively rescaled position of the test particle X(−2 t), t > 0, converges in probability, as → 0, to a deterministic function v(t). The function v(·) depends on the initial distribution of the random environment through a non-linear parabolic equation. This law of large numbers for the position of the tracer particle is deduced from the hydrodynamical limit of an inhomogeneous one dimensional symmetric zero range process with an asymmetry at the origin. An Einstein relation is satisfied asymptotically when the external force is small.

Introduction The one dimensional, nearest neighbor symmetric simple exclusion process can be described as follows: particles evolve on the one dimensional lattice Z with an exclusion rule that prevents more than one particle occupying the same site. Each particle jumps after a mean one exponential time to the right or left with probability 1/2. If the chosen site is already occupied, the jump is suppressed to conform to the exclusion rule. We add to this system an extra particle and refer to it as the tagged particle. This particle is subject to the same exclusion rule that forbids more than one particle at the same site and, in contrast with the other particles, experiences the action of a constant external driving force. In result, the tagged particle jumps with probability 1/2 < p ≤ 1 to the right and q = 1 − p to the left. Without the presence of the environment, the tagged particle would perform a simple asymmetric random walk. In particular, if Xt stands for its position at time t, t−1 (Xt − X0 ) would converge almost surely to p − q as t ↑ ∞. The presence of the symmetric environment affects dramatically the evolution of the tagged particle. Since the untagged

288

C. Landim, S. Olla, S. B. Volchan

particles behave as symmetric random walks, we expect an accumulation of particles at the right of the tagged particle and a rarefaction at the left. Thus the environment slows down the motion of the tagged particle and tends to confine it. In fact we prove in this article that if the initial configuration of the system is a Bernoulli measure with slowly varying density, then Xt − X0 √ =v (0.1) lim t→∞ t in probability, where v is a real number depending on the given density. An heuristic derivation of (0.1) can be √ found in Burlatsky et al. ([BMMO, BMOR]). The diffusive scale t is peculiar to the nearest neighbor assumption that restrains the tagged particle to jump over the symmetric particles. In higher dimension or without the nearest neighbor assumption, one would expect the tagged particle to move in the scale t. In the not driven case (p = q) Arratia [Ar] showed that t−1/4 (Xt − X0 ) converges in distribution, as t ↑ ∞, to a Gaussian variable with variance r 1−α 2 · (0.2) α π A corresponding invariance principle, i.e. the convergence of the properly rescaled process (X−4 t − X0 ) to a fractional Brownian motion of parameter 1/2, is proven in [RV]. This behavior should be characteristic of every one dimensional nearest neighbor model [Spo]. The first results of this type were established by Harris ([H]) in the case of Brownian particles with hard core interaction in dimension 1. In Sect. 6 we prove that if we start with a constant profile of density α then r 1−α 2 v = , lim p−q→0 p − q α π which is the Einstein relation between the mobility v/(p − q) given by (0.1) and the diffusivity given by (0.2). This is in agreement with the heuristic results of [BMOR]. Einstein relations can be established for a large class of weakly asymmetric models (i.e. the asymmetry is rescaled with the parameter relating the microscopic and the macroscopic scales (cf. [LR])). If the asymmetry is strong (i.e. not rescaled in the macroscopic limit) rigorous results on the Einstein relations are rare, essentially because of the difficulty to compute the stationary state of the environment as seen from the particle. Here is the idea of our approach. First we have to understand that this is a nonstationary problem: the tracer will start to push the particles in front and generate an inhomogeneous density profile that will evolve deterministically under a diffusive rescaling of space and time. The proper way to formulate the problem is thus to prove that (X−2 t − X0 ) → v(t), where v(t) is a deterministic function of the (macroscopic) time t. This suggests that the problem is basically a hydrodynamic limit (cf. [KL]) with a moving boundary. There is a natural map that transforms a one dimensional nearest neighbor exclusion process in a zero range process. This map transforms the moving boundary problem in a fixed boundary problem. Thus we need to prove the hydrodynamic limit for a zero range process with boundary conditions. Herbert Spohn made us notice a connection between this problem and the evolution of the random interfaces in a 3-phase Potts model at zero temperature under a Glauber dynamics. For some particular initial conditions the triple point of intersection of the three phases evolves macroscopically exactly like (0.1).

Driven Tracer Particle

289

1. Statements of the Results Consider a family of indistinguishable particles moving according to continuous time, symmetric, nearest neighbor random walks on Z with an exclusion rule. We add a tagged particle that moves according to an asymmetric random walk, jumping with probability p to the right, probability q to the left, and that respects the exclusion rule. The configuration of the system is denoted by (X, ξ), where X ∈ Z is the position of the tagged asymmetric particle, and ξ ∈ {0, 1}Z is the configuration of all other particles. Clearly ξ(X) = 0, because that site is already occupied by the asymmetric particle. The system just described is a Markov process whose generator acts on local functions F : Z × {0, 1}Z → R as X [F (X, ξ z,z+1 ) − F (X, ξ)] LF (X, ξ) = (1/2) z6=X−1,X

+ p(1 − ξ(X + 1))[F (X + 1, ξ) − F (X, ξ)] + q(1 − ξ(X − 1))[F (X − 1, ξ) − F (X, ξ)] ,

(1.1)

where ξ z,z+1 is the configuration obtained from ξ, exchanging the occupation variables ξ(z), ξ(z + 1). To fix ideas set p > 1/2. Denote by Z∗ the set of integers distinct from 0. For 0 ≤ α ≤ 1, denote by µα the Bernoulli product measure on {0, 1}Z∗ with density α: µα {ξ : ξ(x) = 1} = α , for every x in Z∗ . More generally, for a positive integer N and a profile κ0 : R → [0, 1], denote by µN κ0 (·) the Bernoulli product measure associated to κ0 : µN κ0 (·) {ξ : ξ(x) = 1} = κ0 (x/N ) the probability measure on the path space D(R+ , Z × {0, 1}Z ) for x in Z∗ and by PµN κ0 (·) induced by the Markov process with generator L defined in (1.1) and the initial measure δ 0 × µN κ0 (·) . Before stating the theorem, we introduce some notation required to define the limit vt . Fix a strictly positive profile κ0 . Denote by H: R → R, F : R → R the functions defined by Z A 1 − 1. (1.2) κ0 (u) du , F(B) = H(A) = −1 κ0 (H (B)) 0 Here H−1 stands for the inverse of the strictly increasing, absolutely continuous function H. Consider the non-linear parabolic equation with boundary condition on R+ × R+ ,   ∂t ρ = (1/2)18(ρ) ρ(t, 0) = 0  ρ(0, ·) = F+ (·) ,

(1.3)

where F+ stands for the restriction of F on R+ and 8(ρ) = ρ/(1 + ρ); and the nonlinear parabolic equation on R+ × R with boundary condition at the origin

290

C. Landim, S. Olla, S. B. Volchan

 ∂t ρ = (1/2)18(ρ)     p8(ρ(t, 0+)) = q8(ρ(t, 0−))  ∂u 8(ρ(t, 0+)) = ∂u 8(ρ(t, 0−))    ρ(0, ·) = F(·).

(1.4)

A precise definition of solutions of these differential equations is given in Sects. 3 and 4. In Sect. 6 we show that this equation can be transformed in a linear Stefan problem by a Lagrangian coordinate transformation. As consequence the original exclusion process with an asymmetric particle has a hydrodynamic behavior described by the solution of a Stefan problem. Theorem 1.1. Assume p = 1. Fix a profile κ0 : R+ → [0, 1] such that σ ≤ κ0 ≤ 1 − σ for some σ > 0. Then, for every δ > 0, i h X 2 tN − v > δ =0, (1.5) lim PµN t κ0 (·) N →∞ N Z

where

∞

n

vt =

o F (u) − ρ(t, u) du

(1.6)

0

and ρ is the solution of equation (1.3). Theorem 1.2. Assume p < 1. For α < 1 define ψα (u) = α1{u < 0}+(qα/p)1{u > 0}. Fix a profile κ0 : R → [0, 1] such that ψα ≤ κ0 ≤ 1 − σ for some σ > 0, 0 < α < 1. Then, for every δ > 0, (1.5) holds provided vt is given by (1.6) and ρ is the solution of Eq. (1.4). The integral defining vt in (1.6) must be understood in the following sense: consider the sequence {Hn , n ≥ 1} of real functions defined by (1.7) Hn (u) = (1 − un−1 )+ . R +∞ It follows from the equation satisfied by ρ that 0 Hn (u){F (u)−ρ(t, u)}du converges as n ↑ ∞. This limit defines the right-hand side of (1.6). In the case where the initial state is a Bernoulli product measure µα with a fixed density α, we can make more explicit computations: Theorem 1.3. If the initial state is µα , then (1 − α) vt = p−q→0 p − q α lim

r

2t . π

Theorems 1.1 and 1.2 are proven in Sect. 5. Theorem 1.3 and more asymptotic results are proven in Sect. 6. We √ now explain why in Theorems 1.1 and 1.2 the asymmetric tagged particle moves at scale t and why the displacement is related to the solution of the differential equations (1.3), (1.4). We start labeling all particles. The tagged asymmetric particle is labeled 0. For j ≥ 1, we label the j th particle at the right (left) of the tagged particle by j (−j). For x in Z, denote by η(x) the number of holes between particle x and particle x + 1. In this way we transform a configuration of {0, 1}Z with a particle at some site X into a configuration {η(x), x ∈ Z} ∈ NZ . Denote by T : Z × {0, 1}Z → NZ the transformation

Driven Tracer Particle

291

just described. T induces a transformation on the space of functions (resp. probability measures) of Z × {0, 1}Z to the space of functions (resp. probability measures) of NZ still denoted by T . The dynamics of the process (Xt , ξt ) induces a dynamics for ηt that can be informally described as follows. For every x 6= −1, if there is at least one particle at site x, at rate 1/2 one of them jumps to site x + 1 and, symmetrically, if there is at least one particle at site x + 1, at rate 1/2 one of them jumps to site x. The picture is slightly different between sites −1 and 0 due to the behavior of the asymmetric tagged particle. A particle jumps at rate q from site −1 to site 0 if there is a particle at −1 and a particle jumps at rate p from site 0 to site −1 if there is a particle at the origin. This process is the so-called zero range process, with an asymmetry at the origin. The position at time t of the asymmetric tagged particle corresponds in the zero range model to the total number of jumps between 0 and t from 0 to −1 minus the total number of jumps in the same interval from −1 to 0: X {η0 (x) − ηt (x)}. (1.8) Xt = x≥0

The right-hand side is to be understood in the same sense as the righ-hand side of (1.6) by the use of the functions (1.7) (with the limit in the L2 sense) (cf. [RV]). Since in the zero range process the jumps of particles over all bonds, except the bond {−1, 0}, are symmetric, we expect the process to have a diffusive hydrodynamic behavior, i.e., that for a large class of initial profiles, the process accelerated by N 2 is such that for all continuous functions with compact support G, X G(x/N )ηtN 2 (x) (1.9) N −1 R

x

converges in probability to R G(u)ρ(t, u)du, where ρ is the solution of a nonlinear heat equation. In particular, approximating 1{u > 0} by the sequence defined in (1.7), it follows from (1.8) and (1.9) that X XtN 2 = N −1 {η0 (x) − ηtN 2 (x)} N x≥0

converges in probability to vt given by (1.6). 2. The Case p=1 In the case where the asymmetric tagged particle jumps only to the right, the evolution of the medium on its left is irrelevant for its motion. For the corresponding zero range dynamics, p = 1 means that at rate 1 a particle at the origin jumps to −1 and no particle jumps from −1 to 0. We may therefore assume that there is at −1 an infinite reservoir or an absorption point to which particles from the origin jump at rate 1 and from which no particle jumps. Moreover, the position of the tagged particle at time t corresponds in the zero range process to the number of particles that left the system before time t. Consider the zero–range process on N whose generator acts on cylinder functions as o Xn Lx,x+1 + Lx+1,x , (2.1) L = Lb + x≥0

292

C. Landim, S. Olla, S. B. Volchan

where

Lx,y f (η) = (1/2)g(η(x))[f (σ x,y η) − f (η)]

and

Lb f (η) = g(η(0))[f (η − d0 ) − f (η)]. Here, for a site x, dx stands for the configuration with no particles but one at x, summation is performed site by site, σ x,y η is the configuration η with one particle less at x and one more at y: σ x,y η = η − dx + dy and g(k) = 1{k ≥ 1}. For α > 0, denote by να+ the product measure on NN with marginals given by 1 α k . (2.2) να+ {η, η(x) = k} = 1+α 1+α It is easy to check that these measures are reversible with respect to the generators Lx,x+1 + Lx+1,x defined above and that Eνα+ [η(x)] = α. We need to introduce some terminology on weak solutions of non linear parabolic equations. Fix a bounded initial profile ρ0 : R+ → R. A bounded function ρ: [0, T ] × R+ → R is said to be a weak solution of the partial differential equation (1.3) with initial condition ρ0 in the layer [0, T ] × R+ if (a) 8(ρ(t, u)) is absolutely continuous in the space variable and Z T Z 2 ds du e−u {∂u 8(ρ(s, u))} < ∞ , 0

R+

(b) ρ(t, 0) = 0 for almost every 0 ≤ t ≤ T , and (c) For every smooth function with compact support G: R+ → R vanishing at the origin and for every 0 ≤ t ≤ T , Z t Z Z Z ds du G0 (u)∂u 8(ρ(s, u)). du ρ(t, u)G(u) − du ρ0 (u)G(u) = − (1/2) 0

R+

Uniqueness of weak solutions of (1.3) can be proved with similar methods to the ones presented in [ELS], we outline the argument in the appendix. The existence for special initial conditions ρ0 follows from the tightness of the sequence QµN defined below in Theorem 2.2. We now describe the initial states considered in this section. Fix a sequence of probability measures {µN , N ≥ 1} on NN .We assume that (H1) The sequence µN is bounded above (resp. below) by να+ (resp. νλ+ ) for some 0 < λ < α. (H2) There exists a bounded function ρ0 : R+ → R+ such that for each continuous function G: R+ → R with compact support and each δ > 0, Z i h X G(x/N )η(x) − du G(u)ρ0 (u) ≥ δ = 0. lim µN N −1 N →∞

x

The first assumption is needed in order to prove the two block estimates for zero range processes with bounded jump rate (cf. [KL]). The second one just imposes a hydrodynamic behavior at time 0. For each probability measure µ on NN , denote by PN µ the probability measure on the path space D(R+ , NN ) induced by the Markov process with generator (2.1) accelerated N by N 2 and the initial measure µ. Expectation with respect to PN µ is denoted by Eµ .

Driven Tracer Particle

293

Theorem 2.1. Fix a sequence of initial measures satisfying assumptions (H1) and (H2). For any continuous function G: R+ → R with compact support and any δ > 0, " # Z −1 X N G(x/N )ηt (x) − du G(u)ρ(t, u) ≥ δ = 0, lim PµN N N →∞

x

where ρ is the unique solution of (1.3). For each positive integer N and each configuration η, define the empirical distribution Radon measure on R+ obtained by assigning a mass N −1 π N = π N (η) as the positiveP N −1 N N to each particle: π = N z≥0 η(z)δz/N and set πt = π (ηt ). Fix T > 0. Theorem 2.1 follows from the convergence in distribution of the process {πtN , 0 ≤ t ≤ T }, stated below in Theorem 2.2, and some standard topology arguments (cf. Chap. IV of [KL]). To state the convergence in distribution of the empirical measure we need some notation. Denote by M+ = M+ (R+ ) the space of positive Radon measures on R+ endowed with the vague topology, a metrizable topology. For each probability measure µ on NN , denote N by QN µ the probability measure on the path space D([0, T ], M+ ) induced by Pµ and the N empirical measure π . Theorem 2.2. The sequence QN µN converges to the probability measure concentrated on the absolutely continuous path π(t, du) = ρ(t, u)du whose density is the solution of (1.3). Guo, Papanicolaou and Varadhan introduced in [GPV] a method, well known by now, to prove Theorem 2.2 provided one has a bound on the entropy and on the Dirichlet form of the system with respect to some invariant measure. These bounds are usually obtained computing the time derivative of the entropy of the distribution of particles at time t relative to the equilibrium distribution. In the present context, however, there is only one invariant measure: the trivial one δ0 concentrated on the configuration 0 with no particles. Since all other probability measures on NN are orthogonal with respect to this one, the entropy of any reasonable measure with respect to δ0 is infinite and the entropy method does not apply straightforwardly. To overcome this problem, we compute the relative entropy with respect to an inhomogeneous product measure that is not invariant but close to the invariant measure. To obtain an estimate on the entropy and on the Dirichlet form, we first assume that there exists a parameter β > 0 for which the relative entropy H(µN |νβ+ ) is bounded by C0 N for some finite constant C0 . Coupling arguments permit to remove this assumption. This is explained at the end of this section. To deduce an estimate on the entropy of the system, we need to introduce a class of inhomogeneous product measures. For x ≥ 0, define γx by γx = β(1 + x)/N for N the product measure on NN 0 ≤ x ≤ N − 1 and γx = β for x ≥ N . Denote by νγ(·) with marginals given by N νγ(·) {η, η(x) = k} = (1 − γx )γxk

(2.3)

for all x ≥ 0 and k ≥ 0. A simple computation relying on the entropy inequality shows that the entropy of N is bounded by C1 N for some finite constant C1 depending only µN with respect to νγ(·) N N ) ≤ C1 N (cf. Remark V.1.2 in [KL]). on C0 , α and β: H(µ | νγ(·) N , define the Dirichlet form Dγ (f ) For each probability density f with respect to νγ(·) by

294

C. Landim, S. Olla, S. B. Volchan

Dγ (f ) = Dγ,b (f ) + Dγ,i (f ) = Dγ,b (f ) +

X

Dx,x+1 (f ) ,

x≥0

Z

hp i2 p N Dγ,b (f ) = (1/2) g(η(0)) f (η − d0 ) − f (η) dνγ(·) , Z hp i p 2 N f (η + dx+1 − dx ) − f (η) dνγ(·) . Dx,x+1 (f ) = (1/2) g(η(x)) where

(2.4)

Proposition 2.3. Let StN be the semigroup associated to the generator L introduced in (2.1) accelerated by N 2 . Denote by ft = ftN the Radon–Nikodym derivative of µN StN N with respect to νγ(·) . There exists a finite constant C = C(β) such that N ) ≤ −N 2 Dγ (ft ) + CN. ∂t H(µN StN |νγ(·) N . It is easy to check Proof. Denote by L∗γ the adjoint operator of L with respect to νγ(·) that ft is the solution of the forward equation

(

∂t ft = N 2 L∗γ ft

(2.5)

N f0 = (dµN )/(dνγ(·) ).

Then by explicit calculation Z Z N N N ) = N 2 L∗γ ft log ft dνγ(·) + N 2 L∗γ ft dνγ(·) ∂t H(µN StN |νγ(·) Z Z Z Lft N N N = ft N 2 L log ft dνγ(·) = N 2 ft (L log ft − ) dνγ(·) + N 2 Lft dνγ(·) . ft (2.6) N were an invariant measure. Notice that the last term would vanish if νγ(·) √ √ Since for every a, b > 0, a log(b/a) − (b − a) is less than or equal to −( b − a)2 , for every x, y ≥ 0, we have that hp i2 p ft Lx,y log ft − Lx,y ft ≤ −(1/2)g(η(x)) ft (η + dy − dx ) − ft (η) hp i2 p ft Lb log ft − Lb ft ≤ −g(η(0)) ft (η − d0 ) − ft (η) . Recall the definition of the Dirichlet form Dγ (·) introduced in (2.4). The previous estimate shows that the first term on the rightmost expression of (2.6) is bounded above by −2N 2 Dγ (ft ). R N , which corresponds to the price we are paying To estimate the term N 2 Lft dνγ(·) for not using an invariant distribution as a reference measure, let us write it explicitly: Z N2

N Lft dνγ(·) = N2

XZ

N Lx,x+1 ft + Lx+1,x ft dνγ(·) +N2

Z

N Lb ft dνγ(·) . (2.7)

x≥0

Performing the change of variables ξ = η − dx + dy , the measures change as N N dνγ(·) (η)/dνγ(·) (ξ) = γx g(ξ(y))/γy g(η(x)). In particular, we have that

Driven Tracer Particle

295

Z

N Lx,x+1 ft + Lx+1,x ft dνγ(·) γ Z x N = (1/2) −1 g(η(x + 1))ft (η) dνγ(·) γx+1 γ Z x+1 N + (1/2) −1 g(η(x))ft (η) dνγ(·) . γx

We may thus rewrite the right-hand side of (2.7) as X (1N γ)(x) Z N g(η(x))ft (η) dνγ(·) (1/2) γx x≥1 γ Z 1 N + (N 2 /2) −1 g(η(0))ft (η) dνγ(·) γ0 Z N + N 2 g(η(0))[ft (η − d0 ) − ft (η)] dνγ(·) .

(2.8)

In this formula, (1N γ)(x) stands for N 2 {γx+1 + γx−1 − 2γx }. By definition of γ, (1N γ)(x) = 0 for all x except at x = N −1, where (1N γ)(N −1) = N 2 (γN −2 −γN −1 ), which is negative because γ is non decreasing. The first line of (2.8) is therefore negative. A change of variables ξ = η − d0 permits to write the second term of the second line as Z 2 N N [γ0 − g(η(0))]ft (η) dνγ(·) . The second line of (2.8) is therefore equal to γ Z 1 N βN + (1/2)N 2 −3 g(η(0))ft (η) dνγ(·) ≤ βN γ0 because γ0 = β/N , f is a density and γ1 /γ0 = 2. This concludes the proof of the proposition. With the previous estimate on the entropy and on the Dirichlet form, we are in a position to apply the classical entropy method to prove the hydrodynamic behavior of the system (cf. Chapter V in [KL]). We just point out here the main difference coming from the absorption point at the origin. Lemma 2.4. For every 0 ≤ t ≤ T , lim EµN

N →∞

hZ

t

i g(ηs (0))ds = 0.

0

Proof. Recall that we denote by ft the Radon–Nikodym derivative of µN StN with respect Rt N to νγ(·) . Set f¯t = t−1 0 fs ds. With this notation, the expectation in the statement writes Z N t f¯t (η)g(η(0))dνγ(·) . Adding and subtracting f¯t (η − d0 ) and changing variables, we obtain that this integral is equal to

296

C. Landim, S. Olla, S. B. Volchan

Z t

N g(η(0))[f¯t (η) − f¯t (η − d0 )]dνγ(·) + tγ0 .

The second term vanishes as N ↑ ∞ because γ0 = β/N . The first one, by Schwarz inequality and a change of variables, is bounded above by t {kgk∞ + γ0 } + tADγ,b (f¯t ) A √ for every A > 0. Choosing A = N , we conclude the proof of the lemma by virtue of Proposition 2.3 and the convexity of the Dirichlet form. Lemma 2.5. For every 0 ≤ t ≤ T , lim sup lim sup EµN →0

N →∞

hZ

t 0

i ds 8(2ηsN (0)) = 0.

Notice that in this last expression we multiply ηtN (0) by 2 to obtain the density of particles on the box [0, N ]. The proof of Lemma 2.5 is performed in three steps. We first show that we may replace the cylinder function g(η(0)) by an average over a small macroscopic box around the origin. We then replace this average by 8(2η N (0)) and recall Lemma 2.4 to conclude. Lemma 2.6. For each 0 ≤ t ≤ T and smooth G: R+ → R, lim sup lim sup EµN →0

hZ

N →∞

t

N n oi X ds G(s) g(ηs (0)) − (N )−1 g(ηs (y)) = 0.

0

y=0

Proof. Denote by V (ηs ) the expression inside braces in the previous formula: V (η) = g(η(0)) − (N )−1

N X

g(η(y)).

(2.9)

y=0

Since f¯t = t−1 as

Rt 0

fs ds, we may rewrite the expectation in the statement of the lemma Z N (dη). t V (η)f¯t (η)νγ(·)

A change of variables ξ = η − dx gives that (N )

−1

x−1 N X X

Z n γy

R

N V (η)f¯t (η)νγ(·) (dη) is equal to

o N f¯t (η + dy ) − f¯t (η + dy+1 ) νγ(·) (dη)

x=0 y=0

Z + [γy − γy+1 ]

N ¯ ft (η + dy+1 )νγ(·) (dη) .

Since γx is increasing in x, the second term is negative. On the hand, rewriting the difference {a − b} = {f¯t (η + dy ) − f¯t (η + dy+1 )} as √ other √ √ √ { a − b}{ a + b} and applying the Schwarz inequality, we bound the first term by

Driven Tracer Particle

297

Z nq N x−1 q o2 A XX N ¯ ft (η + dy ) − f¯t (η + dy+1 ) νγ(·) γy (dη) 2N x=0 y=0

Z n N x−1 o 1 XX N f¯t (η + dy ) + f¯t (η + dy+1 ) νγ(·) + γy (dη) AN x=0 y=0

for every A > 0. Changing variables back, keeping in mind that γx is a non decreasing function and inverting the order of summation, we show that this expression is bounded above by N −1 2kgk∞ A X N Dy,y+1 (f¯t ) + 2 A y=0

for every positive A. Recalling Proposition 2.3 and taking A = proof of the lemma.

√

N , we conclude the

In view of Lemma 2.4, to conclude the proof of Lemma 2.5, it remains to replace the average of the cylinder function g(η(x)) by 8(2η N (0)) but this is the classical two blocks estimate. This concludes the proof of Theorem 2.2 under the assumption that the entropy of the initial state with respect to the product measure νβ+ is bounded above by C0 N for some finite constant C0 . A coupling argument permits to remove Assumption (H2’). Consider a sequence µN satisfying assumptions (H1) and (H2). Fix A > 0 and let µN,A be the probability measure on NN defined by µN,A = µN

3AN

⊗ νβ+

3cAN

,

where 3AN = {0, . . . , AN } and ν3 is the marginal of the probability measure ν on 3. Since νλ+ ≤ µN ≤ να+ and since all cylinder functions can be decomposed as the difference of two monotone functions (cf. [KL]), a simple computation and the explicit formula for the relative entropy give that H(µN,A |νβ+ ) ≤

o 1n H(να+,AN |νβ+,AN ) + H(νλ+,AN |νβ+,AN ) , 2

where νγ+,m is the marginal of νγ+ on {0, . . . , m}. In particular, the entropy H(µN,A |νβ+ ) is bounded above by C0 N for some finite constant C0 depending only on A, α and λ. Let ρA (t, u) denote the solution of (1.3) with initial condition ρRA 0 (u) = ρ0 (u)1{u ≤ A} + β1{u > A}. Investigating the time evolution of the integral R+ due−u ρA (t, u)2 we obtain uniform in A a priori estimates that show that ρA converges to the unique solution of (1.3) with initial condition ρ0 . Since the jump rate g is non decreasing, we may couple a zero range starting from µN with another one starting from µN,A and show that as A ↑ ∞ both behave exactly in the same way on compact sets. This coupling, the hydrodynamic behavior of the empirical measure for a process starting from µN,A and the convergence of ρA to ρ, permit to extend Theorem 2.2 to the sequence of measures satisfying assumptions (H1) and (H2).

298

C. Landim, S. Olla, S. B. Volchan

3. The Case p < 1 We turn in this section to the case where the asymmetric tagged particle jumps at rate p to the right and at rate q to the left. The corresponding zero range process has jumps at rate (1/2) over all bonds but {−1, 0}. From the origin, particles jump at rate p to −1 and from −1 particles jump at rate q to 0. Recall that to fix ideas we assumed p > q. The purpose of this section is to deduce the hydrodynamic behavior of the just described space inhomogeneous process. Consider the zero–range process on Z with generator given by X {Lx,x+1 + Lx+1,x } + 2pL0,−1 + 2qL−1,0 , (3.1) L= x6=−1

where Lx,y is the generator defined just after (2.1). In contrast with the previous section, this system possesses a one parameter family of invariant measures. For each ϕ < p−1 , denote by ν¯ ϕi the product measure on NZ with marginals given by ν¯ ϕi {η, η(x) = k} =

1 ϕkx , Z(ϕx ) g(k)!

(3.2)

where ϕx = pϕ for x ≤ −1 and ϕx = qϕ for x ≥ 0. A direct computation shows that the Markov process with generator given by (3.1) is reversible with respect to these product measures. Before stating the main result of this section, we introduce some terminology on weak solutions of non-linear parabolic equations. Fix a bounded function ρ0 : R → R. A bounded function ρ: R+ × R → R is said to be a weak solution of the partial differential equation (1.4) with initial condition ρ0 if (a) 8(ρ(t, u)) is absolutely continuous in the space variable and for every t > 0, Z

Z

t

du e−|u| {∂u 8(ρ(s, u))} < ∞ , 2

ds R

0

(b) p8(ρ(t, 0+)) = q8(ρ(t, 0−)) for almost every t ≥ 0 and (c) For every smooth function with compact support G: R → R and for every t > 0, Z

Z R

du ρ(t, u)G(u) −

Z R

Z

t

du ρ0 (u)G(u) = −

ds 0

R

du G0 (u)∂u 8(ρ(s, u)).

Since ρ(t, u) is only a measurable function, requirement (b) must be understood as Z

t

lim

→0

0

n 1 Z 1 Z 0 o h(s) p8 ρ(s, u)du − q8 ρ(s, u)du ds = 0 0 −

(3.3)

for every t ≥ 0 and any continuous function h(t). The third property in (1.4) just states that there is conservation of the total mass at the origin. Uniqueness of weak solutions of (1.4) is proved with similar techniques to the ones presented in [ELS](cf. Appendix). The existence for special initial conditions ρ0 follows from the tightness of the sequence QN µN defined below.

Driven Tracer Particle

299

For each probability measure µ on NZ , denote by PN µ the probability measure on the Z path space D(R+ , N ) induced by the Markov process with generator (3.1) accelerated N by N 2 and the initial measure µ. Expectation with respect to PN µ is denoted by Eµ . We now define the initial states considered in the first main theorem of this section. Fix a sequence of initial measures µN on NZ , we assume that (IS1) The sequence µN is bounded above (resp. below) by some invariant state ν¯ αi (resp. ν¯ λi ) for some 0 < λ < α. (IS2) There exists a function ρ0 : R → R+ such that for each continuous function G: R → R+ and each δ > 0, Z h i X G(x/N )η(x) − du G(u)ρ0 (u) ≥ δ = 0. lim µN N −1 N →∞

x

Notice that it follows from assumption (IS1) that the function ρ0 in (IS2) is necessarily bounded. Theorem 3.1. Consider a sequence of initial states µN satisfying assumptions (IS1), (IS2). For any continuous function G: R → R with compact support and any δ > 0, " # Z −1 X N G(x/N )ηt (x) − du G(u)ρ(t, u) ≥ δ = 0, lim PµN N N →∞

x

where ρ is the unique solution of (1.4). Like in Sect. 3 (cf. also Chap. IV of [KL]), we deduce this result from the convergence in distribution of the empirical measure π N = π N (η) defined as the −1 to each particle: positive Radon P measure on R obtained by assigning a mass N π N = N −1 z∈Z η(z)δz/N . Set πtN = π N (ηt ) and denote by M+ = M+ (R) the space of positive Radon measures on R endowed with the vague topology, a metrizable topology. Fix T > 0. For each probability measure µ on NZ , denote by QN µ the probability and the empirical measure measure on the path space D([0, T ], M+ ) induced by PN µ N π . Theorem 3.2. The sequence QN µN converges to the probability measure concentrated on the absolutely continuous path π(t, du) = ρ(t, u)du whose density is the solution of (1.4). Coupling arguments similar to the ones presented at the end of the previous section show that it is enough to prove Theorem 3.2 under the assumption that there exist a density β > 0 and a finite constant C0 such that the entropy of µN with respect to ν¯ βi is bounded by C0 N : H(µN |ν¯ βi ) ≤ C0 N for every N ≥ 1. We therefore assume until the end of this section the existence of such constants β and C0 . The main difference in the proof of the hydrodynamic limit of this model and the classical proof for space homogeneous systems resides in the behavior at the boundary u = 0. The next four lemmas solve this question. For a site x, a configuration η and a positive integer `, denote by M`± (x, η) the density of particles for the configuration η on a box of size ` at the right (left) of x: x+`

M`+ (x, η) =

1 X η(y) , ` + 1 y=x

M`− (x, η) =

x 1 X η(y). `+1 y=x−`

300

C. Landim, S. Olla, S. B. Volchan

Lemma 3.3. For every continuous function H: [0, T ] → R, h Z T n o i + lim sup lim sup EN dt H(t) g(η (0)) − 8(M (0, η )) = 0. N t t N µ →0

N →∞

0

+ (0, ηt ) by The same result holds if g(ηt (0)) is replaced by g(ηt (−1)) and MN − MN (−1, ηt ).

This result follows from the next lemma and the two blocks estimate. Lemma 3.4. For every continuous function H: [0, T ] → R, lim sup lim sup EN µN →0 N →∞

h Z

T

N n o i X dt H(t) g(ηt (0)) − (N )−1 g(ηt (x)) = 0.

0

x=0

The same result holds if g(ηt (0)) is replaced by g(ηt (−1)) and the average over {0, . . . , N } is replaced by the average over {−N, . . . , 0}. Proof. Recall from (2.9) the definition of V (ηt ). By the entropy inequality, i h Z T E µN ds H(s)V (ηs ) 0

h n H(µN |ν¯ βi ) 1 + log Eν¯ βi exp ≤ NA AN

Z

T

oi ds G(s)AN V (ηs )

0

for every A > 0. By assumption, the first term on the right-hand side is bounded by CA−1 . To prove the lemma it is therefore enough to show that the limit of the second one is less than or equal to 0 for every A > 0. Since e|x| ≤ ex + e−x and lim supN N −1 log{aN +bN } ≤ max{lim supN N −1 log aN , lim supN N −1 log bN }, replacing H by −H we deduce that we only need to prove the previous statement without the absolute value in the exponent. By the Feynman–Kac formula and the variational formula for the largest eigenvalue of an operator, h nZ T oi 1 log Eν¯ βi exp ds G(s)AN V (ηs ) lim sup N →∞ AN 0 (3.4) Z T nZ o H(t)V (η)f (η)ν¯ βi (dη) + A−1 N D(f ) . dt sup ≤ 0

f

In this formula, the supremum is taken over all densities f with respect to ν¯βi and D(f ) is the Dirichlet form Z p p f L f dνβi . D(f ) = We are now ready to integrate by parts the cylinder function V . The rest of the proof is similar to the proof of Lemma 2.6 and omitted for this reason. The same argument permits to deduce the following result. Lemma 3.5. For every continuous function H: [0, T ] → R, i h Z T dt H(t){pg(ηt (0)) − qg(ηt (−1))} = 0. lim sup EN µN N →∞

0

Driven Tracer Particle

301

The next result follows from Lemma 3.3 and Lemma 3.5. Corollary 3.6. For every continuous function H: [0, T ] → R, h Z T n o i − + dt H(t) p8(M (0, η )) − q8(M (−1, η )) lim sup EN = 0. N t t N µ N N →∞

0

These technical lemmas permit to adapt the classical proof of the hydrodynamic behavior of reversible systems to the present context. Details are left to the reader. Remark 3.7. In Sects 2 and 3 only the monotonicity and the boundness of the jump rate g(·) were used. The same arguments permit therefore to deduce the hydrodynamic behavior of a more general class of processes. 4. The Asymmetric Tagged Particle We prove in this section Theorems 1.1 and 1.2 through the hydrodynamic behavior of the inhomogeneous zero range processes considered in the previous two sections. We have seen in the first section that the displacement of the asymmetric tagged particle corresponds in the zero range process to the total flux of particles through the origin. For this reason, we start deducing the total flux through the origin from the hydrodynamic limit proved in the previous two sections. Proposition 4.1. In the case p = 1, consider a sequence of probability measures µN satisfying assumptions (H1) and (H2). Then, for every t ≥ 0 and δ > 0, Z ∞ h i −1 X {η (x) − η (x)} − du {ρ(t, u) − ρ (u)} > δ = 0 , (4.1) lim PN N N t 0 0 µ N →∞

x≥0

0

where ρ is the solution of (1.3). In the case p < 1, consider a sequence of probability measures µN satisfying assumptions (IS1), (IS2). Then, for every t ≥ 0 and δ > 0 (4.1) holds, where ρ is now the solution of (1.4). Proposition 4.1 follows from the hydrodynamic behavior of the inhomogeneous processes considered in Sects. 3, 4 and from the definition of the infinite sums appearing in (4.1). Theorem 1.1 and 1.2 follow from Proposition 4.1 if we prove the following proposition: Proposition 4.2. Fix a sequence of initial states µN ρ0 (·) satisfying the assumptions of N Theorem 1.1 or 1.2. The sequence T µρ0 (·) satisfy assumptions (H1), (H2) in the case p = 1 or (IS1), (IS2) in the case p < 1, where T is the transformation defined in Sect. 1. Proof. We start with the case p = 1. A simple computation shows that T transforms + defined by (2.2). Fix the Bernoulli product measure µρ in the product measure ν(1−ρ)/ρ a profile ρ0 : R+ → [0, 1] for which there exists σ > 0 such that σ ≤ ρ0 ≤ 1 − σ. Recall that we denote by µN ρ0 (·) the inhomogeneous product measure associated to ρ0 . N = ν . We shall now show that νρN0 (·) fulfills assumptions (H1 ), (H2). Let T µN ρ0 (·) ρ0 (·) We first claim that if µ is a product measure on {0, 1}N∗ bounded above (resp. below) + . Here by µ+ρ for some 0 < ρ < 1, then T µ is bounded below (resp. above) by ν(1−ρ)/ρ

302

C. Landim, S. Olla, S. B. Volchan

µ+ρ stands for the restriction on N of the measures µρ . Notice that the inequalities are reversed by the application T . To fix ideas assume that µ ≤ µ+ρ . For x ≥ 1, denote by γx the probability of finding a particle at x for the probability µ so that γx ≤ ρ. For j ≥ 1, denote by Nj the position of the j th particle at the right of the origin. Since γx ≤ ρ for every x and µ, µρ are product measures, it is possible to couple µ and µρ in such a way µ ρ −Njµ ≥ Nj+1 −Njρ for all j ≥ 1. In this formula, Njµ (resp. Njρ ) that N1µ ≥ N1ρ and Nj+1 stands for the position of the j th particle under the distribution µ (resp. µρ ). Applying the transformation T to this coupling measure, we construct a measure on NN × NN with + and concentrated first marginal equal to T µ, second marginal equal to T µ+ρ = ν(1−ρ)/ρ 1 2 + , what on configurations (η , η ) below the diagonal. This shows that T µ ≥ ν(1−ρ)/ρ + N + concludes the proof of the claim. In particular, νσ/(1−σ) ≤ νρ0 (·) ≤ ν(1−σ)/σ for every N ≥ 1 and assumption (H1) is verified. Notice, however, that the claim “µ1 ≤ µ2 implies T µ1 ≥ T µ2 ” is not correct. Consider, for instance, the configuration ξ 1 , ξ 2 such that ξ 1 (x) = 1

if and only if x 6= 1, 2, 3 and ξ 2 (x) = 1

if and only if x 6= 1, 3.

In this case the deterministic measures δξi are such that δξ1 ≤ δξ2 but it is not correct that δT ξ1 is above δT ξ2 . We turn now to the second assumption (H2). It follows from (1.2) that Z B F(u) du = H−1 (B) − B (4.2) 0

for every B > 0. In order to check (H2), we just need to show that under νρN0 (·) , RB P[BN ] N −1 x=0 η(x) converges in probability to 0 F (u)du for every B > 0. Fix a positive integer n. The following inequalities state that for the exclusion process the total number of sites in 3n = {0, . . . , n} is equal to the total number of particles plus the total number of holes (that corresponds to the total number of particles for the zero range process): Pn Pn −1+ ξ(x) ξ(x) n n x=0 x=0 X X X X ξ(x) + η(y) ≤ n + 1 ≤ ξ(x) + η(y). x=0

y=0

x=0

y=0

P[BN ] The convergence of N −1 x=0 η(x) follows from theseRinequalities, the fact that under P n −1 the measure µN ρ0 (·) , N 0≤x≤[nN ] ξ(x) converges to 0 ρ0 (u) du and identity (4.2). Details are left to the reader. In exactly the same way, assumptions (IS1), (IS2) can be checked in the case p < 1. The only difference is that we assume in (IS1) that the sequence of initial measures is bounded below by an invariant measure ν¯ϕi which is inhomogeneous in space. This forces the initial profile ρ0 to be bounded below by the function ψα (u) = (1 − α)1{u < 0} + [1 − (q/p)α]1{u > 0} for some 0 < α < 1. 5. Einstein Relation We consider in this section initial profiles for which the solution of equation (1.4) is selfscaling. For two fixed densities ρ− and ρ+ consider, for instance, the initial condition ρ0 (·) given by

Driven Tracer Particle

303

ρ0 (u) = ρ+ 1{u ≥ 0} + ρ− 1{y < 0}. √ The solution of (1.4) takes the form ρ(t, u) = ϕ(u/ t), where ϕ(·) is the solution of  − zϕ0 (z) = ∂z2 8(ϕ(z)) ,     ϕ0 (0−) ϕ0 (0+) (5.1) = ϕ(±∞) = ρ± .  (1 + ϕ(0+))2 (1 + ϕ(0−))2    p8(ϕ(0+)) = q8(ϕ(0−)). √ It easy to see that in this case vt = v t, where v is given by Z +∞ {ρ+ − ϕ(y)} dy. v = 0

Moreover, since ρ+ = ϕ(∞), we may write the expression inside braces as R ∂ ϕ(z)dz. Performing an integration by parts and keeping in mind that ϕ is the z [y,∞) solution of (5.1), we obtain that v =

ϕ0 (0+) · (1 + ϕ(0+))2

We now transform (5.1) in a linear equation through the following Lagrangian change of coordinates: Z z 1 · (1 + ϕ(y)) dy m(x) = x(z) = 1 + ϕ(z(x)) 0 We leave to the reader to check that this transformation is in fact the inverse of the transformation T described in (1.2). Moreover, a simple computation shows that m(x) is the solution of the linear equation  00 m (x) = −(x + v)m0 (x) ,      m0 (0+) m0 (0−)   = ,  −v = m(0+) m(0−) (5.2)  p(1 − m(0+)) = q(1 − m(0−)) ,      1   m(±∞) = α± = . 1 + ρ± In fact (5.2) describes the selfscaling solution of the Stefan problem:  1  ∂t m∗ (x, t) = ∂xx m∗ (x, t) ,    2    ∂x m∗ (vt +, t) ∂x m∗ (vt −, t) − vt = = , (5.3) m∗ (vt +, t) m∗ (vt −, t)    ∗ ∗    p{1 − m (vt +, t)} = q{1 − m (vt −, t)} ,  ∗ m (x, 0) = α+ 1{x ≥ 0} + α− 1{x < 0} . √ In other words, m(x/ t) is the macroscopic profile of density as seen from the tagged asymmetric particle.

304

C. Landim, S. Olla, S. B. Volchan

The solution of (5.2) can be written as Z x  2   e−(1/2)y −vy dy for x > 0 ,  A+ + B+ m(x) = Z0 x  2   A− + B − e−(1/2)y −vy dy for x < 0 , 0

where the parameters are related by the equations p(1 − A+ ) = q(1 − A− ) ;

−v =

B+ B− = ; A+ A−

α± = A± J(±v),

R +∞ 2 where J(v) = 1 − v 0 e−(1/2)y −vy dy. It follows from the previous identities that the parameters p, α+ , α− and v satisfy the equation α− α+ =q 1− . (5.4) p 1− J(v) J(−v) This equation was obtained heuristically by [BDMO]. In particular, we cannot write v as an explicit function of p, α+ , α− , but we can study some asymptotic relations. We consider three distinct asymptotics. We first investigate the case of a constant initial profile: α+ = α− = α. In this case elementary computations give the identity (p − q)

1 − α pJ(−v) − qJ(v) − (p − q)J(v)J(−v) = · α J(v)J(−v)

(5.5)

For small asymmetry p − q, we have a small displacement v. Replacing in (5.5) J(v) by its expansion for v small gives, for fixed α and small p − q, that r v = (p − q)

2 1−α + o(p − q). π α

This proves the validity of Einstein relation for small drifts. In the case α+ 6= α− one can expand around the equilibrium, i.e., for small p(1 − α+ ) − q(1 − α− ). The same expansions show that p(1 − α+ ) − q(1 − α− ) v= pα+ − qα−

r

2 + o p(1 − α+ ) − q(1 − α− ) . π

A third possible asymptotics is given when the initial profile is constant and the density α = α+ = α− is small. In this case, for a fixed drift p − q, the displacement v is very large. Asymptotically, for |v| close to ∞, a simple computation shows that √ 2 J(v) ∼ v −2 , J(−v) ∼ vev /2 2π. Using these expansions in (5.4) one obtains that r v∼

p−q + o α+

1 √ α+

.

Driven Tracer Particle

305

6. Appendix: Uniqueness Case p = 1. This is an extension to infinite volume of an argument presented in [ELS2]. Fix a weak solution ρ(t, u) of the differential Eq. (1.3). Since ρ(t, ·) is in L1loc (R+ ), we may define Rt : R2+ → R by Z v ρ(t, w) dw. (6.1) Rt (u, v) = u

Denote by [·, ·] the inner product in L2 (R2+ ). Fix a smooth function H: R2+ → R with compact support. Changing the order of summations we obtain that Z du ρ(t, w)h(w) , (6.2) [Rt , H] = R+

Z

where

Z

w

∞

du

h(w) =

w

0

Z dv H(u, v) −

Z

∞

w

du w

dv H(u, v). 0

Notice that h is a smooth function with compact support that vanishes at the origin. Moreover, its derivative is given by Z ∞ 0 h (w) = du {H(w, u) − H(u, w)}. 0

Therefore, in the virtue of (6.2), property (c) of weak solutions and a change of variables, for every smooth function H with compact support, Z t Z Z ds du dv H(u, v){∂v 8(ρ(s, v)) − ∂u 8(ρ(s, u))}. [Rt , H] = [R0 , H] + 0

R+

R+

In particular, we have that Z

t

Rt (u, v) − R0 (u, v) =

ds {∂v 8(ρ(s, v)) − ∂u 8(ρ(s, u))}

(6.3)

0

for almost all (u, v) in R2+ . Consider now two solutions ρ1 , ρ2 of Eq. (1.3), denote by Rt1 , Rt2 the respective functions associated to ρ1 , ρ2 , through (6.1) and set Wt = Rt1 − Rt2 , ρ¯t = ρ1t − ρ2t . Denote by [·, ·]e the inner product on L2 (R2+ ) associated to the measure e−(u+v) dudv. In view of property (a) of weak solutions and identity (6.3), R: [0, T ] → L2 (R2+ , e−(u+v) dudv) is almost everywhere differentiable. Therefore, Z Z d ¯ t (v) − ∂u 8 ¯ t (u)} , [Wt , Wt ]e = 2 du dv e−(u+v) Wt (u, v){∂v 8 dt ¯ t (v) stands for 8(ρ1 (t, v)) − 8(ρ2 (t, v)). An integration by parts gives that the where 8 right-hand side is equal to Z Z Z ¯ t (u)ρ(t, ¯ t (v) −2 e−u 8 ¯ u) + 2 du dv e−(u+v) Wt (u, v)8 (6.4) R+

306

because by

C. Landim, S. Olla, S. B. Volchan

R

du exp{−u} = 1. By Schwarz inequality, the second term is bounded above Z 1 ¯ t (u) 2 du e−u 8 k80 k∞ [Wt , Wt ]e + 0 k8 k∞ R+ Z ¯ t (u)ρ(t, ≤ k80 k∞ [Wt , Wt ]e + du e−u 8 ¯ u) R+

because 8 is an increasing function with a bounded first derivative. Adding this expression to the first term of (6.4), we obtain that the time derivative of [Wt , Wt ]e is bounded above by Z 0 ¯ t (u)ρ(t, du e−u 8 ¯ u) ≤ k80 k∞ [Wt , Wt ]e k8 k∞ [Wt , Wt ]e − R+

because 8 is non decreasing. By the Gronwall inequality, we deduce that [Wt , Wt ]e is bounded above by [W0 , W0 ]e exp{k80 k∞ t}, which concludes the proof of the uniqueness of weak solutions of Eq. (1.3). The case p < 1. The argument is similar to the one presented for p = 1. For t ≥ 0, define Rt : R2 → R+ as in (6.1). It can be shown that Z t ds {∂v 8(ρ(s, v)) − ∂u 8(ρ(s, u))} Rt (u, v) − R0 (u, v) = 0

for almost all (u, v) in R . Consider two solutions of Eq. (1.4). Denote by m(du) = m(u)du the absolutely continuous measure with density m(u) = p1{u < 0}+q1{u > 0} enough and Rand fix a smooth function θ: R → R+ such that θ(0) = 0, θ(u) = |u| for u2 large m(du) exp{−θ(u)} = 1. Let [·, ·]m stand for the inner product in L (R2 ) with respect to the measure m(du)m(dv) exp{−θ(u) − θ(v)}. Fix two solutions ρ1 , ρ2 of Eq. (1.4), denote by Rt1 , Rt2 the respective functions associated to ρ1 , ρ2 , through (6.1) and set Wt = Rt1 − Rt2 . With the same arguments presented above one can show that [Wt , Wt ]m is bounded above by [W0 , W0 ]m exp{C(θ, k80 k∞ )t}. In this deduction the use of the measure m(du) instead of the Lebesgue measure is fundamental in the integration by parts performed in (6.4) for the boundary term to cancel. 2

Acknowledgement. The authors would like to thank H. Rost and H. Spohn for fruitful discussions on the Einstein relation and Pott’s models. We also thank G. S. Oshanin for pointing out a computational mistake in Sect. 5 of a previous version of this paper.

References [A] [Ar]

Andjel, E.; Invariant measures for the zero-range processes. Ann. Probab. 10, 525–547 (1982) Arratia, R: The motion of a tagged particle in the simple symmetric exclusion system in Z. Ann.Prob. 11, 362–373 (1983) [BDMO] Burlatsky, S.F., De Coninck, J., Moreau, M., Oshanin, G. S: Dynamics of the shock front propagation in a one-dimensional hard core lattice gas. Preprint (1997) [BMMO] Burlatsky, S. F., Mogutov, A. V., Moreau, M., Oshanin, G. S: Directed walk in a one–dimensional gas. Phys. Lett. A 166, 230–234 (1992) [BMOR] Burlatsky, S.F., Moreau, M., Oshanin, G. , Reinhardt, W.P: Motion of a driven tracer particle in a one-dimensional symmetric lattice gas. Phys.Rev. E, 54, no. 4, 3165–3172 (1996) [ELS] Eyink, G., Lebowitz, J.L. and Spohn, H.: Lattice gas models in contact with stochastic reservoirs: local equilibrium and relaxation to steady states. Commun. Math. Phys. 140, 119–131 (1991)

Driven Tracer Particle

[GPV] [H] [KL] [LM] [LR] [RV] [Spo]

307

Guo, M.Z., Papanicolaou, G.C. and Varadhan, S.R.S.: Nonlinear diffusion limit for a system with nearest neighbor interactions. Comm. Math. Phys. 118, 31–59 (1988) Harris, T.E: Diffusions with collisions between particles. J. Appl. Prob. 2, 323–338 (1965) Kipnis, C., Landim, C: Hydrodynamical Limit of Interacting Particle Systems., Preprint 1996 Landim, C., Mourragui, M: Hydrodynamic limit of mean zero asymmetric zero range processes in infinite volume. Ann. Inst. H. Poincar´e, Prob. et Stat. 33, 65–82 (1997) Lebowitz, J.L., Rost, H.: The Einstein relation for the displacement of a test particle in a random environment. Stoc. Proc. Appl. 54, 183–196 (1994) Rost, H. and Vares, M.E: Hydrodynamics of a one dimensional nearest neighbor model. Contemp. Math. 41, 329–342 (1985) Spohn, H: Large Scale Dynamics of Interacting Particles Text and Monographs in Physics, New York: Springer Verlag, 1991

Communicated by J. L. Lebowitz

Commun. Math. Phys. 192, 309 – 347 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Relative Zeta Functions, Relative Determinants and Scattering Theory ¨ Werner Muller Universit¨at Bonn, Mathematisches Institut, Beringstrasse 1, D-53115 Bonn, Germany. E-mail: [email protected] Received: 15 January 1997 / Accepted: 2 July 1997

Abstract: We use the method of zeta function regularization to regularize the ratio det A/ det A0 of the determinants of two elliptic self-adjoint operators A, A0 satisfying certain natural assumptions. This is of interest, especially, if the regularized determinants of the individual operators don’t exist as, for example, in the case of elliptic operators on a noncompact manifold. 0. Introduction In the present paper we use the method of zeta function regularization to regularize the ratio det A/ det A0 of the determinants of two elliptic self-adjoint operators A and A0 , acting in the space of smooth sections of a vector bundle over a C ∞ manifold X. If X is closed, then the zeta function technique can be used to regularize the determinants of the individual operators A and A0 , and the above ratio is simply the ratio of these regularized determinants. However, if X is noncompact then, in general, an elliptic self-adjoint operator A on X has continuous spectrum and therefore, the zeta function of A cannot be defined. Since regularized determinants of elliptic operators play an important role in various fields of mathematics and physics (see [Mu6] for some references concerning applications), it is interesting to see how this technique can be extended to the case of elliptic operators on noncompact manifolds. First, we recall some facts about the zeta function regularization of determinants of elliptic operators on compact manifolds. Let E be a complex vector bundle over a closed n-dimensional C ∞ manifold M and let A : C ∞ (E) → C ∞ (E) be an elliptic pseudodifferential operator of order m > 0. Suppose that A is symmetric and nonnegative (with respect to an inner product in C ∞ (E) induced by the choice of a metric on M and a fibre metric in E). Then the zeta function of A is defined by X λ−s (0.1) ζA (s) = j , λj >0

310

W. M¨uller

where the λj ’s run over the eigenvalues of A, and each eigenvalue is counted with its multiplicity. The series (0.1) is absolutely convergent in the half-plane Re(s) > n/m and admits a meromorphic extension to the entire complex plane [Se]. Moreover, s = 0 is not a pole of ζA (s). We note that the zeta function can also be expressed in terms of the heat operator by Z ∞ 1 ts−1 Tr(e−tA ) − dim ker A dt, Re(s) > n/m, (0.2) ζA (s) = 0(s) 0 and the well known heat expansion can be used to obtain the analytic continuation of ζA (s) (see e.g. [Gi]). Using the zeta function, the regularized determinant of A is defined by d . (0.3) det A = exp − ζA (s) ds s=0 The zeta function regularization was first introduced by Ray and Singer [RS1] to define a regularized determinant for the Laplacian on differential forms twisted by a flat bundle. Hawking [H] has used the same method to regularize quadratic path integrals on a curved background spacetime. In some cases it is also possible to define a determinant for self-adjoint elliptic operators that are unbounded from below. For example, let P : C ∞ (E) → C ∞ (E) be an elliptic self-adjoint differential operator of order m > 0. For such operators, Atiyah, Patodi and Singer [APS1, APS3] introduced the so-called η-function which in the half-plane Re(s) > n/m is given by ηP (s) =

X signλj |λj |s

λj 6=0

1 = 0 (s + 1)/2

Z

∞

t(s−1)/2 Tr P e−tP

2

dt.

0

(0.4) Here λj runs over the nonzero eigenvalues of P . The analytic function ηP (s) has a meromorphic extension to C with isolated simple poles. The residues at all the poles are locally computable, and s = 0 is not a pole of ηP (s) [APS3, Gi]. The fact that ηP (s) is regular at s = 0 is less obvious than in the case of the zeta function. Using ηP (0), one can define a regularized determinant for P by πi ηP (0) − ζ|P | (0) (0.5) det P = det |P | · exp 2 [Si]. As explained above, for elliptic self-adjoint operators on a noncompact manifold X, in general, the zeta function (0.1) cannot be defined. To introduce an appropriate zeta function, we consider pairs (A, A0 ) of elliptic self-adjoint operators such that A is a compactly supported perturbation of A0 . Then A0 serves as a reference operator and our goal is to regularize det A/ det A0 . In some special cases, this question has been studied in [Br, GMS, Lu, JL] and [Mu2] (see §4). Assume, in addition, that A and A0 are nonnegative. Then, under some natural assumptions, we can expect that e−tA − e−tA0 is a trace class operator for each t > 0, and that Tr e−tA − e−tA0

Relative Zeta Functions, Relative Determinants and Scattering Theory

311

has an asymptotic expansion as t → 0. This holds, for example, for spinor Laplace operators on complete Riemannian manifolds [Bu]. In order to define the relative zeta is analogous to (0.2), we have to investigate function ζ(s; A, A0 ) by a formula which the behaviour of Tr e−tA − e−tA0 as t → ∞. For this purpose we use Krein’s theory of the spectral shift function [Kr, BY, Y], which implies that the spectral shift function ξ(λ) = ξ(λ; A, A0 ) of (A, A0 ) exists and the following trace formula holds: Z ∞ Tr e−tA − e−tA0 = −t e−tλ ξ(λ) dλ. (0.6) 0

The spectral shift function is locally integrable and locally constant on the complement of σ(A)∪σ(A0 ). Moreover, e−tλ ξ(λ) is absolutely integrable on R+ . Thus, if the essential spectrum of A0 has a positive lower bound, then the same holds for A and it follows from (0.6) that there exists c > 0 such that (0.7) Tr e−tA − e−tA0 = dim ker A − dim ker A0 + O(e−ct ) as t → ∞. This resembles the behaviour of the trace of the heat operator in the compact case and we can define a relative zeta function by Z ∞ 1 ts−1 Tr e−tA − e−tA0 − b dt, Re(s) 0, ζ(s; A, A0 ) = 0(s) 0 where b = dim ker A − dim ker A0 . However, if the continuous spectrum of A0 extends down to zero, (0.7) does not hold. Then the large time behaviour of Tr e−tA − e−tA0 is more complicated and this is where scattering theory comes into play. First of all, by (0.6), it follows that the large time behaviour of Tr e−tA − e−tA0 is related to the behaviour of the spectral shift function ξ(λ) near λ = 0. To study ξ(λ) near λ = 0 we shall use scattering theory. Since e−tA − e−tA0 is a trace class operator, the Birman-Krein invarinace principle for wave operators [K1, K2] implies that the Wave operators W± (A, A0 ) exist and are complete and therefore, the scattering matrix S(λ) = S(λ; A, A0 ), λ ∈ σac (A0 ), exists. Furthermore, the determinant of the scattering matrix is related to the spectral shift function by det S(λ) = e−2πiξ(λ) ,

λ ∈ σac (A0 ),

[BK, BY, Y]. Thus, the investigation of the large time behaviour of Tr e−tA − e−tA0 can be reduced to the study of log det S(λ) near λ = 0. For typical examples (see Sect. 4), det S(λ) is differentiable near zero and (0.6) specializes to a more explicit trace d log det S(λ). This formula involving the eigenvalues and the scattering phase shift dλ trace formula can be used to determine the asymptotic behaviour of Tr e−tA − e−tA0 as t → ∞. Then, by a modification of (0.2), we can define a relative zeta function ζ(s; A, A0 ) which is a meromorphic function of s ∈ C. This zeta function has some new features that do not occur in the compact case. For example, there are poles whose residues are related to the derivatives of det S(λ) at λ = 0 and therefore, these residues are not locally computable as in the compact case. If ζ(s; A, A0 ) is holomorphic at s = 0, we can introduce the regularized ratio det(A, A0 ) =

det A det A0

by the analogous formula (0.3) with ζA (s) replaced by ζ(s; A, A0 ).

312

W. M¨uller

We note that we may think of A0 as being a reference operator and regard det(A, A0 ) as a function of A. In particular, we can use det(A, A0 ) in much the same way as det A is used in the case of elliptic operators on a compact manifold. Next we briefly describe the content of this paper. In Sect. 1, we introduce relative zeta functions for pairs (H, H0 ) of self-adjoint operators in a Hilbert space. In Sect. 2, we use Krein’s theory of the spectral shift function to relate the large time behaviour of Tr e−tA − e−tA0 to properties of the scattering matrix near zero. Then relative determinants are introduced and studied in Sect. 3. Finally, in Sect. 4, we discuss several natural examples: the Schr¨odinger operator in Rn , perturbations of the euclidean Laplacian, manifolds with cylindrical ends and surfaces with hyperbolic ends. 1. Relative Zeta Functions In this section we study relative zeta functions in the abstract setting. Let H be a separable Hilbert space and let H, H0 be two self-adjoint nonnegative linear operators in H. We assume that the following conditions hold: 1) Let e−tH and e−tH0 be the heat semigroups associated to H and H0 , respectively. Then e−tH − e−tH0 is a trace class operator for t > 0.

(1.1)

2) As t → 0, there exists an asymptotic expansion of the form Tr e

−tH

−e

−tH0

∼

k(j) ∞ X X

ajk tαj logk t,

(1.2)

j=0 k=0

where −∞ < α0 < α1 < . . . and αk → ∞. Moreover, if αj = 0 we assume that ajk = 0 for k > 0. 3) As t → ∞, there exists an asymptotic expansion of the form ∞ X bk t−βk Tr e−tH − e−tH0 ∼

(1.3)

k=0

where 0 = β0 < β1 < · · · . It follows from (1.2) that the integral Z 1 ts−1 Tr (e−tH − e−tH0 ) dt 0

is absolutely convergent in the half-plane Re(s) > −α0 and has a meromorphic continuation to the whole complex plane. The possible poles occur at s = −αj , j ∈ N, and at a given pole s = −αj , the Laurent expansion is given by k(j) X k=0

Set

ajk

k! + O(1). (s + αj )k+1

(1.4)

Relative Zeta Functions, Relative Determinants and Scattering Theory

ζ1 (s; H, H0 ) =

1 0(s)

Z

1

ts−1 Tr(e−tH − e−tH0 ) dt.

313

(1.5)

0

Then ζ1 (s; H, H0 ) is a meromorphic function of s ∈ C. Poles may occur at s = −αj , j ∈ N, and the corresponding Laurent expansion is determined by (1.2). By our assumption about the asymptotic expansion (1.2), ζ1 (s; H; H0 ) is holomorphic at s = 0. Next consider the integral Z ∞ ts−1 Tr (e−tH − e−tH0 ) dt. 1

By (1.3), this integral is absolutely convergent in the half-plane Re(s) < β0 and admits a meromorphic continuation to C. Set Z ∞ 1 ts−1 Tr e−tH − e−tH0 dt. (1.6) ζ2 (s; H, H0 ) = 0(s) 1 Then ζ2 (s; H, H0 ) is a meromorphic function of s ∈ C with possible poles at s = βk , k ∈ N+ . All poles are simple and we have Res ζ2 (s; H, H0 ) =

s=βk

bk , k > 0. 0(βk )

Furthermore, ζ2 (s; H, H0 ) is holomorphic at s = 0. Put ζ (s; H, H0 ) = ζ1 (s; H, H0 ) + ζ2 (s; H, H0 ) .

(1.7)

We call ζ (s; H, H0 ) the relative zeta function of (H, H0 ) . Summarizing, we have proved Proposition 1.1. Assume that the pair (H, H0 ) of nonnegative self-adjoint operators satisfies conditions (1.1)–(1.3). Then the relative zeta function ζ (s; H, H0 ) is a meromorphic function of s ∈ C. The set of poles of ζ (s; H, H0 ) is contained in the set {−αj | j ∈ N} ∪ {βk | k ∈ N+ } and s = 0 is a regular point. Example 1.1. Let M be a closed Riemannian manifold of dimension n, let E be a Hermitian vector bundle over M and let Di : C ∞ (M, E) −→ C ∞ (M, E), i = 0, 1, be an elliptic self-adjoint differential operator of order mi > 0 which is nonnegative. Let Hi , i = 0, 1, be the unique self-adjoint extension of Di in L2 (M, E). It is wellknown that exp(−tHi ) is a trace class operator for t > 0 and, as t → 0, there exists an asymptotic expansion X aij tj . (1.8) Tr e−tHi ∼ t−n/mi j≥0

Let hi = dim Ker Di , i = 0, 1. Then, as t → ∞, we have Tr (e−tHi ) = hi + O(e−ct ).

(1.9)

Thus, (1.2) and (1.3) are a consequence of (1.8) and (1.9). In the present case, the zeta function ζHi (s) of Hi is defined by (0.1) and the relative zeta function of (H1 , H0 ) can be expressed in terms of the absolute zeta functions by

314

W. M¨uller

ζ(s; H1 , H0 ) = ζH1 (s) − ζH0 (s),

(1.10)

which justifies the name relative zeta function. We note that (1.8) is the general form of the short time asymptotic expansion of the trace of the heat operator of an elliptic operator on a compact manifold. In particular, there occur no logarithmic terms in this expansion. However, if we pass to the noncompact setting, then, in general, logarithmic terms naturally arise in the short time asymptotic expansion of the trace of the difference of two heat semigroups. Examples are hyperbolic manifolds or, more generally, manifolds with cusps. Furthermore, the large time behaviour (1.9) of the trace of the heat semigroup has also a very simple form. In the noncompact case, (1.3) is related to the behaviour of the continuous spectrum near zero. If the essential spectrum of H0 has a positive lower bound, then the expansion (1.3) is similar to (1.9) (see Lemma 2.2). Finally, we observe that a relative version of the η-function (0.4) can be defined in the same way. Suppose that D and D0 are self-adjoint operators in H such that the following three conditions hold: 1) Let e−tD and e−tD0 be the heat semigroups of the nonnegative selfadjoint operators D2 and D02 , respectively. Then 2

2

De−tD − D0 e−tD0 is a trace class operator for t > 0. 2

2

(1.11)

2) As t → 0, there exists an asymptotic expansion of the form Tr De

−tD 2

− D0 e

−tD02

∼

k(j) ∞ X X

Ajk tκj logk t,

(1.12)

j=0 k=0

where −∞ < κ0 < κ1 < · · · → ∞. 3) As t → ∞, there exists an asymptotic expansion of the form ∞ X 2 2 Tr De−tD − D0 e−tD0 ∼ Bk t−θk ,

(1.13)

k=0

where 0 < θ0 < θ1 < · · · → ∞. Then we can introduce partial η-functions by Z 1 2 2 1 t(s−1)/2 Tr De−tD − D0 e−tD0 dt, η1 (s; D, D0 ) = 0 (s + 1)/2 0 Re(s) > −κ0 ; 1 η2 (s; D, D0 ) = 0 (s + 1)/2

Z

∞

(1.14)

2 2 t(s−1)/2 Tr De−tD − D0 e−tD0 dt,

1

Re(s) < θ0 .

(1.15)

Using the asymptotic expansions (1.12) and (1.13), the partial η-functions can be continued analytically to the whole complex plane. Set η(s; D, D0 ) = η1 (s; D, D0 ) + η2 (s; D, D0 ).

(1.16)

This is the relative η-function associated to (D, D0 ). If κj 6= −1/2 for all j ∈ N, then η(s; D, D0 ) is regular at s = 0.

Relative Zeta Functions, Relative Determinants and Scattering Theory

315

2. The Spectral Shift Function and the Large Time Asymptotic Expansion In this section, we study the spectral shift function of (H, H0 ) and we investigate the connection between its behaviour at 0 (resp. ∞) and the asymptotic expansion (1.3) (resp. (1.2)). First, we introduce some notation. Let A be a self-adjoint operator in a Hilbert space. Then we denote by σ(A) the spectrum of A. Moreover, by σpp (A), σsc (A), σac (A) and σess (A), respectively, we shall denote the pure point spectrum, the singular continuous, the absolutely continuous and the essential spectrum of A, respectively. Next we recall some basic results of Krein [Kr] and Birman–Krein [BK] concerning the spectral shift function. See also [BY, Y]. Let A and A0 be bounded self-adjoint operators in H and suppose that V = A − A0 is a trace class operator. Let R0 (z) = (A0 − z)−1 . Then the spectral shift function ξ(λ) = ξ(λ; A, A0 ) = π −1 lim arg det(1 + V R0 (λ + i)) →0

(2.1)

exists for a.e. λ ∈ R. Note that arg det(1 + V R0 (λ + i)) is well defined by the condition that it should tend to zero as → ∞. The spectral shift function ξ(λ) is real valued, belongs to L1 (R) and satifies Z ξ(λ) dλ, ||ξ||L1 ≤ ||A − A0 ||1 , (2.2) Tr(A − A0 ) = R

where || · ||1 denotes the trace norm. Moreover, let I(A, A0 ) be the smallest interval containing σ(A) ∪ σ(A0 ). Then ξ(λ) = 0 Let

for

G=

λ∈ / I(A, A0 ). Z

f : R → R | f ∈ L1 and

R

(2.3)

|fˆ(p)|(1 + |p|) dp < ∞ .

Then for every ϕ ∈ G, ϕ(A) − ϕ(A0 ) is a trace class operator and Z ϕ0 (λ)ξ(λ) dλ. Tr(ϕ(A) − ϕ(A0 )) =

(2.4)

R

Note that (2.4) determines the spectral shift function up to an additive constant and this constant is fixed by (2.3). Applied to our setting, we obtain Proposition 2.1. Let H, H0 be two nonnegative self-adjoint operators in H and assume that e−tH − e−tH0 is a trace class operator for t > 0. Then there exists a unique real valued locally integrable function ξ(λ) = ξ(λ; H, H0 ) on R such that for each t > 0, e−tλ ξ(λ) ∈ L1 (R) and the following conditions hold R∞ 1) Tr(e−tH − e−tH0 ) = −t 0 e−tλ ξ(λ) dλ. 2) For every ϕ ∈ G, ϕ(H) − ϕ(H0 ) is a trace class operator and Z ϕ0 (λ)ξ(λ) dλ. Tr(ϕ(H) − ϕ(H0 )) = R

316

W. M¨uller

3) ξ(λ) = 0 for λ < 0. Proof. First, we note that 2) and 3) determine ξ(λ) uniquely. Indeed, if ξ1 and ξ2 are any locally integrable functions on R satisfying 2) and 3), then ξ1 − ξ2 is a constant which must be zero by 3). To prove existence, let ξt (λ) = ξ(λ; e−tH , e−tH0 ) be the spectral / [0, 1]. shift function of (e−tH , e−tH0 ). Then ξt ∈ L1 (R) and by (2.3), ξt (λ) = 0 for λ ∈ Let ϕ ∈ C0∞ (R) and set ( ϕ(− 1t ln µ), if 0 < µ; f (µ) = 0, if µ ≤ 0. Then f ∈ C0∞ (R) with supp f ⊂ (0, ∞). Changing variables by µ = e−tλ and using (2.4), we obtain Z ϕ0 (λ)ξt (e−tλ ) dλ, Tr(ϕ(H) − ϕ(H0 )) = R

which holds for every ϕ ∈ C0∞ (R). Furthermore, we note that ξt (e−tλ ) is locally integrable and ξt (e−tλ ) = 0 for λ < 0. Hence, by uniqueness, ξt (e−tλ ) is independent of t. We denote this function by ξ(λ) = ξ(λ; H, H0 ). It satisfies 2) and 3). Moreover, using (2.2) we get 1) and e−tλ ξ(λ) is integrable on R. The extension to functions in G is straight forward. The function ξ(λ; H, H0 ) is called the spectral shift function of H, H0 . From Proposition 2.1, it follows that, in order to determine the asymptotic behaviour of Tr(e−tH − e−tH0 ) as t → ∞, we have to investigate the behaviour of the spectral shift function ξ(λ) near λ = 0. As a first consequence of Proposition 2.1 we obtain Lemma 2.2. Suppose that σess (H0 ) ⊂ [c, ∞], where c > 0. Then ker H and ker H0 are both finite dimensional and there exists c1 > 0 such that Tr(e−tH − e−tH0 ) = dim ker H − dim ker H0 + O(e−c1 t )

(2.5)

as t → ∞. Proof. First, we observe that by (1.1) and the invariance of the essential spectrum under compact perturbations [RS], it follows that σess (H) = σess (H0 ) ⊂ [c, ∞) , c > 0. Therefore, H0 and H have pure point spectrum in [0, c). In particular, ker H0 and ker H e1 ) be the infimum of the nonzero spectrum of H are finite dimensional. Let λ1 (resp. λ e1 ). By assumption, we have 0 < µ ≤ c and it (resp. H0 ) and set µ = 1/2 min(λ1 , λ follows from Proposition 2.1, 2), that ξ(λ) = dim ker H0 − dim ker H,

0 ≤ λ ≤ µ.

Put e = ξ(λ) + dim ker H − dim ker H0 , ξ(λ)

λ ∈ R.

−tλ e = 0 for λ < µ. Moreover, ξ(λ)e e Then we have ξ(λ) is integrable as a function of λ ∈ R. Hence, we obtain Z ∞ Z ∞ e dλ. e−tλ ξ(λ) = dim ker H − dim ker H0 − t e−tλ ξ(λ) −t 0

µ

Relative Zeta Functions, Relative Determinants and Scattering Theory

317

The integral on the right hand side can be estimated by Z ∞ −tλ/2 e e−tµ/2 |ξ(λ)|e dλ ≤ Ce−tµ/2 µ

for t ≥ 1. Using Proposition 2.1, 1), we obtain (2.5).

Thus, if the essential spectrum of H0 , and consequently also the essential spectrum of H, has a positive lower bound, then as t → ∞, Tr(e−tH −e−tH0 ) behaves in the same way as the trace of the heat operator of an elliptic nonnegative self-adjoint differential operator on a compact manifold. Let h = dim ker H − dim ker H0 . If follows from (2.5) that under the assumption of Lemma 2.2, the relative zeta function can be defined by Z ∞ 1 ts−1 Tr(e−tH − e−tH0 ) − h dt ζ(s; H, H0 ) = 0(s) 0 for Re(s) > −α0 . This is the analogue of (0.2). A special case is that H, H0 > 0. Then (1.2) implies that for Re(s) > −α0 , Z ∞ 1 H −s − H0−s = ts−1 e−tH − e−tH0 dt, 0(s) 0 and H −s − H0−s is a trace class operator for Re(s) > −α0 . Hence, ζ(s; H, H0 ) = Tr H −s − H0−s , Re(s) > −α0 . Now assume that the continuous spectrum of H0 extends down to zero. Then the behaviour of the spectral shift function at λ = 0, in general, can be very complicated. We assume that near λ = 0, ξ(λ) admits an asymptotic expansion in powers of λ, λ ≥ 0. This implies (1.3). More precisely, we have Lemma 2.3. Suppose there exist > 0 and a sequence 0 ≤ γ0 < γ1 < γ2 < · · · with γi → ∞ such that for every N ∈ N, ξ(λ) =

N X

cj λγj + O(λγN +1 )

j=0

uniformly for λ ∈ [0, ]. Then there exists an asymptotic expansion of the form Z ∞ b1 b0 ξ(λ)e−tλ dλ ∼ t−1 γ0 + γ1 + · · · t t 0 as t → ∞.

(2.6)

318

W. M¨uller

Proof. We split the integral as Z

∞

R R∞ + . Since ξ(λ)e−tλ is integrable we obtain 0

ξ(λ)e−tλ dλ ≤ Ce−t/2 ,

t ≥ 1.

Furthermore, note that for γ ≥ 0 and t ≥ 1, we have Z Z ∞ Z γ −tλ γ −tλ λ e dλ = λ e dλ − 0

∞

λγ e−tλ dλ

0

C(γ) + O(e−t/2 ). tγ+1 Replacing ξ(λ) by (2.6), we get the desired result. =

To investigate the asymptotic behaviour of the spectral shift function near λ = 0, one may use the relation between the spectral shift function and the scattering phase [BK, BY]. Let Hac (resp. H0,ac ) be the absolutely continuous subspace for H (resp. H0 ) and let Hac (resp. H0,ac ) be the restriction of H (resp. H0 ) to Hac (resp. H0,ac ). Furthermore, let P0 be the orthogonal projection of H onto H0,ac . It follows from (1.1) and the Birman–Krein invariance principle [BK, K1] that the wave operators W± = W± (H, H0 ) = s − lim eitH e−itH0 P0 t→±∞

(2.7)

exist and are complete. This means that the strong limit (2.7) exists and the operators W± define isometries of H0,ac onto Hac , intertwining H0,ac and Hac . In this context, the scattering operator S is defined by S = W−∗ W+ .

(2.8)

This is a unitary operator on H0,ac that commutes with H0,ac. Let σ0 = σac (H0 ) and let {E0 (λ)}λ∈σ0 be the spectral family of H0,ac . Since S commutes with H0,ac we have Z S(λ) dE0 (λ), S= σ0

where S(λ) = S(λ; H, H0 ) acts in the Hilbert space H(λ). The operator S(λ) is called the on-shell scattering matrix. We observe that S(λ) is a unitary operator in H(λ). Let Iλ be the identity of H(λ). Since S(λ; H, H0 ) = S(e−tλ ; e−tH , e−tH0 ) and exp(−tH) − exp(−tH0 ) is a trace class operator, it follows that T (λ) = S(λ) − Iλ

(2.9)

is a trace class operator (cf. [BK, BY]). The opeator T (λ) is called the scattering amplitude. Thus, the Fredholm determinant det S(λ) of S(λ) exists. Since S(λ) is unitary, we 1 log det S(λ) have | det S(λ)| = 1, λ ∈ σac (H0 ). Hence, the scattering phase s(λ) = 2πi is defined mod Z. It was proved in [BK] that there is a distinguished choice for the scattering phase, namely the spectral shift function ξ(λ). More precisely, the following identity holds det S(λ) = e−2πiξ(λ) ,

a.e. λ ∈ σac (H0 ),

(2.10)

(see [BY, Y]). This equality can be used to derive an expansion of ξ(λ) near λ = 0. For b ,m → D the ramified covering of > 0, let D = {z ∈ C | |z| < } and denote by D m order m, defined by z 7→ z . Then, as a consequence of (2.10), we obtain

Relative Zeta Functions, Relative Determinants and Scattering Theory

319

Proposition 2.4. Let > 0 and suppose that (0, ) ⊂ σac (H0 ). Furthermore, assume that 1) ξ(λ) is continuous on (0, ); 2) There exists m ∈ N such that det S(λ), λ ∈ (0, ), extends to a holomorphic function b ,m . on D Let 0 < 1 < . Then there exists an expansion of the form ξ(λ) =

∞ X

ck λk/m ,

λ ∈ [0, 1 ),

(2.11)

k=0

for some 0 < 1 < 1/m . Proof. According to our assumption, det S(λ) is smooth on (0, ). Since | det S(λ)| = 1, λ ∈ σac (H0 ), it follows that there exists δ ∈ C ∞ ((0, )) such that det S(λ) = exp(−2πiδ(λ)), λ ∈ (0, ). On the other hand, ξ is continuous on (0, ) and therefore, (2.10) implies that there exists k ∈ Z such that ξ(λ) = δ(λ) + k, λ ∈ (0, ). Hence, ξ is smooth on (0, ). Differentiating (2.10) and using that S(λ) is unitary, we get 1 dS(λ) dξ(λ) =− Tr S ∗ (λ) , λ ∈ (0, ). (2.12) dλ 2πi dλ Furthermore, since δ(λ) has a finite limit as λ → 0, we get Z λ dS(u) 1 du Tr S ∗ (u) ξ(λ) = ξ(0+) − 2πi 0 du Z λ1/m m 1 ∗ m dS(u ) = ξ(0+) − du Tr S (u ) 2πi 0 du (2.13) for λ ∈ [0, ). By assumption, Tr S ∗ (z m ) dS(z m )/dz is analytic on the disc |z| < 1 ≤ 1/m . Inserting the Taylor expansion on the right hand side of (2.13) we obtain the desired expansion (2.11). Let

dS(z m ) Tr S (z ) dz ∗

m

=

∞ X

dk z k , |z| < 1 ,

k=0

be the Taylor expansion at z = 0. Then by (2.13), the coefficients of the expansion (2.11) are related to the dk ’s by ck = dk−1 /k, k ≥ 1. Thus, under the assumptions of Proposition 2.4, the asymptotic expansion of Tr(e−tH − e−tH0 ) for t → ∞ determines det S(λ) near λ = 0, up to an additive constant. This means that the coefficients bk in (1.3) and therefore, the residues of the poles of ζ(s; H, H0 ) at s = k/m, k ∈ N+ , are nonlocal. If we insert ξ(λ) in the trace formula of Proposition 2.1 and integrate by parts, we obtain

320

W. M¨uller

Tr(e−tH − e−tH0 )

Z dS(λ) 1 −tλ ∗ dλ + O(e−t/2 ) e Tr S (λ) = −ξ(0+) + 2πi 0 dλ Z 1 m 1 −tλm ∗ m dS(λ ) = −ξ(0+) + dλ e Tr S (λ ) 2πi 0 dλ + O(e−t/2 ) (2.14)

as t → ∞. This formula shows more explicitly how the scattering matrix determines the asymptotic behaviour of Tr(e−tH − e−tH0 ) as t → ∞. Furthermore, in special cases, ξ(0+) can be expressed in terms of spectral data. It follows from (2.14) that Tr(e−tH − e−tH0 ) = −ξ(0+) + O(t−1/m ) as t → ∞. Inserting this equality in (1.6), we obtain the analytic extension of ζ2 (s; H, H0 ) to the half-plane Re(s) < 1/m. Now, in order to define the relative determinant of H, H0 by zeta function regularization, it is sufficient to know that ζ(s; H, H0 ) is analytic in some half-plane Re(s) < ε, ε > 0. Therefore, we may replace (1.3) by the following weaker assumption: There exist b0 ∈ C and ρ > 0 such that Tr e−tH − e−tH0 = b0 + O(t−ρ ) (2.15) as t → ∞. As above, this condition can be restated in terms of the behaviour of the spectral shift function near λ = 0. In fact, by Proposition 2.1, 1), (2.15) is equivalent to ξ(λ) = −b0 + O(λρ )

(2.16)

as λ → 0+. For this condition to hold, it is not necessary that the scattering matrix extends to an analytic function on some covering of a neighborhood of the origin. Much less stringent assumptions are sufficient. Now, suppose that (2.16) holds. Then ζ2 (s; H, H0 ) has a meromorphic extension to the half-plane Re(s) < ρ which is given by b0 0(s + 1) Z ∞ 1 ts−1 Tr(e−tH − e−tH0 ) − b0 dt. + 0(s) 1

ζ2 (s; H, H0 ) = −

(2.17)

This formula implies that ζ2 (s; H, H0 ) is regular at s = 0. Analogous results can be obtained for the small time asymptotic expansion (1.2). If we impose smoothness assumptions at λ = ∞ similar to those at λ = 0, then it follows that the asymptotic expansion (1.2) corresponds to an asymptotic expansion of the scattering phase shift d/dλ(log det S(λ)) as λ → ∞. Next, we briefly discuss relative η-functions. Details will appear elsewhere. Let D, D0 be self-adjoint operators in H, satisfying (1.11) and (1.12). Moreover, suppose that the spectra of D and D0 have a common gap, i.e. there exists an interval [a, b], a < b, such that (σ(D) ∪ σ(D0 )) ∩ [a, b] = ∅.

(2.18)

Relative Zeta Functions, Relative Determinants and Scattering Theory

321

Then, by a modification of the proof of Proposition 2.1, one can introduce a spectral shift function ξ(λ) = ξ(λ; D, D0 ). This function has the following properties: 1) ξ ∈ L1loc (R) and ξ(λ) = 0 for λ ∈ [a, b]; 2) for all ϕ ∈ C0∞ (R), ϕ(D) − ϕ(D0 ) is a trace class operator and Z Tr (ϕ(D) − ϕ(D0 )) = ϕ0 (λ)ξ(λ) dλ;

(2.19)

R

3) Tr De

−tD 2

− D0 e

−tD02

Z

=

R

2 d (λe−tλ )ξ(λ) dλ. dλ

(2.20)

We note that 1) and 2) determine ξ uniquely. If, however, (2.18) is not satisfied, i.e. if σ(D) = σ(D0 ) = R, then the construction of the spectral shift function is slightly more complicated. In particular, ξ cannot be normalized by fixing its values outside the spectrum and therefore, ξ is determined by (2.19) constant. only up2 to an additive −tD −tD02 as t → ∞ can − D0 e By (2.20), the study of the behaviour of Tr De be reduced to the investigation of the spectral shift function near λ = 0. To handle this problem, one can use, in the same way as above, the connection between the spectral 2 2 shift function and the scattering phase. Indeed, since De−tD − D0 e−tD0 is a trace class operator for t > 0, it follows from the Birman–Kato invariance principle that the wave operators W± (D, D0 ) exist and are complete [K2]. Hence, the scattering matrix S(λ; D, D0 ) also exists and it follows from [BK] that det S(λ; D, D0 ) = e−2πiξ(λ;D,D0 ) ,

λ ∈ σac (D0 ).

As above, this equality can be used to get an expansion of ξ(λ; D, D0 ) near λ = 0, provided that the scattering matrix extends to a holomorphic function on some finite covering of a neighborhood of the origin. But, in order to define the η-invariant, we only need that there exists > 0 such that 2 2 (2.21) Tr De−tD − D0 e−tD0 = O(t−(1/2+) ) as t → ∞. Then (1.15) is absolutely convergent in the half-plane Re(s) < 2. Suppose, for example, that D and D0 are invertible. Then there exists > 0 such that (2.18) holds with respect to the interval [−, ]. Thus, the spectral shift function vanishes on [−, ] and by (2.20), we obtain 2 2 2 (2.22) Tr De−tD − D0 e−tD0 = O(t− /2 ). Hence, in this case, the relative η-function can be defined by a single Mellin transform: η(s; D, D0 ) Z ∞ 2 2 1 t(s−1)/2 Tr De−tD − D0 e−tD0 dt, = 0 (s + 1)/2 0 Re(s) > −κ0 .

(2.23)

322

W. M¨uller

3. Relative Determinants Suppose that (H, H0 ) are nonnegative self-adjoint operators which satisfy (1.1), (1.2) and (2.15). Then the relative zeta function ζ(s; H, H0 ) is well defined as a meromorphic function in the half-plane Re(s) < ρ, ρ > 0, and s = 0 is not a pole of ζ(s; H, H0 ). Therefore, generalizing (0.3), we may define the regularized relative determinant of (H, H0 ) by d . (3.1) det(H, H0 ) = exp − ζ(s; H, H0 ) ds s=0 To justify the name “relative determinant”, consider two self-adjoint operators H0 and H1 such that exp(−tHi ) is trace class for t > 0 and assume that (1.8) holds for both operators. Then the absolute determinants det Hi are defined and it follows from (1.10) that det(H1 , H0 ) =

det H1 . det H0

(3.2)

Using (2.17), the logarithm of the relative determinant can be written as d ζ1 (s; H, H0 )|s=0 − b0 00 (1) ds Z ∞ t−1 Tr e−tH − e−tH0 − b0 dt. +

− log det(H, H0 ) =

1

(3.3) Furthermore, we may use the asymptotic expansion (1.2) to replace the derivative of ζ1 (s; H, H0 ) at s = 0 by a more explicit formula. Namely, let N ∈ N be such that the exponent αN in the expansion (1.2) satisfies αN > 0. Then it follows that − log det(H, H0 ) =

j=0 k=0



Z

1

+

k(j) N X X

ajk

k! − aj0 00 (1) αjk

t−1 Tr(e−tH − e−tH0 ) −

0

− b0 00 (1) +

k(j) N X X

 ajk tαj log t dt k

j=0 k=0

Z

∞

t−1 Tr(e−tH − e−tH0 ) − b0 dt.

1

(3.4) We shall now summmarize some of the elementary properties of the determinant. As an immediate consequence of the definition we get Lemma 3.1. Suppose that H0 , H1 and H2 are self-adjoint operators in H such that both (H1 , H0 ) and (H2 , H1 ) satisfy (1.1), (1.2) and (2.15). Then the following holds: 1) det(H0 , H1 ) = det(H1 , H0 )−1 . 2) The operators (H2 , H0 ) also satisfy (1.1), (1.2) and (2.15) and det(H2 , H0 ) = det(H2 , H1 ) · det(H1 , H0 ).

(3.5)

Relative Zeta Functions, Relative Determinants and Scattering Theory

323

Next, we consider some special cases. Let H be a nonnegative self-adjoint operator in H and suppose that H has eigenvalues 0 < λ1 ≤ λ2 ≤ · · · ≤ λm . Let Pm denote the orthogonal projection of H onto the subspace of H spanned by the eigenvectors that correspond to λ1 , . . . , λm . Put H0 = (I − Pm )H. Then exp(−tH) is a finite-rank perturbation of exp(−tH0 ). In particular, (H, H0 ) satisfies (1.1)–(1.3) and we have det(H, H0 ) =

m Y

λi .

(3.6)

i=1

More generally, if the pair (H, H0 ) satisfies (1.1)– (1.3), then the pair ((I − Pm )H, H0 ) satisfies the same conditions and by (3.5), we get det(H, H0 ) =

m Y

λi · det((I − Pm )H, H0 ).

(3.7)

i=1

This means that we may split off any finite number of eigenvalues from the determinant. Next let H = Hp ⊕ Hc be the orthogonal decomposition into the subspaces that correspond to the point spectrum and the continuous spectrum of H, respectively. Let Hp and Hc denote the restriction of H to Hp and Hc , respectively. Suppose that for each t > 0, exp(−tHp ) is a trace class operator and as t → 0, Tr(exp(−tHp )) admits an asymptotic expansion of the form (1.2). Then det Hp can be defined by zeta function regularization and using (3.5), one has det(H, H0 ) = det Hp · det(Hc , H0 ).

(3.8)

Now let a > 0. Then (H + a, H0 + a) also satisfies (1.1)–(1.3) and the relative zeta function is given by Z ∞ 1 ts−1 e−ta Tr(e−tH − e−tH0 ) dt ζ(s; H + a, H0 + a) = 0(s) 0 for Re(s) > −α0 . The right hand side also makes sense if we replace a by any complex number z with Re(z) > 0. We denote the corresponding analytic function by ζ(s, z; H, H0 ). Using (1.2) it follows that for every z ∈ C with Re(z) > 0, ζ(s, z; H, H0 ), regarded as function of s, admits a meromorphic continuation to C which is holomorphic at s = 0. Thus, generalizing (3.1) we set ∂ . (3.9) det(H + z, H0 + z) = exp − ζ(s, z; H, H0 ) ∂s s=0 Again, if the operators H0 and H1 are as in (3.2), then det(H1 + z, H0 + z) =

det(H1 + z) . det(H0 + z)

(3.10)

We note that if H is a nonnegative self-adjoint operator such that exp(−tH) is trace class for t > 0 and (1.8) holds, then H has pure discrete spectrum consisting of a sequence of eigenvalues 0 ≤ λ0 < λ1 < λ2 < · · · → ∞ of finite multiplicities and det(H + z) is an entire function of z whose set of zeros equals {−λi }i∈N and the multiplicities coincide. In particular, this shows that (3.10) is the quotient of two entire functions. We don’t know yet, if this holds in general. Here we prove a weaker result.

324

W. M¨uller

Proposition 3.2. Suppose that (H, H0 ) satisfies (1.2). Then the relative determinant det(H + z, H0 + z) is a holomorphic function of z ∈ C − (−∞, 0]. Proof. Let ξ(λ) be the spectral shift function associated to (H, H0 ). Let c > 0. By Proposition 2.1, 1), we have Z 2c Z ∞ e−tλ ξ(λ) dλ − t e−tλ ξ(λ) dλ. Tr(e−tH − e−tH0 ) = −t 0 −tλ

2c

Since e ξ(λ) ∈ L (R) for t > 0, the second integral is O(e−tc ) as t → ∞. Furthermore, for every N ∈ N, we have 1

Z

2c

e−tλ ξ(λ) dλ =

0

N X

ck tk + O(tN +1 )

k=0

R∞ as t → 0. Together with (1.2) it follows that as t → 0, 2c e−tλ ξ(λ) dλ has a complete asymptotic expansion similar to (1.2). This implies that the double integral Z ∞ Z ∞ 1 s −tz F (s, z) = t e e−tλ ξ(λ) dλ dt 0(s) 0 2c is absolutely convergent in the half-planes Re(s) > −α0 and Re(z) > −c. Moreover, as a function of s, it admits a meromorphic continuation to C which is holomorphic at s = 0. Next consider the first integral. Note that ξ is absolutely integrable on [0, 2c]. Hence for Re(z) > 0, we get Z ∞ Z 2c 1 ts e−tz e−tλ ξ(λ) dλ dt − 0(s) 0 0 Z 2c Z ∞ 1 ξ(λ) ts e−t(z+λ) dt dλ =− 0(s) 0 0 Z 2c −(s+1) (z + λ) ξ(λ) dλ. = −s 0

Thus, for Re(z) > 0, we have det(H + z, H0 + z) (Z 2c

(z + λ)

= exp 0

−1

) ∂F (0, z) . ξ(λ) dλ + ∂s

(3.11)

The integral has an obvious extension to an analytic function of z ∈ C − (−∞, 0]. Moreover, ∂F ∂s (0, z) is holomorphic in Re(z) > −c. Since c is arbitrary, (3.11) gives the analytic continuation to C − (−∞, 0]. We now shall study the variation of the relative determinant. So, we consider a differentiable 1-parameter family Hu , u ∈ (−, ), of nonnegative self-adjoint operators in H and a nonnegative self-adjoint operator H in H such that for each u ∈ (−, ), the pair (Hu , H) satisfies (1.1), (1.2) and (2.15). By (3.5) we have det(Hu , H) = det(Hu , H0 ) · det(H0 , H),

Relative Zeta Functions, Relative Determinants and Scattering Theory

325

d so that we may take H = H0 . Let H˙ u = du Hu . By Duhamel’s principle, exp(−tHu ) is differentiable in u and Z t d −tHu e e−vHu H˙ u e−(t−v)Hu dv. (3.12) = du 0

Now assume that H˙ u e−tHu is a trace class operator for t > 0 with trace norm uniformly d bounded on compact subsets of (0, ∞). Then it follows from (3.12) that du exp(−tHu ) is also a trace class operator for t > 0 and d Tr e−tHu − e−tH0 = −t Tr(H˙ u e−tHu ). du

(3.13)

We shall now make the following two assumptions about the family Hu : i) Hu is invertible for each u ∈ (−, ), ii) As t → 0, there exists an asymptotic expansion of the form Tr(H˙ u Hu−1 e−tHu ) ∼

k(j) ∞ X X

Cjk (u)tγj logk t

j=0 k=0

+ C1 (u) + C2 (u) log t

(3.14)

and the exponents γj are such that −∞ < γ1 < γ2 < · · · , γj 6= 0, j ∈ N, and γj → ∞. Note that by i), there exists c > 0 such that Tr(H˙ u Hu−1 e−tHu ) = O(e−ct )

(3.15)

as t → ∞. Using i) and (3.13) - (3.15), it follows that for Re(s) 0, Z ∞ 1 d ∂ ζ(s; Hu , H0 ) = ts Tr(H˙ u Hu−1 e−tHu ) dt du 0(s) 0 ∂t Z ∞ s ts−1 Tr(H˙ u Hu−1 e−tHu ) dt. =− 0(s) 0 (3.16) By (3.14) and (3.15), Z

∞

ts−1 Tr(H˙ u Hu−1 e−tHu ) dt

0

has an analytic continuation to a meromorphic function of s ∈ C. The location of the poles is determined by the asymptotic expansion (3.14). In particular, the pole at s = 0 has order ≤ 2 and the Laurent expansion is given by −

C2 (u) C1 (u) + ··· , + s2 s

where C1 (u) and C2 (u) are the corresponding coefficients in (3.14). Using (3.16), we obtain

326

W. M¨uller

Proposition 3.3. Suppose that the family Hu , u ∈ (−, ), satisfies assumptions i) and ii). Then det(Hu , H0 ) is differentiable and d log det(Hu , H0 ) = C1 (u) + 00 (1)C2 (u), du

(3.17)

where C1 (u) denotes the constant term and C2 (u) the coefficient of log t in the asymptotic expansion (3.14). Corollary 3.4. Let Hu , u ∈ [0, 1], be a differentiable family of self-adjoint operators in H such that for all u ∈ [0, 1], (1.1) and (1.2) holds for (Hu , H0 ) and Hu satisfies conditions i) and ii). Then (Z ) 1

det(H1 , H0 ) = exp

(C1 (u) + 00 (1)C2 (u)) du .

(3.18)

0

Remark. A similar formula was proved in [GMS] for a certain class of elliptic invertible pseudodifferential operators on a compact manifold. For each of these operators the determinant can be defined via zeta function regularization and the relative determinant of any two such operators is the fraction of the determinants of the individual operators. The assumption i) is, of course, rather restrictive and excludes many cases that are of interest for applications. However, using (3.4), differentiability can be established under more general assumptions. For example, we have Proposition 3.5. Let Hu , u ∈ (−ε, ε), be a differentiable family of self-adjoint operators which satisfy (1.1), (1.2), (2.15) and in addition, the following conditions hold: a) The constant b0 and the exponent ρ in (2.15) are independent of u, and the coefficients of the expansion (1.2) are differentiable in u. b) For t > 0, H˙ u e−tHu is a trace class operator and there exist C > 0 and δ > 0 such that Tr H˙ u e−tHu ≤ Ct−(1+δ) for t ≥ 1 and u ∈ (−ε, ε). c) As t → 0, there exists an asymptotic expansion of the form k(j) ∞ X X Tr H˙ u e−tHu ∼ Cjk (u)tνj logk t, j=0 k=0

where the exponents νj are such that −∞ < ν1 < ν2 < · · · , νj 6= 1, j ∈ N, and νj → ∞. R1 ˙ −tHu dt is absolutely convergent in the half-plane Then the integral 0 ts Tr He Re(s) > −ν1 − 1 and has a meromorphic extension to the half-plane Re(s) < δ which is holomorphic at s = 0. Moreover, det(Hu , H0 ) is a differentiable function of u and d log det(Hu , H0 ) = du

Z

1

˙ −tHu dt t Tr He s

0

s=0

Z

∞

+ 1

Tr H˙ u e−tHu dt.

Relative Zeta Functions, Relative Determinants and Scattering Theory

327

Proof. We use formula (3.3). Since by b), Tr(H˙ u e−tHu ) is absolutely integrable on [1, ∞), it follows from a) and (3.13) that d du

Z

∞

t

−1

Tr(e

−tHu

−e

−tH0

Z

∞

) − b0 dt = −

1

Tr H˙ u e−tHu dt.

1

R1 Furthermore, by c), 0 ts Tr H˙ u e−tHu dt has a meromorphic extension to C which is holomorphic at s = 0, and 1 ∂ ζ1 (s; D, D0 ) = − ∂u 0(s)

Z

1

ts Tr H˙ u e−tHu dt.

0

Taking derivatives at s = 0 gives the desired result.

To conclude this section, we consider the case of two self-adjoint operators D, D0 which are not necessarily bounded from below. A typical example are Dirac type operators on a complete manifold. Suppose that (1.11) and (1.12) hold for (D, D0 ). For simplicity, we also assume that D and D0 are invertible, i.e. 0 is not in the spectrum of D and D0 . Then the relative η-function η(s; D, D0 ) is defined by (2.23) and has a meromorphic extension to C. If −1/2 does not occur among the exponents of the asymptotic expansion (1.12), then η(s; D, D0 ) is regular at s = 0 and therefore, we may define the relative η-invariant as η(0; D, D0 ). For example, if D and D0 are Dirac type operators, then the asymptotic expansion (1.12) contains no negative exponents so that the relative η-invariant is given by 1 η(0; D, D0 ) = √ π

Z

∞

2 2 t−1/2 Tr De−tD − D0 e−tD0 dt.

0

Suppose also that D2 , D02 satisfy (1.1) and (1.2). Since D and D0 are invertible, Lemma 2.2 implies that (1.3) holds and the relative zeta function is given by ζ(s; D2 , D02 ) = Tr (D2 )−s − (D02 )−s

= Tr |D|−2s − |D0 |−2s ,

Re(s) > −κ0 .

Hence, the relative zeta function ζ(s; |D|, |D0 |) also exists as a meromorphic function on C, and we have ζ(s; D2 , D02 ) = ζ(2s; |D|, |D0 |). This formula implies that ζ(s; |D|, |D0 |) is regular at s = 0 and therefore, det(|D|, |D0 |) can be defined by (3.1). Generalizing (0.5), we introduce the relative determinant for D, D0 by det(D, D0 ) = det(|D|, |D0 |) πi (η(0; D, D0 ) − ζ(0; |D|, |D0 |) . · exp 2

(3.19)

328

W. M¨uller

4. Examples In this section we discuss a number of typical cases of operators (H, H0 ) which satisfy the assumptions (1.1) - (1.3). Such operators arise very naturally both in geometry and physics. In particular, we will show how the general trace formula of Proposition 2.1 specializes in all these cases to a rather explicit trace formula involving the eigenvalues of the operator H and the scattering phase shift. 4.1. The Schr¨odinger operator in Rn . A typical example is the Schr¨odinger operator 1 + V in Rn , where V ∈ C0∞ (Rn ) and 1 is the negative of the usual Laplacian of Rn . I shall limit myself to this simple case, where V is both smooth and has compact support, although most of the results have generalizations to potentials with less regularity assumptions and the support property replaced by growth conditions. For details we refer the reader to [CV, Gu1, Gu2]. Regarded as an operator in L2 (Rn ) with domain C0∞ (Rn ), 1 + V is essentially self-adjoint. Let H be the unique self-adjoint extension of 1 + V and let H0 = 1. The spectrum of H consists of a finite number of eigenvalues λ1 < λ2 ≤ λ3 ≤ · · · ≤ λN < λN +1 = · · · = λN 0 = 0, where each eigenvalue is repeated according to its multiplicity, and an absolutely continuous spectrum which has multiplicity 2, if n = 1, and infinite multiplicity, if n > 1. Using Duhamel’s formula e−tH − e−tH0 =

Z

t

e−sH V e−(t−s)H0 ds,

0

it is easy to see that e−tH − e−tH0 is a trace class operator for t > 0 (see also [CV, Gu1, Gu2]). Hence, the wave operators W± (H, H0 ) exist and are complete, and the scattering operator S is defined by (2.8). Let L2 (Rn ) ' L2 R+ , L2 (S n−1 ); λn−1 dλ be the spectral resolution of H0 defined by the Fourier transform Z −n/2 b f (λ)(ω) = (2π) e−iλ<ω,x> f (x) dx. Rn

Then the “on-shell scattering matrix” S(λ) = S(λ; H, H0 ) is given by c (λ) = S(λ2 )fb(λ), λ > 0. Sf Furthermore, S(λ) is a unitary operator in L2 (S n−1 ) which is of the form S(λ) = Id + T (λ), where T (λ) is an integral operator with a smooth kernel T (λ; ω, ω 0 ). Since V ∈ C0∞ (Rn ), it follows that for odd n, S(λ2 ) has an analytic continuation to a meromorphic operator valued function of λ ∈ C. In the even-dimensional case a similar result is valid, except that S(λ) extends to a meromorphic function on the logarithmic covering of C, i.e. as a function of the variable log λ. In particular, this implies Proposition 4.1. The Fredholm determinant det S(λ) of S(λ) is real analytic on (0, ∞).

Relative Zeta Functions, Relative Determinants and Scattering Theory

329

Furthermore, since e−tH − e−tH0 , t > 0, is a trace class operator, the spectral shift function ξ(λ; H, H0 ) exists. In the present case, it can also be defined using the resolvents. Let R0 (λ) and R(λ) denote the resolvent of H0 and H, respectively. Let µ ∈ R − σ(H) and let k > n/2. Then R(µ)k − R0 (µ)k is of the trace class. Hence, there exists the spectral shift function ξµ (λ) = ξ(λ; R(µ)k , R0 (µ)k ) defined by (2.1) and, by the uniqueness of the spectral shift function, we get ξ(λ; H, H0 ) = ξµ (λ − µ)−k ; R(µ)k , R0 (µ)k . Using Proposition 1 in [Gu1] (or Theorem III.2 in [Gu2]), it follows that ξ(λ) is locally constant on the complement of σ(H) and continuous on (0, ∞). If n 6= 2, 4, then the discontinuity of ξ(λ) at 0 is given by ξ(0−) − ξ(0+) = m(0, H) + ,

(4.1)

where m(0, H) is the multiplicity of the eigenvalue 0 of H and = 0, if there is no resonance at 0 and = 1/2 otherwise. If n = 2, 4, the description is more difficult. Let det S(0) = lim det S(λ). λ→0+

We note that ξ(0−) ∈ Z. Hence, using (2.10), we obtain det S(0) = (−1)2 . This implies that we may pick the branch of log det S(λ) such that ξ(0−) − ξ(0+) = m(0, H) +

1 log det S(0). 2πi

Then by Proposition 2.1 we get the following trace formula for n 6= 2, 4 : 0

Tr(e

−tH

−e

−tH0

)=

N X i=1

+

1 2πi

e−tλi + Z

∞ 0

log det S(0) 2πi

e−tλ

d log det S(λ) dλ, dλ

(4.2)

[Gu1, Gu2]. If n = 2, 4, (2πi)−1 log det S(0) has to be replaced by a different constant. Let K(t, x, y) be the kernel of exp(−tH). Using the heat equation method [Gi], it follows that as t → 0, there exists an asymptotic expansion Tr(e−tH − e−tH0 ) ∼ (4πt)−n/2

∞ X

aj,n tj .

(4.3)

j=1

The first coefficients can be determined explicitly: R R a1,n = − Rn V (x) dx, a2,n = 21 Rn V 2 (x) dx, R a3,n = − 16 Rn V 3 (x) + 21 ||DV (x)||2 dx. The asymptotic expansion (4.3) corresponds to an asymptotic expansion of the scattering phase

330

W. M¨uller

s(λ) =

1 log det S(λ) 2πi

for high energy. Namely, for λ → ∞ one has ∞

X ds (λ) ∼ λn/2−1 αj (V )λ−j , dλ

(4.4)

j=1

and the coefficients αj (V ) are related to the coefficients in (4.3). We observe that the asymptotic expansion (4.4) holds for more general potentials (cf. [R]). It remains to investigate the large time behaviour of Tr(e−tH − e−tH0 ). For this purpose we have to assume that n is odd. Then det S(λ2 ) is real analytic on (−c, ∞) for some c > 0 and therefore it has a power series expansion at λ = 0 which implies that there exists > 0 such that the series X ∞ 1 dS(λ) ∗ ds (λ) = Tr S (λ) = bj λj/2 (4.5) dλ 2πi dλ j=0

is convergent for 0 ≤ λ ≤ . Now let P1 be the orthogonal projection of L2 (Rn ) onto the orthogonal complement of the subspace spanned by the eigenfunctions of H that correspond to the negative eigenvalues. Let H1 be the restriction of H to RanP1 . Then H1 ≥ 0 and H −H1 has finite rank. Let n be odd. Using (4.2), (4.3) and (4.5), it follows that (H1 , H0 ) satisfies (1.1)– (1.3). Hence, the relative zeta function ζ(s; H1 , H0 ) is well defined. Poles may occur at the points s = k/2, k ∈ Z. All poles are simple and the residues can be determined from the expansions (4.3) and (4.5). The relative determinant det(H1 , H0 ) is then given by (3.1). Using this determinant, we define the relative determinant det(1 + V, 1) by det(1 + V, 1) =

N Y

λi · det(H1 , H0 ).

(4.6)

i=1

Remark. We recall from Sect. 2, that in order to define the determinant, the full expansion (4.5) is not needed and may be replaced by weaker assumptions like (2.16). For example, there exist asymptotic expansions of the scattering matrix S(λ) as λ → 0 under more general assumptions about the potential (see [J1, JK]). If W ∈ C0∞ (Rn ) is another potential, we may define the relative determinant det(1+ V + W, 1 + V ) in the same way, and by Lemma 3.1, we have det(1 + V + W, 1 + V ) = det(1 + V + W, 1) · det(1 + V, 1)−1 .

(4.7)

If n is even, there is no expansion of the scattering phase near 0 in powers of λ. This problem can be eliminated if we shift the spectra of H and H0 by a sufficiently large positive constant m. Let n be arbitrary. Pick m > 0 such that H + m > 0. Then there exists c > 0 such that as t → ∞, one has Tr e−t(H+m) − e−t(H0 +m) = (e−tc ). If t → 0, this trace has an asymptotic expansion similar to (4.3). Hence, the relative zeta function can be defined in the half-plane Re(s) > n/2 in the usual way by

Relative Zeta Functions, Relative Determinants and Scattering Theory

1 ζ(s; H + m, H0 + m) = 0(s)

Z

∞

331

ts−1 Tr e−t(H+m) − e−t(H0 +m dt

0

= Tr (H + m)−s − (H0 + m)−s .

(4.8)

This zeta function was first introduced and studied by Guillop´e [Gu1]. Using the trace formula (2.4), it follows that for Re(s) > n/2, 0

ζ(s; H, H0 ) =

N X

(λj + m)−s + αm−s

j=1

1 + 2πi

Z

∞

(λ + m)−s

0

d log detS(λ) dλ, dz

(4.9)

where α is determined by the singularity of ξ(λ) at λ = 0. If n 6= 2, 4, then α = (2πi)−1 log detS(0). The analytic continuation of ζ(s; H + m, H0 + m) is holomorphic at s = 0 and we can define the relative determinant by (3.1). Again, we denote this relative determinant by det(1 + V + m, 1 + m). We now consider the variation of det(1 + V + m, 1 + m) with respect to V which can be computed using (3.17). First we note that Z ∞ e−s(H+m) ds (H + m)−1 e−t(H+m) = t

= (H + m)−1 e−(H+m) +

Z

1

e−s(H+m) ds.

t

Let W ∈ C0∞ (Rn ). Using this formula together with the local heat expansion, it follows that as t → 0, there is an asymptotic expansion of the form ∞ X dj (W ) t−n/2+j Tr W (H + m)−1 e−t(H+m) ∼ j=1 j6=n/2

+c1 (W ) + c2 (W ) log t, where 1 dj (W ) = n/2 − j

Z W (x)aj−1,n (x) dx,

Rn

c1 (W ) = Tr W (H + m)−1 e−(H+m) Z p X (j + 1 − n/2)−1 +

Rn

j=0 j6=n/2−1

Z

Z W (x)

+ Rn

0

W (x)aj,n (x) dx

 1

K(s, x, x) −

p X j=0

 aj,n (x)s−n/2+j  ds dx,

332

W. M¨uller

Z c2 (W ) = −

Rn

W (x)an/2−1,n (x) dx.

Thus, by (3.17) we have δ log det(1 + V + m, 1 + m) = c1 (δV ) + 00 (1)c2 (δV ). δV Now assume that n is odd and consider det(1+V, 1). Since the continuous spectrum of 1 + V contains 0, det(1 + V, 1) may not be differentiable in all directions. But we can use Proposition 3.5 to determine under what assumptions about the potential and the variation, det(1u , 10 ) will be differentiable. Suppose, for example, that V ≥ 0, V 6= 0, and let W ∈ C0∞ (Rn ) be such that V + uW ≥ 0 for |u| ≤ . Let 1u = 1 + V + uW and let Ku (x, y, t) be the kernel of e−t1u . Then 1u has no L2 -eigenfunctions for |u| ≤ and by Duhamel’s principle, it follows that Ku (x, y, t) − (4πt)−n/2 e−kx−yk /4t Z t Z = −(4π)−n/2 s−n/2 Ku (x, z, t − s) 2

0

Rn

× (V + uW )(z)e−kz−yk

2

/4s

dz ds.

(4.10)

Hence, if |u| ≤ , we get Ku (x, y, t) ≤ (4πt)−n/2 e−kx−yk

2

/4t

.

(4.11)

Inserting this estimate into (4.10), we obtain Tr e−t1u − e−t1 = O(t−n/2+1 )

(4.12)

as t → ∞ and |u| ≤ . Using (4.11) and (4.12), it is easy to verify that the assumptions of Proposition 3.5 are satisfied. Thus the relative determinant det(1 + V + uW, 1) is differentiable for |u| ≤ . 4.2. Perturbations of the Euclidean Laplacian. Let g be a complete Riemannian metric on Rn and let gij (x), 1 ≤ i, j ≤ n, be the components of g(x) with respect to the standard basis dxi of Tx∗ Rn . Let G = det(g) and let g kl (x) denote the components of (gij (x))−1 . Furthermore, let W be a complex vector space of dimension m with Hermitian inner product h·, ·i. Let A = (A1 , . . . , An ) be a smooth Yang–Mills potential, i.e. the Aj are C ∞ functions on Rn with values in the Hermitian operators on W . Then we consider the following differential operator: 1 = 1g,A = −G−1/4

n X √ jk ∂ ∂ + iAj Gg + iAk G−1/4 , ∂xj ∂xk

(4.13)

j,k=1

acting in the Hilbert space H = L2 (Rn ) ⊗ W. Note that this class of operators contains, for example, all spinor Laplacians on Rn , i.e. the squares of twisted Dirac operators. For p > 0, we make the following assumption: For every α ∈ Nn , there exists Cα > 0 such that

Relative Zeta Functions, Relative Determinants and Scattering Theory

X ij

|∂ α (gij (x) − δij )| +

X

|∂ α Ai (x)| ≤ Cα (1 + ||x||2 )−p−|α| .

333

(4.14)

i

In [CKS] it was proved that 1g,A , regarded as operator in H with domain C0∞ (Rn ) ⊗ W, is essentially self-adjoint. Let H = Hg,A be the corresponding self-adjoint extension. Furthermore let 10 = −

n X ∂2 ⊗ IdW , ∂x2i i=1

and let H0 be the self-adjoint extension of 10 . H and H0 are nonnegative self-adjoint operators in H. The following results, which we summarize as a proposition, were established in [CKS]. Proposition 4.2. Suppose that p > 1. Then a) The wave operators W± (H, H0 ) exist and are complete. b) H has no point spectrum. c) The singular continuous spectrum of H is empty. If p > n, it follows from [Ro, Br] that for t > 0, e−tH − e−tH0 is a trace class operator and as t → 0, there is an asymptotic expansion of the form X aj t−n/2+j . (4.15) Tr(e−tH − e−tH0 ) ∼ j≥0

Thus for m > 0, the pair of operators (H + m, H0 + m) satisfies (1.1)– (1.3) and the relative zeta function ζ(s; H + m, H0 + m) can be defined in Re(s) > n/2 by the usual formula (4.16) ζ(s; H + m, H0 + m) = Tr (H + m)−s − (H0 + m)−s . Remark. In a more general context, such zeta functions have been studied by Bruneau [Br] in his thesis. Bruneau considers a certain class of elliptic operators in Rn which, for example, includes the Laplacians for asymptotically equal metrics and also the operators of the type (4.13). He then studies relative zeta functions like (4.16). In order to define the relative zeta function for (H, H0 ) itself, we need some information about the behaviour of the scattering matrix at λ = 0. This requires more stringent assumptions. For example, we may assume that gij (x) = δij , i, j = 1, . . . , n, Aj (x) = 0, j = 1, . . . , n for k x k≥ R. Then 1g,A = 10 outside a compact set. If n is odd, it follows from Theorem 1.1 of [SZ] that (H − λ2 )−1 , Im(λ) > 0 has a continuation to a meromorphic function of λ ∈ C with values in the operators from L2comp (Rn ) ⊗ W to L2loc (Rn ) ⊗ W . Let S(λ) = S(λ; H, H0 ) be the scattering matrix. Then, using the analytic continuation of the resolvent, it follows that S(λ2 ) extends to a meromorphic function of λ ∈ C. We now can proceed in essentially the same way as in the case of the Schr¨odinger operator in odd dimensions. In particular, the relative determinant det(1g,A , 10 ) exists and we can study it as a function of the metric g and the gauge field A. In the same way, one can study Dirac operators on Rn coupled to a gauge field. For example, consider the Dirac operator on R3 :

334

W. M¨uller

DA = c

3 X j=1

γj

∂ + iAj (x) + mc2 γ4 + V (x), ∂xj

x ∈ R3 ,

where c > 0 is the speed of light, γ1 , ...γ4 are the Dirac matrices, V is the electric potential, m is the mass and A = (A1 , A2 , A3 ) is the magnetic field. We assume that the potential V and the magnetic field A are C ∞ and satisfy the following assumption: For each α ∈ N3 , there exists Cα > 0 such that |∂ α A(x)| + |∂ α V (x)| ≤ Cα (1 + |x|2 )−p−|α| ,

x ∈ R3 ,

where p > 0 is independent of α. 4 Then DA is essentially self-adjoint in L2 (R3 ) . Let D0 be the free Dirac operator, i.e. the Dirac operator with magnetic field A = 0. If p > 1, then the wave operators W± (DA , D0 ) exist and are complete. The singularly continuous spectrum of DA is empty and the essential spectrum of DA equals (−∞, −mc2 ]∪[mc2 , ∞) (cf. [Th]). In particular, if m > 0, then the continuous spectrum of DA has a gap at zero. It is proved in [Br, Br1] Due to the gap of the continuous that the operators (DA , D0 ) satisfy (1.11) and (1.12). 2 ) − D0 exp(−tD02 ) is exponentially decreasing for t → spectrum, Tr DA exp(−tDA ∞. Thus we can define the relative eta invariant η(DA , D0 ). Moreover, det(|DA |, |D0 |) also exists and therefore, det(DA , D0 ) exists too and is given by (3.19). 4.3. Manifolds with cylindrical ends. Let M be a compact n-dimensional C ∞ manifold with smooth boundary Y . Let X be the non-compact manifold obtained from M by gluing the bottom of the half-cylinder R+ × Y to the boundary of M , i.e. X = M ∪Y (R+ × Y ). We equip X with a Riemannian metric g which is a product on R+ × Y, i.e. on R+ × Y , g takes the form g = du2 + h, where h is the pull-back of a metric on Y. Then X equipped with g is called a “manifold with a cylindrical end”. In this way, we obtain a natural class of complete Riemannian metrics on X. We also may consider larger classes of metrics which are obtained by perturbations of cylindrical end metrics. For example, we may require that the perturbation together with all its derivatives is exponentially decreasing if u → ∞, where u is the radial variable on the cylinder. If we change coordinates by v = e−u , then we get a metric on the compact manifold M. Let x be a defining function of ∂M . Then near the boundary, the metric g has the form dx2 + h, x2 where h is a semi-positive metric on M which restricts to a non-degenerate metric on ∂M. These are the exact b-metrics studied by Melrose [Me1, Me2]. Another possible condition would be to require that the perturbation and a finite number of its derivatives decay like (1 + u2 )−p , for some p > 1. Here we shall restrict attention to metrics which are exact products on the cylinder. But most of the results extend to larger classes of metrics; for example, to the class of exact b-metrics. Let X be equipped with a cylindrical end metric and let 1 be the

Relative Zeta Functions, Relative Determinants and Scattering Theory

335

corresponding Laplacian. Then 1 is essentially self-adjoint and we continue to denote its unique self-adjoint extension by 1. Now, we shall briefly describe some facts concerning the spectral decomposition of 1. For details see [Me1, Mu3]. The spectrum of 1 is the union of a pure point spectrum σpp (1) and the absolutely continuous spectrum σac (1). The point spectrum σpp (1) consists of eigenvalues of finite multiplicity with no finite point of accumulation [Do]. Let 0 < λ1 ≤ λ2 ≤ λ3 ≤ · · · → ∞ be the sequence of eigenvalues where each eigenvalue is repeated according to its mulitplicity. Let Npp (λ) = # {j | λj ≤ λ} . It was proved by Christiansen and Zworski [CZ] that there exists C > 0 such that Npp (λ) ≤ C(1 + λ)n/2 , λ ≥ 0,

(4.17)

n = dim X. The first result of this type is due to Donnelly [Do], who proved a similar bound with n replaced by 2n − 1. We note that (4.17) is the optimal bound [CZ]. The estimate (4.17) implies that for t > 0, X e−tλi < ∞. (4.18) i

Let L2p (X) ⊂ L2 (X) be the subspace spanned by the L2 -eigenfunctions of 1. Then it follows from (4.18) that e−t1 |L2p (X) is a trace class operator and X −tλj Tr e−t1 |L2p (X) = e . (4.19) j

Now note that on R × Y , one has +

1=−

∂2 + 1Y . ∂u2

(4.20)

Therefore, standard perturbation theory implies that the continuous spectrum of 1 is governed by the Laplacian 1Y of Y . We regard the right-hand side of (4.20) as an operator in L2 (R+ × Y ) with domain C0∞ (R+ × Y ) and impose Dirichlet boundary conditions at the bottom of the half-cylinder. Let 10 be the corresponding self-adjoint extension. Then e−t1 − e−t10 is a trace class operator for t > 0. Hence, by the Birman–Kato invariance principle, the wave operators W± (1, 10 ) exist and are complete. Therefore, 1ac is unitarily equivalent to 10 . Let 0 = µ 0 < µ1 < µ2 < · · · → ∞ be the sequence of eigenvalues of 1Y . Then it follows that σac (1) =

∞ [

[µj , ∞),

j=0

i.e. the µj ’s are the thresholds of the continuous spectrum. At each threshold there starts a new branch of the continuous spectrum with multiplicity equal to the rank of the corresponding eigenspace.

336

W. M¨uller

A more explicit description of the continuous spectrum can be given in terms of generalized eigenfunctions. Let E(µk ) be the eigenspace for the eigenvalue µk of 1Y . Let φ ∈ E(µk ) and let λ > µk be different from all thresholds. Then there exists a unique eigenfunction E(φ, λ) ∈ C ∞ (X) of 1 with eigenvalue λ which on R+ × Y has the following expansion: √ E(φ, λ, (u, y)) =eiu λ−µk φ(y) √ X + e−iu λ−µl (Tkl (λ)φ) (y) + 9, µl <λ

(4.21) where 9 ∈ L2 (R+ × Y ) and Tkl (λ) : E(µk ) → E(µl ) is a linear operator. Furthermore, each generalized eigenfunction E(φ, λ) extends to a meromorphic √ function of λ ∈ Σ, where Σ is the minimal Riemann surface to which all functions λ − µk , k ∈ N, extend to be holomorphic. This is a consequence of the extension of the resolvent (1−λ)−1 , as an operator in certain weighted L2 spaces, to a meromorphic function of λ ∈ Σ (see [Mu4, Me1]). More precisely, let ρ ∈ C ∞ (X) be such that ρ(u, y) = u for (u, y) ∈ [1, ∞) × Y . For δ ∈ R, let L2δ (X) be the Hilbert space of all measurable functions on X which are square integrable with respect to the measure e2ρ(x) dx. Then the following holds: Theorem 4.3. Let δ > 0. Then the resolvent (1 − λ)−1/2 , as anoperator from L2δ (X) to L2−δ (X), extends to a holomorphic function of λ ∈ Σ. For the proof see [Me1]. One also can adapt the method described in [Mu4]. Let 1/4 λ − µk Skl (λ) = Tkl (λ), (4.22) λ − µl and put S(λ) =

M

Skl (λ).

µk ,µl <λ

Then S(λ) :

M µk <λ

E(µk ) →

M

E(µk )

µk <λ

is equal to the scattering matrix S(λ; 1, 10 ). Since the generalized eigenfunctions are meromorphic functions of λ ∈ Σ, the same holds for Tkl (λ) and therefore, for Skl (λ) too. This leads to the following result about the structure of the scattering matrix. Proposition 4.4. The scattering matrix S(λ) is a unitary operator of finite rank which depends smoothly on λ ∈ (µj−1 , µj ) for all j ∈ N, and its rank changes when λ crosses a threshold. Moreover, the coefficients of S(λ) extend to meromorphic functions on Σ. We note that if 0 < λ < µ1 , then S(λ) is a scalar and S(λ2 ) extends to a meromorphic function on {z ∈ C | |z| < µ1 } which is regular at λ = 0. In particular, S(λ) is a holomorphic function in a small neighborhood of the origin. Thus, condition 2) of Proposition 2.4 is satisfied and it remains to show that the spectral shift function ξ(λ) = ξ(λ; 1, 10 ) is continuous in some interval (0, ), > 0. To study ξ(λ) one may use the definition (2.1) of the spectral shift function and proceed along lines similar to [JK2]. However, for our present purpose, we also can use the trace formula proved by

Relative Zeta Functions, Relative Determinants and Scattering Theory

337

Christiansen [Ch] for exact b-metrics (see [Me2, Chapter 7]). The trace that is used by Christiansen is a regularization of the trace for operators with continuous kernels, called b-trace, which √ was introduced by Melrose [Me1]. The trace formula computes the b-trace of cos t 1 , as a tempered distribution, in terms of spectral data. But the √ √ √ b-trace of cos t 1 is closely related to the trace of cos t 1 − cos t 10 , taken in the distributional sense. Therefore, the b-trace formula gives rise to a relative trace formula. For example, the b-trace of e−t1 is given by b- Tr e

−t1

Z e

= lim

R→∞

−t1

XR

R −t1Y (x, x) dx − √ , Tr e 4πt

where XR = M ∪Y ([0, R] × Y ). Now note that the kernel of e−t10 equals 1 √ 4πt

(u − u0 )2 exp − 4t

(u + u0 )2 − exp − 4t

e−t1Y (y, y 0 ).

Thus, we get Tr e−t1 −e−t10

Z e

= lim

R→∞

Z

= lim

R→∞

+√

1 4πt

−t1

Z (x, x) dx −

XR

e

−t10

[0,R]×Y

R e−t1 (x, x) dx − √ Tr e 4πt XR Z ∞ 2 e−u /t du Tr e−t1Y

(x, x) dx −t1Y

0

1 = b- Tr e−t1 + Tr e−t1Y . 4 √ Since e−t1 can be expressed in terms of cos t 1 by e−t1 = √

1 4πt

Z

e−u R

2

/4t

√ cos u 1 du,

the application of the b-trace formula leads to the following relative trace formula: ∞ X −tλj 1 X e + dim E(µk )e−tµk Tr e−t1 − e−t10 = 2 j k=1

1 + Tr(S(0)) 4 Z ∞ 1 d log det S(λ) dλ. e−tλ + 2πi 0 dλ

(4.23)

338

W. M¨uller

Remark. An analogous trace formula holds for ϕ(1) − ϕ(10 ), where ϕ is any function in C0∞ (R). It follows from Proposition 2.1 that in order to establish the trace formula, it is sufficient to determine the spectral shift function up to an additive constant. On the other hand, one can compute the spectral shift function explicitly using the eigenfunction expansion of the corresponding kernels. This approach was used in [Mu5] to compute the spectral shift function for spinor Laplacians on finite volume locally symmetric spaces of Q-rank one. The computations for manifolds with cylindrical ends are similar, in fact, they are easier. Using the trace formula (4.23) together with the fact that S(λ2 ) is holomorphic in a neighborhood of λ = 0, it follows that as t → ∞, we have an asymptotic expansion of the form ∞ X 1 cj t−j/2 . Tr e−t1 − e−t10 ∼ dim ker 1 + Tr (S(0)) + 4 j=1

To determine the asymptotic behaviour as t → 0, we may use the standard parametrix construction which implies that there is an asymptotic expansion of the form ∞ X aj t−n/2+j Tr e−t1 − e−t10 ∼ j=0

as t → 0. Hence, conditions (1.1)–(1.3) hold and the relative zeta function ζ(s; 1, 10 ) is a meromorphic function of s ∈ C. Using the trace formula (4.23), the zeta function can be expressed in terms of the eigenvalues and the scattering phase shift. First we note that by the main result of [CZ], the scattering phase satisfies Z 3 d log det S(λ) dλ = O(3n ) dz 0 as 3 → ∞. Let z ∈ C, Re(z) > 0. Then, using (4.23), it follows that for Re(s) > n, we have X X ζ(s; 1 + z, 10 + z) = (λj + z)−s + (µk + z)−s

+

1 2πi

j

Z

k ∞

(λ + z)−s

0

d log det S(λ) dλ. dλ

(4.24)

The series and the integral are absolutely convergent for s in the given half-plane. There is a similar formula for z = 0, except that the integral has to be defined by analytic continuation using the expansion log det S(λ) =

∞ X

dj λj/2 ,

|λ| < ε.

j=0

This expansion is a consequence of the analyticity of S(λ2 ) in a neighborhood of λ = 0. Everything that has been said for the Laplacian 1 on functions extends without any difficulties to generalized Dirac operators. Suppose, for example, that X is a spin manifold and let F → X be a Hermitian vector bundle over X with a Hermitian

Relative Zeta Functions, Relative Determinants and Scattering Theory

339

connection. We assume that all structures are adapted to the product structure on R+ ×Y . Let S be the spinor bundle of X and let D : C ∞ (S ⊗ F ) → C ∞ (S ⊗ F ) be the associated twisted Dirac operator. Then, on R+ × Y , D has the following form: ∂ D=γ + DY , (4.25) ∂u where γ denotes the Clifford multiplication by the exterior normal vector field to Y and DY is a twisted Dirac operator on Y . We note that D is essentially self-adjoint in L2 (S ⊗ F ) and we shall denote its unique self-adjoint extension also by D. Let H = D2 and let H0 be the closure of −

∂2 + DY2 : C0∞ (R+ × Y, S ⊗ F ) → L2 (R+ × Y, S ⊗ F ) ∂u2

obtained by imposing Dirichlet boundary conditions. Then, as above, H, H0 satisfy the conditions (1.1) - (1.3) and therefore, ζ(s; H, H0 ) and det(H, H0 ) are well defined. Since, by (4.25), D determines uniquely the operator H0 , det(H, H0 ) depends only on D and we shall denote it simply by det(D2 ). Now assume that dim X is even. Then the spinor bundle splits as S = S+ ⊕ S− and, with respect to this splitting, D takes the form 0 D− D= . D+ 0 Then D− D+ and D+ D− have unique self-adjoint extensions H + and H − , respectively. Note that H = H + ⊕ H − . Similarly, we get H0± such that H0 = H0+ ⊕ H0− . Then the relative determinants det(H ± , H0± ) can be defined as above and we shall denote them by det(D− D+ ) and det(D+ D− ), respectively. Next we consider families of Dirac operators on manifolds with cylindrical ends. Let M and B be connected Riemannian manifolds and assume that B is compact. Let π : M → B be a Riemannian submersion whose fibres Zb = π −1 (b) are 2k-dimensional manifolds with cylindrical ends. We assume that Zb is an oriented spin manifold. Such families can be constructed from fibrations of manifolds with boundary along the lines of [BC]. Let F → M be a Hermitian vector bundle which is adapted to the product structure and let Db be the twisted Dirac operator on the fibre Zb . Let R+ × Yb be the cylindrical end of Zb . Then on R+ × Yb , Db has the form ∂ D b = γb + D Yb . ∂u Let D denote the corresponding family of Dirac operators. Assume that 0 ∈ / Spec(DYb ) for all b ∈ B. Then for each b ∈ B, Db has discrete spectrum near zero, and we may construct the determinant line bundle L = det(kerD+ )∗ ⊗ det(kerD− )

340

W. M¨uller

in the same way as in [BF1]. Using the determinant det(D−,b D+,b ) defined as above, we can equip L with the Quillen metric. More generally, we may assume that dim ker DYb is constant. This means that the orthogonal projections Pb of L2 (S⊗F |Yb ) onto ker DYb are a smooth family, i.e. ker DYb , b ∈ B, gives rise to a smooth vector bundle ker DY over B. Then we modify the Dirac operator as follows. Let f ∈ C ∞ (R) be such that f (u) = 0 for u ≤ 1 and f (u) = 1 for u ≥ 2. Put e b = Db − γb f (u)Pb . D By our assumption, we get an operator with smooth coefficients e : C ∞ (M, S ⊗ F ) → C ∞ (M, S ⊗ F ). D e is not a differential operator, it is easy to see that D e has similar spectral Although D properties as D. Furthermore, note that on [2, ∞) × Yb , we have e b = γb ∂ + DY − Pb D b ∂u e Y = DY − Pb is invertible for all b ∈ B. This means that the and by definition, D b b e b has a gap at zero. Thus, we may proceed as above and continuous spectrum of D e and equip Le with the Quillen construct the determinant line bundle Le for the family D metric. The following lemma implies that the two line bundles are in fact the same. e b and Db , e b and ker Db denote the spaces of L2 -solutions of D Lemma 4.5. Let ker D respectively. Then we have e b = ker Db , ker D

b ∈ B.

Proof. To simplify notation, we skip the subscript b. Let φj , j ∈ Z, be an orthonormal basis of eigenfunctions of DY and let λj be the corresponding eigenvalues. If ϕ ∈ ker D, then on R+ × Y , ϕ has the following expansion: X ϕ(u, y) = aj e−λj u φj (y). λj >0

e Now let ψ ∈ ker D. e The restriction of ψ to Thus P ϕ = 0 and therefore, ϕ ∈ ker D. R+ × Y is given by Z u X X −λj u ψ(u, y) = bj e φj (y) + ck exp f (u) du φk (y). λj >0

λk =0

0

Since f (u) = 1 for u > 2 and ψ is square integrable, it follows that ck = 0 for all k. Thus P ψ = 0 and hence, Dψ = 0. e coincides with L. Since for Thus, the determinant line bundle Le associated to D e each b ∈ B, the spectrum of Db has a gap at zero, we can construct the Quillen metric as above. 4.4. Surfaces with hyperbolic ends. Let (X, g) be a complete surface of finite area such that the Gaussian curvature K of (X, g) satisfies K ≡ −1 in the complement of a

Relative Zeta Functions, Relative Determinants and Scattering Theory

341

compact set. We call (X, g) a surface with hyperbolic ends. Any such surface X admits a decomposition of the form X = X0 ∪ Y1 ∪ · · · ∪ Ym , where X0 is a compact surface with smooth boundary and Yi ∼ = [1, ∞) × S 1 , i = 1, . . . , m, and the metric on Yi equals ds2 =

dy 2 + dx2 , y2

where (y, x) ∈ [1, ∞) × S 1 . Each end Yi is called a cusp of X. Special cases are hyperbolic surfaces of finite area, i.e. surfaces with K ≡ −1 everywhere and Area(X) < ∞. Then there exists a discrete torsion free subgroup 0 ⊂ SL(2, R) of finite co-volume such that X = 0\H where H denotes the upper half-plane equipped with the Poincar´e metric. Examples are the well known surfaces 0(N )\H, where 0(N ) ⊂ SL(2, Z) is the principal convergence subgroup of level N. Let 1 be the Laplacian of X. Then 1 is essentially self-adjoint in L2 (X). The structure of the spectrum σ(1) of 1 is well known (see e.g. [Mu2]). Namely, σ(1) is the union of the absolutely continuous spectrum σac (1), and the pure point spectrum σpp (H). The point spectrum consists of a sequence of eigenvalues 0 = λ0 < λ1 ≤ λ2 ≤ · · · of finite multiplicity. The only possible point of accumulation of the eigenvalue sequence is ∞ and, as proved by Colin de Verdiere [CV2], for a generic metric there exist only finitely many eigenvalues which are all contained in [0,1/4). The absolutely continuous spectrum equals [1/4, ∞) with multiplicity equal to the number of cusps m. Let 10 be the self-adjoint extension of the operator −

m X i=1

yi2

m m M d2 M ∞ : C ([1, ∞)) → L2 [1, ∞), yi−2 dyi 0 2 dyi i=1 i=1

with respect to Dirichlet boundary conditions. If we regard a function on [1, ∞) as a function on [1, ∞) × S 1 ∼ = Yi which is independent of the second variable, then we get a canonical inclusion m M

L2 [1, ∞), yi−2 dyi ⊂ L2 (X).

(4.26)

i=1

Hence, we may consider e−t10 as an operator in L2 (X) which is equal to zero in the orthogonal complement of the image of the inclusion (4.26). With this identification, it was proved in [Mu1] that e−t1 − e−t10 is a trace class operator for every t > 0. Thus 1ac and 10 are unitarily equivalent. Moreover, there exists a complete set of generalized eigenfunctions Ei (z, s), i = 1, ..., m, which are meromorphic functions of s ∈ C satisfying 1Ei (z, s) = s(1 − s)Ei (z, s), s ∈ C. Furthermore, each Ei (z, s) is regular on the line Re(s) = 1/2. The restriction of Ei (z, s) to the cusp Yj can be expanded in a Fourier series and the zeroth Fourier coefficient has the form δij yjs + Cij (s)yj1−s .

342

W. M¨uller

Put C(s) = Cij (s) . Then C(s) is a meromorphic matrix valued function which satisfies the following functional equation: C(s)C(1 − s) = Id. Moreover, the scattering matrix S(λ) = S(λ; 1, 10 ) is related to C(s) by 1 1 + λ2 = C + iλ , λ ∈ R. S 4 2

(4.27)

Hence S(λ) has an extension to a meromorphic function on the double covering of C defined by λ = s(1 − s) and we have S(s(1 − s)) = C(s). Using the generalized eigenfunctions, the spectral shift function ξ(λ) = ξ(λ; 1, 10 ) can be computed explicitly. Together with Proposition 2.1, this leads to the following trace formula: X −tλj 1 e + Tr S(1/4) e−t/4 Tr e−t1 − e−t10 = 4 j Z ∞ 1 d log det S(λ) dλ − e−tλ 2πi 1/4 dλ X 1 e−tλj + Tr C(1/2) e−t/4 = 4 j −

1 4πi

Z

∞

e−(1/4+λ

−∞

2

)t

d log det C(1/2 + iλ) dλ, dλ

(4.28)

where we have used (4.27).

√ √ Remark. There is a similar trace formula for cos t 1 −cos t 10 . If X is hyperbolic, i.e. X = 0\H, then the left-hand side of this trace formula can be rewritten as a sum of orbital integrals where the sum runs over the conjugacy classes in 0. This is then the Selberg trace formula. It follows from (4.28) that as t → ∞, we have Tr e−t1 − e−t10 = 1 + O(e−ct ) for some c > 0. Furthermore, using Theorem 8.20 of [Mu1], we obtain the following asymptotic expansion for t → 0: Area m log t + √ Tr e−t1 − e−t10 = 4πt 2 4πt +

√ 3γm 1 χ(X) √ + O( t). + 2 6 4πt

(4.29)

Here γ is Euler’s constant and χ(X) is the Euler characteristic of X. The higher order terms are similar. This is an example where logarithmic terms arise naturally. Summarizing, it follows that 1, 10 satisfy (1.1) - (1.3) and therefore, the relative zeta function ζ(s; 1, 10 ) exists.

Relative Zeta Functions, Relative Determinants and Scattering Theory

343

Using (4.28), it follows that for Re(s) > 1, the relative zeta function is given by P −s s−1 Tr(C(1/2)) ζ(s; 1, 10 ) = λj >0 λj + 4 R ∞ 1 d λ−s dλ log det S(λ) dλ. − 2πi 1/4 By (4.29), ζ(s; 1, 10 ) has a meromorphic continuation to C which is holomorphic at s = 0. In particular, we can define the relative determinant det(1, 10 ) by (3.1). Since 10 is uniquely determined by 1, we shall denote the determinant by det 1. Remark. A different regularization of a relative determiant for surfaces with hyperbolic ends was introduced by Lundelius [Lu]. We recall his definition. Let Xi , i = 1, 2, be two surfaces with hyperbolic ends and assume that the number of ends is the same for both surfaces. Let 1i , i = 1, 2, be the Laplacian of Xi . By assumption there exist open relatively compact subsets Ki ⊂ Xi such that X1 − K1 = X2 − K2 and 11 |(X1 − K1 ) = 12 |(X2 − K2 ). Let H = L2 (X1 ) ⊕ L2 (K2 ) ∼ = L2 (X2 ) ⊕ L2 (K1 ). Then we may regard 1i as an unbounded operator in H which is zero in L2 (Kj ), i 6= j, i, j = 1, 2. Moreover, e−t11 − e−t12 is a trace class operator for t > 0 and (1.2), (1.3) hold for (11 , 12 ). Thus, one can define det(11 , 12 ). This is the relative determinant used by Lundelius. It is closely related to our definition. In fact, it follows from the definitions that det 11 . det(11 , 12 ) = det 12 Next consider the differentiability of the determinant. It follows from Proposition 3.5 that det 1 is a differentiable function on the space of all metrics with hyperbolic ends. For example, let g0 be any metric with hyperbolic ends and let f ∈ C0∞ (X). Put gu = euf g0 ,

u ∈ R.

Then gu is also a metric with hyperbolic ends. Let 1gu be the Laplacian with respect to gu . Then 1gu = e−uf 1g0 , and it follows as in [OPS] that −

d log det 1gu |u=0 = du

Z f (x) X

1 K0 (x) − 12π A0

dµ0 (x),

where K0 (x), A0 and dµ0 are the Gaussian curvature, the area and the volume element of (X, g0 ), respectively. Thus one can study critical points of det 1 for finite area surfaces with hyperbolic ends in the same way as in [OPS]. In the present case, det 1 can be expressed in a different way using the eigenvalues of 1 and the resonances. Here resonances are defined in terms of an analytic continuation of the resolvent of 1 to a meromorphic function with values in the linear operators from L2comp (X) to L2loc (X) (see [Mu4]). A resonance is then, by definition, a pole of the analytic continuation of the resolvent which does not correspond to an eigenvalue in the sense that the pole is either not an eigenvalue or its multiplicity (defined in a proper sense) is bigger than the dimension of the eigenspace. Let R(1) be the set of all poles of the analytic continuation of the resolvent. For each η ∈ R(1) one can define an algebraic

344

W. M¨uller

multiplicity m(η). In fact, there is a nonself-adjoint operator B – the generator of the Lax–Phillips semigroup – such that R(1) equals the set of generalized eigenvalues of B and m(η) is then the algebraic multiplicity of this eigenvalue [Mu2]. We note that resonances can also be defined as poles of the scattering matrix which we regard as a function on the double covering of C defined by λ = s(1 − s). Now the resonance set R(1) can be used as a discrete set of spectral parameters in the same way as the eigenvalues are used in the case of a compact Riemannian manifold. For example, there is an analogue of Weyl’s formula [Mu2, Pa], and also a trace formula [Mu2]. In particular, one can introduce the resonance zeta function which is defined by X

ζRes (s) =

η∈R(1) η6=1

m(η) . (1 − η)s

(4.30)

This series is absolutely convergent for Re(s) > 2 and admits a meromorphic continuation to C which is holomorphic at s = 0. Using this zeta function, we define a second regularized determinant by d . det Res 1 = exp − ζRes (s) ds s=0 Then formally, one has Y

det Res 1 =

|1 − η|2m(η) .

η∈R(1) η6=1

Furthermore, the two determinants are closely related. Namely, we have Area(X) 3πγ − m detRes 1, det 1 = exp 8π 2 where γ is Euler’s constant [Mu2]. More generally, for Re(z) > 1, we can define X

ζRes (s; z) =

η∈R(1) η6=1

m(η) , Re(s) > 2. (z − η)s

(4.31)

Then, as a function of s, ζRes (s; z) has a meromorphic continuation to C and s = 0 is not a pole. Put ∂ . det Res (1 + z − 1) = exp − ζRes (s; z) ∂s s=0 For a hyperbolic surface 0\H of finite area, detRes (1 + z − 1) can be expressed in terms of the Selberg zeta function Z0 (s) and the determinant of the scattering matrix S(λ). First recall that Z0 (s) =

∞ YY γ k=0

1 − e−(s+k)`(γ) , Re(s) > 1,

Relative Zeta Functions, Relative Determinants and Scattering Theory

345

where γ runs through the primitive closed geodesics and `(γ) is the length of γ. Furthermore, we write the spectral parameter λ as usual as λ = z(1 − z), z ∈ C, and regard the scattering matrix as a function of z. Then det 2Res (1 + z − 1) = detS(z)Z0 (z)2 Z∞ (z)2 0(z + 1/2)−2m ec(2z−1)+d ,

(4.32)

where c and d are certain constants that depend on Area(0\H) and m, and Area(0\H)/2π Z∞ (s) = (2π)s 02 (s)2 /0(s) with 02 (s) being the double Gamma function (see [Mu2]). It follows from (4.32) that detRes (1 + z − 1) extends to an analytic function on C with the set of zeros equal to R(1). We claim that this holds in general. If 0 = SL(2, Z), then the scattering matrix is a function S(z) which is given in terms of the Riemannian zeta function ζ(s) by S(z) =

√ 0(z − 1/2) ζ(2z − 1) π 0(z) ζ(2z)

[He]. Thus in this case, the resonances are precisely the numbers ρ/2, where ρ runs over the nontrivial zeros of the Riemann zeta function. Furthermore, detRes (1 + z − 1) can be factorized in the product of two determinants, where the first one is defined in terms of the eigenvalues similar to (0.3) and the second one is defined in terms of the nontrivial zeros ρ of ζ(s). We describe the second determinant. For Re(z) > 1 consider the Dirichlet series X (z − ρ)−s , (4.33) ζ(s, z) = (2π)s ρ

where arg(z − ρ) ∈ (−π/2, π/2). It was proved in [De] that (4.33) converges absolutely for Re(s) > 1 and for fixed z, it has an analytic continuation to a holomorphic function of s ∈ C\{1}. Moreover, one has 2−1/2 (2π)−2 π −z/2 0(z/2)ζ(z)z(z − 1) ∂ . = exp − ξ(s, z) ∂s s=0

(4.34)

The right hand side is then the determinant associated with the resonances. Note that (4.33) resembles (4.31). This analogy becomes even closer if we recall Colin de Verdiere’s result that for a generic metric, the number of eigenvalues is finite [CV2]. Remark. In conclusion, one can say that for a surface with hyperbolic ends, the resonances can be used as a substitute for the eigenvalues of the Laplacian of a compact surface. It would be very interesting to see if in other cases, the resonances play a similar role. Everything that we described here for surfaces can be extended to the case of manifolds with ends of hyperbolic type as studied in [Mu1]. Furthermore, the Laplacian on forms can be treated in the same way. In particular, we can introduce the L2 analytic torsion for hyperbolic manifolds of finite volume. Another interesting problem is the investigation of the finer structure of the distribution of resonances in the case of surfaces with hyperbolic ends. This should be seen in connection with quantum chaos [Sa].

346

W. M¨uller

References [APS1] [APS3] [BC] [BF1] [BK] [Br] [Br1] [BY] [Bu] [Ch] [CZ] [CKS] [CV] [CV2] [De] [Do] [Gi] [GMS] [Gu1] [Gu2] [H] [He] [J1] [JK] [JK2] [JL] [K1] [K2]

Atiyah, M.F., Patodi, V.K. and I.M. Singer: Spectral asymmetry and Riemannian geometry, I. Math. Proc. Camb. Phil. Soc. 77, 43–69 (1975) Atiyah, M.F., Patodi, V.K. and Singer, I.M.: Spectral asymmetry and Riemannian geometry, III. Math. Proc. Camb. Phil. Soc. 79, 71–99 (1976) Bismut, J.-M. and Cheeger, J.: Families index for manifolds with boundary, super connections, and cones, I. J. Funct. Anal. 90, 306–354 (1990) Bismut, J.-M. and Freed, D.S.: The analysis of elliptic families, I. Metrics and connections on determinant bundles. Commun. Math. Phys. 106, 159–176 (1986) Birman, M.Sh. and Krein, M.G.: On the theory of wave operators and scattering operators. Dokl. Akad. Nauk SSSR 144, 475–478 (1962); English transl. in Soviet Math. Dokl. 3 (1962) Bruneau, V.: Propri´et´es asymptotiques du spectre continu d’op´erateurs de Dirac. These de Doctorat, Universit´e de Nantes, 1995 Bruneau, V.: Sur le spectre continu de l’op´erateur de Dirac: formule de Weyl, limite non-relativiste. C.R. Acad. Sci. Paris 322, 43–48 (1996) Birman, M.Sh. and Yafaev, D.R.: The spectral shift function. The work of M.G. Krein and its further development. St. Petersburg Math. J. 4, 833–870 (1993) Bunke, U.: Relative index theory. J. Funct. Anal. 105, 63–76 (1992) Christiansen, T.: Scattering theory for manifolds with asymptotically cylindrical ends. J. Funct. Anal. 131, 499–530 (1995) Christiansen, T.and Zworski, M.: Spectral asymptotics for manifolds with cylindrical ends. Ann. Inst. Fourier, Grenoble 45, 251–263 (1995) Cotta-Ramusino, P.,Kr¨uger, W. and Schrader, R.: Quantum scattering by external metrics and Yang-Mills potentials. Ann. Inst. Henri Poincar´e 31, 43–71 (1979) Colin de Verdiere, Y.: Une formule de traces pour l’op´erateur de Schr¨odinger dans R3 . Ann. scient. ´ Norm. Sup. 4e s´erie, 14, 27–39 (1981) Ec. Colin de Verdiere, Y.: Pseudo-Laplaciens II. Ann. Inst. Fourier, Grenoble 33, 87–113 (1983) Deninger, C.: Local L-factors of motives and regularized determinants. Invent. math. 107, 135–151 (1992) Donnelly, H.: Eigenvalue estimates for certain noncompact manifolds. Michigan Math. J. 31, 349– 357 (1984) Gilkey, P.B.: Invariance theory, the heat equation, and the Atiyah-Singer index theorem. Second edition, Boca Raton, Ann Arbor: CRC Press, 1995 Gamboa Saravi, R.E., Muschietti, M.A., Schaposnik, F.A. and Solomin, J.E.: ζ-function method and the evaluation of fermion currents. J. Math. Phys. 26, 2045–2049 (1985) Guillop´e, L.: Une formule de trace pour l’op´erateur de Schr¨odinger dans Rn . These de 3eme cycle, Grenoble, 1981 Guillop´e, L.: Asymptotique de la phase de diffusion pour l’op´erateur de Schr¨odinger avec potentiel. C.R. Acad. Sci. Paris 293, 601–603 (1981) Hawking, S.W.: Zeta function regularization of path integrals in curved space time. Commun. Math. Phys. 55, 133–148 (1977) Hejhal, D.A.: The Selberg trace formula and the Riemann zeta function. Duke Math. J. 43, 441–482 (1976) Jensen, A.: Spectral properties of Schr¨odinger operators and time-decay of the wave functions, results in L2 (Rm ), m ≥ 5. Duke Math. J. 47, 57–80 (1980) Jensen, A. and Kato, T.: Spectral properties of Schr¨odinger operators and time-decay of the wave functions. Duke Math. J. 46, 583–611 (1979) Jensen, A. and Kato, T.: Asymptotic behaviour of the scattering phase for exterior domains. Commun. Partial Diff. Equations 3 , 1165–1195 (1978) Jorgenson, J. and Lundelius, R.: Continuity of relative hyperbolic spectral theory through metric degeneration. Duke Math. J. 84, 47–81 (1996) Kato, T.: Perturbation theory for linear operators. Berlin: Springer-Verlag, 1966 Kato, T.: Wave operators and unitary equivalence. Pacific J. Math. 15, 171–180 (1965)

Relative Zeta Functions, Relative Determinants and Scattering Theory

[Kr] [Lu] [Me1] [Me2] [Mu1] [Mu2] [Mu3] [Mu4] [Mu5] [Mu6] [OPS] [Pa] [RS1] [RS] [R] [Ro] [Sa] [Se] [Si] [SZ] [Th] [Y]

347

Krein, M.G.: On the trace formula in perturbation theory. Mat. Sbornik 33(75), 597–626 (1953) (Russian) Lundelius, R.: Asymptotics of the determinant of the Laplacian on hyperbolic surfaces of finite volume. Duke Math. J. 71, 211–242 (1993) Melrose, R.B.: The Atiyah-Patodi-Singer index theorem. Boston: A.K. Peters, 1993 Melrose, R.B.: Geometric scattering theory. Cambridge: Cambridge University Press, 1995 M¨uller, W.. Spectral theory for Riemannian manifolds with cusps and a related trace formula. Math. Nachrichten 111, 197–288 (1983) M¨uller, W.: Spectral theory and scattering theory for certain complete surfaces of finite volume. Invent. math. 109, 265–305 (1992) M¨uller, W.: Eta invariants and manifolds with boundary. J. Diff. Geom. 40, 311–377 (1994) M¨uller, W. On the analytic continuation of rank one Eisenstein series. Geom. Funct. Anal. 6, 572–586 (1996) M¨uller, W. Manifolds with cusps of rank one. Lecture Notes in Math. 1244, Berlin: Springer-Verlag, 1987 M¨uller, W.: Relative determinants of elliptic operators and scattering theory. Journees “Equations aux derivees partielles”, Saint-Jean-de Monts 1996, Ecole Polyt. Osgood, B., Phillips, R. and Sarnak, P.: Extremals of determiants of Laplacians. J. Funct. Anal. 80, 148–211 (1988) Parnovski, L.B.: Spectral asymptotics of the Laplace operator on surfaces with cusps. Math. Annalen 303, 281–296 (1995) Ray, D.B. and Singer, I.M.: R-torsion and the Laplacian on Riemannian manifolds. Adv. Math. 7, 145–210 (1971) Reed, M. and Simon, B.: Methods of mathematical physics. IV, London: Academic Press, 1978 Robert, D.: Asymptotique a grande e´ nergie de la phase de diffusion pour un potentiel. Asymptotic Anal. 3, 301–320 (1991) Robert, D.: Asymptotique de la phase de diffusion a haute e´ nergie pour des perturbations du second ´ Norm. Sup. 4e s´erie, 25, 107–134 (1992) ordre du Laplacien. Ann. scient. Ec. Sarnak, P.: Arithmetic quantum chaos. Israel Math. Conf. Proc. 8, 183–236 (1995) Seeley, R.T.: Complex powers of an elliptic operator. Proc. Symp. Pure Math. 10, 288–307 (1967) ´ Cartan et les Singer, I.M.: Families of Dirac operators with applications to physics. In: Elie Math´ematiques d’aujourd’hui, Ast´erisque 1985, Num´ero Hors S´erie. Sj¨ostrand, J. and Zworski, M.. Complex scaling and the distribution of scattering poles. J. Am. Math. Soc. 4, 729–769 (1991) Thaller, B.: The Dirac equation, Texts and Monographs in Physics, Berlin–Heidelberg–New York: Springer-Verlag, 1992 Yafaev, D.R.: Mathematical scattering theory. Transl. Math. Monographs, Vol. 105, Princeton, NJ: AMS, 1992

Communicated by P. Sarnak

Commun. Math. Phys. 192, 349 – 403 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

New Braided Endomorphisms from Conformal Inclusions Feng Xu Department of Mathematics, University of California, Los Angeles, CA 90095-1555, USA Received: 12 September 1996 / Accepted: 3 July 1997

Abstract: Various puzzles about subfactors and integrable lattice models associated with conformal inclusions are resolved in the framework of constructive quantum field theory in two dimensions. In particular, a new class of braided endomorphisms are obtained for a general class of conformal inclusions and their properties are analyzed. The existence of subfactors with principal graphs E6 or E8 follows from a rather simple argument in our construction. The fusion graphs of many new examples are given. 0. Introduction Let G ⊂ H be a conformal inclusion and π 0 be the vacuum representation of Loop group LH (see Sect. 1.6). Denote by e the identity element in H and I an interval on S 1 . Let LI G = {f ∈ LG | f = e on I c }, and LI H = {f ∈ LH | f = e on I c }. Then π 0 (LI G)00 ⊂ π 0 (LI H)00 is called the subfactor from conformal inclusions. Both π 0 (LI G)00 and π 0 (LI H)00 are hyperfinite III1 factors and when G = SU (N ), π 0 (LI G)00 ⊂ π 0 (LI H)00 is of finite depth (see Sect. 2). In [X3] and [X4] attempts are made to construct subfactors from conformal inclusions from some known integrable lattice models in [PZ]. The advantage of such an approach is that one can determine the principal graphs of the subfactors completely from an explicit recursive formula. However, one has to check some local conditions. The calculations which are needed are tedious but straightforward (unlike global conditions such as the flatness condition [OCN1] which is impossible to check directly) when the rank of G is small. But when the rank of G is big, the corresponding integrable lattice models become harder to construct. In fact, the main motivation of this paper is to explain and generalize some of the observations in [X3] and [PZ]. On the other hand, in [Ka1] it is suggested that one can construct certain “quotient” subfactors by subfactors from conformal inclusions. In particular, the role of intertwining YBE is emphasized in [Ka1]. However, it is clear that one can only hope to be able to check intertwining YBE in very limited cases.

350

F. Xu

In this paper we will show the existence of “quotient” subfactors for all maximal conformal inclusions G ⊂ H when G = SU (N ) 1 and H is a simple group. In fact, we will show the existence of a new class of braided endomorphisms for all maximal conformal inclusions SU (N ) ⊂ H with H being simple. There are two main ingredients which make such a general statement possible. The first one is the work of A. Wassermann in [W1] which clarifies the fusion ring structure arising from the positive energy representations of LSU (N ) (cf. Sect. 1.6). The second one is the work of R. Longo and K.-H.Rehren in [LR] which applies to the study of conformal inclusions. Let us describe in more detail the content of the paper. Section 1 is a preliminary section of the general theory of sectors, correspondences and constructive conformal field theories. Sections 1.2 and 1.3 are contained in [GL1] and we have included them to set up the notations and concepts. In Sect. 1.4, we proved the Yang–Baxter Equation (YBE) and Braiding–Fusion Equation (BFE). We also give a proof of monodromyequations. These results are scattered in the literature and their proof is not new. We have included them for future references. Our treatment follows that of [X6]. In Sect. 1.5 we use diagrams to represent YBE and BFE proved in Sect. 1.4. These diagrammatic representations make many algebraic relations more transparent. Part of the results in [W1] and [FG] are sketched in Sect. 1.6. In Sect. 2 we study the subfactors from conformal inclusions, following the ideas of [W2] and [LR]. Section 2.1 is contained in [LR]. We have included this section to introduce concepts and notations compatible with the previous sections. In Sect. 2.2, the general theory of [LR] is applied to the subfactors associated with the conformal inclusions. Proposition 2.5 shows that such subfactors naturally give rise to a standard net. Proposition 2.6 enables us to study certain endomorphism associated with conformal inclusions by using the theory sketched in Sect. 1.6. The local property established in Proposition 2.7 and Proposition 2.8 plays a crucial role in proving Theorem 3.3 and Theorem 3.9. The idea of the proof of Proposition 2.7 and Proposition 2.8 is contained in [LR]. Section 3 is the main part of this paper. We use the results of previous sections to prove six main theorems. Theorem 3.1 and Corollary 3.2 shows the existence of “quotient” endomorphisms. Theorem 3.3 implies that these endomorphisms are braided and gives more information on the higher relative commutants of these braided endomorphisms. Theorem 3.6 shows some commutation relations which imply, for example, that for the subfactors associated to these endomorphisms, the principal graphs and the dual principal graphs are isomorphic as abstract graphs (Corollary 3.7). Connections with [X3] and [PZ] are made in Theorem 3.8, which implies that certain graphs “support” (in the sense of [PZ]) representations of Hecke algebras. Two explicit examples are given in the end of Sect. 3. The existence of subfactors with principal graphs E6 , E8 follows from Corollary 3.7 and a rather simple counting argument. In Sect. 3.2 we prove some further properties of our braided endomorphisms and establish their relations to solvable lattice models in [PZ]. Among them, Theorem 3.9 determines the spectrum of the fusion graphs completely, and Theorem 3.10 proves most of the assumptions of [PZ] in generality. In fact, the work of [PZ] has been an important source of inspiration for our work. The main part of the proof in Sect. 3 consists of computations of algebraic relations established in Sects. 1 and 2. We have used diagrammatic representations which may help to “visualize” the algebraic identities. 1

For other simple G case, see comments in Sect. 5.

New Braided Endomorphisms from Conformal Inclusions

351

By using the results of Sect. 3, especially Theorem 3.3 and Lemma 3.5, the fusion graphs of the braided endomorphism associated to G = SU (3) are obtained in Sect. 4. \ \ (15) One particular interesting example associated to conformal inclusion SU (4)4 ⊂ SU is also given. It shows how commutativity among the subsectors of our braided endomorphisms may fail. It should be mentioned that all the graphs first appear in [PZ] with different meaning and are constructed by a certain hypothesis. Our results provide a rigorous foundation to the insights of [PZ]. In Sect. 5, we present our conclusions and some questions which arise naturally from our approach. 1. Preliminaries 1.1. Sectors and correspondences. Let M , N be von Neumann algebras, that we always assume to have separable preduals, and H a M − N correspondence, namely H is a (separable) Hilbert space, where M acts on the left, N acts on the right and the actions are normal. We denote by xξy, x ∈ M , y ∈ N , ξ ∈ H the relative actions. The trivial M − M correspondence is the Hilbert space L2 (M ) with the standard actions given by the modular theory xξy = xJy ∗ Jξ ,

x, y ∈ M ,

ξ ∈ L2 (M ) ,

where J is the modular conjugation of M ; the unitary correspondence is well defined modulo unitary equivalence. If ρ is a normal homomorphism of M into M we let Hρ be the Hilbert space L2 (M ) with actions: x · ξ · y ≡ ρ(x)ξ · y, x ∈ M , y ∈ M , ξ ∈ L2 (M ). Denote by End(M ) the semigroup of the endomorphism of M and Corr(M ) the set of all M − M correspondences. The following proposition is proved in [L4] (Corollary 2.2 in [L4]). Proposition 1.1.1. Let M be an infinite factor. There exists a bijection between the unitary equivalence classes of End(M ) and the unitary equivalence classes of Corr(M ), i.e., given ρ, ρ0 ∈ End(M ), Hρ is unitarily equivalent to Hρ0 iff there exists a unitary u ∈ M with ρ0 (x) = uρ(x)u∗ . Let Sect(M ) denote the quotient of End(M ) modulo unitary equivalence in M as in Proposition 1.1. We call sectors the elements of the semigroup Sect(M ); if ρ ∈ End(M ) we denote by [ρ] its class in Sect(M ). By Proposition 2.2 Sect(M ) may be naturally identified with Corr(M )∼ the quotient of Corr(M ) modulo unitary equivalence. It follows from [L3] and [L4] that Sect(M ), with M a properly infinite (on Hilbert space H) von Neumann algebra, is endowed with a natural involution θ → θ¯ that commutes with all natural operations of direct sum, tensor product and other (the tensor product of correspondences correspond to the composition of sectors). Suppose ρ ∈ End(M ) is given together with a normal faithful conditional expectation : M → ρ(M ). We define a number d (possibly ∞) such that: d−2 := Max{λ ∈ [0, +∞)|(m+ ) ≥ λm+ , ∀m+ ∈ M+ }. Now assume ρ ∈ End(M ) is given together with a normal faithful conditional expectation : M → ρ(M ), and assume d < +∞. We define d = Min {d }.

352

F. Xu

d is called the statistical dimension of ρ. It is clear from the definition that the statistical dimension of ρ depends only on the unitary equivalence classes of ρ. The properties of the statistical dimension can be found in [L1, L3 and L4]. Denote by Sect0 (M ) those elements of Sect(M ) with finite statistical dimensions. For λ, µ ∈ Sect0 (M ), let Hom(λ, µ) denote the space of intertwiners from λ to µ, i.e. a ∈ Hom(λ, µ) iff aλ(x) = µ(x)a for any x ∈ M . Hom(λ, µ) is a finite dimensional vector space and we use hλ, µi to denote the dimension of this space. hλ, µi depends only on [λ] and [µ]. Moreover we have hνλ, µi = hλ, νµi, ¯ hνλ, µi = hλ, µλi which follows from Frobenius duality (see [L2] or [Y]). We will also use the following notations: if µ is a subsector of λ, we will write it as µ ≺ λ or λ µ. 1.2. General properties of conformal precosheaves on S 1 . In this section we recall the basic properties enjoyed by the family of the von Neumann algebras associated with a conformal Quantum Field Theory on S 1 . All the propositions in this section and Sect. 1.3 are proved in [GL1]. Our goal in this section is to give the definition of a conformal precosheaf. Let us first do some preparations. By an interval in this section only we shall always mean an open connected subset I of S 1 such that I and the interior I 0 of its complement are non-empty. We shall denote by I the set of intervals in S 1 . We shall denote by P SL(2, R) the group of conformal transformations on the complex plane that preserve the orientation and leave the unit circle S 1 globally invariant. Denote by G the universal covering group of P SL(2, R). Notice that G is a simple Lie group and has a natural action on the unit circle S 1 . Denote by R(ϑ) the (lifting to G of the) rotation by an angle ϑ. We may associate two one-parameter groups with any interval I in the following way. Let L1 be the upper semi-circle, i.e. the interval {eiϑ , ϑ ∈ (0, π)}. By using the Cayley transform C : S 1 → R ∪ {∞} given by z → −i(z − 1)(z + 1)−1 , we may identify L1 with the positive real line R+ . Then we consider the one-parameter groups 3I1 (s) and TI1 (t) of diffeomorphisms of S 1 (cf. Appendix B of [GL1]) such that C3I1 (s)C −1 x = es x ,

CTI1 (t)C −1 x = x + t ,

t, s, x ∈ R .

We also associate with I1 the reflection rI1 given by ¯ rI1 z = z, where z¯ is the complex conjugate of z. It follows from the definition that 3I1 restricts to an orientation preserving diffeomorphism of I1 , rI1 restricts to an orientation reversing diffeomorphism of I1 onto I10 and TI1 (t) is an orientation preserving diffeomorphism of I1 into itself if t ≥ 0. Then, if I is an interval and we choose g ∈ G such that I = gI1 , we may set 3I = g3I1 g −1 ,

rI = grI1 g −1 ,

TI = gTI1 g −1 .

The elements 3I (s), s ∈ R and rI are well defined, while the one parameter group TI is defined up to a scaling of the parameter. However, such a scaling plays no role in this paper. We note also that TI 0 (t) is an orientation preserving diffeomorphism of I into itself if t ≤ 0. Let r be an orientation reversing isometry of S 1 with r2 = 1 (e.g. rI1 ). The action of r on P SL(2, R) by conjugation lifts to an action σr on G, therefore we may consider the

New Braided Endomorphisms from Conformal Inclusions

353

semidirect product of G ×σr Z2 . Since G ×σr Z2 is a covering of the group generated by P SL(2, R) and r, G ×σr Z2 acts on S 1 . We call (anti-)unitary a representation U of G ×σr Z2 by operators on H such that U (g) is unitary, resp. antiunitary, when g is orientation preserving, resp. orientation reversing. Now we are ready to define a conformal precosheaf. A conformal precosheaf A of von Neumann algebras on the intervals of S 1 is a map I → A(I) from I to the von Neumann algebras on a Hilbert space H that verifies the following property: A. Isotony. If I1 , I2 are intervals and I1 ⊂ I2 , then A(I1 ) ⊂ A(I2 ) . B. Conformal invariance. There is a unitary representation U of G (the universal covering group of P SL(2, R)) on H such that U (g)A(I)U (g)∗ = A(gI) ,

g ∈ G,

I ∈I.

C. Positivity of the energy. The generator of the rotation subgroup U (R)(·) is positive. D. Locality. If I0 , I are disjoint intervals then A(I0 ) and A(I) commute. The lattice symbol ∨ will denote “the von Neumann algebra generated by”. E. Existence of the vacuum. There exists a unit vector (vacuum vector) which is U (G)-invariant and cyclic for ∨I∈I A(I). We have the following (cf. Proposition 1.1 of [GL1]): Proposition 1.2.1. Let A be a conformal precosheaf. The following hold: (a) Reeh-Schlieder theorem: is cyclic and separating for each von Neumann algebra A(I), I ∈ I. (b) Bisognano-Wichmann property: U extends to an (anti-)unitary representation of G ×σr Z2 such that, for any I ∈ I, U (3I (2πt)) = 1it I , U (rI ) = JI , where 1I , JI are the modular operator and the modular conjugation associated with (A(I), ) [29]. For each g ∈ G ×σr Z2 , U (g)A(I)U (g)∗ = A(gI). (c) Additivity: if a family of intervals Ii covers the interval I, then A(I) ⊂ ∨i A(Ii ) . (d) Spin and statistics for the vacuum sector [16]: U is indeed a representation of P SL(2, R), i.e. U (2π) = 1. (e) Haag duality: A(I)0 = A(I 0 ) A conformal precosheaf is called irreducible if it also satisfies the following: F. Uniqueness of the vacuum (or irreducibility). The only U (G)-invariant vectors are the scalar multiples of . The term irreducibility is due to the following (See Proposition 1.2 of [GL1]):

354

F. Xu

Proposition 1.2.2. The following are equivalent: C are the only U (G)-invariant vectors. The algebras A(I), I ∈ I, are factors. In this case they are type III1 factors. If a family of intervals Ii intersects at only one point ζ, then ∩i A(Ii ) = C. The von Neumann algebra ∨A(I) generated by the local algebra coincides with B(H) (A is irreducible). Let I be an interval and denote by I¯ its closure on the circle. The conformal precosheaf A constructed in Sect. 1.6 has the following additonal property: the map (i) (ii) (iii) (iv)

I → A(I) extends to all the connected subsets of S 1 such that the interior of I and the interior I 0 of its complement I c are non-empty and satisfies the Isotony property; moreover, ¯ A(I) = A(I). We shall only consider a conformal precosheaf A with the above property in this paper. 1.3. Superselection structure. In this section A is an irreducible conformal precosheaf of von Neumann algebras as defined in Section 1.2. Our goal is to define the covariant representation of A and the associated concepts which will be important later on. Again all the definitions and propositions are given in [GL1] and we safely refer the reader to [GL1] for further details and unexplained notations. A covariant representation π of A is a family of representations πI of the von Neumann algebras A(I), I ∈ I, on a Hilbert space Hπ and a unitary representation Uπ of the covering group G of P SL(2, R), with positive energy, i.e. the generator of the rotation unitary subgroup has positive generator, such that the following properties hold: I ⊃ I¯ ⇒ πI¯ |A(I) = πI (isotony), adUπ (g) · πI = πgI · adU (g)(covariance) . A unitary equivalence class of representations of A is called superselection sector. Assuming Hπ to be separable, the representations πI are normal because the A(I)’s are factors . Therefore for any given I0 , πI00 is unitarily equivalent idA(I00 ) because A(I00 ) is a type III factor. By identifying Hπ and H, we can thus assume that π is localized in a given interval I0 ∈ I, i.e. πI00 = idA(I00 ) (cf. [Fro]). By Haag duality we then have πI (A(I)) ⊂ A(I) if I ⊃ I0 . In other words, given I0 ∈ I we can choose in the same sector of π a localized endomorphism with localization support in I0 , namely a representation ρ equivalent to π such that I ∈ I, I ⊃ I0 ⇒ ρI ∈ End A(I) ,

ρI00 = idI00 .

To capture the global point of view we may consider the universal algebra C ∗ (A). Recall that C ∗ (A) is a C ∗ -algebra canonically associated with the precosheaf A (see [Fre]). C ∗ (A) has the following properties: there are injective embeddings ιI : A(I) → C ∗ (A) so that the local von Neumann algebras A(I), I ∈ I, are identified with subalgebras of C ∗ (A) and generate all together a dense ∗-subalgebra of C ∗ (A), and every representation of the precosheaf A factors through a representation of C ∗ (A). Conversely any representation of C ∗ (A) restricts to a representation of A. The vacuum representation π0

New Braided Endomorphisms from Conformal Inclusions

355

of C ∗ (A) corresponds to the identity representation of A on H, thus π0 acts identically on the local von Neumann algebras. We shall often drop the symbols ιI and π0 when no confusion arises. By the universality property, for each g ∈ P SL(2, R) the isomorphism adU (g) : A(I) → A(gI), I ∈ I lifts to an automorphism αg of C ∗ (A). We shall lift the map g → αg to a representation, still denoted by α, of the universal covering group G of P SL(2, R) by automorphisms of C ∗ (A). The covariance property for an endomorphism ρ of C ∗ (A) localized in I0 means that αg · ρ · αg−1 is adzρ (g)∗ · ρ = αg · ρ · αg−1 g ∈ G for a suitable unitary zρ (g) ∈ C ∗ (A). We define ρg = αg · ρ · αg−1

, g ∈ G.

ρg,J is the restriction of ρg to A(J). The map g → zρ (g) can be chosen to be a localized α-cocycle, i.e. zρ (g) ∈ A(I0 ∪ gI0 ) ∀g ∈ G : I0 ∪ gI0 ∈ I zρ (gh) = zρ (g)αg (zρ (h)) , g, h ∈ G . To compare with the result of [FG], let us define: 0ρ (g) = π0 (zρ (g)∗ ). This notation will be used in Sect. 1.4. An endomorphism of C ∗ (A) localized in an interval I0 is said to have finite index if ρI (= ρ|A(I) ) has finite index, I0 ⊂ I (see [L2, L3]). The index is independent of I due to the following (See Proposition 2.1 of [GL1]) Proposition 1.3.1. Let ρ be an endomorphism localized in the interval I0 . Then the index Ind(ρ) := Ind(ρI ), the minimal index of ρI , does not depend on the interval I ⊃ I0 . The following Proposition is Proposition 2.2 of [GL1]: Proposition 1.3.2. Let ρ be a covariant (not necessarily irreducible) endomorphism with finite index. Then the representation Uρ described before is unique. In particular, any irreducible component of ρ is a covariant endomorphism. By the above proposition the univalence of an endomorphism ρ is well defined by Sρ = Uρ (2π) . When Sρ is a complex number of modulus one, since Uρ0 (g) := π0 (u)Uρ (g)π0 (u)∗ , where ρ0 (·) := uρ(·)u∗ , u ∈ C ∗ (A), Sρ depends only on the superselection class of ρ. When ρ is irreducible, Sρ is a complex number of modulus one since by definition Sρ belongs to π(C ∗ (A))0 and ρ is irreducible, and we have Sρ = e2πi1ρ . with 1ρ the lowest weight of Uρ . 1ρ is also referred to as Conformal dimension. Examples of calculations of 1ρ can be found at the beginning of Sect. 3.2. Let ρ1 , ρ2 be endomorphisms of an algebra B. Recall from Sect. 1.1 that their intertwiner space is defined by Hom(ρ1 , ρ2 ) = {T ∈ B : ρ2 (x)T = T ρ1 (x),

x ∈ B}.

356

F. Xu

In the case B = C ∗ (A), ρi localized in the interval Ii and T ∈ (ρ1 , ρ2 ), then π0 (T ) is an intertwiner between the representations π0 · ρi . If I ⊃ I1 ∪ I2 , then by Haag duality its embedding ιI · π0 (T ) is still an intertwiner in (ρ1 , ρ2 ) and a local operator. We shall denote by (ρ1 , ρ2 )I the space of such local intertwiners (ρ1 , ρ2 )I = (ρ1 , ρ2 ) ∩ A(I) . If I1 and I2 are disjoint, we may cover I1 ∪ I2 by an interval I in two ways: we adopt the convention that, unless otherwise specified, a local intertwiner is an element of (ρ1 , ρ2 )I , where I2 follows I1 inside I in the clockwise sense. We now define the statistics. Given the endomorphism ρ of A localized in I ∈ I, choose an equivalent endomorphism ρ0 localized in an interval I0 ∈ I with I¯0 ∩ I¯ = ∅ and let u be a local intertwiner in (ρ, ρ0 ) as above, namely u ∈ (ρ, ρ0 )I˜ with I0 following ˜ clockwise I inside I. The statistics operator σ := u∗ ρ(u) = u∗ ρI˜ (u) belongs to (ρ2I˜ , ρ2I˜ ). Recall that if ρ is an endomorphism of a C ∗ -algebra B, a left inverse of ρ is a completely positive map 8 from B to itself such that 8 · ρ = id. It follows from Cor.2.12 of [GL1] that if ρ is irreducible there exists a unique left inverse 8 of ρ and that the statistics parameter λρ := 8(σ) depends only on the sector of ρ. The statistical dimension d(ρ) and the statistics phase κρ are then defined by λρ . d(ρ) = |λρ |−1 , κρ = |λρ | In [GL1], the following remarkable theorem is proved: Theorem 1.1. κρ = Sρ if ρ has finite statistics. 1.4. Coherence equations. In this section, we assume 1 is a set of localized covariant endomorphism of A with localization support in I0 . Let h, g be elements of G. We assume hI0 ∩ I0 = ∅, gI0 ∩ I0 = ∅, hI0 ∩ gI0 = ∅. Choose J1 , J2 ∈ I such that J1 ∪ J2 ( S 1 , J1 ⊃ I0 ∪ g.I0 , J2 ⊃ I0 ∪ h.I0 , J1 ∩ h.I0 = ∅, J2 ∩ g.I0 = ∅ and J1 ∩ J2 = I0 . We assume in J1 (resp. J2 ), g.I0 (resp. h.I0 ) lies a clockwise (resp. anti clockwise) from I0 . Lemma 1.1. For any J ⊃ J1 ∪ J2 , J ∈ I, γ, λ ∈ 1 and x ∈ A(J), we have (0) (1) (2) (3)

0λ (g) ∈ A(J1 ). 0λ (g)∗ γJ1 (0λ (g))γJ · λJ (x) = λJ · γJ (x)0λ (g)∗ γJ1 (0λ (g). 0λ (g)∗ γJ1 (0λ (g)) = λJ2 (0γ (h)∗ )0γ (h). 0γ (g)∗ γJ1 (0λ (g)) ∈ A(I0 ).

Proof. Recall λJ (x) = 0γ (g)∗ λg,J (x)0λ (g) for any x ∈ A(J), S 1 ) J ⊃ J1 . Since λJ (resp. λg,J ) is localized on I0 (resp. g.I0 ), it follows that 0λ (g) ∈ A(J ∩ J1c )0 for any S 1 ) J ⊃ J1 . Let us choose J4 ⊃ J1 , J3 ⊃ J1 so that I2 = J4 ∩ J1c , I3 = J3 ∩ J1c are closed intervals and I2c ∩ I3c = J1 . Then we have: 0λ (g) ∈ A(I2c ) ∩ A(I3c ).

New Braided Endomorphisms from Conformal Inclusions

357

We claim that

A(I2c ) ∩ A(I3c ) = A(J1 ). In fact it is clear that A(I2c ) ∩ A(I3c ) ⊃ A(J1 ). By Haag duality, it is sufficient to prove A(I2 ) ∨ A(I3 ) ⊃ A(J1c ). But I2 ∪ I3 = J1c and the inclusion above follows by (c) of Proposition 1.2.1. So we have 0λ (g) ∈ A(J1 ). By (0), γJ1 (0λ (g)) is well defined. To prove (1), we can calculate the left hand side as follows: 0λ (g)∗ γJ1 (0λ (g))γJ · λJ (x) = 0λ (g)∗ γJ (0λ (g)λJ (x)) = 0λ (g)∗ γJ (λg,J (x)0λ (g)) = 0λ (g)∗ γJ (λg,J (x))γJ1 (0λ (g) = 0λ (g)∗ λg,J (γJ (x))γJ1 (0λ (g)) = λJ · γJ (x)0λ (g)∗ γJ1 (0λ (g)), where in the first “=” we used γJ (x) = γJ1 (x) if x ∈ A(J)A(J1 ) and J ⊃ J1 . In the fourth “=” we used λg,J (γJ (x)) = γJ (λg,J (x)) for x ∈ A(J) since λg,J and γJ have disjoint support. To prove (2), it is sufficient to prove: 0γ (h)∗ 0γ (h)γJ1 (0λ (g))0γ (h)∗ = 0λ (g)λJ2 (0γ (h)∗ )0λ (g)∗ 0λ (g), i.e.

0γ (h)∗ γh,J1 (0λ (g)) = λg,J2 (0γ (h)∗ )0γ (g) .

This follows from

γh,J1 (0λ (g)) = 0λ (g) λg,J2 (0λ (h)∗ ) = 0γ (h)∗ . Since 0γ (g) (resp. 0γ (h)∗ ) is in A(J1 ) (resp. A(J2 )) and J1 (resp. J2 ) is disjoint from the support h.I0 (resp. g.I0 ) of γh,J1 (resp. λg,J2 ). It follows from (1) and the proof of (0) that 0λ (g)∗ γJ1 (0λ (g)) ∈ A(J1 ). Similarly, λJ2 (0γ (h)∗ )0γ (h) ∈ A(J2 ). From (2) we deduce that 0λ (g)∗ γJ1 (0λ (g)) ∈ A(J1 ) ∩ A(J2 ) = A(J0 ) where the last "=” follows as in the proof of (0). Because of the property (1) of Lemma 1.4.1, 0λ (g)∗ γJ1 (0λ (g)) is called the braiding operator. We shall use σγ,λ to denote 0λ (g)∗ γJ1 (0λ (g)). We are now ready to prove the following equations. For simplicity we will drop the subscript I0 and write µI0 as µ for any µ ∈ 1 in the following. Proposition 1.4.1. (1) Yang-Baxter-Equation (YBE): σµ,γ µ(σλ,γ )σλ,µ = γ(σλ,µ )σλ,γ λ(σµ,γ ) . (2) Braiding-Fusion-Equation (BFE): For any w ∈ Hom(µγ, δ), σλ,δ λ(w) = wµ(σλ,γ )σλ,µ , σδ,λ w = λ(w)σµ,λ µ(σγ,λ ), ∗ ∗ ∗ λ(w) = wµ(σγ,λ )σµ,λ , σδ,λ ∗ ∗ ∗ σλ,δ λ(w) = wµ(σγ,λ )σλ,µ .

(a) (b) (c) (d)

358

F. Xu

Proof. To prove (1), let us first calculate the left-hand side of (1) as follows: σµ,γ µ(σλ,γ )σλ,µ = σµ,γ µ(γJ2 (0λ (h)∗ )0λ (h))µJ2 (0λ (h)∗ )0γ (h) = σµ,γ µ(γJ2 (0λ (h)∗ ))0γ (h) . For the right-hand side of (1), we have: γ(σλ,µ )σλ,γ λ(σµ,γ ) = γ(µJ2 (0λ (h)∗ )0λ (h)) · γJ2 (0λ (h)∗ )0λ (h) · λ(σµ·γ ) = γ(µJ2 (0λ (h)∗ ))λh (σµ,γ ) · 0λ (h) = γ(µJ2 (0λ (h)∗ ))σµ,γ 0λ (h) = σµ,γ µ(γJ2 (0λ (h)∗ )), where in the second “=” we have used 0λ (h)λ(σµ,γ ) = λh (σµ,γ )0λ (h); In the third “=” we have used λh (σµ,γ ) = σµ,γ , since σµ,γ ∈ A(I0 ) and λh has support on h.I0 which is disjoint from I0 . To prove (a) of (2), let us calculate, starting from the right-hand side of (a) as follows: wµ(σλ,γ )σλ,µ = wµ(γJ2 (0λ (h)∗ )0λ (h)) · µ(0λ (h)∗ )0λ (h) = wµ(γJ2 (0λ (h)∗ ))0λ (h) = δ(0λ (h)∗ )w0λ (h) = σλ,δ λ(w) . To prove (b), we make use of (2) in Lemma 1.4.1 to calculate, starting from the right-hand side of (b) in the following: λ(w)σµ,λ µ(σγ,λ ) = λ(w)0λ (g)∗ µ(0λ (g)) · µ(0λ (g)∗ γ(0λ (g))) = λ(w)0λ (g)∗ µ(γ(0λ (g))) = 0λ (g)∗ wµ(γ(0λ (g))) = 0λ (g)∗ δ(0λ (g))w = σδ,λ w. (c) (resp. (d)) is proved in exactly the same way as (a) (resp. (b)) with h replaced by g (resp. g replaced by h). Suppose ξ1 ∈ Iξ1 ⊂ J1 , Iξ1 ∩ gI1 ∩ I1 = ∅, ξ2 ∈ Iξ2 ⊂ J2 , and Iξ2 ∩ I1 ∩ h.I1 = ∅. Here g, h, J1 , J2 are defined as the beginning of this section. It follows from (2) of Lemma 1.4.1 that: γIξc (0λ (g))∗ 0λ (g) = γJ2 (0λ (h)∗ )0λ (h) = σλ,γ . 1

Hence σλ,γ σγ,λ = γIξc (0λ (g)∗ )γIξc (0λ (g)). 1 2 σλ,γ σγ,λ is called the monodromy operator. Let Te : δ → γλ be an intertwiner. Recall Sρ = Uρ (2π) is the univalence of a covariant endomorphism. When ρ is irreducable, Sρ is a complex number.

New Braided Endomorphisms from Conformal Inclusions

359

Proposition 1.4.2 (Monodromy Equation)). If Sδ , Sγ , Sλ are complex numbers, then Te∗ γIξc (0λ (g)∗ )γIξc (0λ (g))Te = Te∗ σλ,γ σγ,λ Te = 1

2

Sδ . Sλ Sγ

Proof. The proof is essentially contained in [FRS]. We define W = {g ∈ G | I0 ∪ gI0

is a proper interval contained in

S 1 \Iξ1 } .

For g ∈ W , we define Uγλ (g) = γI0 ∪g.I0 (0λ (g))Uγ (g). Then it is easy to check that if g1 ∈ W , g2 ∈ W and g1 g2 ∈ W , then we have: Uγλ (g1 g2 ) = Uγλ (g1 )Uγλ (g2 ) . For any g ∈ G, since G is connected, g can be decomposed as g = g1 · · · gn with gi ∈ W . Define Uγλ (g) = Uγλ (g1 ) · · · Uγλ (gn ) . By a standard deformation argument, using the fact that G is simply connected (see Proposition 8.2 in [GL2]), Uγλ (g) is independent of the decomposition of g. It follows from the proof of (v) of Lemma 4.8 in [Fro] that: Uγλ (g)(γ · λ)J (x)Uγλ (g) = (γ · λ)gJ (αg .x) for any x ∈ AJ . Since Te∗ Uγλ (g)Te is a representation of G associated with δ, it follows from Proposition 1.3.2 that Uδ (g) = Te∗ Uγλ (g)Te . We may assume, for simplicity, that I0 is so small that I0 ∩ π.I0 = ∅. Notice that in particular Uδ (2π) = Sδ = Te∗ Uγλ (2π)Te . Choose Iξ1 , Iξ2 such that Iξ1 , Iξ2 , I0 , π.I0 don’t intersect and anti-clockwise on the circle the order of the intervals are I0 , Iξ2 , π.I0 , Iξ1 . We have: Uγλ (2π) = Uγλ (π) · Uγλ (−π)∗ h i∗ = γIξc (0λ (π)∗ )Uγ (π) · γIξc (Uλ (−π)U0 (−π)∗ )Uγ (−π) 1

2

= γIξc (0λ (π)∗ )Uγ (2π) · γIξc U (π)Uλ (π)) 1

1

= γIξc (0λ (π)∗ )Sγ · Sλ · γIξc (0λ (π)). 1

2

So we have: Sδ = Te∗ · γIξc (0λ (π)∗ )γIξc (0λ (π))Te . 1 2 Sγ Sλ It is clear, by Lemma 1.4.1, that as long as g.I0 ∩ I0 = ∅, γIξc (0λ (g)∗ )γIξc (0λ (g)) = γIξc (0λ (π)∗ )γIξc (0λ (π)) . 1

2

1

The proof of the proposition is now completed.

2

360

F. Xu

1.5. Diagrammatic representations. Let γ, α, β ∈ 1, where 1 is as the beginning of Sect. 1.4. A choice of basis of Hom(γ, αβ) which consists of isometries is called a gauge choice. The elements of the basis are called spin variables. We will denote elements in γ (x) ∈ Hom(γ, αβ) where x runs over a finite index set a basis of Hom(γ, αβ) by Pαβ which consists of n elements, and n = Hom(γ, αβ). γ γ It is instructive to represent Pαβ (x) (resp. Pαβ (x)∗ ) by the following diagrams: γ

γ Pαβ (x)

-

α

γ Pαβ (x)∗

x

@ @ α

-

β

@ % @% x

γ

β

∗ We represent σα,β (resp. σβ,α ) by an over crossing (resp. under crossing) diagram as follows:

α

β

σα,β -

β

α

∗ σβα

β

-

α

The diagram of the compositions of two operators ba with diagrammatic representations are defined to be a diagram which is obtained by putting the diagram of a on the top of the diagram of b. It will be extremely convenient to represent YBE and BFE relations in terms of diagrams. For similar treatment, see [MS or X2]. For the reader’s convenience, we have used dotted horizontal lines in the following diagrams. Between each two adjacent horizontal lines, there is only one crossing or trivalent graph indicating the corresponding operators. All the diagrams should be read from the top to the bottom, meaning the compositions of the corresponding operators from the top to the bottom. We will omit the spin variables x in the following for simplicity. However, it should be clear that it is straightforward to include the spin variables at each stage. Now the Yang-Baxter-Equation (YBE) as in Proposition 1.4.2 can be represented as:

New Braided Endomorphisms from Conformal Inclusions

λ

µ

361

γ

λ

µ γ

=

The Braiding-Fusion-Equation (BFE) corresponding to (a) and (b) of Proposition 1.4.2 can be represented as: λ

µ

γ

λ

µ

γ

δ

λ

=

δ

λ

µ γ

λ

µ

γ

λ

λ δ λ δ The diagrams corresponding to (c) and (d) of Proposition 1.4.2 is the same as above except the over-crossing is replaced by under-crossing. We will use these diagrammatic identities to prove most of the theorems in Sect. 3. 1.6. Conformal precosheaf from representation of Loop groups. Let G = SU (N ). We denote LG the group of smooth maps f : S 1 7→ G under pointwise multiplication. The diffeomorphism group of the circle DiffS 1 is naturally a subgroup of Aut(LG) with the action given by reparametrization. In particular the group of rotations RotS 1 ' U (1) acts on LG. We will be interested in the projective unitary representation π : LG → U (H) that are both irreducible and have positive energy. This means that π should extend to LG n Rot S 1 so that H = ⊕n≥0 H(n), where the H(n) are the eigenspace for the action of RotS 1 , i.e., rθ ξ = expinθ for θ ∈ H(n) and dim H(n) < ∞ with H(0) 6= 0. It follows

362

F. Xu

from [PS] that for fixed level K which is a positive integer, there are only a finite number of such irreducible representations indexed by the finite set X X K λi 3i , λi ≥ 0 , λi ≤ K , P++ = λ ∈ P | λ = i=1,···,N −1

i=1,···,n−1

where P is the weight lattice of SU (N ) and 3i are the fundamental weights. We will K ν , define Nλµ = use 30 to denote the trivial representation of SU (N ). For λ, µ, ν ∈ P++ P (δ) (δ) (δ∗) (δ (δ) /S30 ), where Sλ is given by the Kac-Peterson formula: K Sλ Sµ Sν δ∈P++ X εw exp(iw(δ) · λ2π/n), Sλ(δ) = c w∈SN

where εw = det(w) and c is a normalization constant fixed by the requirement that Sµ(δ) is ν are non-negative integers. an orthonormal system. It is shown in [Kac2], p. 288 that Nλµ K with structure Moreover, define Gr(CK ) to be the ring whose basis are elements of P++ ν K ∗ constants Nλµ . The natural involution ∗ on P++ is defined by λ 7→ λ = the conjugate of λ as a representation of SU (N ). We shall use f to denote the positive energy representation of LSU (N ) at level K corresponding to the vector representation of SU (N ). The vacuum representation of a loop group LG is the positive energy representation of LG corresponding to the trivial representation of G. by S1(3) . Define dλ = We shall also denote S3(3) 0

S1(λ)

(30 )

S1

. We shall call (Sν(δ) ) the S-matrix

of LSU (N ). We shall encounter the ZN group of automorphisms of this set of weights, generated by σ : λ = (λ1 , λ2 , · · · , λN −1 ) → σ(λ) = (K − 1 − λ1 − · · · λN −1 , λ1 , · · · , λN −2 ). Define color τ (λ) ≡ Σi (λi − 1)i mod(N ). The central element ω = exp 2πi N of SU (N ) (λ) . We also make use of acts on the representation of SU (N ) labeled by λ as exp 2πiτ N the N (linearly dependent) vectors ei ˆ 1 , ei = 3 ˆi−3 ˆ i−1 , i = 1, · · · N − 1, eN = −3 ˆ N −1 . e1 = 3 The irreducible positive energy representations of LSU (N ) at level K give rise to an irreducible conformal precosheaf A (see Sect. 2) and its covariant representations in the following way: First note if πλ is a representation of central extension of Diff+ (S 1 ) on Hλ , then πλ induces an action of G as follows: Let us denote the induced central extension of ] P SL(2, R) ⊂ Diff+ (S 1 ) from that of Diff+ (S 1 ) by P SL(2, R) which is a circle bundle, ] denoted by L, over P SL(2, R) (cf. Sect. 4.5 of [PS]). Let π2 : P SL(2, R) → P SL(2, R) and π1 : G → P SL(2, R) be the natural covering maps. Since G is a simply connected simple Lie group, the pull back circle bundle π1∗ (L) on G is homemorphic to G × S 1 . It ] follows that there exists a homomorphism ϕ from G to P SL(2, R) such that π2 .ϕ = π1 . We shall fix ϕ and denote by πλ (g) the operator πλ (ϕ(g)) for any g ∈ G in the following. The conformal precosheaf is defined by A(I) = π0 (LI G)00 . In fact, by the results in Chapter 2, Theorem A, B, C, E and F of [W2] or Theorem 3.2. of [FG] that A(I) satisfies A to F of Sect. 1.2 and therefore is indeed an irreducible conformal precosheaf.

New Braided Endomorphisms from Conformal Inclusions

363

Let U (λ, I) be a unitary operator from Hλ to H0 such that: πλ (x) = U (λ, I)∗ π0 (x)U (λ, I) for any x ∈ LI G. Fix I0 ⊂ S 1 . We define a collection of maps as follows. For any interval J ⊂ S 1 , x ∈ A(J), λJ (x) = U (λ, I0c )U ∗ (λ, J)xU (λ, J)U (λ, I0c )∗ . It follows that if J ⊃ I0 , then λJ (x) commutes with A(J c ) for any x ∈ A(J). By Haag-duality, if J ⊃ I0 , λJ (AJ ) ⊂ A(J). Define: Uλ (g) = U (λ, I c )πλ (g)U (λ, I c )∗ . It is easy to check that {λJ } gives a covariant representation of conformal precosheaf A. Let us note that the intervals in Sect. 1.2 are defined to be open intervals. We can actually choose the interval to be closed since we shall be concerned with the conformal precosheaves from positive energy representations of LG and by Theorem E of [W2], π(LI G)00 = π(LI¯ G)00 where I¯ is the closure of I. The collection of maps {λJ } define an endomorphism λ of C ∗ (A) (see [GL2], Sect. 8). The relation between λ and λJ is given by π0 (λ(iJ (x))) = λJ (x)

for any

x ∈ A(J) .

Here iJ : A(J) → C ∗ (A) is the embedding of A(J) in C ∗ (A), and π0 is the vacuum representation of C ∗ (A). This makes the definition of composition λ · µ of two covariant representations {λJ } and {µJ } straightforward. One simply defines λ · µ as the composition of λ, µ as endomorphisms of C ∗ (A). It is easy to check that if J ⊃ I1 , π0 ((λ · µ)(iJ (x))) = λJ · µJ (x) . An equivalent definition can be found in Sect. 4.2 of [?]. We shall also be concerned with the positive energy representations at level K of LSU (N ) which is not irreducible but may be decomposed into a direct sum of finitely many irreducible representations. We will denote the set of such representations by CK . By abuse of notation we shall use λ to denote an element of CK . It is easy to see that the previous construction will give us a covariant (not necessarily irreducible) representation of A for any λ ∈ CK . We shall also use λ to denote the localized endomorphism of A(I0 ). All the sectors [λ] with λ irreducible generate a ring (see Sect. 1.1). We will call such a ring the fusion ring. For λ irreducible, the univalence Sλ is given by an explicit formula (cf. 9.4 of [PS]). c2 (λ) , where c2 (λ) is the value of the Casimir operator on the Let us first define 1λ = K+N representation of SU (N ) labeled by the dominant weight λ. 1λ is usually called the conformal dimension. Then we have: Sλ = exp(2πi1λ ). The following remarkable result is proved in [W2] (See corollary 1 of Chapter V in [W2]). (K) has a finite index with index value d2λ . The fusion ring Theorem 1.2. Each λ ∈ P++ (K) generated by all λ ∈ P++ is isomorphic to Gr(CK ).

364

F. Xu

Proof. It is easy to check that the tensor product of bimodules defined in Sect. 30 of [W1] correspond precisely to the compositions of the corresponding endomorphisms defined in this section. By Sect. 1.1 the fusion ring generated by λ’s is exactly the same as the fusion ring defined in Sect. 33 of [W1], and its structure is given by Corollary 1 of Chapter V in [W2]. The equivalence between the ring structure described in Corollary 1 of Chapter V in [W2] and Gr(CK ) described above is proved as an exercise on p. 288 of [?]. In Sect. 3, we will also need a similar but much simpler results as Theorem 1.6 for positive energy representations of loop group LH at level 1 where H = E6 , E7 , E8 , Spin(N ). We shall use σi to denote such representations, and Sσi σj to denote the S matrix of LH at level 1, where Sσi ,σj are defined on p. 264 of [Kac2] similar as above. For each σi we can define a covariant representation of an irreducible conformal precosheaf from the vacuum representation of LH exactly to that above, using Theorem 3.2 of [FG]. By abuse of notation, we shall use the same σi to denote the endomorphism which is localized on a fixed interval I0 . The ring structure associated to the H = Spin(N ) case is determined by Theorem 7.1 of [JB]. When H = E8 , there is only one representation, the vacuum representation at level 1, the fusion ring is generated by the identity element. When H = E6 (resp. H = E7 ), the center of E6 is Z3 (resp. Z2 ). The irreducible level 1 representations are in one-to-one correspondence with the elements of the center of H (cf. p. 181 of [PS]) . Denote by π0 , πσ the vacuum representation and the representation corresponding to the generator σ in the center with order n. Then (cf. p. 181 of [PS]): πσ ∼ = π0 · Adασ , where ασ is a smooth path from the identity element e of H to σ along S 1 , and Adασ is the adjoint action of ασ on LH. Choose ασ in such a way that ασ |I0c ≡ e. Then the sector [Adασ ] generates the fusion ring, and since ασn ∈ LI0 H, it follows that [Adnασ ] = id and the fusion ring is isomorphic to the group ring of Z. In all the cases above, i.e., when H = E6 , E7 , E8 , Spin(N ), the fusion rings determined above have also been computed by using the formula on p. 288 of [Kac2] in [Wal] on P.784-788. In particular this shows that the irreducible representations of the fusion ring are given by (cf. p. 288 of [Kac2]): σi →

Sσi σj . S1σj

Here the subscript 1 in S1σj is used to denote the vacuum representation of LH. 2. Subfactors from Conformal Inclusions 2.1. Nets of subfactors. In this section we sketch some of the results of [LR] which will be used in this paper. For the details of all the proofs and unexplained terminology , we safely refer the reader to [LR]. We have changed some of the notations in [LR] since they have been used to denote different objects in this paper. Let N ⊂ M be an inclusion of type III von Neumann algebras on a Hilbert space H. Let φ ∈ H be a joint cyclic and separating vector (which always exists if N is type

New Braided Endomorphisms from Conformal Inclusions

365

III and H is separable). Let jN = AdJN and jM = AdJM be the modular conjugations w.r.t.φ and the respective algebra. Then α = jN jM |M ∈ End(M ) maps M into a subalgebra of N . We call α the canonical endomorphism associated with the subfactor, and denote by N1 := jN jM (M ) ⊂ N, M ⊂ M1 := jM jN (N ) the canonical extension resp. restriction. φ is again joint cyclic and separating for the new inclusions above, giving rise to new canonical endomorphisms β = jN1 jN ∈ End(N ) and α1 = jM jM1 ∈ End(M1 ). We have the following formula for canonical endomorphism: (cf. Proposition 2.9 of [LR]): Proposition 2.1. Let N ⊂ M be an inclusion of properly infinite factors, : M → N a faithful normal conditional expectation, and eN ∈ N 0 the associated Jones projection. The canonical endomorphism α : M → N is given by γ = 9−1 · 8, where

8(m) = U mU ∗ (m ∈ M )

is the isomorphism of M into N eN implemented by an isometry U ∈ M1 = hM, eN i with U U ∗ = eN , and 8 is the isomorphism of N with N eN given by 8(n) = neN (n ∈ N ). Every canonical endomorphism of M into N arises in this way. Definition. A net of von Neumann algebras over a partially ordered set J is an assignment M : i → Mi of von Neumann algebras on a Hilbert space to i ∈ J which preserves the order relation, i.e., Mi ⊂ Mk if i ≤ k. A net of subfactors consists of two nets N and M such that for every i ∈ J , Ni ⊂ Mi is an inclusion of subfactors. We simply write N ⊂ M. The net M is called standard if there is a vector ∈ H which is cyclic and separating for every Mi . The net of subfactors N ⊂ M is called standard if M is standard and N is standard on a subspace H0 ⊂ H with the same cyclic and separating vector ∈ H0 . For a net of subfactors N ⊂ M, let be a consistent assignment i → i of normal conditional expectations. Consistency means that i = k |Mi whenever i ≤ k. Then we call a normal conditional expectation from M onto N . is called standard, if it preserves the vector state ω = (, ·). If the index set J is directed, i.e., for j, k ∈ J there is m ∈ J with j, k ≤ m, we with a net M of von Neumann algebras the inductive limit C ∗ algabra S associate ( i∈J Mi )− and denote it by the same symbol M. Then we have (cf. Cor.3.3 of [LR]): Proposition 2.2. Let N ⊂ M be a directed standard net of subfactors (w.r.t. the vector ∈ H) over a directed set J , and a standard conditional expectation. For every i ∈ J there is an endomorphism αi of the C ∗ algebra M into N such that α|Mj is a canonical endomorphism of Mj into Nj whenever i ≤ j. Furthermore, αi acts trivially on Mi0 ∩ N . As i ∈ J varies to k, the corresponding endomorphisms αi and αk are inner equivalent by a unitary in Nl whenever i, k ≤ l.

366

F. Xu

If the index set J is directed, by Cor.4.2 of [LR] the index is constant in a directed standard net of subfactors with a standard conditional expectation. We have (cf. Cor.4.3 of [LR]): Proposition 2.3. Let N ⊂ M be a directed standard net of subfactors (w.r.t. the vector ∈ H) over a directed set J , and a standard conditional expectation. If the index d2 = Ind() is finite, then for any i ∈ J , there is an isomorphic intertwiner v1 : id → α in Mi which satisfies the following identity with the isometric intertwiner w0 : id → β (β := α|N ) in Ni : w0∗ v1 = d−1 id = w0∗ α(v1 ). α is given on M by the formula α(m) = d2 (v1 mv1∗ )(m ∈ M). Furthermore, every element in M is of the form nv1 with n ∈ N , namely m = d2 (mv1∗ )v1 = d2 v1∗ (v1 m). 2.2. Subfactors from Conformal Inclusions. Let G ⊂ H be inclusions of compact simple Lie groups. LG ⊂ LH is called a conformal inclusion if the level 1 projective positive energy representations of LH decompose as a finite number of irreducible projective representations of LG. LG ⊂ LH is called a maximal conformal inclusion if there is no proper subgroup G0 of H containing G such that LG ⊂ LG0 is also a conformal inclusion. A list of maximal conformal inclusions can be found in [GNO]. Let H 0 be the vacuum representation of LH, i.e., the representation of LH associated with the trivial representation of H. Then H 0 decomposes as a direct sum of irreducible projective representation of LG at level K. K is called the Dynkin index of the conformal inclusion. Assume H 0 = ⊕λ∈P0 mλ Hλ where P0 ⊂ CK is finite and mλ ˆ is the multiplicity of Hλ in H 0 . We shall write the conformal inclusion as Gˆ K ⊂ H. We shall limit our consideration to the following conformal inclusions though most of d 1 , SU d(2)28 ⊂ d(2)10 ⊂ SO(5) the arguments apply to other cases as well. For SU (2) : SU d d d d ˆ ˆ ˆ ˆ G2 , For SU (3) : SU (3)5 ⊂ SU (6)1 , SU (3)9 ⊂ E6 , SU (3)21 ⊂ E7 ; (A8 )1 ⊂ (Eˆ 8 )1 ; and four infinite series: d N (N − 1) , N ≥ 4; d(N )N −2 ⊂ SU (a) SU 2 1 N (N + 1) d d ; (b) SU (N )N +2 ⊂ SU 2 1 2 d(N )2N ⊂ Spin(4N [ − 1)1 , N ≥ 2; SU d [ SU (2N + 1)2N +1 ⊂ Spin(4N (N + 1))1 .

(c) (d)

These cover all the maximal conformal inclusions of the form SU (N ) ⊂ H with H being a simple group. Let G = SU (N ) and I be an interval of S 1 . Set I c = S 1 \I. Let LI G = {f ∈ LG | f = e on I c } and LI H = {f ∈ LH | f = e on I c }, where e is the identity element of H. Let π 0 be the vacuum representation of LH on Hilbert space H 0 with vacuum vector . By Theorem 3.3 of [FG] π 0 (LI H)00 is isomorphic to the unique hyperfinite III1 factor M . Proposition 2.4. π 0 (LI G)00 ⊂ π 0 (LI H)00 is an irreducible inclusion with finite index.

New Braided Endomorphisms from Conformal Inclusions

367

Proof. Since G ⊂ H is an conformal inclusion, we can assume H 0 = ⊕λ∈P0 mλ Hλ , where P0 ⊂ CK is finite and mλ is the multiplicity of Hλ . Let H0 be the vacuum representation of LG. Denote by qλ the projections from H 0 to Hλ . By Theorem 3.3 of [FG] is the unique U (G) invariant vector in H 0 of lowest weight, so must be in H0 with multiplicity m0 , and therefore m0 = 1. Assume q ∈ π 0 (LI G)0 ∩ π 0 (LI H)00 . Then q ∈ π 0 (LI G)0 ∩ π 0 (LI c G)0 . By Theorem F in [W1], π 0 (LI G)0 ∩ π 0 (LI c G)0 = π 0 (LG)0 . Notice q0 is a minimal abelian projection in π 0 (LG)0 , so we have: q = qq0 = a for some constant complex number a. Since q ∈ π 0 (LI H)00 and is separating for π 0 (LI H)00 , we must have q = a · id. To prove the inclusion is of finite index, consider the following inclusions: π 0 (LI G)00 ⊂ π 0 (LI H)00 = π 0 (LI c H)0 ⊂ π 0 (LI c G)0 , where the = in the middle follows from Haag duality (cf. Theorem 3.3 of [FG]). By [L1], the statistical dimension of π 0 (LI G)00 ⊂ π 0 (LI c G)0

P is λ∈P mλ dλ which is finite by the definition of conformal inclusion and Theorem 1.6. Hence π 0 (LI G)00 ⊂ π 0 (LI H)00 is of finite index. The subfactor in Proposition 2.4 is called the subfactor from conformal inclusions. Now fix intervals I0 , I1 , J such that J¯ = I¯0 ∪ I¯1 , I0 ∩ I1 = ∅ and I1 lies in the anticlockwise direction from I0 in J. We define a directed set J to be the set which consists ¯ For any two elements I 0 , I 00 ∈ J , I 0 ≤ I 00 iff of all the intervals I 0 such that I 0 ⊂ J. I 0 ⊂ I 00 . For simplicity, we shall use the following notations in the rest of the paper: M (I) := π 0 (LI H)00 , N (I) := π 0 (LI G)00 , M1 (I) := π 0 (LI c G)0 , A(I) := π0 (LI G)00 , M := A(I0 ). Let eN := q0 be the unique projection from H 0 to H0 as in the proof of Proposition 2.4. Proposition 2.5. (1) N (I) ⊂ M (I) is a directed standard net of subfactors w.r.t. the vector ∈ H 0 over the set J for any I ∈ J , and there exists a unique standard conditional expectation from M (I) to N (I). (2) M1 (I) = hM (I), eN i. Proof. By Theorem 3.2 of [FG], is cyclic and separating for M (I). Similarly is cyclic and separating for N (I) restricted to H0 . Since the modular conjugation 1it of M (I) associated to the state ω = (, ·) is geometric and is given by (b) of Proposition 1.2.1, it follows that 1it N (I)1−it = N (I) and by Takesaki’s theorem (cf. Sect. 12. of [W1]) there exists a normal conditional expectation I from M (I) to N (I) and preserves the state ω. In fact such an I is unique and faithful since N (I) ⊂ M (I) is irreducible and of finite index. Recall eN is the projection from H 0 to H0 . Then

368

F. Xu

I (m) = eN m. It follows that if I ⊂ I 0 and m ∈ M (I), then I (m) = I 0 (m). So we have I (m) = I 0 (m), 0

since is separating for M (I ). Thus I is a standard conditional expectation from M (I) to N (I). To prove (2), notice that hM (I), eN i is the basic construction associated with N (I) ⊂ M (I) and the unique conditional expectation I , so the statistical dimension of N (I) ⊂ hM (I), eN i is d2 , where d is the statistical dimension of N (I) ⊂ M (I). It is obvious that hM (I), eN i ⊂ M1 (I), so we just have to show that the statistical dimension of N (I) ⊂ M1 (I) is also d2 . The subfactor M (I) ⊂ M1 (I) is anti-conjugate to N (I c ) ⊂ M (I c ) which has the same statistical dimension d as that of N (I) ⊂ M1 (I) since N (I c ) ⊂ M (I c ) is conjugate to N (I) ⊂ M1 (I), using congugation by U (g), where U (g) is the representation of G on H 0 and g.I = I c . It follows that the statistical dimension of N (I) ⊂ M1 (I) is also λ2 and hM (I), eN i = M1 (I). By Proposition 2.4 and Proposition 2.3, we have a canonical endomorphism αI0 : M (I) → N (I) for any I ⊃ I0 . Moreover, we have v1 ∈ M (I0 ) which satisfy the properties stated in Proposition 2.3. Let w1 = αI0 (v1 ). Recall β I0 := αI0 |N (I). For simplicity, we shall drop the labels and write αI0 , β I0 simply as α, β in the following. Define U (γ, I) : H 0 → H0 to be a unitary operator which commutes with the action of LI G as in Sect. 1.6. Notice that such an operator always exists since π 0 |LI G is equivalent to that of π0 . We shall think of U (γ, I) as an element of B(H 0 ) by identifying H0 as a subspace of H 0 . Define: φI : B(H 0 ) → B(H0 ) by φI (m) = U (γ, I)mU (γ, I)∗ , m ∈ B(H 0 ). We claim α can be chosen to be c α(m) = φ−1 I (φI0 (m)).

(1)

In fact by Proposition 2.1 and the proof of Proposition 2.2 in [LR], we just have to show c c φ−1 I (φI0 (m))eN = φI0 (m)

(2)

for any I ⊃ I0 and m ∈ M (I). Notice for any m ∈ M (I) and I ⊃ I0 , φI0c (m) ∈ A(I) as operators on H0 by definition of φI0c and Haag duality for A(I). So it is sufficient to check (2) for the case φI0c (m) = π0 (x) with x ∈ LI G since π0 (LI G) generates A(I). From the definitions we have

New Braided Endomorphisms from Conformal Inclusions

369

∗ φ−1 I (π0 (x))eN = U (γ, I) π0 (x)U (γ, I)eN

= U (γ, I)∗ U (γ, I)π 0 (x)eN = π0 (x)eN , and hence (2) and (1) hold. For any a ∈ A(I), define: γI (a) := φI · β · φ−1 I (a). Notice since φ−1 I (A(I)) = N (I), γ is well defined. By using (1), we have: γI (a) = φI0c (φ−1 I (a)).

(3)

which is precisely the definition of the localized (localized on I0 ) covariant endomorphism associated with the representation H 0 of LG as in Sect. 1.6. We shall denote γI0 for the fixed interval I0 by γ. Notice γ ∈ End(M ). Since β|N (I0 ) is a canonical endomorphism, we can find ρ1 ∈ End(N (I0 )) such that: β|N (I0 ) = ρ1 ρ¯1 with ρ¯1 (N (I0 )) ⊂ N (I0 ) conjugate to N (I0 ) ⊂ M (I0 ). Moreover, ρ1 (N (I0 )) = α(M (I0 )). Define ρ ∈ End(M ) and w ∈ M by ρ := φI0 · ρ1 · φ−1 I0 , w := φI0 (W1 ). Proposition 2.6. (1) γ = ρρ¯ ; (2) The subfactor ρ(M ¯ ) ⊂ M is conjugate to the subfactor N (I0 ) ⊂ M (I0 ); (3) There exists isometries v, v2 in M with w = ρ(v2 ) such that: ¯ vx = γ(x)v, v2 x = ρρ(x)v 2, wγ(x) = γ 2 (x)w and

γ(ρ(x))w = wρ(x),

ww∗ = γ(w∗ )w, ∗

w v=

d−1 ρ 1

for any

x∈M

ww = γ(w)w, ∗

= w γ(v).

Moreover, for any element x ∈ ρ(M ), x can be written as x = x1 w with x1 ∈ γ(M ). ¯ ). Similarly, for any element y ∈ M, y = y1 v2 with y1 ∈ ρ(M Proof. (1) and (2) follows from the definitions. (3) follows from Theorem 5.1 of [L2] and Cor.5.6 of [L4]. Let us choose h ∈ G with h.I0 = I1 . Recall from Sect. 1.3 that the braiding operator σγ,γ ∈ M is defined by: σγ,γ = γJ (0γ (h)∗ )0γ (h). Proposition 2.7. σγ,γ w = w.

370

F. Xu

Proof. The proof is contained in the proof of Cor.4.4 of [LR]. Let us explain the proof in our notations. Notice that ∗ φ−1 J σγ,γ φJ = β(u0 )u0 −1 with u0 = φ−1 J (0γ (h)), and φJ (w) = α(v1 ). It is sufficient to show that

β(u∗0 )u0 α(v1 ) = α(v1 ). It follows from the intertwining property (see Sect. 1.3) that u0 β(n) = nu0 (∀n ∈ N (I0 )). Let u = λ(v10 v1∗ ), where v10 ∈ M (I1 ) is the choice of isometry similar to v1 but corresponds to the interval I1 in Proposition 2.3. By Proposition 2.3, uv1 = v10 , u0 β(n) = nu0 (∀n ∈ N (I0 )). It follows that

uu∗0 ∈ N (I0 )0 ∩ N (J).

So φJ (uu∗0 ) ∈ A(I0 )0 ∩ A(J) = A(I1 ), where A(I0 )0 ∩ A(J) = A(I1 ) follows from the proof of (0) in Lemma 1.4.1. By applying φ−1 J to both sides, we have proved that: uu∗0 ∈ N (I1 ). Since β is localized on I0 , we conclude that β(uu∗0 ) = uu∗0 , i.e., β(u∗0 )u0 = β(u∗ )u. Since uv1 = v10 ∈ M (I1 ) and I0 ∩ I1 = ∅, by locality we have uv1 v1 = v1 uv1 = β(u)v1 v1 . Using α(v1 )v1 = v1 v1 we have uα(v1 )v1 = β(u)α(v1 )v1 . Now multiply v1∗ from the right on both sides and apply , using (v1 v1∗ ) = λ−1 , we have β(u∗ )uα(v1 ) = β(u∗0 )u0 α(v1 ) = α(v1 ).

Recall σj is the localized endomorphism (localized on I0 ) as in the end of Sect. 1.6. Let us recall how σj is defined. Let U (j, I) : Hσj → H 0 be a unitary map which commutes with the action of LI H. Define: ψI : B(Hσj ) → B(H 0 ) by ψI (x) = U (j, I)xU (j, I)∗ . Then σj is given by: σj (m) = ψI0c (ψI−1 (m)) (∀m ∈ M (I)). Let γj be the reducible representation of LG on Hσj . Then the localized endomorphism, denoted by the same γj , is given by: γj = φI0c · ψI0c ψI−1 · φ−1 I . Define σj0 ∈ End(M ) by: σj0 = ρ−1 · φI0c · σj · φ−1 I c · ρ. 0

c Notice by the definition of ρ1 , ρ1 (N (I0 )) = α(M (I0 )) = φ−1 I0 φI0 (M (I0 )), and it follows 0 that φ−1 ρ(M ) = M (I ), so σ as above is well defined. 0 j Ic 0

New Braided Endomorphisms from Conformal Inclusions

371

Proposition 2.8. (1) As elements in End(M ), we have ρσj0 ρ¯ = γj ; (2) σγ,γj w = ρσj0 ρ−1 (w). Proof. (1) follows directly from the definitions and the formula γ = ρρ. ¯ −1 ∗ 0 −1 (σ w) = α(σ(u ))u α(v ) and φ To prove (2), notice φ−1 γ,γ 0 1 j 0 J J (ρσj ρ (w)) = α(σj (v1 )), it is sufficient to show that α(σj (u∗0 ))u0 α(v1 ) = α(σj (v1 )). Here u0 = φ−1 J (0γ (h)) as in the proof of Proposition 2.7. Recall from the proof of Proposition 2.7 that u = λ(v10 v1∗ ), where v10 ∈ M (I1 ) is the choice of isometry similar to v1 but corresponds to the interval I1 in Proposition 2.3. Since ασj as an endomorphism of N (I) is localized on I0 , exactly the same argument as in the proof of Proposition 2.6 shows that α(σj (u∗0 ))u0 = α(σj (u∗ ))u. So we just have to show α(σj (u∗ ))uα(v1 ) = α(σj (v1 )). By Proposition 2.7 we have α(u∗ )uα(v1 ) = α(v1 ), i.e., uα(v1 ) = α(u)α(v1 ), so α(σj (u∗ ))uα(v1 ) = α(σj (u∗ ))α(u)α(v1 ) = α(σj (u∗ )uv1 ) = α(σj (u∗ )σj (uv1 )) = α(σj (v1 )), where in the third = we have used that uv1 ∈ M (I1 ) and σj is localized on I0 .qe In the next chapter all the endomorphisms will be in End(M ). For this reason we shall write σj0 in Proposition 2.8 simply as σj .

3. New Braided endomorphisms In this chapter all the endomorphisms will be in End(M ). P By formula (3) of Sect. 2, [γ] = λ∈P0 mλ [λ] ∈ Gr(CK ). By Proposition 2.6, γ = ρρ. ¯ So Proposition 2.6 will allow us to study ρ by γ which is an element in Gr(CK ). By Proposition 2.6, w ∈ Hom(γ, γ 2 ). Hence we can represent w and w∗ as:

372

F. Xu

γ

w

-

γ

@ @ γ

w∗

-

γ

γ @ % @%

γ

Let σλ,µ : λµ → µλ be the braiding operator as in Sect. 1.4. We will drop the subscript of σλ,µ and write σλ,µ simply as σ. In any case the subscript can always be recovered by tracing which spaces it acts on. 3.1. New braided endomorphisms. Fix λ ∈ CK (λ is not necessarily an irreducible representation). Let σ : γ · λ → λ · γ, be the braiding operator as defined in Sect. 1.4. Denote by λσ ∈ End(M ) which is defined by λσ (x) = σ ∗ λ(x)σ for any x ∈ M . Let : M → ρ(M ) be the minimal conditional expectation (see [L4]) given by (x) = w∗ γ(x)w for any x ∈ M . Theorem 3.1. (1) λσ (ρ(x)) ∈ ρ(M ) for any x ∈ M . (2) (λσ (x)) = λσ ((x)) for any x ∈ M . Proof. By the definitions of λσ we have λσ γ(x) = γ(λ(x)) ⊂ ρ(M ). Since by Proposition 3.1, every element x ∈ ρ(M ) can be written as x = x1 w with x1 ∈ γ(M ). It is sufficient to show λσ (w) ⊂ ρ(M ). We claim λσ (w) = γ(σ)w i.e. σ ∗ λ(w)σ = γ(σ)w. It is equivalent to λ(w)σ = σγ(σ)w which follows from (c) of Proposition 1.4.2. Hence we have proved λσ (w) = γ(σ)w. Since w ∈ ρ(M ), γ(σ) ∈ ρ(M ), therefore λσ (w) ∈ ρ(M ) and the proof of (1) is complete. As for (2), By using λσ (w) = γ(σ)w we have λσ ((x)) = λσ (w∗ )λσ (γ(x))λσ (w) = w∗ γ(σ)∗ γ(λ(x))γ(σ)w = (λσ (x)) Now let v be the isometry in M with vx = ρρ(x)v ¯ for any x ∈ M . Define aλ (x) = v ∗ ρ¯ · λσ · ρ(x)v. Corollary 3.2. aλ ∈ End(M ) and ρ · aλ = λσ · ρ, aλ · ρ¯ = ρ¯ · λ. If we denote by daλ , dλ the statistical dimension of aλ and λ, then daλ = dλ . Proof. Let x, ∈ M be an arbitrary element. We have ρ · aλ (x) = ρ(v ∗ ρ¯ · λσ · ρ(x)v) = ρ(v ∗ ρ¯ · ρ(z)v) = ρ(z) = λσ ρ(x). And

aλ · ρ(x) ¯ = v ∗ ρ¯ · λσ ρ · ρ(x)v ¯ ¯ = ρ¯ · λ(x). = v ∗ ρ¯ · ρ · ρ(λ(x))v

New Braided Endomorphisms from Conformal Inclusions

373

Since ρ · aλ = λσ · ρ and ρ is one-to-one, it makes sense to write aλ = ρ−1 · λσ · ρ. It follows immediately that aλ ∈ End(M ). By the multiplicative property of the statistical dimension and aλ · ρ¯ = ρ¯ · λ, we have daλ dρ¯ = dρ¯ dλ . Since dρ¯ > 1, it follows that daλ = dλ . Remark. (1) As in the proof above, since ρ · aλ = λσ · ρ and ρ is one-to-one, it makes sense to write aλ = ρ−1 ·λσ ·ρ. One can think of aλ as measuring the non-commutativity of ρ and λσ : in fact, it follows from Theorem 3.3 that aλ is in general different from λσ as sectors. Following [Ka1] and (2) of Theorem 3.1, aλ is referred to as quotient endomorphisms as it can be thought as obtained from λ “divided” by the symmetry provided by ρ. Remark. (2): There are two kinds of braidings one may use in theorem 3.1. The one we used corresponds to over crossing (see Sect. 2). If we choose to use under crossing braiding, then we get a˜ λ which is in general different from aλ (See the beginning of Sect. 3.2). However, it is clear from the proof that all the results in this section about aλ hold true for a˜ λ . Next we claim that aλ is a braided endomorphism of M . (See [L1]), i.e., there exists a unitary σ in a2λ (M )0 ∩ M that satisfies the braiding relation σaλ (σ)σ = aλ (σ)σaλ (σ). Let us first prove the following: Theorem 3.3. Let µ, λ ∈ CK (µ, λ are not necessarily irreducible). Then: ¯ aλ ρ) ¯ = Hom(aµ , aλ ) . (1) Hom(aµ ρ, (2) Hom(ρaµ , ρaλ ) = ρ(Hom(aµ , aλ )). ¯ aλ ρ) ¯ ⊃ Hom(aµ , aλ ). We need to show Proof. It is obvious that Hom(aµ ρ, ¯ aλ ρ) ¯ ⊂ Hom(aµ , aλ ). Notice y ∈ Hom(aµ ρ, ¯ aλ ρ) ¯ iff: Hom(aµ ρ, ∀x ∈ M,

¯ = yaµ · ρ(x). ¯ aλ · ρ(x)y

Since by Corollary 3.2, aλ · ρ(x) ¯ = ρ¯ · λ(x), aµ · ρ(x) ¯ = ρ¯ · µ(x), by applying ρ to both sides of the equation the above is equivalent to: γ · λ(x)ρ(y) = ρ(y)γ · µ(x). Recall γ · λ = λσ · γ, γ · µ = µσ · γ (we have omitted the subscript of σ, but it ¯ aλ ρ) ¯ iff: should be clear from the context what they should be), hence y ∈ Hom(aµ ρ, λσ · γ(x)ρ(y) = ρ(y)µσ · γ(x) for any x ∈ M . Similarly it follows from Corollary 3.2 that z ∈ Hom(aµ , aλ ) iff λσ · ρ(x) ρ(z) = ¯ aλ ρ) ¯ ⊂ Hom(aµ , aλ ), it is ρ(z) µσ · ρ(x) for any x ∈ M . Hence to show Hom(aµ ρ, sufficient to show if ρ(z) satisfies λσ · γ(x) ρ(z) = ρ(z) µσ · γ(x) for any x ∈ M , then λσ · ρ(x) ρ(z) = ρ(z) λµ · ρ(x) for any x ∈ M . By Proposition 3.1, all we need to show is if ρ(z) ∈ Hom(µσ · γ, λσ · γ), then λσ (w)ρ(z) = ρ(z)µσ (w). Notice from the proof of Theorem 3.1 we have λσ (w) = γ(σ)w, µσ (w) = γ(σ)w. In terms of diagrams, ρ(z)µσ (w) = ρ(z)γ(σ)w can be represented as:

374

F. Xu

γ

µ w

γ(σ) aa a ρ(z) ,@ , , @ γ λ

µ

By using YBE and BFE, the above diagram is equal to: γ

µ w σ∗ γ (ρ(z)) σ γ(σ)

γ

λ

µ

∗ Using the fact σγ,γ w = w, we see the above diagram corresponds to:

γ(σ)σγ,γ γ(ρ(z))w. Since γ(ρ(z))w = wρ(z), we have γ(σ)σγ,γ γ(ρ(z))w = γ(σ)σγ,γ wρ(z) = γ(σ)wρ(z) = λσ (w)ρ(z). As for (2), it is clear that Hom(ρaµ , ρaλ ) ⊃ ρ(Hom(aµ , aλ )). To prove the equality, we just have to show the dimension of ρ(Hom(aµ , aλ )) is equal to that of Hom(ρaµ , ρaλ ). Since ρ is one to one, all we need to show is: haµ , aλ i = hρaµ , ρaλ i. By (1) Frobenius duality: ¯ aλ ρi ¯ haµ , aλ i = haµ ρ, = hρµ, ¯ ρλi ¯ = hγµ, λi. On the other hand, from Corollary 3.2 we have

New Braided Endomorphisms from Conformal Inclusions

375

hρaµ , ρaλ i = hµσ ρ, λσ ρi = hµρ, λρi = hµγ, λi = hγµ, λi. So we have haµ , aλ i = hρaµ , ρaλ i = hγµ, λi.

Remark. It follows immediately from the above theorem that every irreducible subsector of aµ remains irreducible when multipy from the right (resp. from the left) by ρ¯ (resp. ρ). Corollary 3.4. (1) For any λ, µ ∈ CK , aλµ = aλ aµ ; (2) aλ is a braided endomorphism. Proof. (1) Apply ρ to both sides, we just have to show ρaλµ = ρaλ aµ . By using Cor.3.2, it is sufficient to show: λ(σγ,µ )σγ,λ = σγ,λµ , which follows directly from the definitions of the braiding operator before Proposition 1.4.2 in Sect. 1.4. (2) Denote by τ = σλ,λ the braiding unitaries in λ20 (M ) ∩ M . It follows from YBE that τ λ(τ )τ = λ(τ )τ λ(τ ). Let σ = ρ(τ ¯ ) ∈ ρλ ¯ 20 (M ) ∩ M = a2λ ρ¯0 (M ) ∩ M . By Theorem 3.3, 2 0 20 σ ∈ aλ ρ¯ (M ) ∩ M = aλ (M ) ∩ M . By applying ρ¯ to the equation τ λ(τ )τ = λ(τ )τ λ(τ ) and using aλ · ρ¯ = ρ¯ · λ, we get: τ aλ (τ )τ = aλ (τ )τ aλ (τ ). Hence aλ is a braided endomorphism. Recall from Sect. 2.1 that f denotes the vector representation of SU (N ). Corollary 3.5. (1) af is irreducible, i.e. a0f (M ) ∩ M = C1. (2) [aλ¯ ] = [¯aλ ]. ¯ 0 (M ) ∩ M it is sufficient to show Proof. (1) Since, a0f (M ) ∩ M ⊂ af ρ¯0 (M ) ∩ M = ρf 0 ρ¯ (M ) ∩ M = C1. By Frobenius duality (see [L2] or [Y]), ¯ af ρi ¯ = hρf, ¯ ρf ¯ i haf ρ, = hρρ, ¯ f f¯i X mλ λ, 1 + adi, =h λ∈P0

where ad denotes the adjoint representation of SU (N ). ¯ af ρi ¯ = 1, it is sufficient to show ad ∈ / P0 . By [Kac1], Since m1 = 1, to show haf ρ, Cλ any λ ∈ P0 has the property that K+N ∈ Z, where Cλ is the eigenvalue of the Casimir operator. Cad N Since 0 < P / P0 . K+N = K+N < 1, (recall K > 0) it follows that ad ∈ (2) Let [aλ ] = i mi xi , where xi are irreducible subsectors of [aλ ] and mi are positive integers. ¯ ρi ¯ = haλ aλ¯ , 1i, ¯ ρ, ¯ by (1) of Cor.3.4. By Theorem 3.3, haλ aλ λλ P Notice aλ aλ¯ = aP ¯ ρi ¯ = i mi hxi aλ¯ , 1i. Since hxi aλ¯ ρ, ¯ ρi ¯ ≥ hxi aλ¯ , 1i, it follows that i.e., i mi hxi aλ¯ ρ, hxi aλ¯ ρ, ¯ ρi ¯ = hxi aλ¯ , 1i. On the other hand,

376

F. Xu

¯ ρi hxi aλ¯ ρ, ¯ ρi ¯ = hxi ρ¯λ, ¯ ¯ = hxi , ρλρi ¯ λi = hxi , ρρa ≥ hxi , aλ i = mi .

P So we have hxi aλ¯ , 1i = haλ¯ , x¯ i i ≥ mi and we can write [aλ¯ ] = i mi x¯ i + y = [¯aλ ] + y. But the statistical dimension of aλ¯ is dλ¯ = dλ = the statistical dimension of [¯aλ ], therefore y = 0. ¯ Let Aρ ⊂ Sect(M ) which is generated by all the irreducible subsectors of [ρλρ], where λ ∈ CK . Since ρρ¯ ∈ CK and the elements of CK generate a finite subring of Sect(M ), it follows Aρ is a finite ring. Many examples suggest Aρ is a commutative ring but we have an example in the next section showing that is not the case. However, we have the following theorem. Theorem 3.6. (1) Let [b] be any subsector of [A(aµ )], where A is an arbitrary polynomial in aµ , µ ∈ CK , then [aλ ][b] = [b][aλ ] for any λ ∈ CK ; (2) Let [c] be any subsector of [ρρ], ¯ then [aλ ][c] = [c][aλ ] for any λ ∈ CK . Proof. (1) For simplicity we denote A(aµ ) by A. It follows from Corollary 3.2 that ˜ there exists A(µ) ∈ CK (denote by A˜ in the following) such that: Aρ¯ = A˜ ρ¯ and ˜ is the braiding operator. Since b is a subsector of ˜ ρA = Aσ ρ, where σ : γ A˜ → Aγ A, there exists partial isometry v ∈ M with b(x) = v ∗ A(x)v , ∀x ∈ M . Denote by p = vv ∗ ∈ A0 (M ) ∩ M . Then aλ b(x) = aλ (v ∗ )aλ A(x)aλ (v) baλ (x) = v ∗ Aaλ (x)v. Notice aλ · A · ρ¯ = ρ¯ · λA˜ = ρ¯ · A˜ 1 · λ = ρ(σ ¯ 1 )ρ¯ · A˜ · λ ρ(σ ¯ 1∗ ) = ρ(σ ¯ 1 )A · aλ · ρ¯ ρ(σ ¯ 1∗ ), ∗ where A˜ 1 = σ1 Aσ1 and σ1 : A˜ · λ → λ · A˜ is the under crossing braiding operator. Let y = ρ(σ ¯ 1∗ ). First we claim y aλ · A(x) = A · aλ (x)y, ∀x ∈ M . Since ρ · aλ · A = λσ · A˜ σ · ρ, ρ · A · aλ = A˜ σ · λσ · ρ, it is sufficient to show ρ(y) λσ · A˜ σ · ρ(x) = A˜ σ · λσ · ρ(x)ρ(y), ∀x ∈ M . ˜ ˜ = γ(σ1∗ λ· A(x)) = Recall ρ(y) λσ · A˜ σ ·γ(x) = γ(σ1∗ )λσ · A˜ σ ·γ(x) = γ(σ1∗ )γ ·λ· A(x) ∗ ˜ ˜ γ(A · λ(x)σ1 ) = Aσ · λσ · γ(x)ρ(y), ∀x ∈ M . By Proposition 2.6, ρ(M ) is generated by γ(M ) and w, it is sufficient to show γ(σ1∗ )λσ · A˜ σ (w) = A˜ σ · λσ (w)γ(σ1∗ ). In terms of diagrams (using δσ (w) = γ(σ)w, ∀δ as in the proof of Theorem 3.1): γ

λ

eσ (w) ←→ γ(σ1∗ ) · λσ · A

e A eσ (w) λσ A γ(σ1∗ )

γ

e λA γ(σ1∗ )

eσ · λσ (w) · γ(σ ∗ ) ←→ A 1 A˜ σ · λσ (w)

New Braided Endomorphisms from Conformal Inclusions

377

It follows from YBE that γ(σ1∗ )λσ · A˜ σ (w) = A˜ σ · λσ (w)γ(σ1∗ ). Hence b · aλ (x) = v ∗ A · aλ (x)v = v ∗ yaλ · A(x)y ∗ v, ∀x ∈ M . Recall aλ · b(x) = aλ (v ∗ )aλ · A(x)aλ (v). To show [aλ · b] = [b · aλ ], it is sufficient to ¯ 1 )pρ(σ ¯ 1∗ ). Apply ρ to both sides and use show that aλ (vv ∗ ) = y ∗ vv ∗ y, i.e., aλ (p) = ρ(σ Corollary 3.2, it is sufficient to show: λσ (ρ(p)) = γ(σ ∗ )ρ(p)γ(σ). Recall p ∈ A0 (M ) ∩ M = Aρ¯0 (M ) ∩ M = ρ¯A˜ 0 (M ) ∩ M . Hence ρ(p) ∈ γ A˜ 0 (M ) ∩ M . In terms of diagrams: A˜

γ λ

σ

λσ (ρ(p)) ←→

λ (ρ(p))

σ∗ γ

λ

A˜ γ(σ1∗ )

γ(σ1 )ρ(p)γ(σ1∗ ) ←→

ρ(p)

γ(σ1 ) By YBE and BFE, λσ (ρ(p)) = γ(σ1 )ρ(p)γ(σ1∗ ). (2) The proof is quite similar to that of (1). Let u be the partial isometry with c(x) = u∗ ρρ(x)u ¯ for any x ∈ M and let q = uu∗ . Then aλ · c(x) = aλ (u∗ )aλ · ρ¯ · ρ(x)aλ (u) , c · aλ (x) = u∗ ρ¯ · ρ · aλ (x)u. By Corollary 3.2, ρ¯ · ρ · aλ (x) = ρ¯ · λσ · ρ(x) = ρ(σ ¯ ∗ )ρ¯ · λ · ρ(x)ρ(σ) ¯ ∗ ¯ = ρ(σ ¯ )aλ · ρ¯ · ρ(x)ρ(σ). ¯ ∗ )aλ · ρ¯ · ρ(x)ρ(σ)u. ¯ To show [aλ · c] = [c · aλ ], it is sufficient Hence c · aλ (x) = u∗ ρ(σ to show aλ (q) = ρ(σ)q ¯ ρ(σ ¯ ∗ ), which is equivalent to (applying ρ to both sides and using Corollary 3.2) λσ (ρ(q)) = γ(σ)ρ(q)γ(σ ∗ ). Notice ρ(q) ∈ γρ0 (M ) ∩ M ⊂ γ 20 (M ) ∩ M . Hence

378

F. Xu

γ λ

γ σ

λσ (ρ(q)) ←→

λ (ρ(q))

σ∗

γ

λ

γ γ(σ ∗ )

γ(σ)ρ(q)γ(σ ∗ ) ←→

ρ(q)

γ(σ)

It follows from YBE and BFE that λσ (ρ(q)) = γ(σ)ρ(q)γ(σ ∗ ).

Notice since [af ] ∈ Aρ (see the definition after Corollary 3.5), the subfactor M ⊃ af (M ) has finite depth. Theorem 3.6 gives us more information on the higher relative commutants of this subfactor. We have: ¯ (In particular this shows [af ] Corollary 3.7. (1) h(af a¯ f )n , (af a¯ f )n i = hf 2n f¯2n , ρρi. is different from [f ] in general.) (2) h(af a¯ f )n af , (af a¯ f )n af i = hf 2n+1 f¯2n+1 , ρρi. ¯ (3) The principal graph of M ⊃ af (M ) is isomorphic to its dual principal graph as abstract graphs. Proof. h(af a¯ f )n , (af a¯ f )n i = h(af af¯ )n , (af af¯ )n i = ha(f f¯)n , a(f f¯)n i = hρ(f ¯ f¯)n , ρ(f ¯ f¯)n i ¯ = hf 2n f¯2n , ρρi, where we have used (2) of Cor.3.5, (1) of Cor.3.4, (1) of Theorem 3.3 in the first, the second and the third "=". We have also used Frobenius duality and Theorem 1.6 in the last 2 identities. (2) is proved in a similar way as in (1).

New Braided Endomorphisms from Conformal Inclusions

379

To prove (3), let G, G0 , G1 (resp. H, H 0 , H 1 ) denote the principal graph (resp. the dual principal graph), the even nodes and the odd nodes of the principal graphs (the dual principal graphs). Then G0 , H 0 correspond to irreducible subsectors of (af a¯ f )n , (where n > the depth of M ⊃ af (M )) and G1 , (resp. H 1 ) corresponds to irreducible subsectors of (af a¯ f )n af (resp. (af a¯ f )n a¯ f ). The isomorphism ϕ from G to H is defined to be: for any x ∈ G0 ∪ G1 , ϕ(x) = x¯ ∈ H 0 ∪ H 1 , where x¯ is the conjugate sector of x. ϕ is an isomorphism between G and H since by (1) of Theorem 3.6 and (2) of Corollary 3.5, ¯ = [af¯ x] ¯ = [xa ¯ f¯ ] and hx af , yi = h¯af x, ¯ yi ¯ = hx¯ a¯ f , yi ¯ = hϕ(x)ϕ(af ), ϕ(y)i. [¯af x] In [X3] , certain subfactors are constructed from some exceptional representations of Hecke-algebras. The theory developed in this section allow us to generalize some of the observations of [X3]. In fact, part of the motivation of this paper is to give a better explanation of some results in [X3]. Let us first introduce some notations. ¯ n (M )0 ∩ ρ(M ¯ ) and Bn = anf (M )0 ∩ M . Let n be a positive integer. Set An = ρf S S Denote by A = n An and B = n Bn . Let bk = f k−1 (σ), where σ = σf,f is the braiding operator. Let hk = ρ(b ¯ k ). Theorem 3.8. (1) An ⊂ Bn ; (2) There exists a Markov trace tr defined on B, i.e., tr(hn−1 x) = tr(hn−1 ) tr(x) for any x ∈ Bn−1 ; 2πi (3) Let q = exp( K+N ) and h0i = q 2N hi . If K + N > 4, then h0i satisfies the following type A Hecke algebra relations: N +1

h0i h0i+1 h0i = h0i+1 h0i h0i , h0i h0j = h0j h0i (|i − j| > 1), h0i = (q − 1)h0i + q; 2

(4) An as an algebra is generated by 1, h1 , ...hn−1 ; (5) The Markov trace tr as in (1) makes the following inclusions: An ⊂ An+1 ∩ ∩ Bn ⊂ Bn+1 a periodic commuting square in the sense of [GHJ] (see Chapter 4 of [GHJ]) when n is sufficiently large. Proof. (1) By Theorem 3.3 and Corollary 3.2, n 0 ¯ (M ) ∩ M = ρf ¯ n0 (M ) ∩ M ⊃ ρf ¯ n0 (M ) ∩ ρ(M ¯ ) = An . Bn = an0 f (M ) ∩ M = af ρ

(2) Let us define tr as in Corollary 2.5 of [L1]. Let εn : M → ρf ¯ n (M ) be the minimal expectations. For any b ∈ B, define tr(b) = limi→+∞ εi (b). By Corollary 3.5, a0f (M ) ∩ M = C1. It follows from Corollary 2.5 of [L1] that tr is really a trace. Let us show for any x ∈ Bn−1 , εn (hn−1 x) = tr(x) tr(hn−1 ), which implies tr is a Markov trace. Notice ¯ n−1 ) ∈ ρf ¯ n−2 (M ). Therefore hn−1 = ρ(g εn (hn−1 x) = εn (εn−2 (hn−1 x)) = εn (hn−1 εn−2 (x)).

380

Since

F. Xu

∀x ∈ Bn−1 , εn−2 (x) ∈ ρf ¯ n−2 (M ) ∩ ρf ¯ n−10 (M ) = ρf ¯ n−2 (M ∩ ρf ¯ 0 (M )) = ρf ¯ n−2 (M ∩ a0f (M )) = C1.

So we have εn−2 (x) = tr(x). Hence εn (hn−2 (x)) = εn (hn−1 )· tr(x) = tr(hn−1 ) tr(x). (3) The first two relations follow immediately from the definitions of h0i and YBE. We just have to show the third relation, and it is sufficient to prove it for h0 := h01 . By Theorem 1.6, f 2 (M )0 ∩ M ' M2 (C), where M2 (C) denotes the set of twoby-two matrices. So the unitary operator b := b1 ∈ f 2 (M )0 ∩ M satisfies a quadratic relation. Denote by Spec(A) the set of eigenvalue of an operator A. It follows from Proposition 1.4.3 and the formula given at the beginning of the proof of Lemma 3.1 that Spec(b2 ) = {q It follows that

N −1 N

,q

−N −1 N

}.

Spec(h0 ) = {q 2 , 1}. 2

¯ (M )). Let f (resp. ρf ¯ ) be the minimal conditional expectation from M to f (M ) (resp. ρf By Sect. 2 of [L1] and the end of Sect. 1.3, ¯ f (b)) tr(h) = ρf ¯ (h) = ρ( = f (b) = f −1 f (b) = 8f (b) = d−1 f κf , where we have also used the fact that both ρf ¯ (h) and f (g) are scalars. By Theorem 1.3.3, κf = Sf , and using the explicit formula of Sf given in Sect. 1.6, tr(h0 ) =

q−1 . 1 − q −N

We claim that if K + N > 4, then Spec(h0 ) = {q, −1}. We just have to show that Spec(h0 ) cannot be {q, 1}, {−q, 1}, {−q, −1}. If Spec(h0 ) = {q, 1}, then gi → −h0i gives a C ∗ representation of the Hecke algebra H∞ (−q) of [We1]. By the statement on p. 378 of [We1] this is possible only if −q = exp( 2πi l ) for some integer l with |l| ≥ 4. One checks easily that this is impossible if K + N > 4. Similar argument shows that Spec(h0 ) 6= {−q, −1}. If Spec(h0 ) = {−q, 1}, then gi → −h0i gives a C ∗ representation of the Hecke algebra H∞ (q) of [We1]. Since the Markov trace tr factors through such a representation, by (b) q−1 of Theorem 3.6 of [We1], there exists 1 ≤ k 0 ≤ K + N − 1 such that tr(−h0 ) = 1−q −k0 0

which leads to 2 = q −N + q −k , a contradiction. So if K + N > 4, then Spec(h0 ) = {q, −1}, and (3) is proved. When K+N ≤ 4, one checks easily from the argument above that the only possibility is N = 3, K = 1 and Spec(h0 ) = {i, −1} or Spec(h0 ) = {−i, −1}. (4) Assume K + N > 4 or N = 3, K = 1 and Spec(h0 ) = {i, −1}. Let Aˆ n be the subalgebra of An generated by 1, h1 , ...hn−1 . By the proof of (3) above and (b) of Theorem 3.6 of [We1], the simple ideals of Aˆ n are given by (N, K + N ) diagrams (cf. p. 367 of [We1]), which are in one-to-one correspondence with the irreducible descendants of f n by Theorem 1.6. The inclusion matrix of Aˆ n ⊂ Aˆ n+1 , given by (2.14) on P.369 of

New Braided Endomorphisms from Conformal Inclusions

381

[We1], is precisely the same as the inclusion matrix of An ⊂ An+1 by Theorem 1.6. Since Aˆ 0 = A0 ' C, (4) is proved. If K = 1, N = 3 and Spec(h0 ) = {−i, −1}, the simple ideals of Aˆ n are given by (1, 4) diagrams (cf. p. 367 of [We1]). But it is easy to check that such diagrams are in one-to-one correspondence with the irreducible descendants of f n by Theorem 1.6, and the inclusion matrix of Aˆ n ⊂ Aˆ n+1 , given by (2.14) on P.369 of [We1], is precisely the same as the inclusion matrix of An ⊂ An+1 by Theorem 1.6. So the same proof as above shows (4) in this case. (5) It follows from the proof of Proposition 3.1 in [X3] and (1) to (4) that the square in (5) of the theorem is a commuting square. ¯ has finite Finally let us prove the periodicity. For any irreducible λ ∈ CK , since [ρ·λ] statistical dimension, it decomposes as a direct sum of a finite number of irreducible P sectors. Let us write [ρ¯ · λ] = a Vρ¯λa [a], where Vρ¯λa are non-negative integers. Let P be the set of all [a]’s which appear in the above decompositions, i.e. Vρ¯λa 6= 0 for some λ ∈ CK . P is a finite set andPwe denote by ]P the number of elements in P . We claim µ µ [b] for some non-negative integers Vab . In fact for any µ ∈ CK , [a · µ] = [b]∈P Vab if [b] is any irreducible subsector of [a · µ], then [b] is also an irreducible subsector of ν ν λ [ρ¯ · λ · µ] = Σν [ρ¯ · ν]Nλµ , where Nλµ is as in Sect. 1.6. Hence [b] ∈ P . Notice λ → Vab gives a representation of Gr(CK ) by ]P × ]P matrices whose entries are non-negative integers. Recall from Sect. 1.6 that each irreducible λ ∈ CK can be assigned one of the N -colors C(λ) ∈ 0, 1, 2, · · · N − 1. We denote by Pi the set of those elements a ∈ P such that Vρ¯λa 6= 0 for some λ ∈ CK of color i. Notice for n > m (m is a fixed number which depends on N and K), all irreducible λ with color ≡ n(mod N ) appears as irreducible components of f n (recall f has color 1). Hence the minimal projections of ¯ n , are Bn , which are in one to one correspondence with the irreducible subsectors of ρf in one to one correspondence with the elements of Pi with i ≡ n(mod N ). The inclusion f , where a ∈ Pi , b ∈ Pi+1 . It is the same as that of matrix for Bn ⊂ Bn+1 is given by Vab Bn+N ⊂ Bn+N +1 for n > m. Next let us show Bn ⊂ Bn+N is primitive, i.e. given any a ∈ Pi , b ∈ Pi , there exists r such that b is contained in af N r (we use notations b ≺ af N r in the following). ¯ Therefore b ≺ ρf ¯ n+N ≺ Since ρf ¯ n ≺ a it follows from Frobenius duality af¯n ≺ ρ. n n+N Nr n n N (r−1) ¯ ¯ af f ≺ af , where we use the fact that f f ≺ f for some positive integer r. The fact that An ⊂ An+1 is periodic with period N and that An ⊂ An+N is primitive follows from the lemma on p. 369 of [We1] and (4). In the following we will give two well studied examples when G = SU (2). In this case there are two nontrivial conformal inclusions: SU (2)10 ⊂ Spin(5) and SU (2)28 ⊂ G2 . The representation of SU (2) is denoted by spin j which is a half integer. Let us consider these two cases separately. The fusion ring for SU (2) is well known. The one which will be used is: 21 · j = (j − 21 ) ⊕ (j + 21 ). Example 1. SU (2)10 ⊂ Spin(5). In this case K = 10, the subfactor M ⊃ af (M ) has π . γ = 0 ⊕ 3 (cf.[CIZ]). The principal graph can be either A11 or E6 (see index 4 cos2 12 [GHJ]). From (2) of Corollary 3.7 we have haf a¯ f af , af a¯ f af i = hf 6 , 0 ⊕ 3i = 6, while it is easy to check that if the principal graph is A11 , then haf a¯ f af , af a¯ f af i = 5. Therefore the principal graph of M ⊃ af (M ) is E6 . Example 2. SU (2)28 ⊂ G2 . In this case K = 28 and the subfactor M ⊃ af (M ) has π . γ = 0 ⊕ 5 ⊕ 9 ⊕ 14 (cf.[CIZ]). The principal graph can only be A29 , D16 index 4 cos2 30 or E8 by [GHJ].

382

F. Xu

From (2) of Corollary 3.7 we have h(af a¯ f )2 af , (af a¯ f )2 af i = hf 10 , ρρ¯ = 0 ⊕ 5 ⊕ 9 ⊕ 14i. Notice spin 5 representation appears in the irreducible decomposition of f 10 with multiplicity 1. Hence h(af a¯ f )2 af , (af a¯ f )2 af i = hf 10 , 0i + 1 = 43. If M ⊃ af (M ) has principal graph A29 or D16 , it is easy to check that h(af a¯ f )2 af , (af a¯ f )2 af i = hf 10 , 0i = 52 + 42 + 1 = 42. Therefore the principal graph of M ⊃ af (M ) is E8 . Hence there exist hyperfinite type II1 subfactors with principal graph E8 by [Po]. The original proof of this fact (see [I2]) is completed by some tedious but straightforward calculations of about 15 matrices. 3.2. More properties. As mentioned at Sect. 3.1 (after Corollary 3.2), there are in fact two kinds of braided endomorphisms aλ and a˜ λ associated to each λ ∈ CK . In general c2 (λ) , where c2 (λ) is [aλ ] 6= [˜aλ ]. (See Lemma 3.2 below). Recall from Sect. 1.6 1λ = K+N the value of the Casimir operator on the representation of SU (N ) labeled by λ. Then we have: (K) . If for any λ0 ≺ λ · f , 1λ0 − 1λ − 1f are integers, Lemma 3.1. Suppose λ ∈ P++ then λ is the trivial representation. P P Proof. Recall for λ = 1≤j≤N −1 λj 3j , where λj ≥ 0, 1≤j≤N −1 λj ≤ K (We also write such λ as [λ1 , ...λN −1 ]),

c2 (λ) =

1 2N +

and 1λ =

c2 (λ) K+N .

1 2

X

j(N − j)λ2j +

1≤j≤N −1

X

1 N

X

j(N − k)λj λk

1≤j
j(N − j)λj

1≤j≤N −1

Also

λ · f =[λ1 + 1, λ2 , ..., λN −1 ] + [λ1 − 1, λ2 + 1, ..., λN −1 ] + [λ1 , λ2 − 1, λ3 + 1..., λN −1 ] + [λ1 , ..., λN −2 − 1, λN −1 + 1] + [λ1 , ..., λN −2 , λN −1 − 1], (K) is defined to be 0. There where any term on the right-hand side which is outside P++ are several cases to consider: (1) If λ0 = [λ1 + 1, λ2 , ..., λN −1 ], then by the formula given above it is easy to check:

1λ0 − 1λ − 1f =

1 N (K + N )

X

(N − j)λj .

1≤j≤N −1

P P 1 1 If λ 6= [0, ...0], then 0 < N (K+N 1≤j≤N −1 (N − j)λj < K+N 1≤j≤N −1 λj < 1. ) Therefore if 1λ0 − 1λ − 1f is an integer, then λ = [0, ...0]. (2) If λ0 = [λ1 , λ2 , ..., λN −1 − 1] and λ 6= [0, ...0], one checks that 1λ0 − 1λ − 1f = −

1 N (K + N )

X 1≤j≤N −1

λj j −

1 (N − 1). N +K

New Braided Endomorphisms from Conformal Inclusions

383

It follows that 0 < |1λ0 − 1λ − 1f | <

1 (K + N − 1) < 1. N +K

Hence if 1λ0 − 1λ − 1f is an integer, then λ = [0, ...0]. (3) Assume [λ1 + 1, λ2 , ..., λN −1 ], [λ1 , λ2 , ..., λN −1 − 1] are not contained in λ · f . Let λ0 = [λ1 , ...λi − 1, λi+1 + 1, ..., λN −1 ] with 1 ≤ i ≤ N − 2. One checks that c2 (λ0 ) − c2 (λ) − c2 (f ) =

1 N

X

λj (N − j) −

1≤j≤N −1

X

λj − i.

1≤j≤i

It follows that |c2 (λ0 ) − c2 (λ) − c2 (f )| = | ≤

1 N

X

λj (N − j) −

i<j≤N −1

X

1 X λj j − i| N 1≤j≤i

λj + i < K + N.

1≤j≤N −1

Hence if 1λ0 − 1λ − 1f is an integer, then c2 (λ0 ) − c2 (λ) − c2 (f ) = 0, i.e.,

X X 1 λj (N − j) = (λj + 1). N 1≤j≤N −1 1≤j≤i P Denote by G(i) = 1≤j≤i (λj + 1). G(i) is a strictly increasing function of i. So there is at most one solution to the above equation. It follows that λ · f is irreducible. By the fusion rules with f , it follows that λ = σ m (0) with 0 ≤ m ≤ N − 1. In this case λ0 = σ m (f ) = (0, 0, ..., K − 1, 1, 0, ..., 0), where there are m − 1 zeros before K − 1. One checks that c2 (λ0 ) − c2 (λ) − c2 (f ) = 0 is equivalent to 1 K(N − m) = m + K, N which has a unique solution only for m = 0, i.e., λ is the trivial representation.

Now we are ready to prove: Lemma 3.2. If the conformal inclusion SU (N ) ⊂ H is nontrivial, i.e. if γ is not a trivial representation of SU (N ), then [af ] is different from [˜af ]. ¯ af ρi ¯ = haf ρ, ¯ a˜ f ρi, ¯ Proof. Suppose [af ] = [˜af ]. Then haf , a˜ f i = haf , af i = haf ρ, where we have used Theorem 3.3 and the definitions of af and a˜ f . Since Hom(af , a˜ f ) ¯ a˜ f ρ), ¯ it follows Hom(af , a˜ f ) = Hom(af ρ, ¯ a˜ f ρ) ¯ = Hom(ρf, ¯ ρf ¯ ). Since ⊂ Hom(af ρ, 1 ∈ Hom(ρf, ¯ ρf ¯ ), it follows af = a˜ f . Hence fσ ρ = fσ˜ ρ. Let s = σ˜ ∗ σ ∈ Hom(γf, γf ). Then fσ (w) = fσ˜ (w), i.e. sfσ (w)s∗ = fσ (w). Since s ∈ Hom(γf, γf ), it follows that s ∈ Hom(fσ ρ, fσ ρ) = Hom(ρaf , ρaf ) ⊂ ρ(M ) (this last inclusion follows from (2) of Theorem 3.3). On the other hand, it follows from the proof of Theorem 3.1 that fσ (w) = γ(σ)w, fσ˜ (w) = γ(σ)w, ˜ so γ(s)w = w. It follows that s = w∗ γ(s)w = 1, where we use the fact s ∈ ρ(M ). By [Kac1], for any λ ≺ γ, 1λ ∈ Z. It follows that the

384

F. Xu

univalence of γ is constant 1. By Proposition 1.4.3 the eigenvalues of s ∈ Hom(γf, γf ) are given by exp 2πi(1λ0 − 1λ − 1f ) with λ ≺ γ, λ0 ≺ λ · f . Since s = 1, it follows that 1λ0 − 1λ − 1f = 0 for any λ ≺ γ, λ0 ≺ λ · f . By Lemma 3.1, we have γ is a sum of trivial representations. But from the proof of Proposition 2.4 γ contains the trivial representation exactly once, so γ is trivial, a contradiction. Notice [aλ ] and [˜aλ ] belong to Aρ . We have the following commutation relations between their descendants. Lemma 3.3. Let [x] and [y] be subsectors of [˜aλ ] and [aµ ] respectively. Then [xy] = [yx]. Proof. Recall from Sect. 1.4 that σ denotes the over crossing braiding operator. We shall use σ˜ to denote the under crossing braiding operator only in this proof. ¯ )0 , vv ∗ = q ∈ Assume x = u∗ a˜ λ u, y = v ∗ aµ v with uu∗ = ρ ∈ aλ˜ (M )0 = ρλ(M 0 0 aµ = ρµ(M ¯ ) . Then xy = u∗ a˜ λ (v ∗ )˜aλ aµ a˜ λ (v)u, yx = v ∗ aµ (u∗ )aµ a˜ λ aµ (u)v. Notice

a˜ λ aµ ρ(x) ¯ = ρλµ(x) ¯ = ρµ ¯ σ λ(x) = ρ(σ ¯ ∗ )ρµλ(x) ¯ ρ(σ) ¯ ∗ ¯ ρ(σ) ¯ , ∀x ∈ M, = ρ(σ ¯ )aµ a˜ λ ρ(x)

where σ ∈ Hom(λµ, µλ) is the braiding operator. Let us show ¯ ∗ )aµ a˜ λ (x)ρ(σ), ¯ ∀x ∈ M. a˜ λ aµ (x) = ρ(σ By applying ρ to both sides, it is sufficient to show λσ˜ µσ ρ = γ(σ ∗ )µσ λσ˜ ργ(σ). Since γ(σ) ∈ Hom(γλµ, γµλ), it is sufficient to show λσ˜ (µσ (w)) = γ(σ ∗ )µσ λσ˜ (w)γ(σ). In terms of diagrams (using δσ (w) = γ(σ)w, ∀δ as in the proof of Theorem 3.1): γ

λ

λσ˜ (µσ (w)) ←→ O

µ

New Braided Endomorphisms from Conformal Inclusions

γ(σ ∗ )µσ λσ˜ (w)γ(σ) is represented by: γ

385

λ

µ

It follows from YBE that λσ˜ (µσ (w)) = γ(σ ∗ )µσ λσ˜ (w)γ(σ). So we have a˜ λ aµ (x) = ρ(σ ¯ )aµ a˜ λ (x)ρ(σ) ¯ , ∀x ∈ M . Hence xy = u∗ a˜ λ (v ∗ )ρ(σ ¯ ∗ )aµ a˜ λ ρ(σ)˜ ¯ aλ (v)u. To show [xy] = [yx], it is sufficient to show: ¯ ∗ ) = aµ (u)vv ∗ aµ (u∗ ) ρ(σ)˜ ¯ aλ (v)uu∗ a˜ λ (v ∗ )ρ(σ ∗

or

¯ ∗ ) = aµ (p)q. ρ(σ)˜ ¯ aλ (q)pρ(σ

¯ ∗ ) = q, ρ(σ)p ¯ ρ(σ ¯ ∗ ) = aµ (p). Since We claim ρ(σ)˜ ¯ aλ (q)ρ(σ a˜ λ (q)p = p˜aλ (q), aµ (p)q = qaµ (p), the above claim will prove the lemma. Apply ρ to both sides, it is sufficient to show λσ˜ ρ(q) = γ(σ ∗ )ρ(q)γ(σ)

and γ(σ)ρ(p)γ(σ ∗ ) = µσ ρ(p).

Notice ρ(p) ∈ γλ(M )0 , ρ(q) ∈ γµ(M )0 . In terms of diagrams: γ λ

µ

λσ˜ ρ(q) ←→

λρ(q)

γ γ(σ ∗ )ρ(q)γ(σ) ←→

λ

µ

ρ(q)

386

F. Xu

It follows from YBE and BFE that λσ˜ p(q) = γ(σ ∗ )ρ(q)γ(σ). Similarly γ(σ)ρ(p)γ(σ ∗ ) = µσ ρ(p) follows from the following diagrammatic identity: γ

ρ

λ

γ

µ

λ

=

For any λ ∈ Ck , define ϕ : [λ] → Aρ to be ϕ([λ]) = [aλ ]. This defines a map from Gr(CK ) to Aρ . Lemma 3.4. ϕ is a ring homomorphism. Proof. It is clear from the definition that [aλµ ] = [aλ ][aµ ]. It follows from (2) of Corol¯ All we have to show is [aλ+µ ] = [aλ ]+[aµ ]. Let [aλ ]+[aµ ] = lary 3.5 that ϕ([λ]) = ϕ([λ]). Σi mi Yi , where Yi are irreducible subsectors of [aλ ] + [aµ ] and mi are positive inte¯ aλ+µ ρi ¯ = haµ , aλ+µ i = haµ ρ, ¯ aλ+µ ρi ¯ = haµ , aλ+µ i, gers. By Theorem 3.3, haµ ρ, and haλ ρ, ¯ aλ+µ ρi ¯ = haλ , aλ+µ i = haλ ρ, ¯ aλ+µ ρi ¯ = haλ , aλ+µ i, it follows h([aλ ] + ¯ aλ+µ ρi ¯ = h[aλ ] + [aµ ], aλ+µ i; i.e. [aµ ])ρ, hΣi mi Yi ρ, ¯ aλ+µ ρi ¯ = hΣi mi Yi , aλ+µ i. Notice hYi ρ, ¯ aλ+µ ρi ¯ ≥ hYi , aλ+µ i, it follows that hYi ρ, ¯ aλ+µ ρi ¯ = hYi , aλ+µ i. Hence hYi , aλ+µ i = hYi ρ, ¯ aλ+µ ρi ¯ = hYi ρ, ¯ ρ(λ ¯ + µ)i = hYi ρ, ¯ ([aλ ] + [aµ ])ρi ¯ = hYi , [aλ ] + [aµ ]i. Hence [aλ+µ ] = [aλ ] + [aµ ] + X. Since daλ+µ = dλ + dµ = daλ + daµ , where d is the statistical dimension, X must be 0. Let C ⊂ Aρ be the subring generated by all the subsectors of aλ . It follows from Lemma 3.3 that C is generated by all the subsectors of anf . We’d like to study the subring C. In the rest of this section, we shall restrict ourselves to the conformal inclusions in c2 . We cannot prove Theorem 3.9 following the argument \ Sect. 2.1 except SU (2)28 ⊂ G c \ below for SU (2)28 ⊂ G2 , since the analog of Theorem 1.6 for G2 has not been proved, at least to us. But one checks easily by Example 2 at the end of Sect. 3.3 that Theorem 3.9 c2 as well. \ holds for SU (2)28 ⊂ G ¯ σj ρ) ¯ = Hom(aλ , σj ). Lemma 3.5. Hom(aλ ρ, ¯ σj ρ). ¯ We need to show if T ∈ Proof. It is obvious that Hom(aλ , σj ) ⊂ Hom(aλ ρ, Hom(aλ ρ, ¯ σj ρ), ¯ then T ∈ Hom(aλ , σj ). Applying ρ to both sides of T aλ ρ(x) ¯ = ¯ ∀x ∈ M , we have ρ(T )γλ(x) = γj (x)ρ(T ), i.e. ρ(T ) ∈ Hom(γλ, γj ). To σj ρ(x)T show T aλ (x) = σj (x)T ∀x ∈M, it is sufficient to show ρ(T )λσ ρ(x) = ρσj (x)ρ(T ) for ∀x ∈ M . Since ρ(T ) ∈ Hom(γλ, γj ), ρ(T )λσ ρ(x) = ρσj (x)ρ(T ) for any x ∈ ρ(M ¯ ). By Proposition 2.6 any element of x ∈ M can be written as x = x1 · v2 with ¯ ). It is sufficient to show ρ(T )λσ (w) = ρσj (v2 )ρ(T ). Let σ ∈ ρ(v2 ) = w and x1 ∈ ρ(M

New Braided Endomorphisms from Conformal Inclusions

387

Hom(γγj , γj γ) be the unitary braiding operator. By Proposition 2.8 σw = ρσj (v2 ). The proof of ρ(T )λσ (w) = ρσj (v2 )ρ(T ) is now rather similar to the proof of Theorem 3.3. We can represent ρ(T )λσ (w) as: γ

λ

γ

λ w ∗ σγ,γ

=

γ (ρ(T )) ρ(T ) γj

γ

γj γ

∗ By using σγ,γ w = w we have

γ

λ

ρ(T )λσ (w) =

γj i.e.

γ

ρ(T )λσ (w) = σγ,γj γ(ρ(T ))w = σγ,γj wρ(T ) = ρσj (v2 )ρ(T ),

where we have used γ(ρ(T ))w = wρ(T ).

Let V denote the set of all irreducible subsectors of anf , where n is an arbitrary integer. We use 1 to denote the identity sector. We have a representation k ) on Pof Gr(C λ b, where V as follows: For any [λ] ∈ Gr(Ck ), a, b ∈ V, [λ] · a := [aλ ] · a = b Vab λ λ are nonnegative integers. In fact, Vab = haλ a, bi. Vab For conformal inclusion SU (N ) ⊂ H, recall we denote by {γj } the set of all positive energy representations of LG obtained as the restriction of a level 1 irreducible representation. By Lemma 3.5, ¯ σj ρi ¯ = hρλ, ¯ σj ρi ¯ = hλ, γj i. haλ , σj i = haλ ρ, Hence if hλ, γj i 6= 0, σj is a subsector of aλ (recall σj is irreducible). In particular σj ∈ V . These σj ∈ V will be called special nodes (cf.[PZ]) in the following. Since in the proof of Lemma 3.5, we can replace over crossing braidings by under crossing braidings, it follows that if hλ, γj i 6= 0, σj is a subsector of a˜ λ . By Lemma 3.3, [σj aµ ] = [aµ σj ] for any µ. [σj ]’s form a commutative subalgebra (with identity 1)

388

F. Xu

which is denoted by Σ. The irreducible representations of σ are given by (cf.Sect. 1.6 and Sect. 2.2) Sσ σ σi → i j , S1σj where Sσi σj are the S-matrices of LH at level 1. P b Define the matrix Nc by Nca = hca, bi for a, b, c ∈ V . Then V λ = c V1cλ Nc . Since [aλ¯ ] = [¯aλ ], [σj aλ ] = [aλ σj ], V λ , Nσj are commuting normal matrices, so they can be simultaneously diagonalized. Recall the irreducible representations of Gr(Ck ) are given by λ→ λ = Assume Vab

P

(µ) Sλ µ,i,s∈(Exp) S (µ) 1

Sλ(µ)

.

S1(µ)

∗

·ψa(µ,i,s) ψb(µ,i,s) , where ψa(µ,i,s) are normalized orthog-

onal eigenvectors of V λ (resp. Nσi ) with eigenvalue

(µ) Sλ

S1(µ)

(resp.

Sσi σj S1σj

) . (Exp) is a set

of µ, i, s’s and s is an index indicating the multiplicity of µ, i. We denote by Exp the set of different µ in (Exp). Recall if a representation is denoted by 1, it will always be the vacuum representation. Theorem 3.9. (1) µ ∈ Exp if and only if hµ, γj i 6= 0 for some γj ; q (1) (2) ψ1 = S1(1) S11 ; P 1 2 (3) a∈V da = S (1) S ; 1

= (4) ψσ(µ,i,s) j

Sσj σi S1σi

11

ψ1(µ,i,s) .

Proof. (1)By Lemma 3.4, haλ , σj i = haλ ρ, ¯ σj ρi ¯ = hρλ, ¯ σj ρi ¯ = hλ, γj i. Hence if hλ, γj i 6= 0, σj is a subsector of aλ (recall σj is irreducible). In particular σj ∈ V . λ Now V1σ = haλ , σj i = hλ, γj i. Denote by χj (τ ) the character of γj , χλ (τ ) the j P λ χλ (τ ). It follows from p. 161 of [Kac1] character of λ. Then we have χj (τ ) = λ V1σ j that X X (ν ν S1σj V1σ = V11λ Sλ ) , j j

λ

where S1σj is the S-matrix for LH at level 1 and S1σj > 0 for all σj . But X X (ν ν S1σj V1σ = V11λ Sλ ) j j

λ

=

X µ,i,s∈(Exp),λ

=

∗

Sµ(λ) (ν) (µ,i,s) (µ,i,s)∗ S ψ ψ1 S1µ λ 1

X 1 (ν,i,s) 2 | ν · |ψ1 S 1 i,s

(∗),

New Braided Endomorphisms from Conformal Inclusions

389

P ∗ where we have used λ Sλµ Sλν = δµ,ν and V11λ = (V11λ )∗ , since V11λ is an integer. Hence ν 6= 0 for some σj , then ψ1(ν,i,s) 6= 0 for some ν, i, s, i.e., ν ∈ Exp. Next we need to if V1σ j ζ 6= 0 for some σj . In view of the above identity, all we need show if ζ ∈ Exp, then V1σ j P P (ζ,j,t) 2 | 6= 0. Recall V λ = c V1cλ Nc , i.e., to show is that j,t |ψ1 λ = Vab

=

X

Sλ(µ)

S (µ) µ,i,s∈(Exp) 1 X

∗

· ψa(µ,i,s) ψb(µ,i,s)

Sλ(ν)

S (ν) ν,j,t∈(Exp),c 1

∗

b · ψ1(ν,j,t) ψc(ν,j,t) Nca . ∗

Multiply both sides of the above equation by Sλ(ζ) and sum over λ; we get: X X (ζ,j,t) ∗ ∗ b ψa(ζ,i,s) ψb(ζ,i,s) = ψ1 ψc(ζ,j,t) Nca . i,s

Let a = b; we get

j,t,c

X

|ψa(ζ,i,s) |2 =

i,s

X

∗

a ψ1(ζ,j,t) ψc(ζ,j,t) Nca ,

j,t,c

ψ1(ζ,j,t) P

= 0 for all (j, t), we will have ψa(ζ,i,s) = 0 for any a ∈ V which Hence if contradicts a |ψa(ζ,i,s) |2 = 1. f . Since every a ∈ V is an irreducible descendant of some anf (n > 0), it (2) Let Gab = Vab follows Gab is irreducible (see [GHJ] for definition). By the Perron–Frobenius theorem, there exists a unique unit eigenvector with nonnegative entries corresponding to the largest eigenvalue (1) above we have:

Sf(1) S1(1)

of Gab . In our notation this eigenvector is {ψa(1) }. Using (*) of (ψ1(1) )2 S1(1)

=

X

1 S1σj V1σ = S11 . j

j

q Hence ψ1(1) = S1(1) S11 . P f f (3) From af a = b Vab b we have df da = Vab db . It follows {db }b∈V is a Perron– f (1) Frobenius eigenvector of Vab . Hence db = xψb , where x > 0 is a fixed constant. Since P P ψ (1) 1 1 1 (1) 2 d1 = 1, x = ψ1(1) , db = ψb(1) . So a∈V d2a = (ψ(1) a∈V (ψa ) = (ψ (1) )2 = S (1) S . )2 1

(4) Notice that

1

1

1

1

11

b = hcσj , bi = hσj c, bi = Nσbj c . Ncσ j

So Vσλj b =

X

V1cλ Nσbj c ,

c

i.e., X

Sλ(µ)

S (µ) µ,i,s∈(Exp) 1

X

∗

· ψσ(µ,i,s) ψb(µ,i,s) = j

= It follows immediately that ψσ(µ,i,s) j

Sλ(ν)

S (ν) ν,k,t∈(Exp) 1

Sσj σi S1σi

ψ1(µ,i,s) .

·

∗ Sσj σk · ψ1(ν,k,t) ψb(ν,k,t) . S1,σk

390

F. Xu

3i Let (Gi )ab = Vab with a, b ∈ V . In [PZ], certain postulations on graphs are given based on the connections with integrable lattice models. We shall show that our Gi satisfy all the postulations i) to ix) of 2.2 in [PZ] except (2) of ix). Let {σj } denote all the level 1 representations of LH. and χσj the corresponding character. Then χσj = P P 2 (K) hλ, νj ixλ , and Z = (K) hλ, νj iχλ | is called the partition function j |Σλ∈P++ λ∈P++ associated with G ⊂ H. Recall γ is the representation of LSU (N ) obtained by restriction from the vacuum representation of LH to LSU (N ).

b [ Lemma 3.6. For all the conformal inclusions listed in Sect. 2.2 except (A 8 )1 ⊂ E8 , γ has color 0, i.e., for any λ with hλ, γi ≥ 1, τ (λ) ≡ 0 mod(N ). Proof. For SU (2), SU (3) cases, one can check the lemma is true by using the explicit form of ν given in [PZ]. For series (a) and (b), notice in both cases the generator w = exp 2πi N of the center of 2 SU (N ) is embedded as w → w in H which acts trivially on the vacuum representation of LH. Hence the center of SU (N ) acts trivially on any λ with hλ, γi ≥ 1, so τ (λ) ≡ 0(mod N ). For series (c) and (d), the generator w = exp 2πi N is embedded trivially, (i.e. w → 1 in H). Hence for any λ with hλ, γi ≥ 1, τ (λ) ≡ 0(mod N ). We are now ready to prove the following: Theorem 3.10. The following statements (i) and (ii) are true for all the conformal b [ inclusions in Sect. 2.2 except (A 8 )1 ⊂ E8 . The remaining statements are true for all the conformal inclusions in Sect. 2.2. (i)

To each a ∈ V may be attached a Z/N Z grading τ (a) such that (Gi )ab 6= 0 only if τ (b) = τ (a) + i mod N.

(ii) The complex conjugation a → a¯ satisfies τ (¯a) = −τ (a) and (Gi )ab = (Gi )b¯ a¯ . (iii) The matrices Gi are pairwise transposed of one another: τ

Gi = Gn−i .

(iv) The matrices Gi commute among themselves and are diagonalizable in an orthonormal basis common to all of them. (v) The corresponding eigenvalues of G1 , . . . , GN −1 are given by S3(λ)1

S3(λ)i

S1

S1

,..., (λ)

,..., (λ)

S3(λ)N −1 S1(λ)

;

some of these λ may occur with multiplicities larger than 1 but hλ, γj i 6= 0 for some γj . (vi) The graph defined by G1 admits one extremal vertex, i.e., a vertex on which only one edge is ending and from which only one edge is starting. Proof. (i) Let us first show if a is an irreducible subsector of anf and am f , then n ≡ m mod(N ). Notice under the assumption, 1 is a subsector of [anf a¯ m f ] = [af n f¯m ]. By ¯ af n f¯m ρi ¯ = hρ, ¯ ρf ¯ n f¯m i = hγ, f n f¯m i ≥ 1, Theorem 3.3, we have h1, af n f¯m i = hρ, m where we have also used [¯af ] = [af¯m ] (see (2) of Corollary 3.5). By Lemma

New Braided Endomorphisms from Conformal Inclusions

391

3.5 γ has color 0, but all descendants of f n f¯m have color n − m, it follows n ≡ m mod(N ). So it makes sense to define τ (a) ≡ n mod(N ) if a is an irreducible subsector of anf . It follows from such a definition that τ (ab) ≡ τ (a) + τ (b) mod N . Since a3i has color i, it follows (Gi )ab 6= 0 only if τ (b) = τ (a) + i mod N . (ii) τ (¯a) = −τ (a) follows from the definition of τ (a) in (i). (Gi )ab = ha3i a, bi = ¯ a¯ i = (Gi )a¯ b¯ , where we have used [b¯ a3i ] = [a3i b] ¯ which ¯ 3i , a¯ i = ha3i b, hba follows from (1) of Theorem 3.6. (iii) (Gi )ab = ha3i a, bi = ha, a¯ 3i bi = ha, a3¯ i bi = ha, a3n−i bi = ( t Gn−i )ab where we have used [¯a3i ] = [a3¯ i ] which follows from (2) of Corollary 3.5. (iv) and (v) follow trivially from the definitions of Gi Lemma 3.3 and (1) of Theorem 3.9. As for (vi), we claim the extremal vertex is the identity sector. All we need to show is af is irreducible which follows from (1) of Corollary 3.5. Remark. The only postulate in [PZ] which remains to be proved or disproved is (2) of ix) in [PZ] which says φ(λ) 1 > 0 for any λ ∈ (Exp). 4. More Examples In this section we give some examples as an application of the general theory developed in the previous sections. We will be mainly interested in the subring C which is generated by all the irreducible descendants of anf , where n is a positive integer. What we mean by the fusion diagram of a3i is the oriented graph determined by ¯ i = 3i , the graph is unoriented.) Clearly these graphs determine Gi = V 3i . (In the case 3 the multiplication of aλ with the elements of C for any λ. The following simple lemma will be used repeatedly in Sect. 4.1 and Sect. 4.2. Lemma 4.1. (1) If hλγ, λi = m ≤ 3, then aλ decomposes into a sum of m different irreducible subsectors. (2) If hλγ, µi = n and hµγ, µi = 1, then aµ is irreducible and naµ is a subsector of aλ . (3) Let σj be the special nodes (see the definition before Proposition 3.9) corresponding to Hσj of the level 1 representation of LH. Let γj denote the representation of LSU (N ) obtained by restriction of Hσj to LSU (N ). Then hσj , aλ i = hγj , λi. Proof. (1) By Theorem 3.3, ¯ aλ ρi ¯ = hρλ, ¯ ρλi ¯ haλ , aλ i = haλ ρ, = hγλ, λi = m ≤ 3. PS PS If aλ = i=1 mi xi , where mi ≥ 1 are integers, then i=1 m2i = m ≤ 3. It follows that mi = 1 and S = m. (2) It follows from (1) that aµ is irreducible if hµγ, µi = 1. Again by Theorem 3.3, haλ , aµ i = haλ ρ, ¯ aµ ρi ¯ = hρλ, ¯ ρµi ¯ = hγλ, µi = n. Therefore naµ is a subsector of aλ . (3) It follows from Lemma 3.4 that ¯ aλ ρi ¯ = hσf ρ, ¯ ρλi ¯ = hρσj ρ, ¯ λi hσj , aλ i = hσj ρ, = hγj , λi. We have also used aλ ρ¯ = ρλ ¯ and ρσj ρ¯ = γj (cf. Cor.3.2 and Proposition 2.8).

392

F. Xu

ν We will also need some of the structure constants Nµλ of Gr(Ck ). The general ν using such a formula. formula is given in Sect. 1.6 but it is quite tedious to calculate Nµλ A very efficient algorithm is known and can be found on p. 288 of [Kac2] or [Wal]. We will use aλ instead of [aλ ] to simplify notations. However it should be clear that the ring structure is about sectors rather than endomorphisms. We shall denote the weights (λ1 − λ2 )31 + (λ1 − λ2 )31 + ... + λN −1 3N −1 with λ1 ≥ λ2 ≥ ... ≥ λN −1 of SU (N ) by (λ1 , ...λN −1 ). If a sector is denoted by 0, it will always be the identity sector in the rest of this section.

4.1. E6 and E8 revisited. As in Sect. 3.2, we use a half integer j to denote the representation of SU (2). The ring structure of C is given by the following fusion diagram of a1/2 : b = a3/2 − aq/2

0

a1/2

a1 c = aq/2 d = a5

where the vertices correspond to the irreducible sectors in C. Let us explain how we obtain b = a3/2 − a9/2 . By Lemma 3.3, a1 · a1/2 = a3/2 + a1/2 = b + a1/2 + c. Hence a3/2 = b + c. That a3/2 has two irreducible sectors can be seen from ha3/2 , a3/2 i = ¯ a3/2 ρi ¯ = hρ¯ 23 , ρ¯ 23 i = hγ, 23 · 23 i = h0 ⊕ 3, 0 ⊕ 1 ⊕ 2 ⊕ 3i = 2. ha3/2 ρ, On the other hand, 3 3 3 5 7 9 3 · γ = (0 ⊕ 3) = ⊕ ⊕ ⊕ ⊕ . 2 2 2 2 2 2 2 It follows ha3/2 , a9/2 i = h 23 · γ, 29 i = 1. But ha9/2 , a9/2 i = h0 ⊕ 3, 0 ⊕ 1i = 1, which means a9/2 is irreducible. So b and c must be equal to a9/2 . However dc = d9/2 6= d6 . It follows c = a9/2 and b = a3/2 − a9/2 . It is amusing to check b2 = 0 + a5 , b · a5 = b which means (0, b, a5 ) generates a subring isomorphic to the fusion ring of the Ising model. These are the special nodes (see the definition before Theorem 3.9). Notice it is impossible to have such results in the bimodule picture since b would correspond to the N − M bimodule (see [K]). In exactly the same way we can obtain the ring structure of all irreducible subsectors of a˜ nf (n > 0) b

0

a˜ 1/2

a˜ 1

a˜ q/2

a5

where b = a3/2 − a9/2 = a˜ 3/2 − a˜ 9/2 . It follows from [K] that the principal graph and dual principal graphs of the subfactor ρ(M ) ⊂ M are given by:

New Braided Endomorphisms from Conformal Inclusions

3 0

%J %

S

S S% ρ

1 4 2

S

%J %

S S% ρ¯

5

# # S JJ# S S x y

α 0

393

β β1 δ

# # S JJ# S S x¯ y¯

where we have used β1 instead of γ (γ is reserved for the vacuum representation of LH) compared to [K]. By using the results of [K], we can even determine the ring structure of Aρ whose irreducible elements are given by: 0, a1/2 , a˜ 1/2 , a1 , a˜ 1 , b, a9/2 , a˜ 9/2 , a5 , α = a1/2 a˜ 1/2 , a5 α and αb. Also β = a1 , β1 = a˜ 1 , δ = a5 · α, ε = a5 . That β = a1 , β1 = a˜ 1 agree with the observation in [K]. The ring structure of Aρ is completely determined by the following formula: a1/2 a˜ 1/2 = α, a1/2 a˜ 1 = bα, a1/2 a˜ 9/2 = a5 α; a1/2 α = a˜ 1/2 + bα, a˜ 1/2 α = a1/2 + bα. Perhaps the most surprising identity is α = a1/2 a˜ 1/2 .2 Let us explain why α = a1/2 a˜ 1/2 . First of all, ha1/2 a˜ 1/2 , a1/2 a˜ 1/2 i = ha1/2 a1/2 , a˜ 1/2 a˜ 1/2 i = h0 + a1 , 0 + a˜ 1 i = 1, hence a1/2 a˜ 1/2 is irreducible. Then ha1/2 a˜ 1/2 , ρρi ¯ = hρa1/2 a˜ 1/2 , ρi = h(0 ⊕ 1)ρ, ρi = √ h0 ⊕ 1, 0 ⊕ 3i = 1. This shows ρρ ¯ = 0 + α contains a1/2 a˜ 1/2 . But dα = 2 + 3 = da1/2 · da˜ 1/2 , so we have α = a1/2 a˜ 1/2 . The rest the of irreducible sectors are obtained by a similar or simpler calculation. As for the multiplication rule, we explain how we obtain a1/2 α = a˜ 1/2 + bα which requires more computations than the others. Notice ¯ = ρ¯ · 21 · ρ = a1/2 · (0 + α) = a1/2 + a1/2 α. Since h˜a1/2 , ρ¯ · 21 · ρi ≥ 1, and by a1/2 ρρ Lemma 3.2, a˜ 1/2 6= a1/2 , so we must have h˜a1/2 , a1/2 αi ≥ 1. But hρ¯ · 21 · ρ, ρ¯ · 21 · ρi = hγ · 21 , 21 · γi = h(0 ⊕ 3) · 21 , (0 ⊕ 3) · 21 i = 3. This shows a1/2 · α = a˜ 1/2 + c where c is an irreducible sector. Since ha1/2 · α, bαi = ha1/2 · b, ααi ¯ = hβ, 0 + α + β + β1 + δi = 1 and hbα, bαi = hb2 , α2 i = h0 + a5 , 0 + α + β + β1 + δi = 1, it follows bα is irreducible and a1/2 · α = a˜ 1/2 + bα. The commutativity of Aρ , for example a˜ 1/2 b = b˜a1/2 , a1/2 a˜ 1/2 = a˜ 1/2 a1/2 , follows from Lemma 3.3 and Theorem 3.6. For the E8 case the structure of C is given by the fusion diagram of the a1/2 diagram c a b a2 a3/2 a1 a1/2

0

2 This identity is an indication of the relevance of a and a ˜ f to the study of the still mysterious dual f subfactor ρ(M ¯ ) ⊂ M in the general case.

394

F. Xu

with a = a3 − a2 , b = a7/2 − a3/2 and c = a5/2 + a3/2 − a7/2 . Again it is easy to check that a2 = a + 0. The structure of Aρ is more complicated and we haven’t done the calculation. 4.2. The case of SU (3) and SU (4). The weights of SU (3) are labeled by (λ1 , λ2 ) with λ1 ≥ λ2 ≥ 0. d(3)3 ⊂ Spin(8). [ The partition function is Z = |χ(0,0) + χ(3,0) + χ(3,3) |2 + Example 0. SU 2 3|χ(2,1) | . γ = (0, 0) + (3, 0) + (3, 3). The special nodes (see the definition before Proposition 3.9) corresponding to the blocks of Z from the left to the right are denoted by 0, b1 , b2 , b3 . The fusion graph G1 is given by: a¯ fb3 HH j H H @ @ / b @b1 2 j 0HH HH 6 af G1 is determined as follows. We have (1, 0)(1, 1) =(0, 0) + (2, 1), (1, 0)(1, 0) =(2, 0) + (1, 1), (1, 1)(1, 1) =(1, 0) + (2, 2), (1, 1)γ =(1, 1) + (2, 0) + (3, 2). By using Lemma 4.1, the first identity says a(1,0) a(1,1) = 0+a(2,1) = 0+b1 +b2 +b3 . (In fact we only have a(2,1) b1 + b2 + b3 but by calculating statistical dimension we conclude that a(2,1) = b1 + b2 + b3 ), the fourth identity says a(1,1) is irreducible and a(1,1) = a(2,0) . The graph G1 is then completely determined as above. d(3)5 ⊂ SU d(6). The special nodes are denoted by e0 , e1 , e2 , a(5,0) , Example 1. SU a(5,5) and a(0,0) which correspond to the blocks in Z = |χ(5,2) + χ(2,2) |2 + |χ(3,0) + χ(3,3) |2 + |χ(2,0) + χ(5,3) |2 + |χ(3,2) + χ(5,0) |2 + |χ(3,1) + χ(5,5) |2 + |χ(0,0) + χ(4,2) |2 from the left to the right. f = (1, 0) and f¯ = (1, 1) correspond to vector representation and its dual respectively. The multiplication by (1, 0) and (1, 1) is given by (λ1 , λ2 )(1, 0) = (λ1 + 1, λ2 ) + (λ1 , λ2 + 1) + (λ1 − 1, λ2 − 1), (λ1 , λ2 )(1, 1) = (λ1 − 1, λ2 ) + (λ1 , λ2 − 1) + (λ1 + 1, λ2 + 1). σ(c) c Let σ be the action of Z3 on the weights. Then Nσ(a)b = Nab (cf. p. 783 of [Wal]). First we claim the principal graphs for the subfactor M ⊃ af (M ) is given by the following: a(5,5) 0 a a f

e0

a(5,1)

(5,4)

a(4,0)

e1

New Braided Endomorphisms from Conformal Inclusions

395

In fact from (2, 1)γ = (2, 1) + (2, 1) + (5, 1) + (5, 4) + · · ·, (where · · · denotes the rest of sectors which are different from (2, 1), (5, 1) and (5, 4)), by Lemma 4.1, a(2,1) decompose into two irreducible sectors. Since σ(1, 0) = (5, 1), σ 2 (1, 1) = (5, 4) and af , af¯ are irreducible, it follows a(5,1) , a(5,4) are irreducible and are subsectors of a(2,1) . Hence a(2,1) = a(5,1) + a(5,4) . Since (5, 1)(1, 0) = (5, 2) + (4, 0), (5, 4)(1, 0) = (5, 5) + (4, 3), it remains to show that a(4,3) = a(1,0) + a(4,0) , a(5,2) = a(1,0) + e0 , a(3,0) = a(5,4) + e1 . One checks (5, 2) γ = σ(2, 0)γ = σ((2, 0)γ) = (5, 2) + (5, 2) + (1, 0) + · · · . Hence a(5,2) = a(1,0) + e0 (e0 is a subsector of a(5,2) follows from that (5, 2) belongs to the blocks in Z correspond to e0 and Lemma 4.1). Similarly one has (4, 3) γ = (4, 3) + (4, 3) + (1, 0) + (4, 0) + · · · , (3, 0) γ = (3, 0) + (3, 0) + (5, 4) + · · · which leads to a(4,3) = a(2,1) + a(4,0) , a(3,0) = a(5,4) + e1 . a(4,0) , a(5,4) are irreducible follows from a(4,0) · γ = σ(a(1,1) γ) = a(4,0) + · · · and a(5,4) · γ = σ 2 (a(1,1) γ) = a(5,4) + · · · . By using Z3 symmetry, we can immediately determine the fusion graph of af :

a(5,5) A T U A T

T a(5,4) (4,4) T a e1 H T T HH * jH T T HHT KT ?HT a(4,0) af THH6 T JJ T HH ^T Y H H H T HH T a(5,0) 0 T

a(5,1) T

T ] J TT J e0

e2

396

F. Xu

d(3)9 ⊂ E c6 . The three special nodes are denoted by 10 , 12 , 11 which Example 2. SU correspond to the three blocks (from the left to the right) in the partition function Z = 2|χ(4,2) + χ(7,2) + χ(7,5) |2 + |χ(0,0) + χ(9,0) + χ(9,9) + χ(8,4) + χ(5,1) + χ(5,4) |2 . d(3)9 but correspond to two The first 2 blocks are identical as representations of SU ˆ different representations of E6 at level 1. We claim the principal graph for af (M ) ⊂ M is given by 11 ,,

af , @ @ I a a a(2,2) @ ! ! " " a(2,1) 22 " A 12 A l 20 l k KA l 10 with 5 = a(2,1) , 6 = a(2,2) , 20 = 10 af , 22 = 12 af . First it is clear (2, 1)γ = (2, 1)+· · ·, so a(2,1) is irreducible and 5 = a(2,1) . (2, 1)(1, 0) = (3, 1) + (2, 2) + (1, 0). Let us show a(3,1) = a(2,2) + 10 af + 12 af . Notice (3, 1)γ = 3(3, 1) + (2, 2) + · · · and (3, 1)(1, 1) = (2, 1) + (3, 0) + (4, 2). Remember 10 + 12 are subsectors of a(4,2) by (3) of Lemma 4.1. It follows a(3,1) decomposes into 3 irreducible subsectors and ha(3,1) af¯ , 10 i = ha(3,1) af¯ , 12 i = 1. So are exactly the irreducible a(2,2) , 10 af , 12 af √ subsectors of a(3,1) . The statistical di1 mension d of af is 3 + 1, da(2,2) = 2 d − d . If a(2,2) · a(1,1) = 2a(2,1) + x, it follows dx = 0 and therefore a(2,2) a(1,1) = 2a(2,1) . (In fact it follows a(2,1) = a(3,3) .) The principal graph is completely determined as above. The fusion graph of af is determined similarly as: 11 %C % CM 21 %- C 31 C %A % AK5 C C 6 , A 4%" - , EL i + C , " L"- , C 22 E" 6 E L ,32C L , EO C 30 12 E, 20 L L

KL L L 10

New Braided Endomorphisms from Conformal Inclusions

397

where 30 = 10 · af¯ , 32 = 12 · af¯ , 4 = a(2,0) . d(3)21 ⊂ Eˆ 7 . The partition function is given by: Example 3. SU Z =|χ(0,0) + χ(21,0) + χ(21,21) + χ(8,4) + χ(17,12) + χ(17,4) + χ(20,10) + χ(11,1) + χ(11,9) + χ(12,5) + χ(15,9) + χ(15,5) + |2 + |χ(6,0) + χ(21,6) + χ(15,15) + χ(6,6) + χ(21,15) + χ(15,0) + χ(11,7) + χ(14,4) + χ(11,4) + χ(14,10) + χ(17,7) + χ(17,10) |2 . The special nodes are 10 and 11 . The fusion graph of af is given by: 11

31

21

51

41

61

81

91

71

101 120

110

110

122 100

70 80

90 60

40

50 20

30

10

,

where 20 = af , 30 = a(1,1) , 40 = a(2,0) , 50 = a(2,1) , 60 = a(2,2) , 70 = a(4,1) − 11 · a(3,2) , 80 = a(3,1) , 90 = a(3,2) , 100 = a(4,3) − 11 · a(3,1) , 110 = a(3,0) , 120 = 70 · a(1,0) − 110 and i1 = 11 · i0 for 1 ≤ i ≤ 12. The checking of the above formula is tedious for 70 , 100 , 120 but quite straightforward. The determination of the fusion graphs in Examples 1 to 3 prove a conjecture on p. 20 of [X3]. d(10)1 . The partition function is d(5)3 ⊂ SU Example 4. SU Z = |χ(0,0,0,0) + χ(2,2,1,0) |2 + |χ(3,1,1,0) + χ(3,3,2,2) |2 + (|χ(1,1,0,0) + χ(3,2,2,0) |2 + |χ(3,3,0,0) + χ(2,2,1,1) |2 + |χ(2,2,0,0) + χ(3,2,2,2) |2 + |χ(3,0,0,0) + χ(3,2,2,1) |2 + c.c),

398

F. Xu

where c.c indicates the conjugate of the terms inside the parenthesis (the conjugate of |χλ + χµ |2 is defined to be |χλ¯ + χµ¯ |2 ). The fusion graphs G1 and G2 are given by:

G

G

where the broken lines indicated double edges. The arrows are omitted but can be recovered from the following rule (See (i) of Proposition 3.10): the color of vertex a is a − 1 (mod 5), and in Gi , i = 1, 2, a points to b if and only if a = b + i (mod 5). The determination of G1 is rather similar to Example 1. One determines the principal graph of subfactor af (M ) ⊂ M first and then use Z5 symmetry to obtain G1 . G2 is determined by tedious but rather straightforward calculations. The special nodes are labeled from 1 to 10 in the above graphs. d(6)1 . The partition function is d(4)2 ⊂ SU Example 5. SU Z = |χ(0,0,0) + χ(2,2,0) |2 + |χ(2,2,2) + χ(2,0,0) |2 + 2|χ(1,1,0) |2 + 2|χ(2,1,1) |2 ,

γ = (0, 0, 0) + (2, 2, 0). The corresponding special vertices corresponding to the blocks in Z from the left to the right are labeled by 1,4,2,6,5,3. The fusion graphs G1 and G2 are determined from the following formula and Lemma 4.1: (1, 0, 0)(1, 0, 0) = (2, 0, 0) + (1, 1, 1), (1, 1, 1)(1, 1, 1) = (1, 1, 0) + (2, 2, 2); (1, 1, 1)γ (2, 1, 0).

Graphs G1 , G2 are the following:

New Braided Endomorphisms from Conformal Inclusions

399

7

7 6

6

1

5

5 4 3

2

4

1 3

2 8 8

d(4)4 ⊂ Spin(15) [ Example 6. SU 1 . The partition function Z = |χ(0,0,0) +χ(4,4,0) +χ(3,3,2) + χ(3,1,0) |2 + |χ(4,4,4) + χ(4,0,0) + χ(2,1,1) + χ(4,3,1) |2 + |2χ(3,2,1) |2 . The corresponding special nodes are denoted by 1, ε and τ which correspond to the level 1 representation of LSO(15). The fusion graphs of af and a(1,1,0) determined by G1 and G2 respectively are given by: , 9

9

9

3

5 6

5

2

7

9

8

8

1

,

4

10

3

,

6

1

4 2

10

G2

10

, 10

7

where 2 = ε, 3 = τ, 6 = a(2,0,0) = a(1,1,0) , τ ·a(1,1,0) = 4+5, τ ·af = 9+90 , τ ·af¯ = 10+100 . It is interesting to note d9 = d90 = d10 = d100 = 2 cos π8 . Here as in the previous examples, we have enumerated the nodes in the graph and we have used the number to denote the corresponding sector. Let us explain how the graphs are obtained. Notice (1, 0, 0)(1, 0, 0) = (2, 0, 0) + (1, 1, 0). One checks (2, 0, 0)γ = (2, 0, 0) + (1, 1, 0) + · · ·, which implies a(2,0,0) = a(1,1,0) = 6 by Lemma 4.1. Note (1, 0, 0)(1, 1, 1) = (2, 1, 1) + (0, 0, 0), hτ af , τ af i = h1 + ε, af af¯ i = h1, af af¯ i + hε, af af¯ i = 1 + hε, a(2,1,1) i = 2. Hence we can write τ af = 9 + 90 , similarly τ · a¯ f = 10 + 100 . Since hτ af , τ a¯ f i = h1 + ε, af af i = h1 + ε, a(2,0,0) i = 0, 9, 90 , 10, 100 are all different sectors. Notice 9¯ + 9¯ 0 = 10 + 100 , we may choose our notation such that 9¯ = 10, 9¯0 = 100 . Similarly one has τ · a(1,1,0) = 4 + 5. Let us show a(2,1,0) = τ · af + a(1,1,1) . First one checks (2, 1, 0) · γ = 3(2, 1, 0) + (1, 1, 1) + · · ·. Then ha(2,1,0) , τ · af i = ha(2,1,0) af¯ , τ i = ha(3,2,1) , τ i = 2. It follows a(2,1,0) = τ · af + a(1,1,1) . In the same way 6 · af¯ = 2af + τ · af¯ and a(2,1,1) = ε + τ · a(1,1,0) . From τ · af = 9 + 90 it follows d9 + d90 = 4 cos π8 . Without loss of generality, let us assume d9 ≤ 2 cos π8 . It follows the fusion graph of 9 is a A or D graph (see [GHJ]).

400

F. Xu

Hence at most 9 · 9¯ = 0 + X + Y and haf · 9, af · 9i = h0 + ε + 4 + 5, 0 + X + Y i ≤ 3. On the other hand haf · 9, τ · a(1,1,0) i = hτ · 9, a(1,1,0) af¯ i = hτ · 9, 2af + τ · af¯ i = h9, 2τ · af + af¯ + af¯ i = 2 (we here used εaf¯ = af¯ which follows from hεaf¯ , af¯ i = hε, af af¯ i = 1). If af · 9 = 4 + 4 or 5 + 5, then haf · 9, af · 9i ≥ 4 which is impossible so af · 9 = 4 + 5 = τ · a(1,1,0) , and it follows d9 = d90 = 2 cos π8 . Similarly one can show af · 90 = 4 + 5, a¯ f · 10 = 4 + 5 = a¯ f · 100 . It follows af · 4 = 7 + 10 + 100 + W . However, from the explicit formula for the statistical dimensions it follows dW = 0, therefore af · 4 = 7 + 10 + 100 . The above calculations determine G1 completely. G2 is determined in a similar way. Since d9 = d10 = 2 cos π8 , the principal graph associated of the subfactor associated with 9 and 10 are given as follows: 5 4 0 0 4 5 A A J A A J A A J A A J J A A J A A a f 10 100 a¯ f 9 90 The nodes in the diagram are determined by the fact that they must be among the nodes of G1 and calculations of statistical dimensions. A bit surprising fact is that 9 · 10 = 0 + 4 but 10 · 9 = 0 + 5, in particular 9 · 10 6= 10 · 9 ! Let us explain how we derive it. ¯ = h6 + τ, 6 + τ i = 2. On the other hand, In fact, h9 · 9, af · af i = haf¯ · 9, af · 9i h9 · 9, af · af i = h9 · 9, 6 + 6i, it follows that 9 · 9 6 and by computing the statistical dimensions we conclude that 9 · 9 = 6. So h9 · 9, 9 · 9i = h9 · 10, 10 · 9i = 1. So if we choose our notation such that 9 · 10 = 0 + 4, then 10 · 9 6= 0 + 4, but the only nodes of G1 other than 4 which has the same statistical dimension as that of d4 is 5. It follows that 10 · 9 = 0 + 5. This is the first indication that the subring generated by all the subsectors of anf , where n runs over all positive integers are noncommutative! It follows from the above principal graphs that · 9 = 90 since · (9 + 90 ) = (9 + 90 ) and 9 · 9¯ = 0 + 4. In fact, one can determine the complete multiplication table among the nodes of G1 quite easily from the formula we presented above. We omit the details. All the diagrams in this section first appear in [Fran] and [PZ]. They are constructed by spectral analysis based on certain assumptions . Our theory gives explanations and generalize many observations in [Fran] and [PZ], and furthermore prove that these observations apply to a general class of conformal inclusions. In particular, all the fusion graphs G1 constructed in Examples 1 to 6 support a representation of Hecke algebras by Theorem 3.8 in the sense of [PZ]. This fact is established for Example 1 to 5 in [Fran, Sch] by explicit but rather tedious calculations and for Example 6, no such explicit calculation has been done. Finally let us note that the principal graph of subfactor M ⊃ ρ(M ) is given as the connected part (containing 1) of the graph determined by V1bλ . In fact, since ρρ¯ = γ, the principal graph of M ⊃ ρ(M ) is given by the connected part containing ρ¯ of the Bratteli diagram of γ n (M )0 ∩ M ⊂ γ n ρ0 (M ) ∩ M for n big enough. Notice the minimal projections of γ n (M )0 ∩ M and γ n ρ0 (M ) ∩ M are in one to one correspondence with [λ]’s and irreducible subsectors of [λρ]’s respectively since [λρ] = [ρaλ ] and by (2) of

New Braided Endomorphisms from Conformal Inclusions

401

Theorem 3.3, the irreducible subsectors of [ρaλ ] are in one to one correspondence with the irreducible subsectors of [aλ ]. (Recall the set of such sectors are denoted by V .) Under such a correspondence, the Bratteli diagram is determined by the right multiplication of λ . aλ on V which is given by Vab f . In the first four examples of Sect. 4.2, we have In Sect. 4.1 we have determined Vab ¯ f f¯ f f (1,1,0,0) (1,1,1,0) , Vab , Vab , Vab = determined Vab , Vab . In example 4, we have determined Vab f¯ (1,1,0,0) f (1,1,0) λ and in the last two examples, Vab , Vab , Vab . Since λ → V is a repreVba sentation of the fusion ring and in each case we have determined the image of the λ are completely determined for all the examples in this chapter generators, it follows Vab (see [X3] and [X4] for similar considerations). Hence the principal graphs of the subfactor M ⊃ ρ(M ) are completely determined. The principal graph of M ⊃ ρ(M ) for d(6) can be determined in exactly the same way as [X3] and is the d(3)5 ⊂ SU the case SU same as the graph given there. The following is the principal graph of M ⊃ ρ(M ) for d(6) from [X3]. d(3)5 ⊂ SU the case SU

5. Conclusions and Questions In this paper we have shown the existence of a new class of braided endomorphisms for all the maximal conformal inclusions G = SU (N ) ⊂ H with H being simple. The properties of these braided endomorphisms are analyzed and many new examples are given. Our approach suggests some natural questions. It is clear that our method can be applied to other conformal inclusions (G is not necessarily of type A) once the analog of Theorem 1.6 for LG, where G is any simple simply connected compact group is established. In fact, in [X4], certain subfactors associated to the case when G is of type B are constructed.3 Such a theory has been sketched in [W1] but details have not appeared, at least to us. Another question is the ring structure of Aρ . Such a ring structure not only encodes the structure of braided endomorphisms aλ and a˜ λ , it also contains the information about the ring structure of the subfactors from conformal inclusions. It is worthwhile to notice that the fusion graph of af (see the examples in Sect. 4) also appears in a different context, namely in the construction of some integrable N = 2 supersymmetric models (see [PZ] and [Z]). It will be very interesting to establish such connections. It is also a very interesting question to study the TQFT associated to these new braided endomorphisms along the lines of [OCN2] and [EK]. Finally, based on the relations between subfactors and quantum-groups at roots of unity, it is interesting to see if our new braided endomorphisms have any implications in the quantum group context. 3 We have checked that by assuming Theorem 1.6 for LG where G = B and G , the fusion graphs of a 2 2 f ˆ [ are those given by [X4] and [Fran] for the conformal inclusions b so(5)3 ⊂ b so(10) and (G 2 )3 ⊂ E6 by using Theorem 3.3 of Sect. 3.

402

F. Xu

Acknowledgement. I’d like to thank Prof. Kawahigashi and Prof. A. Wassermann for useful correspondences via e-mail. I’d also like to thank the referee for helpful suggestions. This work is partially supported by NSF grant DMS-9500882.

References [L1] [L2]

Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) Longo, R.: Duality for Hopf algebras and for subfactors, I. Commun. Math. Phys. 159, 133–150 (1994) [L3] Longo, R.: Index of subfactors and statistics of quantum fields, I. Commun. Math. Phys., 126, 217–247 (1989) [L4] Longo, R.: Index of subfactors and statistics of quantum fields, II. Commun. Math. Phys., 130, 285– 309 (1990) [L5] Longo, R.: Proceedings of International Congress of Mathematicians, 1281–1291 (1994) [LR] Longo, R. and Rehren, K.-H. Nets of subfactors. Rev. Math. Phys., 7, 567–597 (1995) [Po] Popa, S. Classification of subfactors and of their endomorphisms. CBMS Lecture Notes Series, 86 [W1] Wassermann, A.: Operator algebras and Conformal field theories III. To appear [W2] Wassermann, A.: Proceedings of International Congress of Mathematicians, 966–979 (1994) [PZ] Petkova, V.B. and Zuber, J.-B.: From CFT to graphs. hep-th-9510198 [Fran] Di Francesco, P. and Zuber, J.-B.: Integrable lattice models associated with SU (N ). Nucl. Phys. B, 338, 602–623 (1990) [Ka1] Kawahigashi, Y.: Classification of paragroup actions on subfactors. Publ. RIMS, Kyoto Univ., 31 481–517 (1995) [EK] Evans, D.E. and Kawahigashi, Y.: From subfactors to 3-dimensional topological quantum field theories and back – A detailed account of Ocneanu’s theory. Internat. J. Math., 6, 537–558 (1995) [OCN1] Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, (Notes recorded by Y. Kawahigashi), 1991 [OCN2] Ocneanu, A.: An invariant couplings between 3-manifolds and subfactors. Preprint (1991) [GHJ] Goodman, F.M., de la Harpe, P. and Jones, V.: Towers of algebras and Coxeter graphs. MSRI publication, no. 14 [PS] Pressley, A. and Segal, G.: Loop groups. Oxford: Clarendon Press, 1986 [I1] Izumi, M.: Applications of fusion rules to classification of subfactors. Publ. RIMS, Kyoto Univ., 27, 953–994 (1991) [I2] Izumi, M.: On Flatness of the Coxeter graph E8 .Pacific J. Math. 166, 305–327 (1994) (1992) [JB] Bockenhauer, J.: An algebraic formulation of level 1 WZW models. Hep-th 9507047, Rev. Math. Phys. 8, 925–948 (1996) [GNO] Goddard, P., Nahm, W. and Olive, D.: Symmetric spaces, Sugawara’s energy momentum tensor in two dimensions and free fermions. Phy. Lett. 160B, 111–116 (1985) [We1] Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math., 92, 349–383 (1988) [M] Maclane, S.: Categories for the working mathematicians. Graduate Texts in Mathematics 5, Berlin: Springer 1977 [FG] Fr¨ohlich, J. and Gabbiani, F.: Operator algebras and CFT. Commun. Math. Phys. 155, 569–640 (1993) [Kac1] Kac, V.G. and Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. in Math. 70, 156–234 (1988) [Kac2] Kac, V.G.: Infinite dimensional algebras, 3rd Edition. Cambridge: Cambridge University Press, 1990 [Sch] Sochen, N.: Integrable models from Hecke algebras. Nucl. Phys.B 360, 613–637 (1991) [MS] Moore, G. and Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 77–184 (1989) [X1] Xu, F.: Orbifold construction in subfactors. Commun. Math. Phys. 166, 237–253 (1994) [X2] Xu, F: A new series of subfactors. Ph.D. thesis, Berkeley, 1995 [X3] Xu, F.: Generalized Goodman-Harper-Jones construction of subfactors, I. Commun. Math. Phys. 184, 475–491 (1997) [X4] Xu, F.: Generalized Goodman-Harpe-Jones construction of subfactors, II. Commun. Math. Phys. 184, 493–508 (1997) [X5] Xu, F.: The flat part of non-flat orbifold. Pac. J. Math., 172(1), 299–306 (1996) [X6] Xu, F.: Jones-Wassermann subfactors for Disconnected Intervals. q-alg 9704003

New Braided Endomorphisms from Conformal Inclusions

[Wal] [CIZ] [Y] [FRS]

403

Walton, M.A.: Fusion rules in WZW models. Nucl. Phys. B 340, 777–790 (1990) Cappeli, A., Itzykson, C. and Zuber, J.-B.: Commun. Math. Phys. 116, 1–23 (1987) Yamagami, S.: A note on Ocneanu’s approach to Jones index theory. Int. J. Math. 4, 859–871 (1993) Fredenhagen, K., Rehren, K.-H. and Schroer, B.: Superselection sectors with braid group statistics and exchange algebras, II. Rev. Math. Phys. Special issue (1992), 113–157 [Fre] Fredenhagen, K.: Generalizations of the theory of superselection sectors. In The algebraic theory of superselection sectors D.Kastler ed., Singapore. World Scientific, 1990 [GL1] Guido, D. and Longo, R.: The Conformal Spin and Statistics Theorem. hep-th 9505059 [GL2] Guido, D. and Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1995) [GL3] Guido, D. and Longo, R.: An Algebraic Spin and Statistics Theorem. To appear in Commun. Math. Phys. Communicated by H. Araki

Commun. Math. Phys. 192, 405 – 432 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

R-Matrix Quantization of the Elliptic Ruijsenaars– Schneider Model G.E. Arutyunov, L.O. Chekhov, S.A. Frolov Steklov Mathematical Institute, Gubkina 8, GSP-1, 117966, Moscow, Russia. E-mail: [email protected]; [email protected]; [email protected] Received: 17 March 1997 / Accepted: 8 July 1997

Abstract: It is shown that the classical L-operator algebra of the elliptic RuijsenaarsSchneider model can be realized as a subalgebra of the algebra of functions on the cotangent bundle over the centrally extended current group in two dimensions. It is governed by two dynamical r and r-matrices ¯ satisfying a closed system of equations. The corresponding quantum R and R-matrices are found as solutions to quantum analogs of these equations. We present the quantum L-operator algebra and show that the system of equations on R and R arises as the compatibility condition for this algebra. It turns out that the R-matrix is twist-equivalent to the Felder elliptic RF -matrix with R playing the role of the twist. The simplest representation of the quantum L-operator algebra corresponding to the elliptic Ruijsenaars-Schneider model is obtained. The connection of the quantum L-operator algebra to the fundamental relation RLL = LLR with Belavin’s elliptic R matrix is established. As a byproduct of our construction, we find a new N -parameter elliptic solution to the classical Yang-Baxter equation. 1. Introduction The appearance of classical dynamical r-matrices [1, 2] in the theory of integrable manybody systems raises an interesting problem of their quantization. In this way one may hope to separate the variables explicitly. At present, the classical dynamical r-matrices are known for the rational, trigonometric [2, 3] and elliptic [4, 5] Calogero–Moser (CM) systems, as well as for their relativistic generalizations – the rational, trigonometric [6, 7] and elliptic [8, 9] Ruijsenaars– Schneider (RS) systems [10]. It is recognized that dynamical systems of the Calogero type can be naturally understood in the framework of the Hamiltonian reduction procedure [11, 12]. Moreover, the reduction procedure provides an effective scheme to compute the corresponding dynamical r-matrices [3, 13]. Depending explicitly on the phase variables, the dynamical r-matrices do not satisfy a single closed equation of the Yang–Baxter type, that makes the problem of their

406

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

quantization rather nontrivial. In [14], the spin generalization of the Calogero–Sutherland system was quantized by using the particular solution [15] of the Gervais–Neveu–Felder equation [16, 15] and in [17], it was interpreted in terms of quasi-Hopf algebras. This system is not integrable, but zero-weight representations of the quantum L-operator algebra admit a proper number of commuting integrals of motion. However, it seems to be important to find such a quantum L-operator algebra for the Calogero-type systems that possesses a sufficiently large abelian subalgebra. Recently, an algebraic scheme for quantizing the rational RS model in the R-matrix formalism was proposed [18]. We introduced a special parameterization of the cotangent bundle over GL(N, C). In new variables, the standard symplectic structure was described by a classical (Frobenius) r-matrix and by a new dynamical r-matrix. ¯ The classical L-operator was introduced as a special matrix function on the cotangent bundle. The Poisson algebra of L inherited from the cotangent bundle coincided with the L-operator algebra of the rational RS model. It is this reason why we called L the L-operator. Quantizing the Poisson structure for L, we found the quantum L-operator algebra and constructed its particular representation corresponding to the rational RS system. This quantum algebra has a remarkable property, namely, it possesses a family of N mutually commuting operators directly on the algebra level. It is well-known that the elliptic RS model is the most general among the systems of the CM and RS types. In this paper, we aim to include this model in our scheme. Recall that the classical L-operator algebra for the elliptic RS model can be obtained by means of the Poissonian [19] or the Hamiltonian reduction schemes [20]. In the first scheme, the affine Heisenberg double is used as the initial phase space and in the second one, the cotangent bundle over the centrally extended current group in two dimensions is considered. Thus, the appropriate phase space we choose to deal with the elliptic RS model d )(z, z) d )(z, z) ¯ over the centrally extended group GL(N ¯ is the cotangent bundle T ? GL(N of double loops. The application of our approach [18] is not straightforward since one should work with the infinite-dimensional phase space and, therefore, the correct ded )(z, z) ¯ in the desired parameterization scription of the Poisson structure on T ? GL(N requires an intermediate regularization. We describe briefly the content of the paper and the results obtained. In the second d )(z, z) ¯ that depends on section we start by describing the Poisson structure on T ? GL(N d )(z, z) ¯ in a special way. two complex parameters k and α. Then we parametrize T ? GL(N The Poisson structure in new variables is ill defined due to the presence of singularities. To overcome this problem, we introduce an intermediate regularization. Removing the regularization we find that only for α = 1/N the resulting Poisson structure is well defined. The value α = 1/N corresponds, in fact, to the case where only the SL(N )(z, z)¯ subgroup is centrally extended. The corresponding Poisson structure is described by two matrices r and r¯ , which depend on N dynamical variables qi . It follows from the Jacoby identity (see Appendix A) that r is an N -parameter elliptic solution to the classical YangBaxter equation (CYBE). It is worthwhile to mention that the main elliptic identities (see Appendix B) follow from the fulfillment of the CYBE for r. We expect that the matrix r is related to a special Frobenius subgroup in GL(N )(z, z) ¯ as it was in the finitedimensional case [18]. The Jacoby identity also implies a closed system of equations on r and r¯ . d )(z, z). ¯ We call this function the We define a special matrix function L on T ? GL(N d )(z, z) ¯ “L-operator" since as we show the Poisson algebra of L inherited from T ? GL(N literally coincides with the one for the elliptic RS model [9, 20]. Thus, the classical L-operator algebra can be realized as a subalgebra of the algebra of functions on

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

407

d )(z, z). d )(z, z) T ? GL(N ¯ It turns out that the L-operator as a function on T ? GL(N ¯ admits a factorization L = W P , where pi = log Pi are the variables canonically conjugated to ¯ The Poisson bracket for the qi and W belongs to some special subgroup in GL(N )(z, z). entries of W is given by the matrix r and coincides with the Sklyanin bracket defining the structure of the Poisson–Lie group. Although the quantum analogs of equations on r and r¯ can be easily established, it is rather difficult to find the corresponding quantum R and R-matrices. The matter is that the matrices r and r¯ have the complicated structure, r = r−s⊗I +I ⊗s and r¯ = r−s⊗I, ¯ due to the presence of the s-matrix. However, we observe that the classical L-operator algebra does not depend on s and, moreover, the matrices r and r¯ also obey a closed system of equations. We show that this system arises as the compatibility condition for a d )(z, z) ¯ new Poisson algebra. This algebra contains both the Poisson algebra of T ? GL(N and the classical L-operator algebra as its subalgebras. In the third section, using this key observation, we pass to the quantization. We find the corresponding quantum R and R-matrices as solutions to quantum analogs of the equations for r and r. ¯ In particular, the R-matrix satisfies a novel triangle relation that differs from the standard quantum Yang-Baxter equation by shifting the spectral parameters in a special way. The Felder elliptic RF -matrix naturally arises in our construction. It turns out that the R-matrix is twist-equivalent to the RF -matrix with the R-matrix playing the role of the twist. Then we derive a new quadratic algebra satisfied by the “quantum" L-operator. This algebra is described by the quantum dynamical R-matrices, namely, R, RF , and R: F . R12 L1 R21 L2 = L2 R12 L1 R12

We show that the system of equations on R, RF , and R-matrices arises as the compatibility condition for this algebra. We present the simplest representation of the quantum L-operator algebra corresponding to the elliptic RS model. We note that when performing a simple canonical transformation, the quantum L-operator coincides essentially with the classical L-operator found in [10]. The quantum integrals of motion for the elliptic RS model were obtained in [10]. In [21], it was shown that any operator from the Ruijsenaars commuting family can be ˆ realized as the trace of a proper transfer matrix for the special L-operator that obeys the ˆ with Belavin’s elliptic R-matrix [26]. We note that our L-operator relation RLˆ Lˆ = Lˆ LR ˆ It follows from this observation that the determinant formula is gauge-equivalent to L. for the commuting family [21] is also valid for L. We show that any representation of our ˆ L-operator algebra is gauge equivalent to a representation of the relations RLˆ Lˆ = Lˆ LR. In the Conclusion we discuss some problems to be solved. 2. Classical L-Operator Algebra d )(z, z). ¯ Let Tτ be a torus endowed with the standard 2.1. Poisson structure of T ? GL(N complex structure and periods 1 and τ . Denote by G a group of smooth mappings from Tτ into the group GL(N, C). Then g ∈ G is a double-periodic matrix function g(z, z). ¯ The dual space to the Lie algebra of G is spanned by double-periodic functions A(z, z) ¯ with values in Mat (N, C). In what follows, we often use the concise notation b [22]. The g(z, z) ¯ = g(z) and A(z, z) ¯ = A(z). The group G admits central extensions G ∗b Poisson structure on T G with fixed central charges reads

408

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

1 ∂ [C, A1 (z) − A2 (w)]δ(z − w) − k(C − αI) δ(z − w), (2.1) 2 ∂ z¯ {g1 (z), g2 (w)} = 0, (2.2) (2.3) {A1 (z), g2 (w)} = g2 (w)Cδ(z − w),

{A1 (z), A2 (w)} =

where k, α are central charges and δ(z) is the two-dimensional δ-function. Here we use the standard tensor notation, and C is the permutation operator. b One can consider the following Hamiltonian action of G on T ∗ G: ¯ (z), A(z) → T −1 (z)A(z)T (z) + kT −1 (z)∂T g(z) → T −1 (z)g(z)T (z).

(2.4)

We restrict our consideration to the case of smooth elements A(z). Then the generic element A(z) can be diagonalized by the transformation (2.4) [23]: ¯ (z)T −1 (z). A(z) = T (z)DT −1 (z) − k ∂T

(2.5)

Here D is a constant diagonal matrix with entries Di , Di 6= Dj and T (z) is doubleperiodic. Matrix D is defined up to the action of the elliptic Weyl group. One can fix D by choosing the fundamental Weyl chamber. The matrix T (z) in Eq. (2.5) is not uniquely defined. Any element T˜ (z, z) ¯ = T (z, z)h(z), ¯ where a diagonal matrix h(z) is an entire function of z, also satisfies (2.5). Demanding T˜ (z, z) ¯ to be double-periodic, we obtain that h(z) is a constant matrix. We can remove this ambiguity by imposing the condition T (ε)e = e,

(2.6)

where e is a vector such that ei = 1 ∀i, and ε is an arbitrary point on Tτ . In what follows, we denote the matrix T (z) that solves Eq. (2.5) and satisfies Eq. (2.6) by T ε (z). Such matrices evidently form a group. Now we try to rewrite the Poisson structure (2.1) in terms of variables T and D. Since Di are G-invariant functions, they belong to the center of (2.1) and, therefore, it is enough to calculate the bracket {T ε (z), T ε (w)}. However, the straightforward calculation reveals that this bracket is ill defined. So, we begin with calculating the bracket {T ε (z), T η (w)}, where T ε (z) and T η (w) satisfy (2.6) at different points ε and η, η X Z δTijε (z) δTkl (w) η ε {Tij (z), Tkl (w)} = d2 z 0 d 2 w 0 {Amn (z 0 ), Aps (w0 )}. (2.7) 0 0) δA (z ) δA (w mn ps mnps To calculate the functional derivative

ε δTij (z) δAmn (z 0 ) ,

we consider the variation of (2.5):

¯ X(z) = t(z)D − Dt(z) − k ∂t(z) + d,

(2.8)

where X(z) = T −1 (z)δA(z)T (z), t(z) = T −1 (z)δT (z) and d = δD. First, from (2.8), we immediately obtain 1 δDi = T −1 ε (z)Tliε (z). δAkl (z) (τ − τ ) ik Let us introduce the function 8(z, s) of two complex variables

(2.9)

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

8(z, s) =

409

z¯ σ(z + s) −2ζ( 1 )zs 2πis τz− −τ . 2 e e σ(z)σ(s)

(2.10)

Here σ(z) and ζ(z) are the Weierstrass σ- and ζ-functions with periods equal to 1 and τ . The function 8(z, s) is the only double-periodic solution to the following equation: 2πis ¯ ∂8(z, s) + 8(z, s) = 2πiδ(z). τ −τ

(2.11)

It is also convenient to define 8(z, 0) as follows: 1 z − z¯ 1 8(z, 0) = lim 8(z, ε) − = ζ(z) − 2ζ( )z + 2πi . ε→0 ε 2 τ −τ This function solves the equation 2πi ¯ ∂8(z, 0) = 2πiδ(z) − . τ −τ τ −τ We introduce the notation qij ≡ qi − qj , where qi = 2πik Di . Using these functions, one can write the solution to (2.8) obeying the condition t(ε)e = 0 [20]: XZ d2 w(8(ε − w, qij )Xij (w)Eii − 8(z − w, qij )Xij (w)Eij ). (2.12) t(z) = κ i,j 1 Hereafter, we denote 2πik by κ. Performing the variation of Eq. (2.12) with respect to Amn (w) one gets

X δTijε (z) −1 ε ε = κ 8(ε − w, qjk )Tijε (z)Tjm (w)Tnk (w) δAmn (w) k ! X −1 ε ε ε − 8(z − w, qkj )Tik (z)Tkm (w)Tnj (w) . k

To compute the bracket (2.7), one needs the following relation between T ε (z) and T (z): (2.13) T ε (z) = T η (z)H ηε , η

where H ηε is a constant diagonal matrix. By direct computation, one finds 1 ε εη {T (z), T2η (w)} = T1ε (z)T2η (w)(H1εη H2ηε r12 (z, w) − αf εη (z, w)), κ 1 where εη (z, w) = r12

X

8(ε − η, qij )Eii ⊗ Ejj +

ij

−

X ij

and

X

(2.14)

8(z − w, qij )Eij ⊗ Eji (2.15)

ij

8(z − η, qij )Eij ⊗ Ejj +

X ij

8(w − ε, qij )Ejj ⊗ Eij (2.16)

410

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

f εη (z, w) = 8(ε − η, 0) + 8(ω − ε, 0) + 8(z − ω, 0) − 8(z − η, 0). The bracket (2.14) has the r-matrix form with the r-matrix depending not only on coordinates qi but also on the additional variables H. In the limit η → ε, one encounters the singularity. This shows that the variable T (z) is not a good candidate to describe the Poisson structure (2.1). However, one can use the freedom to multiply T (z) by any functional of A. So, we introduce a new variable Tε (z) = T ε (z)(det T ε (ε))β .

(2.17)

We use det T ε (ε) in the definition of Tε (z) in order to have the group structure for the new variables. Using the Poisson bracket (2.14) one immediately finds 1 ε εη {T (z), Tη2 (w)} = Tε1 (z)Tη2 (w) H1εη H2ηε r12 (z, w) − αf εη (z, w) κ 1 εη εη + βI ⊗ tr 3 H3εη H2ηε r32 (ε, w) + β tr 3 H1εη H3ηε r13 (z, η) ⊗ I εη ηε εη 2 (2.18) + β tr 34 H3 H4 r34 (ε, η) I ⊗ I , since f (ε, w) = f (z, η) = 0. Now we are going to pass to the limit η → ε.1 For this purpose one should take into account the following behavior of H εη when η goes to ε: H εη = 1 + (ε − η)h + o(ε − η), where h is a constant diagonal matrix being the functional of A. It turns out that there exists a unique choice for α and β, namely, α = 1/N , β = −1/N , for which the singularities cancel and there is no contribution from the matrix h. In the limit η → ε = 0, 2 for these values of α and β, one gets 1 {T1 (z), T2 (w)} = T1 (z)T2 (w)r12 (z, w). κ

(2.19)

Here the limiting r-matrix is given by r12 (z, w) = r12 (z, w) − s(z) ⊗ I + I ⊗ s(w) − where r(z, w) =

X

8(qij )Eii ⊗ Ejj +

i6=j

−

X ij

and s(z) =

X

1 8(z − w, 0)I ⊗ I, N

8(z − w, qij )Eij ⊗ Eji

ij

8(z, qij )Eij ⊗ Ejj +

X

8(w, qij )Ejj ⊗ Eij

1 X 8(qij )Eii − 8(z, qij )Eij . N ij

1 8(qij ) = ζ(qij ) − 2ζ( )qij , i 6= j, 2 8(qii ) = 0. 1

(2.21)

ij

Here we denote by the function 8(qij ) the regular part of 8(ε, qij ) for ε → 0:

2

(2.20)

Without loss of generality we assume that ε and η are real. The ε-dependence can be easily restored by shifting both z and w by ε.

(2.22)

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

411

Note that both r and r are skew-symmetric: r12 (z, w) = −r21 (w, z). A natural conjecture is that the r-matrix obtained satisfies the classical Yang–Baxter equation [[r, r]] ≡ [r12 (z1 , z2 ), r13 (z1 , z3 ) + r23 (z2 , z3 )] + [r13 (z1 , z3 ), r23 (z2 , z3 )] = 0. (2.23) It can be verified either by direct calculation or by considering the limiting case of the Jacoby identity for the bracket (2.18) as is done in Appendix A. Thereby, the r-matrix (2.20) is an N -parameter solution of the classical Yang–Baxter equation. Let us note that, as one could expect, the condition det T(0) = 1 is compatible with the bracket (2.19), since det T(z) is a central element of algebra (2.19). Remark that the choice α = 1/N corresponds to the case where only the sl(N )(z, z)¯ subalgebra is centrally extended. In terms of T(z) (β = −1/N ), the boundary condition looks like T(0)e = λe and det T(0) = 1. One can also check that the field A(z) defined by (2.5) with the substitution T(z) for T (z) obeys Poisson algebra (2.1). The next step is to consider the special parameterization for the field g(z). To this ˜ end, we introduce A(z): ¯ −1 . ¯ −1 + k tr ∂gg A˜ = gAg −1 − k ∂gg N

(2.24)

˜ One can check that A(z) Poisson commutes with A(w) and obeys the Poisson algebra: ∂ 1 1 {A˜ 1 (z), A˜ 2 (w)} = − [C, A˜ 1 (z) − A˜ 2 (w)]δ(z − w) + k(C − I) δ(z − w), 2 N ∂ z¯ (2.25) {A˜ 1 (z), g2 (w)} = Cg2 (w)δ(z − w). ˜ Now we factorize A(z) in the same manner as was done for A(z), −1 ¯ ˜ A(z) = U(z)DU−1 (z) − k ∂U(z)U (z),

(2.26)

where U(z) satisfies the boundary condition U(0)e = λe and det U(0) = 1. Obviously, U(z) Poisson commutes with T(w) and satisfies the Poisson algebra 1 {U1 (z), U2 (w)} = −U1 (z)U2 (w)r12 (z, w). κ

(2.27)

One can find from (2.24) and (2.26) the representation for the field g, g(z) = (det g(z)) N U(z)PT−1 (z), 1

(2.28)

where P is a constant diagonal matrix. Computing the determinants of both sides of Eq. (2.28) one gets det P = det (T(z)/U(z)). Since the l.h.s. does not depend on z and det T(0) = det U(0) = 1 we obtain that det P = 1 and det T(z) = det U(z). Calculating the Poisson brackets of P with P and Q = diag(q1 , . . . qN ) in the same manner as above one reveals that {P1 , P2 } = 0, X 1 1 {Q1 , P2 } = P2 ( Eii ⊗ Eii − I ⊗ I). κ N ii

(2.29) (2.30)

412

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

P In fact, it means that log Pi = pi − N1 i pi , where pi are canonically conjugated to qi . For the remaining Poisson brackets of P with the fields T, U, we have 1 {T1 (z), P2 } = T1 (z)P2 r¯ 12 (z), κ 1 {U1 (z), P2 } = U1 (z)P2 r¯ 12 (z). κ Here r¯ 12 (z) = r¯12 (z) − s(z) ⊗ I −

X 1 I⊗ 8(qij )Ejj , N ij

where we introduced the r-matrix: ¯ X X r¯12 (z) = 8(qij )Eii ⊗ Ejj − 8(z, qij )Eij ⊗ Ejj . ij

(2.31) (2.32)

(2.33)

(2.34)

ij

To complete the description of the classical Poisson structure of the cotangent bundle we present the Poisson bracket of det g with other variables: 1 {Q, det g(w)} = det g(w), κ

{P, det g(w)} = 0,

1 {T(z), det g(w)} = − det g(w)T(z)(8(z − w, 0) + 8(w, 0)), κ 1 {U(z), det g(w)} = − det g(w)U(z)(8(z − w, 0) + 8(w, 0)). κ Recall that the Jacoby identity for bracket (2.19) reduces to the classical Yang-Baxter equation for the r-matrix. As to the Poisson relations (2.31) and (2.32), one finds that the Jacoby identity is equivalent to the following quadratic in the r¯ equation: [¯r12 (z), r¯ 13 (z)] − P3−1 {¯r12 (z), P3 } + P2−1 {¯r13 (z), P2 } = 0

(2.35)

and the equation involving r and r¯ , [r12 (z, w), r¯ 13 (z) + r¯ 23 (w)] + [¯r13 (z), r¯ 23 (w)] − P3−1 {¯r12 (z, w), P3 } = 0.

(2.36)

One can check by direct calculations that the matrices r and r¯ given by (2.20) and (2.33) do solve these equations. Finally, we remark that the fields A(z) and g(z) defined by (2.5) and (2.28) obey Poisson relations (2.1–2.3). Let us note that the Poisson relation for the generator W(z) = T−1 (z)U(z) turns out to be the Sklyanin bracket: 1 {W1 (z), W2 (w)} = [r12 (z, w), W1 (z)W2 (w)], κ

(2.37)

which, therefore, defines the structure of a Poisson-Lie group. This group is an infinitedimensional analog of the Frobenius group appeared in [18], where the Poisson-Lie group structure was related to the existence of a non-degenerate two-cocycle on the corresponding Lie algebra. It would be interesting to find a similar interpretation in the infinite-dimensional case.

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

413

2.2. Classical L-operator. In this subsection, we define a special function on the cotangent bundle, which we call the classical L-operator. The motivation to treat this function as the L-operator is that the Poisson algebra of L is equivalent to the one found in [9] for the L-operator of the elliptic RS model. Denote by L the following function L(z) = T−1 (z)g(z)T(z) = (det g(z)) N T−1 (z)U(z)P. 1

(2.38)

By using the formulas of the previous subsection, one can easily derive the Poisson bracket of L and Q: X 1 {Q1 , L2 (z)} = L2 (z) Eii ⊗ Eii , κ i

(2.39)

and the Poisson algebra of the L-operator, 1 {L1 (z), L2 (w)} = r12 (z, w)L1 (z)L2 (w) κ + L1 (z)L2 (w)(¯r12 (z) − r¯ 21 (w) − r12 (z, w)) + L1 (z)¯r21 (w)L2 (w) − L2 (w)¯r12 (z)L1 (z).

(2.40)

Clearly, the generators Q and L form a Poisson subalgebra in the Poisson algebra of the cotangent bundle. An important feature of this subalgebra is that In (z) = tr Ln (z) form a set of mutually commuting variables. Just as in the finite-dimensional case [18], one can see from (2.39) that the L-operator admits the following factorization: L(z) = W (z)P , where Q and log P are canonically conjugated variables, the W -algebra coincides with (2.37), and the bracket of W and P is 1 {W1 (z), P2 } = −P2 [¯r12 (z), W1 (z)]. (2.41) κ In fact, everything that we need to quantize the L-operator algebra (2.40) is prepared. The problem of quantization is reduced to finding the quantum R and R-matrices satisfying the quantum analogs of Eqs. (2.23), (2.35), and (2.36), R12 (z1 , z2 )R21 (z2 , z1 ) = 1, R12 (z1 , z2 )R13 (z1 , z3 )R23 (z2 , z3 ) = R23 (z2 , z3 )R13 (z1 , z3 )R12 (z1 , z2 ), R12 (z1 , z2 )R13 (z1 )R23 (z2 ) = R23 (z2 )R13 (z1 )P3−1 R12 (z1 , z2 )P3 ,

(2.42) (2.43) (2.44)

R12 (z)P2−1 R13 (z)P2 = R13 (z)P3−1 R12 (z)P3 .

(2.45)

These matrices are assumed to have the standard behavior near ~ = 0: R = 1 + ~r + o(~),

R = 1 + ~¯r + o(~),

where ~ is a quantization parameter. The problem formulated seems to be rather complicated due to the presence of the s-matrix in the classical r and r¯ -matrices. However, the Poisson algebra (2.40) possesses an important property allowing one to avoid the problem at hand. Namely, the matrix s(z) coming both in r and r¯ drops out from the r.h.s. of (2.40). Thereby, Eq. (2.40) can be eventually rewritten as

414

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

1 {L1 (z), L2 (w)} = r12 (z, w)L1 (z)L2 (w) κ − L1 (z)L2 (w)(r12 (z, w) + r¯21 (w) − r¯12 (z)) + L1 (z)r¯21 (w)L2 (w) − L2 (w)r¯12 (z)L1 (z).

(2.46)

F the sum Moreover, if we denote by r12 F r12 (z, w) = r12 (z, w) + r¯21 (w) − r¯12 (z),

(2.47)

then using (2.21) and (2.34) we obtain X X F (z − w) = − 8(qij )Eii ⊗ Ejj + 8(z − w, qij )Eij ⊗ Eji . r12 ij

(2.48)

ij

In this expression one can recognize the elliptic solution to the classical Gervais–Neveu– Felder equation [16, 15]: F F F F F (z1 − z2 ), r13 (z1 − z3 ) + r23 (z2 − z3 )] + [r13 (z1 − z3 ), r23 (z2 − z3 )] [r12 F (z1 −P3−1 {r12

− z2 ), P3 } +

F P2−1 {r13 (z1

− z3 ), P2 } −

F P1−1 {r23 (z2

(2.49)

− z3 ), P1 } = 0.

In fact, rF emerges as the semiclassical limit of the quantum R-matrix found in [15]. The absence of the s-matrix in the resulting L-operator algebra and the appearance of the rF -matrix show that there may exist a closed system of equations involving only r- and r-matrices ¯ in the classical case, and R- and R-matrices in the quantum one. In the next subsection we find the desired system of equations and describe a Poisson structure for which these equations ensure the fulfillment of the Jacoby identity. Note that the algebra (2.46) literally coincides with the one obtained in [20] by using the Hamiltonian reduction procedure. A mere similarity transformation of L turns algebra (2.46) to the one previously found in [9]. In contrast to [9] where (2.46) was derived by direct calculation with the usage of the particular form of the L-operator for the RS model, our treatment does not appeal to the particular form of L. 2.3. Quadratic Poisson algebra with derivatives. In the first subsection, we obtained the matrices r and r¯ obeying system of equations (2.23), (2.35) and (2.36). Clearly, these equations are not satisfied when substituting r and r¯ for r and r¯ . However, computing the l.h.s. of these equations after this substitution we arrive at the surprisingly simple result: [r12 (z1 , z2 ), r13 (z1 , z3 ) + r23 (z2 , z3 )] + [r13 (z1 , z3 ), r23 (z2 , z3 )] = −(∂1 + ∂2 )r12 (z1 , z2 ) + (∂1 + ∂3 )r13 (z1 , z3 ) − (∂2 + ∂3 )r23 (z2 , z3 ),

(2.50)

[r¯12 (z), r¯13 (z)] − P3−1 {r¯12 (z), P3 } + P2−1 {r¯13 (z), P2 } = −∂(r¯12 (z) − r¯13 (z)), (2.51) and [r12 (z1 , z2 ), r¯13 (z1 ) + r¯23 (z2 )] + [r¯13 (z1 ), r¯23 (z2 )] − P3−1 {r12 (z1 , z2 ), P3 } (2.52) = −(∂1 + ∂2 )r12 (z1 , z2 ) + ∂1 r¯13 (z1 ) − ∂2 r¯23 (z2 ). ∂ , where x = Re z. Note that Eqs. (2.35) and (2.36) are formulated with the Here ∂ = ∂x help of P. However, since all the matrices depend only on the difference qij = qi − qj , we simply replace P by P .

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

415

Comparing Eqs. (2.50–2.52) for r and r¯ with (2.23), (2.35), and (2.36) for r and r¯ , we come to the conclusion that the s(z)-matrix coming in r and r¯ effectively plays the role of the derivative with respect to the spectral parameter. It is worth mentioning that Eqs. (2.50)-(2.52) obtained for r and r¯ can be rewritten in the same form as Eqs. (2.23), (2.35), and (2.36) if we replace r and r¯ by r12 − ∂1 + ∂2 and r¯12 − ∂1 . In particular, for (2.50), we have [r12 −∂1 +∂2 , r13 −∂1 +∂3 ]+[r13 −∂1 +∂3 , r23 −∂2 +∂3 ]+[r12 −∂1 +∂2 , r23 −∂2 +∂3 ] = 0. (2.53) Thus, r12 − ∂1 + ∂2 is a matrix first-order differential operator satisfying the standard classical Yang-Baxter equation. Using this fact we write down the Poisson algebra generated by the fields T (z), U (z), Q and P , having Eqs. (2.50)–(2.52) as the consistency conditions: 1 {T1 (z), T2 (w)} = T1 (z)T2 (w)r12 (z, w) + T10 T2 − T1 T20 , κ 1 {U1 (z), U2 (w)} = −U1 (z)U2 (w)r12 (z, w) − U10 U2 + U1 U20 , κ 1 {T1 (z), P2 } = P2 T1 (z)r¯12 (z) + P2 T10 (z), κ 1 {U1 (z), P2 } = P2 U1 (z)r¯12 (z) + P2 U10 (z), κ X {Q1 , P2 } = P2 Eii ⊗ Eii , {Q1 , T2 } = {Q1 , U2 } = 0,

(2.54) (2.55) (2.56) (2.57) (2.58)

i

where T 0 = ∂T . It is worth mentioning that the Poisson structure (2.54–2.58) is not compatible with the boundary condition T (0)e = λe. Let us note that there exists a Poisson subalgebra of Poisson algebra (2.54–2.58), formed by the generators: ¯ (z)T −1 (z), g(z) = U (z)P T −1 (z) A(z) = T (z)DT −1 (z) − k ∂T that coincides with the Poisson algebra of the cotangent bundle with the central charge α = 0. Defining the L-operator as L(z) = T −1 (z)g(z)T (z) = T −1 (z)U (z)P , we get for L algebra (2.46) obtained previously. As in the previous subsection the commutativity of In (z) follows again from the one of g(z). The main advantage of Poisson algebra (2.54–2.58) is that it can be easily quantized. 3. Quantization 3.1. Quantum R-matrices. In this section, following the ideology of the Quantum Inverse Scattering Method [24, 25], we quantize the classical r and r-matrices ¯ and derive the quantum L-operator algebra. We start with quantization of the relations (2.50)-(2.52). Let T (z), U (z) be matrix generating functions being the formal Fourier series in variables x and y: X X Tmn e2πi(mx+ny) , U (z) = Umn e2πi(mx+ny) , T (z) = mn

mn

416

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

where z = x + τ y. Denote by A a free associative unital algebra over the field C generated by matrix elements of the Fourier modes of T (z), U (z), and by the entries of the diagonal matrices P and Q modulo the relations T1 (z)T2 (w − ~) = T2 (w)T1 (z − ~)R12 (−~, z, w), U1 (z)U2 (w + ~) = U2 (w)U1 (z + ~)R12 (~, z, w), T1 (z + ~)P2 R12 (~, z) = P2 T1 (z), U1 (z + ~)P2 R12 (~, z) = P2 U1 (z), X Eii ⊗ Eii , [Q1 , P2 ] = −~P2

(3.1) (3.2) (3.3) (3.4) (3.5)

i

[T1 (z), U2 (w)] = [T1 (z), Q2 ] = [U1 (z), Q2 ] = [P1 , P2 ] = [Q1 , Q2 ] = 0. Here R(~, z, w) and R12 (~, z) are double-periodic matrix functions of spectral parameters. These functions also depend on the coordinates qi and have the following semiclassical behavior at ~ = 0: R = 1 + ~r + o(~), R = 1 + ~r¯ + o(~).

(3.6)

The next step is to find the conditions on R and R that ensure the consistency of the defining relations for A. In the sequel we often use R(z, w) as a shorthand notation for R(~, z, w). First, we write down the compatibility condition for algebra (3.1) or (3.2), which reduces to the Quantum Yang-Baxter equation with spectral parameters shifted by ~, R12 (z, w)R13 (z − ~, s − ~)R23 (w, s) = R23 (w − ~, s − ~)R13 (z, s)R12 (z − ~, w − ~). (3.7) Analogously to the classical case, one can introduce the following matrix differential ∂ ∂ operator R(z, w) = e~ ∂w R(z, w)e−~ ∂z in terms of which Eq. (3.7) reads as the standard Quantum Yang-Baxter equation: R12 (z, w)R13 (z, s)R23 (w, s) = R23 (w, s)R13 (z, s)R12 (z, w).

(3.8)

Relation (3.1) also requires the fulfillment of the “unitarity" condition for R, R12 (z, w)R21 (w, z) = 1.

(3.9)

Analogously, we find the following compatibility conditions for (3.3): P3−1 R12 (z)P3 R13 (z − ~) = P2−1 R13 (z)P2 R12 (z − ~)

(3.10)

and P3−1 R12 (z, w)P3 R13 (z − ~)R23 (w) = R23 (w − ~)R13 (z)R12 (z − ~, w − ~). (3.11) Now taking into account (3.6) one can easily see that in the semiclassical limit 1 1 − {·, ·} = lim [·, ·], ~→0 ~ κ relations (3.1–3.5) determine Poisson structure (2.54–2.58), while Eqs. (3.7), (3.10), and (3.11) turn into (2.50), (2.51), and (2.52), respectively, in order ~2 . In the first order in ~, the unitarity condition (3.9) requires r to be skew-symmetric. Hence, the algebra A with

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

417

defining relations (3.1)–(3.5), where R and R are the solutions of (3.7–3.11) obeying (3.6), is a quantization of the Poisson structure (2.54–2.58). Now we are in a position to find the matrices R and R explicitly. We start with the R-matrix for which we assume the following natural ansatz: X X 8(~1 , qij + ~2 )Eii ⊗ Ejj + 8(z − w + ~3 , qij + ~4 )Eij ⊗ Eji f R(~, z, w) = −

X

ij

8(z + ~5 , qij + ~6 )Eij ⊗ Ejj +

ij

X

ij

8(w + ~7 , qij + ~8 )Ejj ⊗ Eij .

(3.12)

ij

This form is compatible with the structure of the classical r-matrix. Here ~1 , . . . , ~8 are arbitrary parameters that should be specified by Eqs. (3.7) and (3.9), and f is a scalar function that may depend only on ~i and spectral parameters. It turns out that the parameters hi are almost uniquely fixed by the unitarity condition (3.9). Substituting (3.12) into (3.9) and using the elliptic function identities we obtain ~2 = ~3 = ~4 = ~6 = ~8 = 0, ~5 = ~1 + ~7 , f 2 (z, w) = P(~1 ) − P(z − w), where P(z) is the Weierstrass P-function. Now it is a matter of direct calculation to check that Eq. (3.7) holds for ~1 = ~. The remaining parameter ~7 is inessential since it corresponds to an arbitrary common shift of the spectral parameters z and w. In the sequel, we choose ~7 = 0. Therefore, the obtained solution to (3.7) and (3.6) reads as follows: X X 8(~, qij )Eii ⊗ Ejj + 8(z − w, qij )Eij ⊗ Eji (3.13) f (z, w)R(~, z, w) = ij

−

X

ij

8(z + ~, qij )Eij ⊗ Ejj +

ij

X

8(w, qij )Ejj ⊗ Eij ,

ij

√ where f (z, w) = P(~) − P(z − w). One must be careful in the definition of R(−~, z, w). This matrix is defined by (3.13) with the replacement ~ → −~ and f → −f . Therefore, R(~, z, w) and R(−~, z, w) are related as R12 (−~, z, w) = R21 (~, w − ~, z − ~).

(3.14)

To find the R-matrix, we adopt the following ansatz: X X 1 R12 (~, z) = 8(~1 , qij +~2 )Eii ⊗Ejj − 8(z +h3 , qij +~4 +δij ~5 )Eij ⊗Ejj . σ(~) ij ij (3.15) It has almost the same matrix structure as the classical r-matrix. ¯ Since Eq. (3.10) is easier to deal with than Eq. (3.11), we first substitute (3.15) into Eq. (3.10) thus obtaining R: X X 1 R12 (~, z) = − 8(~, qij )Eii ⊗ Ejj 8(z + ~3 , qij )Eij ⊗ Ejj σ(~) i6=j i6=j X − 8(z + ~3 , −~) Eii ⊗ Eii , (3.16) i

where h3 remains unfixed.

418

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

Equation (3.11) involves both the R- and R-matrices and is independent on (3.7) and (3.10). One can verify by direct calculations that R and R given by Eqs. (3.13) and (3.16) also satisfy (3.11) as soon as ~3 = ~. One can easily check that in the case of real ~, the matrices R and R have the proper semiclassical behavior (3.6). −1 In what follows we also need the R -matrix, X X 1 −1 R12 (~, z) = − 8(−~, qij + ~)Eii ⊗ Ejj + 8(z, qij + ~)Eij ⊗ Ejj . (3.17) σ(~) ij ij It would be of interest to mention that just as in the rational case without the spectral parameter [18], one can introduce the formal variable W (z) = T −1 (z)U (z) with permutation relations following from (3.1–3.5): R12 (z, w)W1 (z)W2 (w + ~) = W2 (w)W1 (z + ~)R12 (z, w), W1 (z + ~)P2 R12 (z) = P2 R12 (z)W1 (z).

(3.18) (3.19)

In analogy with the rational case, it is natural to treat Eq. (3.18) as the defining relation of the quantum elliptic Frobenius group. 3.2. Quantum L-operator algebra. Just as in the classical case, we introduce a new variable: (3.20) L(z) = T −1 (z)U (z)P = W (z)P, which we call a quantum L-operator. Using the relations of the algebra A one can formally derive the following algebraic relations satisfied by the quantum L-operator: X Eii ⊗ Eii , (3.21) [Q1 , L2 (z)] = −~L2 (z) i

R12 (z, w)L1 (z)R21 (w)L2 (w) −1

= L2 (w)R12 (z)L1 (z)R21 (w − ~)R12 (z − ~, w − ~)R12 (z − ~).

(3.22)

In spite of the fact that L has the form L(z) = W (z)P , we can not reconstruct from Eqs. (3.21) and (3.22) the relations (3.18) and (3.19) for W and P . So, in the sequel, we do not assume any relations on W and P . Let us define −1 F (z, w) = R21 (w)R12 (z, w)R12 (z). (3.23) R12 Then, by using the explicit form of the R- and R-matrices and elliptic function identities, we obtain X F (z − w) = − 8(−~, qij )Eii ⊗ Ejj f R12 i6=j

+

X i6=j

8(z − w, qij )Eij ⊗ Eji + 8(z − w, ~)

X

Eii ⊗ Eii , (3.24)

i

which is nothing but the R-matrix by Felder [15], i.e., an elliptic solution to the quantum Gervais–Neveu–Felder equation [16, 15]:

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

419

F F F F F F P1−1 R23 (w−s)P1 R13 (z−s)P3−1 R12 (z−w)P3 = R12 (z−w)P2−1 R13 (z−s)P2 R23 (w−s). (3.25) Recall that one feature of RF is the “weight zero" condition: F (z − w)] = 0. [P1 P2 , R12

(3.26)

Developing RF in powers of ~, we have RF = 1 + ~rF + o(~), where rF is given by (2.48). Let us stress that in our consideration RF arises to account for the explicit form of R and R, and that the Gervais–Neveu–Felder equation does not follow from system (3.7–3.11). Formula (3.23) shows that the matrix R plays the role of the twist, which transforms the matrix R(z, w) – a particular solution of (3.7) – into such a solution of (3.25) which depends only on the difference z − w. Thus, the quantum L-operator algebra (3.22) can be presented in the following form: F (z − w). R12 (z, w)L1 (z)R21 (w)L2 (w) = L2 (w)R12 (z)L1 (z)R12

(3.27)

The quantum L-operator algebra seems to be automatically compatible as A is compatible. However, a simple analysis shows that A and the algebra (3.27) admit different supplies of representations. In particular, the simplest representation for L we present below does not realize the algebra (3.18), (3.19). Therefore, we find it necessary to give a direct proof of the compatibility of (3.27). In this way, we come across Eq. (3.25) and discover a new relation involving RF and R. To this end, let us multiply both sides of (3.27) by P2−1 R31 (s + ~)P2 R32 (s)L3 (s) and subsequently using Eq. (3.27) we transform the string L1 · · · L2 · · · L3 into L3 · · · L2 · · · L1 . For the l.h.s., we have R12 (z, w)L1 R21 (w)L2 P2−1 R31 (s + ~)P2 R32 (s)L3 = R12 (z, w)L1 R21 (w)R31 (s + ~)L2 R32 (s)L3 = F (w − s) = R12 (z, w)L1 R21 (w)R31 (s + ~)R32 (s, w)L3 R23 (w)L2 R23 F (w − s) = R12 (z, w)R32 (s + ~, w + ~)L1 R31 (s)L3 P3−1 R21 (w + ~)P3 R23 (w)L2 R23

R12 (z, w)R32 (s + ~, w + ~)R31 (s, z)L3 R13 (z)L1 × F F (z − s)P3−1 R21 (w + ~)P3 R23 (w)L2 R23 (w − s). R13

(3.28)

At this point, we interrupt the chain of calculations by remarking that the next step implies the possibility to push RF somehow through P3−1 R21 (w + ~)P3 R23 (w). It can be done by virtue of the following new relation involving RF and R: F F (z)P2−1 R31 (w)P2 R32 (w − ~) = P1−1 R32 (w)P1 R31 (w − ~)R12 (z), R12

(3.29)

which can be checked directly by using the explicit forms (3.24) and (3.16) of RF and R respectively. Now we pursue calculation of (3.28) with the relation (3.29) at hand: R12 (z, w)R32 (s + ~, w + ~)R31 (s, z)L3 R13 (z)R23 (w + ~) × F F (z − s)P2 R23 (w − s). L1 R21 (w)L2 P2−1 R13 As to the r.h.s., the same technique yields

420

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov F L2 R12 (z)L1 R12 (z − w)P2−1 R31 (s + ~)P2 R32 (s)L3 = F (z − w)P3 = L2 R12 (z)R32 (s + ~)L1 R31 (s)L3 P3−1 R12 F F (z − s)P3−1 R12 (z − w)P3 L2 R12 (z)R32 (s + ~)R31 (s, z)L3 R13 (z)L1 R13

= R31 (s + ~, z + ~)L2 R32 (s)L3 P3−1 R12 (z + ~)P3 × F F R13 (z)L1 R13 (z − s)P3−1 R12 (z − w)P3 =

R31 (s + ~, z + ~)R32 (s, w)L3 R23 (w)R13 (z + ~)L2 R12 (z)L1 × F F F (w − s)P1 R13 (z − s)P3−1 R12 (z − w)P3 = P1−1 R23

(3.30)

R31 (s + ~, z + ~)R32 (s, w)R12 (z + ~, w + ~)L3 R13 (z)R23 (w + ~)L1 × F F F F R21 (w)L2 R21 (w − z)P1−1 R23 (w − s)P1 R13 (z − s)P3−1 R12 (z − w)P3 . Therefore, comparing the resulting expressions we conclude that the compatibility condition for the L-operator algebra (3.27) reduces to four equations (3.7), (3.11), (3.24), and (3.29). The existence of the Poisson commuting functions In (z) in the classical case implies that the commuting family should exist in the quantum case as well. It should be the intrinsic property of the algebra (3.27) itself, without referring to the explicit form of its representations. Let us demonstrate the commutativity of the simplest quantities tr L(z) and tr L−1 (z) postponing the discussion of the general case to the next section. To this end, we need one more relation involving the matrices RF , R and R. In analogy with the rational case, it is useful to introduce the variable g(z) = U (z)P T −1 (z). Calculation of the commutator [g1 (z), g2 (w)] with the help of the defining relations of A results in [g1 (z), g2 (w)] = U2 (w)U1 (z + ~) R12 (~, z, w)P1 R21 (w)P2 R12 (z − ~)R12 (−~, z, w) − P2 R12 (z)P1 R21 (w − ~) T2−1 (w − ~)T1−1 (z). When the spectral parameter is absent, the algebra A allows one to establish a connection with the quantum cotangent bundle (see [18] for details). Then, in particular, the quantity [g1 , g2 ] is equal to zero. In the case at hand, we can not construct a subalgebra of A that is isomorphic to the quantum cotangent bundle. However, one can note that in the elliptic case, the commutativity of g(z) with g(w) follows from the identity R12 (~, z, w)P1 R21 (w)P2 R12 (z − ~)R12 (−~, z, w) = P2 R12 (z)P1 R21 (w − ~). Using the definition of RF , Eq. (3.14), and the "weight zero" condition (3.26), the last formula can be written in the following elegant form: −1

F (z − w)P1 R21 (w)P1−1 . R12 (z, w) = P2 R12 (z)P2−1 R12

(3.31)

Identity (3.31) plays the primary role in proving the commutativity of the family tr L(z) . To prove the commutativity, let us multiply both sides of F (w − z) L2 (w)R12 (z)L1 (z) = R12 (z, w)L1 (z)R21 (w)L2 (w)R21 −1

by P2 R12 (z)P2−1 and take the trace in the first and the second matrix spaces. We get −1

tr 12 P2 R12 (z)P2−1 L2 (w)R12 (z)L1 (z) = −1

F tr 12 P2 R12 (z)P2−1 R12 (z, w)L1 (z)R21 (w)L2 (w)R21 (w − z) .

(3.32)

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

421

−1

It is useful to write R12 in the factorized form X −1 −1 R12 (z) = Rij ⊗ Ejj ,

(3.33)

ij

where

−1

Rij = −σ(~)8(−~, qij + ~)Eii + σ(~)8(z, qij + ~)Eij . Then the l.h.s. of (3.32) reads as X −1 tr 12 (Pj Rij Pj−1 ⊗ Ejj )L2 R12 (z)L1 . ij

Taking into account that R12 is diagonal in the second matrix space and using the cyclic property of the trace, we obtain X −1 tr 12 (Pj Rij Pj−1 ⊗ I)(I ⊗ LEjj )R12 (z)L1 . ij

Since L = W P , where all entries of W commute with qi , we arrive at X −1 tr 12 (Pj Rij Pj−1 ⊗ I)(I ⊗ W Pj Ejj )R12 (z)L1 = ij

X

−1

tr 12 L2 (Rij ⊗ Ejj )R12 (z)L1 = tr L(w) tr L(z) .

ij

As to the r.h.s. of (3.32), we use identity (3.31) to rewrite it as −1

F F (z − w)P1 R21 (w)P1−1 L1 (z)R21 (w)L2 (w)R21 (w − z) . tr 12 R12

Having in mind that R21 is diagonal in the first matrix space and taking into account the property (3.26) one can easily see that under the trace sign, the matrix RF can be F . Therefore, we get pushed to the right where it cancels with R21 −1

tr 12 P1 R21 (w)P1−1 L1 (z)R21 (w)L2 (w) .

(3.34)

Now applying to this expression the technique we used above for the l.h.s of (3.32) we conclude that Eq. (3.34) is equal to tr L(z) tr L(w) . Thus, we proved that tr L(z) commutes with tr L(w) . Quite analogously one can prove that tr L−1 (z) commutes with tr L−1 (w) and with tr L(w) . Now we give an example of the simplest representation of algebra (3.21) and (3.27) associated with the elliptic RS model. Namely, the following L-operator satisfies algebra (3.27): X 8(z, qij + γ)bj Pj Eij , (3.35) L(z) = ij

where bj =

Y

8(γ, qaj ).

(3.36)

a6=j

Here the parameter γ is a coupling constant of the elliptic RS model. This can be checked by straightforward calculations. Some comments are in order. The L-operator of the form

422

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

(3.35) already appeared in [20] as a result of the Hamiltonian reduction procedure applied d )(z, z). ¯ To guess the explicit form of bj one should note that in the rational to T ? GL(N limit algebra (3.27) tends to the one obtained in [18], where the coefficients bj are found to be Y qaj + γ . (3.37) bj = qaj a6=j

Therefore, it is natural to assume that the elliptic analog of (3.37) is given by (3.36). It is worthwhile to mention that bj are not uniquely defined since one can perform a canonical transformation of (q, p)-variables. In particular, the variables b˜ j =

Y σ(qaj + γ) σ(γ)σ(qaj )

(3.38)

a6=j

are related to bj by the canonical transformation qi → qi and P Pi → eα a qai Pi , γ¯ where α = 2ζ(1/2)γ − i γ− τ −τ¯ . We call this L the quantum L-operator of the elliptic RS model. Indeed, taking the Hamiltonian to be H = tr L(z) one can see that the quantum canonical transformation of the form: Y σ(qai + γ) 1/2 Y σ(qai ) 1/2 R Pi , (3.39) Pi = σ(qai ) σ(qai − γ) a6=i

a6=i

where PiR is the momentum in the Ruijsenaars Hamiltonian, turn H into the first integral S1 from the Ruijsenaars commuting family [10]. Moreover, after the canonical transformation (3.39), the L-operator (3.35) coincides essentially with the classical L-operator of the RS model. The generating function for the commuting family in terms of L can be written as X (−µ)N −k Ik (z), (3.40) I(z, µ) =: det (L(z) − µ) := k

where the normal ordering :: means that all momentum operators are pushed to the right. It follows from the results obtained in the next section. 4. Connection to the Fundamental Relation RLL = LLR In this section we establish a connection of the quantum L-operator algebra (3.27) with the fundamental relation RLL = LLR. In [21], the operators from the Ruijsenaars commuting family were obtained by using a special representation Lˆ of the algebra B B (z − w)Lˆ 1 (z)Lˆ 2 (w) = Lˆ 2 (w)Lˆ 1 (z)R12 (z − w), R12

(4.1)

where RB (z) is Belavin’s R-matrix being an elliptic solution to the quantum YangBaxter equation [26]. The explicit form of RB we use here can be found in [27]. For the reader’s convenience we recall a construction of Lˆ [28, 29].

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

423

Denote by h∗ the weight space for slN (C) that can be realized in CN with a basis PN i , < i , j >= δij , as the orthogonal complement to i=1 i . Let ¯k be the orthogonal P N projection of k : ¯k = k − N1 i=1 i . ∗ For each q ∈ h one can introduce the intertwining vectors [32, 33] z ¯k − < q, ¯k >)/iη(τ ), (4.2) φ(z)q+~ j = θj ( q N where X n2 1 θj (z) = τ exp 2πi n(z + ) + 2 2N n∈ N −j+N Z 2 Q∞ and η(τ ) = p1/24 m=1 (1 − pm ) is the Dedekind eta function with p = exp 2πiτ . ¯k j ¯k ¯ q+~ Following [28] we denote by φ(z) the entries of the matrix inverse to φ(z)q+~ j. q q Then the orthogonality relations read as follows n X

¯k j ¯k0 ¯ q+~ φ(z) φ(z)q+~ j = δkk0 q q

j=1

n X

q+~¯k j 0 ¯k ¯ φ(z)q+~ = δjj 0 . j φ(z)q q

(4.3)

k=1

In the sequel, the following formula [21] will be of intensive use n X

0

¯i ¯ q0 +~¯j m φ(z)q+~ φ(z) m q q

m=1

=

θ(z+ < q 0 , ¯j > − < q, ¯i >) Y θ(< q 0 , ¯j 0 > − < q, ¯i >) θ(z) θ(< q 0 , ¯j 0 > − < q 0 , ¯j >) 0

(4.4)

j 6=j

Here θ(z) denotes the Jacoby θ-function X 2 1 1 1 2 1 e2πi(z+ 2 )(n+ 2 )+iπτ (n+ 2 ) = θ0 (0)e−ζ( 2 )z σ(z). θ(z) = − n

ˆ It is shown in [29, 30] that the L-operator Lˆ ij (z) =

N X

¯k ¯ q+~¯k j ~¯k ∂qk φ(z + γN )q+~ e , i φ(z)q q ∂

(4.5)

k=1

acting on the space of functions on h∗ satisfies relation (4.1). This Lˆ is an N × N generalization of the 2 × 2 Sklyanin L-operator [31]. ¯k ˆ The intertwining vectors φ(z)q+~ j coming in the definition of L relate the matrix q (1) B R with the Boltzmann weights for the An−1 face model. Recall [33] that the nonzero Boltzmann weights depending on the spectral parameter z are explicitly given by   q + ~¯i ˇ  q z q + 2~¯i  = θ(z + ~) , (4.6) W θ(~) q + ~¯i   q + ~¯i ˇ  q z q + ~(¯i + ¯j )  = θ(−z + qij ) (i 6= j), W θ(qij ) q + ~¯i   q + ~¯i θ(z) θ(~ + qij ) ˇ q z q + ~(¯i + ¯j )  = W (i 6= j), θ(~) θ(qij ) q + ~¯ j

424

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

where qij =< q, ¯i − ¯j >. The relation between RB and the face weights is given by X i0 j 0

X

0 0

q+~(¯k +¯m ) ¯k RB (z − w)iijj φ(z)q+~ i0 φ(w)q+~¯k j0 = q

(4.7)



q+~(¯k +¯m ) ¯s φ(w)q+~ j φ(z)q+~¯s i q

s

 q + ~¯k ˇ  q z − w q + ~(¯k + ¯m )  . W q + ~¯s

In what follows we use the concise notation   q + ~¯k ˇ  q z − w q + ~(¯k + ¯m )  . Wsk [k + m] = W q + ~¯s Then the dual relation to (4.7) is X q+~(¯k +¯m ) i0 B ¯k j 0 ¯ ¯ q+~ φ(w) φ(z)q+~ R (z − w)ij q i0 j 0 ¯k i0 j 0

=

X

q+~(¯k +¯m ) j ¯s i ¯ ¯ q+~ φ(w)q+~ Wks [k + m]φ(z) . q ¯s

(4.8)

s

In [21] another L-operator L˜ appeared. It is related to Lˆ in the following way: Lˆ ij (z) → L˜ ij (z) =

X

¯i i0 ¯j ˆ ¯ q+~ φ(z) φ(z)q+~ j 0 Li0 j 0 (z) q q

(4.9)

i0 j 0

=

θ(z + γ + qij ) Y θ(γ + qnj ) ~¯j ∂q∂j e . θ(z) θ(qni ) n6=i

In what follows we need to remove from the quantum L-operator algebra (3.27) the nonholomorphic dependence on the spectral and quantization parameters. This can be achieved by considering the following transformation of the L-operator: L(z) → eα(z)Q e−βQ L(z)eβQ e−α(z)Q ,

(4.10)

where α(z) is an arbitrary function of the spectral parameter and β is a complex number. Since the transformed L-operator also has the structure W P , then the following formula is valid: (4.11) L2 (w)eα(z)Q1 = eα(z)Q1 L2 (w)e~α(z)r0 , P where the notation r0 = i Eii ⊗ Eii was used. Recalling that the L-operator (3.35) satisfies the quantum L-operator algebra (3.27) and using Eq. (4.11) one can easily establish the algebra satisfied by the transformed L: ˇ (w)L (w) = L (w)R ˇ (z)L (z)Rˇ F (z − w), Rˇ 12 (z, w)L1 (z)R 21 2 2 12 1 12 ˇ and Rˇ F are ˇ R where the matrices R,

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

425

Rˇ 12 (z, w) = e−α(z)Q1 −α(w)Q2 +βQ2 R12 (z, w)eα(z)Q1 +α(w)Q2 −βQ1 , ˇ (z) = e~α(z)r0 −α(z)Q1 +βQ2 R (z)eα(z)Q1 −βQ1 , R 12

12

F (z, w) = Rˇ 12 ~

(4.12) (4.13) (4.14)

~

F (z, w)eα(z)Q1 +α(w)Q2 + 2 (α(w)−α(z))r0 −βQ2 . e−α(z)Q1 −α(w)Q2 + 2 (α(w)−α(z))r0 +βQ1 R12

Since the transformation in question keeps the form of the quantum L-operator algebra ˇ and Rˇ F also satisfy all the compatibility conditions. ˇ R intact, the transformed matrices R, In particular, the transformation (4.14) defines another solution of (3.25). For β = 0 this was observed in [14]. ¯ h z−z¯ For the particular choice β = 2πi h− τ −τ¯ and α(z) = 2πi τ −τ¯ + β we find that the ˇ and Rˇ F are given by the same formulae (3.13), (3.16), (3.24) with the ˇ R matrices R, z−z¯

θ(z+s) 0 θ(z+s) θ (0)e2πi τ −τ¯ by the meromorphic function θ(z)θ(s) . The total change of 8(z, s) = θ(z)θ(s) transformation with such a choice of α and β transforms (up to an unessential multiplier) the L-operator (3.35) into

Lij (z) =

θ(z + qij + γ) Y θ(qnj + γ) Pj , θ(z)θ(qij + γ) θ(qnj )

(4.15)

n6=j

which is a quasi-periodic meromorphic matrix function of the spectral parameter: L(z + 1) = L(z), L(z + τ ) = e−2πi(γ+~) e−2πiQ L(z)e2πiQ . ~¯

∂

We assume that the L-operator is of the form W P, where Pi = e i ∂qi . The replacement of P by P preserves all the consistency conditions because the R-matrices depend only on the difference qi − qj . Thus, Eq. (3.27) with R-matrices defined via 8(z, s) = θ(z+s) θ(z)θ(s) refers to the meromorphic version of the quantum L-operator algebra while (4.15) provides its particular meromorphic representation. In what follows we use only this meromorphic version. Comparing (4.15) to (4.9) we can read off that L and L˜ are related in the following way: Q n6=j θ(qnj ) Lij (z). (4.16) L˜ ij (z) = Q n6=i θ(qni ) Since the combined transformation (4.9), (4.16) from Lˆ to L depends only on q we can conjecture that any representation L of the quantum L-operator algebra (3.27) is gauge equivalent to some representation Lˆ of (4.1) with a gauge-equivalence defined as Q X 0 θ(qnj 0 ) q+~¯j 0 j q+~¯i0 ¯ ˆ Qn6=j Li0 j 0 (z). φ(z)q (4.17) Lij (z) = i φ(z)q n6=i0 θ(qni0 ) 0 0 ij

Now we are in a position to prove this conjecture. Suppose Lˆ is an abstract L-operator satisfying algebra (4.1) and introduce L˜ by Eq. (4.9). Assume that L˜ has the structure ~¯ ∂ W P, where the entries of the diagonal matrix P are Pi = e i ∂qi and the entries of W commute with qi . Then substituting Lˆ expressed via L˜ in (4.1) and performing the ˜ straightforward calculation with the use of (4.8) one finds an algebra satisfied by L:

426

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

X i0 j 0

¯a +¯c ) j ¯a i0 ¯b ¯ ¯ q+~ φ(w)q+~( Wka [a + c]δa+c,i+k φ(z) φ(z)q+~ q q q+~¯a i0

0

abcd

¯ q+~(¯j +¯d ) L˜ bj (z)L˜ dl (w) = φ(w) q+~¯j j0 X ¯k +¯i ) j 0 ¯k i0 ¯d ¯ ¯ q+~ φ(z)q+~( Waj [j + l]δj+l,a+c φ(w) φ(w)q+~ q q q+~¯k i0 i0 j 0

abcd

¯ q+~(¯c +¯b ) L˜ dc (w)L˜ ba (z). φ(z) q+~¯c j0 Performing the summation in i0 and j 0 with the help of (4.4) we obtain X

Wkb [b + a]δa+b,i+k

abc

θ(w + qac + ~δab − ~δjc ) θ(w)

Y θ(qnc + ~δnb − ~δjc ) L˜ bj (z)L˜ cl (w) = θ(qna + ~δnb − ~δab )

n6=a

X

Waj [j + l]δj+l,a+c

abc

θ(z + qib + ~δik − ~δbc ) θ(z)

Y θ(qnb + ~δnk − ~δbc ) n6=i

θ(qni + ~δnk − ~δik )

L˜ kc (w)L˜ ba (z).

Let us introduce an operator L by inverting (4.16). Substituting this L in the last formula, taking into account the nonzero components of the face weights and multiplying both sides by the function Q Q θ(qni + ~δnk − ~δik ) n6=k θ(qnk ) Q Qn6=i , θ(q ) nj n6=j n6=l θ(qnl + ~δnj − ~δjl ) we finally arrive at the algebra satisfied by L: Q X n6=k θ(qns − ~δjs ) k k θ(w + qks + ~ − ~δjs ) Q Lkj (z)Lsl (w) + Wk [2k]δi θ(w) n6=s θ(qns + ~δnj − ~δjs ) s Q X θ(w + qis − ~δjs ) n6=i θ(qns + ~δnk − ~δjs ) k Q Lkj (z)Lsl (w) + Wk [k + i] θ(w) n6=s θ(qns + ~δnj − ~δjs ) s Q X θ(qki +~) θ(w+qks −~δjs ) n6=k θ(qns +~δni −~δjs ) i Q Lij (z)Lsl (w) Wk [i + k] θ(qki −~) θ(w) n6=s θ(qns +~δnj −~δjs ) s X j θ(z + qis + ~δik − ~δls ) θ(qlj + ~) = + 2δlj × Wj [j + l] θ(q − ~) θ(z) lj s Q n6=i θ(qns + ~δnk − ~δls ) Lkl (w)Lsj (z) + ×Q n6=s θ(qns + ~δnl − ~δls ) Q X j θ(z + qis + ~δik − ~δjs ) n6=i θ(qns + ~δnk − ~δjs ) Q Lkj (w)Lsl (z). Wl [j + l] θ(z) n6=s θ(qns + ~δnj − ~δjs ) s (4.18)

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

427

Here in the second and the third lines i 6= k. The ratio of products of theta-functions occurring in each term in (4.18) allows one to take off the sum over s, e.g., when i 6= k, we have Q θ(qns + ~δnk − ~δjs ) Qn6=i = n6=s θ(qns + ~δnj − ~δjs ) θ(qki + ~)θ(qji ) θ(qki ) |j6=k + δkj (1 − δij ) + | j6=i + δis δij θ(qki − ~) θ(qki )θ(qji + ~) j6=k θ(qjk )θ(~) θ(~) + |j6=i + δks (1 − δkj ) δij θ(qik + ~) θ(qjk + ~)θ(qik ) θ(qkj )θ(~) δjs |j6=i . θ(qkj − ~)θ(qji + ~) To compare (4.18) to (3.27) we rewrite relation (3.27) with the help of Eq. (3.31) in the following form: −1

F (z − w)P1 R21 (w)P1−1 L1 (z)R21 (w)L2 (w) = R12 −1

F P2 R12 (z)P2−1 L2 (w)R12 (z)L1 (z)R12 (z − w).

(4.19)

In the component form algebra (4.19) is presented in Appendix C. Comparing the components of (4.18) to the ones of (4.19) we establish that they coincide up to the overall multiplicative factor θ(z − w)θ(~)2 . Thus, we have shown that any representation of algebra (3.27) by the transformation (4.17) turns into a representation of (4.1). The connection established gives the right to assert that algebra (3.27) possesses a family of N -commuting integrals and that the formula of the determinant type (3.40) for the commuting family proved in [21] is also valid for the L-operator (3.35).

5. Conclusion In this paper, we described the dynamical R-matrix structure of the quantum elliptic RS model. The quantum L-operator algebra possesses a family of commuting operators. It turns out that this algebra has a surprisingly simple structure and can be analysed explicitly in the component form. Furthermore, one can hope that the problem of finding new representations of the algebra obtained is simpler than the corresponding problem for the algebra RLL = LLR. There are several interesting problems to be discussed. First, we recall that in the classical case we obtained two different Poisson algebras, which lead to the same classical L-operator algebra. Only one of them was quantized. It is desirable to quantize the second one and to show that the corresponding quantum L-operator algebra is isomorphic to the algebra obtained in the paper. The elliptic RS model we dealt with in the paper corresponds to the AN −1 root system. It seems to be possible to extend our approach to other root systems and to derive the corresponding L-operator algebras. To this end, one should find a proper parameterization of the corresponding cotangent bundle. Generalizing our approach to the cotangent bundle over a centrally extended group of smooth mappings from a higher-genus Riemann surface into a Lie group, one may expect to obtain new integrable systems.

428

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

It is known that the CM systems admit spin generalizations [34–36]. Recently, the spin generalization was found for the elliptic RS model [37]. However, the Hamiltonian formulation for the model is not found yet. One may hope that in our approach the spin models can arise as higher representations of the L-operator algebra. Probably, the most interesting and complicated problem is to separate variables for the quantum elliptic RS model. Up to now only the three-particle case for the trigonometric RS model was solved explicitly [38]. One could expect that the L-operator algebra obtained in the paper may shed light on the problem. Acknowledgement. The authors are grateful to M.A.Olshanetsky and N.A.Slavnov for the valuable discussions. This work is supported in part by the RFBI grants N96-01-00608, N96-02-19085 and N96-01-00551 and by the ISF grant a96-1516.

Appendix A In this Appendix, we prove that the limiting r-matrix (2.20) satisfies the classical Yang– Baxter equation. For this, let us write the equation, which follows from the Jacoby identity for the bracket (2.18): εη ερ ηρ ερ ηρ (z, w), r13 (z, s) + r23 (w, s)] + [r13 (z, s), r23 (w, s)] [r12 1 ε −1 1 ρ −1 ρ εη ηρ (w, s)] + T3 (s){T3 (s), r12 (z, w)] + T1 (z){T1ε (z), r23 κ κ 1 ερ (z, s)] = 0. − T2η −1 (w){T2η (w), r13 κ

(A.1)

Here εη εη (z, w) = H1εη H2ηε r12 (z, w) − αf εη (z, w) r12 εη εη (z, w) + β tr 3 H1εη H3ηε r13 (z, w) ⊗ I + βI ⊗ tr 3 H3εη H2ηε r32 εη (z, w) I ⊗ I. + β 2 tr 34 H3εη H4ηε r34

(A.2) ε

ηρ

To study (A.1), one needs to know the Poisson bracket of T (z) with H . This bracket can be easily derived from Eqs. (2.13), (2.14), and (2.17): X ερ ηε 1 1 ε −1 T1 (z){Tε1 (z), H2ηρ } = Hi Hj 8(ε − ρ, qij )(Eii − I) ⊗ Ejj (A.3) κ N −

X

i6=j

i6=j

−

X i6=j

1 I) ⊗ Ejj N X εη ηε ηρ + Hi Hj Hj 8(z − η, qij )Eij ⊗ Ejj

Hiεη Hjηε Hjηρ 8(ε − η, qij )(Eii − Hiερ Hjηε 8(z − ρ, qij )Eij ⊗ Ejj

X 1 ηρ H (8(ε − ρ, 0) + N j j

i6=j

1 I) ⊗ Ejj . (A.4) N Equation (A.1) holds at arbitrary values of all the parameters. Without loss of generality one can put ρ = 0 and η = aε, where a and ε are real. Let us now perform the change of variables Hiεη : +8(η − ε, 0) + 8(z − η, 0) − 8(z − ρ, 0)) (Ejj −

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

429

Hiεη = 1 + hεη i and consider the expansion of the l.h.s. of (A.1) in powers of ε and h. Let us note that the matrix rεη (z, w) has the following expansion in powers of h at ε − η → 0: (1) (z, w, ε)h + r(2) (z, w, ε)h2 + o(ε − η), rεη (z, w) = r(z, w) + rreg

(A.5)

(1) (z, w, ε) is regular when ε → 0 and the where r(z, w) is given by (2.20). The matrix rreg ε-dependence of r(2) (z, w, ε) is inessential. Substituting (A.5) into the bracket T−1 {T, r} one gets: (1) (ε)T−1 {T, h} + r(2) (ε)hT−1 {T, h}. (A.6) T−1 {T, r} = rreg

Clearly, Eq. (A.1) should be satisfied in any order in h and ε. Since we are interested in finding an equation for r, we consider only the terms independent of h and ε in the expansion of Eq. (A.1). The terms of zero order in h and ε occurring in (A.1) come from rεη and from the first term in (A.6). However, from the explicit expression for the bracket {T, H} one can see that {T, h}|h=0 = o(ε). Now, taking into account that rreg (ε) is regular at ε → 0, we conclude that the last three terms in (A.1) do not contribute. Thus, in the zero order in h and ε, Eq. (A.1) reduces to the CYBE for r(z, w). The remarkable thing is that the main elliptic identities (see Appendix B) follow from the Jacoby identity for the bracket (2.18) or, equivalently, from the Yang-Baxter equation for r(z, w). (1)

Appendix B Here we present the basic elliptic function identities, formulated as a set of functional relations on 8(z, w) [8]: 8(z, x)8(w, y) = 8(z, x − y)8(z + w, y) + 8(z + w, x)8(w, y − x), (B.1) 8(z, x)8(z, y) = 8(z, x + y) (8(z, 0) + 8(x, 0) + 8(y, 0) − 8(z + x + y, 0)) , (B.2) 8(z − w, a − b)8(z, x + b)8(w, y + a) − 8(z − w, x − y)8(z, y + a)8(w, x + b) = 8(z, x + a)8(w, y + b) (8(a − b, 0) + 8(x + b, 0) − 8(x − y, 0) − 8(a + y, 0)) . (B.3) Equation (B.2) is the limiting case of Eq. (B.1) where w → z, and Eq. (B.3) is a z−z¯ consequence of (B.1) and (B.2). Note that the exponent term e−2ζ(1/2)zs+2πis τ −τ in z¯ 8(z, s) as well as the linear term −2ζ(1/2)z + 2πi τz− −τ in 8(z, 0) are irrelevant since they drop out from (B.1–B.2). To establish the unitarity relation for R, one also needs the identity involving the Weierstrass P-function 8(z, s)8(z, −s) = P(z) − P(s), and to prove Eq. (2.50), the following relation between the derivatives of 8 is of use: ∂8(z, qij ) ∂8(z, qij ) = − (8(z, 0) − 8(qij ))8(z, qij ). ∂z ∂qij

430

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

Appendix C In this Appendix we present the quantum L-operator algebra −1

F (z − w)P1 R21 (w)P1−1 L1 (z)R21 (w)L2 (w) R12 −1

F = P2 R12 (z)P2−1 L2 (w)R12 (z)L1 (z)R12 (z − w)

(C.1)

in the component form. The l.h.s. of (C.1) has the form X 8(~, qki )8(~, qik )8(~, qkj − ~)Lij (z)Lkl (w)Eij ⊗ Ekl i6=k;j,l

X

−

8(~, qki )8(~, qij − ~)8(w, qkj − ~)Lij (z)Ljl (w)Eij ⊗ Ekl

i6=k;j,l

X

+

8(~, qki )8(w, qki )8(~, qij − ~)Lij (z)Lil (w)Eij ⊗ Ekl

i6=k;j,l

X

+

8(z − w, qik )8(~, qki )8(~, qij − ~)Lkj (z)Lil (w)Eij ⊗ Ekl

i6=k;j,l

X

−

8(z − w, qik )8(~, qkj − ~)8(w, qij − ~)Lkj (z)Ljl (w)Eij ⊗ Ekl

i6=k;j,l

X

+

8(z − w, qik )8(w, qik )8(~, qkj − ~)Lkj (z)Lkl (w)Eij ⊗ Ekl

i6=k;j,l

+

X

8(z − w, ~)8(w, ~)8(~, qkj − ~)Lkj (z)Lkl (w)Ekj ⊗ Ekl

j,k,l

−

X

8(z − w, ~)8(w, ~)8(w + ~, qkj − ~)Lkj (z)Ljl (w)Ekj ⊗ Ekl .

j,k,l

The r.h.s. of (C.1) reads X 8(~, qki )8(~, qil − ~)8(~, qlj )Lkl (w)Lij (z)Eij ⊗ Ekl i6=k;j6=l

+

X

8(~, qki )8(~, qij − ~)8(z − w, qlj + ~δlj )Lkj (w)Lil (z)Eij ⊗ Ekl

i6=k;j,l

−

X

8(~, qkl − ~)8(z, qil − ~)8(~, qlj )Lkl (w)Llj (z)Eij ⊗ Ekl

i6=k;j6=l

−

X

8(~, qkj − ~)8(z, qij − ~)8(z − w, qlj + ~δlj )Lkj (w)Ljl (z)Eij ⊗ Ekl

i6=k;j,l

+

X

8(z, qik )8(~, qkl − ~)8(~, qlj )Lkl (w)Lkj (z)Eij ⊗ Ekl

i6=k;j6=l

+

X

8(z, qik )8(~, qkj − ~)8(z − w, qlj + ~δlj )Lkj (w)Lkl (z)Eij ⊗ Ekl

i6=k;j,l

+

X

j6=l;i

8(z, ~)8(~, qil − ~)8(~, qlj )Lil (w)Lij (z)Eij ⊗ Eil

R-Matrix Quantization of Elliptic Ruijsenaars–Schneider Model

+

X

431

8(z, ~)8(~, qij − ~)8(z − w, qlj + ~δlj )Lij (w)Lil (z)Eij ⊗ Eil

i,j,l

−

X

8(z, ~)8(z + ~, qil − ~)8(~, qlj )Lil (w)Llj (z)Eij ⊗ Eil

j6=l;i

−

X

8(z, ~)8(z + ~, qij − ~)8(z − w, qlj + ~δjl )Lij (w)Ljl (z)Eij ⊗ Eil .

i,j,l

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.

25. 26. 27. 28. 29. 30. 31. 32. 33.

Babelon, O.and Viallet, C.M.: Phys. Lett. B237, 411 (1990) Avan, J. and Talon, M.: Phys.Lett. B303, 33–37 (1993) Avan, J., Babelon, O. and Talon, M.: Alg.Anal. 6(2), 67 (1994) Sklyanin, E.K.: Alg. Anal. 6(2), 227 (1994) Braden, H.W. and Suzuki, T.: Lett. Math. Phys. 30, 147 (1994) Avan, J. and Rollet, G.: The classical r-matrix for the relativistic Ruijsenaars-Schneider system. Preprint BROWN-HET-1014 (1995) Suris, Yu.B.: Why are the rational and hyperbolic Ruijsenaars-Schneider hierarchies governed by the same R-matrix as the Calogero-Moser ones? hep-th/9602160 Nijhoff, F.W., Kuznetsov, V.B., Sklyanin, E.K. and Ragnisco, O.: J. Phys. A: Math. Gen. 29, 333–340 (1996) Suris, Yu.B.: Elliptic Ruijsenaars-Schneider and Calogero-Moser hierarchies are governed by the same r-matrix. solv-int/9603011 Ruijsenaars, S.N.: Commun. Math. Phys. 110, 191 (1987) Olshanetsky, M.A., Perelomov, A.M.: Phys. Reps. 71, 313 (1981) Gorsky, A. and Nekrasov, N.: Nucl. Phys. B414, 213 (1994); Nucl. Phys. B436, 582 (1995); Gorsky, A.: Integrable many body systems in the field theories. Prep. UUITP-16/94, (1994) Arutyunov, G.E. and Medvedev, P.B.: Phys. Lett. A 223, 66–74 (1996) Avan, J., Babelon, O. and Billey, E.: The Gervais-Neveu-Felder equation and the quantum CalogeroMoser systems. Commun. Math. Phys. 178, 281–300 (1996) Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. hep-th/9407154 Gervais, J.L. and Neveu, A.: Nucl. Phys. B238, 125 (1984) Babelon, O., Bernard, D. and Billey, E.: Phys. Lett B 375, 89–97 (1996) Arutyunov, G.E., Frolov, S.A.: Quantum dynamical R-matrices and quantum Frobenius group. To appear in Commun. Math. Phys. Arutyunov, G.E., Frolov, S.A. and Medvedev, P.B.: J. Phys. A: Math. Gen. 30, 5051–5063 (1997) Arutyunov, G.E., Frolov, S.A. and Medvedev, P.B.: J. Math. Phys. 38, 5682–5689 (1997) Hasegawa, K.: Ruijsenaars’ commuting difference operators as commuting transfer matrices, qalg/9512029 Etingof, P.I., Frenkel, I.B.: Commu. Math. Phys 165, 429–444 (1994) Falceto, F. and Gawedski, K.: Commun. Math. Phys. 159, 549 (1994) Faddeev, L.D.: Integrable models in (1+1)-dimensional quantum field theory. In: Recent advances in field theory and statistical mechanics. Eds. Zuber, J.B., Stora, R., Les Houches Summer School Proc. session XXXiX, 1982, Elsevier Sci.Publ., 1984 pp. 561 Kulish, P.P., Sklyanin, E.K.: Quantum spectral transform method. Recent developments. In: Integrable quantum field theories. Eds. Hietarinta, J., Montonen, C., Lect. Not. Phys. 51, 1982, pp. 61 Belavin, A.A.: Nucl. Phys. B180 [FS2], 189–200 (1981) Richey, M.P. and Tracy, C.A.: J. Stat. Phys. 42, 311–348 (1986) Quano, Y. and Fujii, A.: Modern Phys. Lett. A vol 8, No 17, 1585–1597 (1993) Hasegawa, K.: J. Phys. A: Math. Gen. 26, 3211–3228 (1993) Hasegawa, K.: J. Math. Phys. 35 (11), 6158–6171 (1994) Sklyanin, E.K.: Funct. Anal. and Appl. (Engl. transl.), 17, 273–284 (1983) Baxter, R.J.: I. Ann. Phys. 76, 1–24 (1973) II. ibid. 25–47, III. ibid. 48–71 Jimbo, M., Miwa T. and Okado, M.: Lett. Math. Phys. 14, 123–131 (1987); Nucl. Phys. B300, 74–108 (1988)

432

G.E. Arutyunov, L.O. Chekhov, S.A. Frolov

34. Krichever, I.M., Babelon, O., Billey, E., Talon, M.: Spin generalization of Calogero-Moser system and the matrix KP equation. hep-th/9411160 35. Nekrasov, N.: Commun. Math. Phys. 180, 587–604 (1996) 36. Enriquez, B. and Rubtsov, V.: Hitchin systems, higher Gauden operators and r-matrices. Math. Research Lett. V3, 343–358 (1996) 37. Krichever, I.M., Zabrodin, A.V.: Spin generalizations of the Ruijsenaars-Schneider model, non-abelian 2D Toda chain and representations of Sklyanin algebra. hep-th/9505039 38. Kuznetsov, V.B. and Sklyanin, E.K.: J. Phys. A: Math. Gen. 29, 2779–2804 (1996) Communicated by G. Felder

Commun. Math. Phys. 192, 433 – 461 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Zero Viscosity Limit for Analytic Solutions, of the Navier-Stokes Equation on a Half-Space. I. Existence for Euler and Prandtl Equations Marco Sammartino1,? , Russel E. Caflisch2,?? 1 Dipartimento di Matematica, University of Palermo, Via Archirafi 34, 90123 Palermo, Italy. E-mail: [email protected] 2 Mathematics Department, UCLA, Los Angeles, CA 90096-1555, USA. E-Mail: [email protected]

Received: 5 September 1995 / Accepted: 14 July 1997

Abstract: This is the first of two papers on the zero-viscosity limit for the incompressible Navier-Stokes equations in a half-space. In this paper we prove short time existence theorems for the Euler and Prandtl equations with analytic initial data in either two or three spatial dimensions. The main technical tool in this analysis is the abstract Cauchy-Kowalewski theorem. For the Euler equations, the projection method is used in the primitive variables, to which the Cauchy-Kowalewski theorem is directly applicable. For the Prandtl equations, Cauchy-Kowalewski is applicable once the diffusion operator in the vertical direction is inverted. 1. Introduction The zero-viscosity limit for the incompressible Navier-Stokes equations in a half-space is a challenging problem due to the formation of a boundary layer whose thickness is proportional to the square root of the viscosity. Boundary layer separation, which is difficult to control, may cause singularities in the boundary layer equations. In this and the companion paper Part II, we overcome these difficulties by imposing analyticity on the initial data. Under this condition, we prove that, in the zero-viscosity limit and for a short time, the Navier-Stokes solution in a half-space goes to an Euler solution outside a boundary layer in either two or three spatial dimensions, and that it is close to a solution of the Prandtl equations within the boundary layer. The construction of the Navier-Stokes solution is performed as a composite asymptotic expansion involving an Euler solution, a Prandtl boundary layer solution and a correction term. It follows the earlier, unpublished analysis of Asano [1], who also restricted the data to be analytic, but our work contains a considerably simplified exposition, explicit use of the Prandtl equations, and several other technical differences: ? ??

Research supported in part by the NSF under grant #DMS-9306720. Research supported in part by DARPA under URI grant number # N00014092-J-1890.

434

M. Sammartino, R. E. Caflisch

Asano used a sup-estimate on the divergence-free projection operator, which we have been unable to verify. He also used high order derivative norms in the y (normal) variable, whereas we find it necessary to only use second derivatives. An earlier attempt to analyze this problem, without the requirement of analyticity and without explicit use of the Prandtl equations, was made by Kato [8]. It was not completely successful, since it required some unverified assumptions on the NavierStokes solution. Analysis of the zero-viscosity limit for the Navier-Stokes solution in an unbounded domain was performed in [3, 7, 13]. In this first part, we present short-time existence results for the Euler and Prandtl equations in a half-space with analytic initial data. The main significance of the Euler result is that it is stated in terms of the function spaces used in the Navier-Stokes result of Part II. For the Euler equations, of course, this is not an optimal result since analyticity is not needed for existence of a solution. Moreover, a more general existence result for analytic solutions of the incompressible Euler equations was proved earlier by Bardos and Benachour [2]. The present proof is somewhat different, since it uses the projection method on the primitive variables, rather than the vorticity formulation. For the Prandtl equations, on the other hand, our result on existence for short time and analytic initial data is the first general existence theorem for the unsteady problem. To the best of our knowledge, the only previous existence theorem for the unsteady Prandtl equations was by Oleinik [10]. For the Prandtl equations with upstream velocity prescribed at the left (x = 0) as well as at infinity and at t = 0, she proved existence for either a short time for all x > 0 or for a short distance and for all time, without the analyticity assumption. The proof required conditions that the prescribed horizontal velocities are all positive and strictly increasing, which are not required in our result. For a review of related mathematical results on both the steady and unsteady Prandtl equations, see [9]. In fact, we conjecture that the general initial value problem for the Prandtl equations is ill-posed in Sobolev space. Although ill-posedness has not been proved, there is some evidence in its favor: First, previous attempts to construct such solutions have failed. Second, there are numerical solutions of Prandtl that develop singularities associated with boundary layer separation in finite time [4–6]. Most recently, E and Engquist [14] have proved existence of Prandtl solutions with singularities. This is not enough to show ill-posedness, however, because in these computations and analysis the singularity time is not small. The main technical tool of our analysis is the abstract Cauchy-Kowalewski Theorem (ACK), the optimal form of which is due to Safonov [11]. This theorem, which is for systems that are first order in some sense, is directly applicable to the Euler equations. Since the Prandtl equations are diffusive rather than first order, the classical CauchyKowalewski Theorem cannot be applied, and it may at first seem surprising that the ACK Theorem is applicable to them. As pointed out by Asano [1], however, the ACK Theorem may be used for a nonlinear diffusion equation after inversion of the diffusion operator. We show below that this strategy works for the Prandtl equations, and in Part II, we shall also apply it to the Navier-Stokes equations. The main simplification of this analytic method over Sobolev methods is that it uses Cauchy estimates to bound derivatives rather than energy estimates. In Sect. 2 the Euler and Prandtl equations are stated and a number of function spaces and norms are defined. The abstract Cauchy-Kowalewski Theorem is formulated in Sect. 3. The existence theorem for the Euler equations is stated and proved in Sect. 4, which includes a convenient formulation and some useful bounds for the projection method. The existence theorem for the Prandtl equations is stated and proved in Sect. 5,

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

435

using properties of the heat operator, which are proved in an appendix. The analysis for Prandtl is completely independent of that for Euler. Some concluding remarks are made in Sect. 6. For convenience, the formulation and analysis will often be written in 2D, but the extension to 3D is straightforward. Key points in the 3D extension will be noted and the main results will be stated for 2D and 3D.

2. Statement of the Problem and Notation 2.1. Euler equations. The Euler equations for a velocity field u E = uE , v E are ∂t u E + u E · ∇u E + ∇pE = 0, ∇ · u E = 0, γn u E ≡ v E (x, y = 0, t) = 0, u E (x, y, t = 0) = u E 0 (x, y) .

(2.1) (2.2) (2.3) (2.4)

Here u E depends on the variables (x, y, t), where x is the transversal variable going from −∞ to ∞, y is the normal variable going from 0 to ∞, and t is the time. The operator γn acting on vectorial functions gives the normal component calculated at the boundary. In the rest of this paper we shall also use the trace operator γ, defined by γu = (u (x, y = 0, t) , v (x, y = 0, t)) .

(2.5)

In Sect. 3 we shall prove that under suitable hypotheses for the initial condition u E 0 (essentially analyticity in x and y ) Euler equations admit a unique solution. Although stated here for 2D, the analysis works equally well in 3D. The existence result is only for a short time. 2.2. Prandtl equations. The Euler equations are a particular case of the Navier-Stokes (N-S) equations, when the fluid has zero viscosity. Therefore, the Navier-Stokes solution for small viscosity ν is expected to be well approximated by an Euler solution, at least away from boundaries, which is confirmed by numerical and experimental observations. An analysis of the short time, spatially global behavior (in presence of a boundary) of N-S equations will be the subject of part II of this work, [12]. In the vicinity of the boundaries, on the other hand, the effect of viscosity is O(1) even as the viscosity goes to zero. The no-slip condition causes the creation of vorticity; moreover in a small layer there is an adjustment of the flow to the outer (inviscid) flow. Due to the resulting rapid variation of the√fluid velocity, the velocity depends on a scaled is of size ε. The normal variable Y = y/ε in which ε = ν. Also the vertical velocity resulting equations governing the velocity field u P = uP , εv P are Prandtl equations: (∂t − ∂Y Y ) uP + uP ∂x uP + v P ∂Y uP + ∂x pP = 0, ∂Y pP = 0, P ∂x u + ∂Y v P = 0, uP (x, Y = 0, t) = 0, P u (x, Y → ∞, t) −→ uE (x, y = 0, t) , uP (x, Y, t = 0) = uP 0 (x, Y ) .

(2.6) (2.7) (2.8) (2.9) (2.10) (2.11)

436

M. Sammartino, R. E. Caflisch

Equation (2.10) is the matching condition between the flow inside the boundary layer and the outer Euler flow. In Sect. 5 we shall prove that the Prandtl solution approaches the boundary value of the Euler solution at an exponential rate as Y goes to infinity. Equation (2.7) implies that the pressure is constant across the boundary layer; to match with the Euler pressure pE , it must satisfy ∂x pP = ∂x pE (x, y = 0, t) = −γ(∂t + uE ∂x )uE .

(2.12)

The normal component of the velocity v P can be found, using the incompressibility condition, to be Z Y ∂x uP (x, Y 0 , t)dY 0 . (2.13) vP = − 0 P

0

P

In 3D, ∂x u is replaced by ∇ · u in this integral, where ∇0 is the gradient with respect to the transversal variables. Therefore Eq. (2.6) can be considered as an equation for the transversal component uP , with v P given by Eq. (2.13), with boundary conditions (2.9)–(2.10), and with initial conditions (2.11). Moreover there must be compatibility between the boundary conditions and initial conditions; i.e. γuP 0 = 0,

(2.14)

E uP 0 (Y → ∞) − γu0 −→ 0.

(2.15)

In this paper we shall prove the existence and the uniqueness of the solutions for Eqs. (2.1)–(2.4) and (2.6)–(2.11). We now introduce the appropriate function spaces. 2.3. Function spaces. Let us introduce the “strip", the angular sector and the “conoid" in the complex plane D(ρ) = R × (−ρ, ρ) = {x ∈ C : =x ∈ (−ρ, ρ)} , Σ(θ) = {y ∈ C :
(2.16) (2.17) (2.18)

In the sequel we shall always be dealing with functions that are analytic in either the single complex variable x or the two complex variables x and y. The functions will be either L2 in the transversal variable x and bounded in the normal variable y, or L2 in both the transversal and normal variable. Next introduce the paths along which the L2 integration is performed: 0(b) = {x ∈ C : =x = b} , 0(θ0 , a) = {y ∈ C : 0 ≤
(2.19) (2.20)

Some of the norms below are defined in terms of the unscaled variable y; while others used the scaled variable Y = y/ε. Throughout this paper, the values of the angle θ and the parameter l counting the number of derivatives will always be restricted to 0 < θ < π/4, 4 ≤ l.

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

437

We have not attempted to make an optimal choice of l. Given a Banach scale {Xρ }0≤ρ≤ρ0 we define Bβk (A, Xρ ) as the space of all C k functions from A to Xρ with the norm |f |k,ρ,β =

k X

sup |∂tj f (t)|ρ−βt .

(2.21)

j=0 t∈A

Here A is supposed to be an interval [0, T ] of time, and ρ may be a vector of parameters such as (ρ, θ) or (ρ, θ, µ), in which case ρ − βt is replaced by (ρ − βt, θ − βt) or (ρ − βt, θ − βt, µ − βt). Due to the large number of function spaces and norms, we do not have a separate notation for every norm. Instead, we will always state the function space under consideration, and then the norm will be the one for that space. Table 1. Table of Function (E=Euler, E1 = first order Euler, P=Prandtl, S=Stokes, NS=Navier-Stokes) Space

L2

H 0l,ρ

x

l

H l,ρ,θ

x, y

j≤l

l−j

l,ρ,θ Hβ,T

x, y

t

j≤l

k ≤l−j

K l,ρ,θ,µ

x

Y

l−k

k≤2

0l,ρ Kβ,T

x

t

l−j

x

Y, t

l−k l−1

l,ρ,θ,µ Kβ,T

Ll,ρ,θ

x, Y

Ll,ρ,θ β,T

x, Y

N l,ρ,θ

x, y

l,ρ,θ Nβ,T

sup

x, y

t

t

O(∂x )

O(∂y ) or O(∂Y )

O(∂t )

equations E,P,S,NS E

l−k−j

E

j≤1

P,S

k≤2

0

P,S

0

1

P

l

0

l−2

0
l − 2j

0

j≤1

l−2

0
0

l

0

l−2

0
S, NS

S,NS E1

l − 2j

0

j≤1

l−2

0
0

E1

The following are function spaces that will be used in the sequel. We begin with the space of functions depending only on the transversal variable (the x-variable). All the functions used below will belong, for a fixed time t and for a fixed value of the normal coordinate y, to this space. A summary of these function spaces is made in Table 1. Definition 2.1. H 0l,ρ is the set of all complex functions f (x) such that • f is analytic in D(ρ). • ∂xα f ∈ L2 (0(=x)) for =x ∈ (−ρ, ρ), α ≤ l; i.e. if =x is inside (−ρ, ρ), then ∂xα f (<xP + i=x) is a square integrable function of <x. • |f |l,ρ = α≤l sup=x∈(−ρ,ρ) k∂xα f (· + i=x)kL2 (0(=x)) < ∞. We now introduce the dependence on the normal variable: Definition 2.2. H l,ρ,θ is the set of all functions f (x, y) such that • f is analytic inside D(ρ) × Σ(θ, a).

438

M. Sammartino, R. E. Caflisch

2 0 00,ρ • ∂yα1 ∂xα2 f (x, y) ∈ L , a); H 0(θ with |θ0 | ≤ θ, α1 + α2 ≤ l. P α1 α2 • |f |l,ρ,θ = α1 +α2 ≤l sup|θ0 |≤θ k|∂y ∂x f (·, y)|0,ρ kL2 (0(θ0 ,a)) < ∞. For a fixed time t all the functions used in the proof of existence and uniqueness for Euler equations will belong to the above space. We now introduce the functions depending on time: l,ρ,θ is defined as Definition 2.3. The space Hβ,T l,ρ,θ Hβ,T =

l \ j=0

Bβj [0, T ], H l−j,ρ,θ .

If a function f (x, y, t) belongs to this space its norm is |f |l,ρ,θ,β,T =

l X

sup |∂tj f (·, ·, t)|l−j,ρ−βt,θ−βt .

j=0 0≤t≤T

In the above spaces we shall prove the existence of a solution of the Euler equations. We now pass to the function spaces for Prandtl equations. The main difference with respect to the Euler equations is the presence of the heat operator (∂t − ∂Y Y ), which breaks the symmetry between the normal and transversal coordinates. We can require differentiability with respect to the transversal coordinate Y only up to the second order, and with respect to time t only up to the first order. Moreover, for Prandtl equations we shall use, in the Y variable, the sup norm. This will allow us to observe the behavior of the Prandtl solution outside the boundary layer. The Prandtl solution will in fact turn out to exponentially match the boundary data of the Euler solution. Definition 2.4. K l,ρ,θ,µ , with µ > 0, is the set of all functions f (x, Y ) such that • f is analytic inside D(ρ) × Σ(θ). • ∂Yα1 ∂xα2 f (x, Y ) ∈ C 0 Σ(θ); H 00,ρ with α1 ≤ 2 and α1 + α2 ≤ l. P P • |f |l,ρ,θ,µ = α1 ≤2 α2 ≤l−α2 supY ∈Σ(θ) eµ
1 \ j=0

Bβj [0, T ], H 0l−j,ρ .

If f (x, t) belongs to this space its norm is |f |l,ρ,β,T =

1 X X

sup |∂tj ∂xα f (t, ·)|0,ρ−βt .

j=0 α≤l−j 0≤t≤T

l,ρ,θ,µ Definition 2.6. The space Kβ,T is the set of all functions f (x, Y, t) such that K l,ρ,θ,µP and ∂t ∂xα f ∈ C 0 [0, T ], K 0,ρ,θ,µ with α ≤ l − 1. • f ∈ C 0 [0, T ],P • |f |l,ρ,θ,µ,β,T = α1 ≤2 α1 +α2 ≤l sup0≤t≤T |∂Yα1 ∂xα2 f (·, Y, t)|0,ρ−βt,θ−βt,µ−βt P + α≤l−1 sup0≤t≤T |∂t ∂xα f (·, ·, t)|0,ρ−βt,θ−βt,µ−βt < ∞.

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

439

We now pass to the spaces for the correction terms in the Navier-Stokes solution. We shall use the spaces below only in Part II. Definition 2.7. N l,ρ,θ is the set of all functions f (x, y) such that • f is analytic inside D(ρ) × Σ(θ, a). • ∂yα1 ∂xα2 f (x, y) ∈ L2 0(θ0 , a); H 00,ρ with |θ0 | ≤ θ, α1 ≤ 2, α1 + α2 ≤ l and 2 when α1 > 0. α2 ≤ l − P • |f |l,ρ,θ = α≤l sup|θ0 |≤θ k|∂xα f (·, y)|0,ρ kL2 (0(θ0 ,a)) P P + 0<α1 ≤2 α2 ≤l−2 sup|θ0 |≤θ k|∂yα1 ∂xα2 f (·, y)|0,ρ kL2 (0(θ0 ,a)) < ∞. l,ρ,θ Definition 2.8. The space Nβ,T is defined as the set of all functions f (x, y, t) such that:

• f ∈ C 0 ([0, T ], N l,ρ,θ ) and ∂t ∂xj f ∈ C 0 ([0, T ], N 0,ρ,θ ) with j ≤ l − 2. P P • |f |l,ρ,θ,β,T = 0≤j≤1 α≤l−2j sup0≤t≤T |∂tj ∂xα f (·, ·, t)|0,ρ−βt,θ−βt P P + 0<α1 ≤2 α2 ≤l−2 sup0≤t≤T |∂yα1 ∂xα2 f (·, ·, t)|0,ρ−βt,θ−βt < ∞ in which the norms on the right are in N l,ρ,θ . In the above two spaces we shall prove the existence of a solution for the first order correction of the Euler flow (see Sect. 5 of Part II). Definition 2.9. Ll,ρ,θ is the set of all functions f (x, Y ) such that • f is analytic inside D(ρ) × Σ(θ, a/ε). • ∂Yα1 ∂xα2 f (x, Y ) ∈ L2 0(θ0 , a/ε); H 00,ρ with |θ0 | ≤ θ, α1 ≤ 2, α1 + α2 ≤ l and α2 ≤ l − P 2 when α1 > 0. • |f |l,ρ,θ = α≤l sup|θ0 |≤θ k|∂xα f (·, Y )|0,ρ kL2 (0(θ0 ,a/ε)) P P + 0<α1 ≤2 0≤α2 ≤l−2 sup|θ0 |≤θ k|∂Yα1 ∂xα2 f (·, Y )|0,ρ kL2 (0(θ0 ,a/ε)) < ∞. Definition 2.10. The space Ll,ρ,θ β,T is the set of functions f (x, Y, t) such that • f ∈ C 0 [0, T ], Ll,ρ,θ and ∂t ∂xα f ∈ C 0 [0, T ], L0,ρ,θ with α ≤ l − 2. P P • |f |l,ρ,θ,β,T = 0≤j≤1 α≤l−2j sup0≤t≤T |∂tj ∂xα f (·, ·, t)|0,ρ−βt,θ−βt P P + 0<α1 ≤2 α2 ≤l−2 sup0≤t≤T |∂Yα1 ∂xα2 f (·, ·, t)|0,ρ−βt,θ−βt < ∞. The above two spaces are the spaces where we shall prove existence and uniqueness for the overall Navier-Stokes correction (see Sect. 7 of Part II Part II). The large number of function spaces is needed because of the various types of data and equations that are considered here: • The H function spaces, which are L2 (x, y), are natural function spaces for the Euler equations. • The K function spaces, which are L∞ (L2 (x), Y ) with decay in Y , are natural for the Prandtl equations. • For Navier-Stokes, the L spaces, which are L2 (x, Y ) with a restricted number of derivatives in Y , are used to allow combination of the Euler and Prandtl results. • For the first order Euler terms, the N spaces, which are L2 (x, y) with a restricted number of derivatives in y, are used.

440

M. Sammartino, R. E. Caflisch

Table 2. Table of Operators (E=Euler, E1 first order Euler, P=Prandtl, S=Stokes, NS=Navier-Stokes, I=this paper, II=Ref [12]) Operator

Description

Definition

Use

P∞

projection

I (4.4)

E , E1

P

half-space projection

I (4.24), (4.25)

E , E1

Pt

integrated (in time) half-space projection

I (4.35)

E , E1

E0 (t)

heat op with IC, diff in Y

I (5.4)

P

E1

heat op with BC, diff in Y

I (5.8)

P

E2

heat op with force, diff in Y

I (5.11)

P

N0 E˜ 1

iξ 0 /|ξ 0 |

I (4.10)

E,S,NS

heat op with BC, diff in x, Y

II (3.21)

S, NS

E˜ 2

heat op with force, diff in x, Y

II (7.4)

NS

rescaled projection

II (7.12)-(7.13)

NS

S

Stokes op

II (3.36)

S, NS

N0

projected heat op

II (7.15)

NS

N∗

Navier-Stokes operator

II (7.20)

NS

P

∞

We shall also use a large number of operators in this analysis. For convenience and clarity, we list the operators used in this paper as well as in the second paper, in the following table, with reference to the location of their definitions. In the following sections we shall often estimate products of functions belonging to the above spaces; if l ≥ 4 such products can be estimated using the following Sobolev inequalites: l,ρ,θ l,ρ,θ and l ≥ 4. Then f · g ∈ Hβ,T , and Proposition 2.1. Let f, g ∈ Hβ,T

|f · g|l,ρ,θ,β,T ≤ c|f |l,ρ,θ,β,T |g|l,ρ,θ,β,T .

(2.22)

l,ρ,θ Proposition 2.2. Let f, g ∈ Ll,ρ,θ β,T and l ≥ 4. Then f · g ∈ Lβ,T , and

|f · g|l,ρ,θ,β,T ≤ c|f |l,ρ,θ,β,T |g|l,ρ,θ,β,T .

(2.23)

A similar statement holds if we are using the sup norm in the Y variable: l,ρ,θ,0 l,ρ,θ,µ , g ∈ Kβ,T and l ≥ 3 (l ≥ 4 in 3D). Then Proposition 2.3. Let f ∈ Kβ,T l,ρ,θ,µ f · g ∈ Kβ,T , and

|f · g|l,ρ,θ,β,µ,T ≤ c|f |l,ρ,θ,β,0,T |g|l,ρ,θ,β,µ,T .

(2.24)

3. Cauchy-Kowalewski Theorem By a Banach scale {Xρ : 0 < ρ ≤ ρ0 } with norms | |ρ we mean a collection of Banach spaces such that Xρ0 ⊂ Xρ00 and | |ρ00 ≤ | |ρ0 when ρ00 ≤ ρ0 ≤ ρ0 . Let τ > 0, 0 < ρ ≤ ρ0 and R > 0.

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

441

Definition 3.1. Xρ,τ is the set of all functions u(t) from [0, τ ] to Xρ endowed with the norm |u|ρ,τ = sup |u(t)|ρ .

(3.1)

0≤t≤τ

Definition 3.2. Xρ,τ (R) is the set of all functions u(t) from [0, τ ] to Xρ such that |u|ρ,τ ≤ R.

(3.2)

Definition 3.3. Yρ,β,τ is the set of all functions u(t) from [0, τ ] to Xρ endowed with the norm |u|ρ,β,τ = sup |u(t)|ρ−βt .

(3.3)

0≤t≤τ

Definition 3.4. Yρ,β,τ (R) is the set of all functions u(t) from [0, τ ] to Xρ such that |u|ρ,β,τ ≤ R.

(3.4)

For t in [0, T ], consider the equation u + F (t, u) = 0 .

(3.5)

The basic existence theorem for this system is the following Abstract Cauchy-Kowalewski (ACK) Theorem, which is a slight modification of the version proved by Safonov [11]. Theorem 3.1. Suppose that ∃R > 0, T > 0, ρ0 > 0, and β0 > 0 such that if 0 < t ≤ T , the following hold: 1. ∀0 < ρ0 < ρ ≤ ρ0 − β0 T and ∀u ∈ Xρ,T (R) the function F (t, u) : [0, T ] −→ Xρ0 is continuous. 2. ∀0 < ρ ≤ ρ0 − β0 T the function F (t, 0) : [0, T ] −→ Xρ,T (R) is continuous in [0, T ] and |F (t, 0)|ρ0 −β0 t ≤ R0 < R. 3. ∀0 < ρ0 < ρ(s) ≤ ρ0 − β0 s and ∀ u1 and u2 ∈ Yρ0 ,β0 ,T (R), Z t 1 |u − u2 |ρ(s) ds. |F (t, u1 ) − F (t, u2 )|ρ0 ≤ C ρ(s) − ρ0 0

(3.6)

(3.7)

Then ∃β > β0 and T1 > 0 such that Eq. (3.5) has a unique solution in Yρ0 ,β,T1 . In the applications below, this theorem will be applied with ρ replaced by a vector of parameters (ρ, θ) or (ρ, θ, µ), and the fraction (ρ(s) − ρ0 )−1 is replaced by (ρ(s) − ρ0 )−1 + (θ(s) − θ0 )−1 or (ρ(s) − ρ0 )−1 + (θ(s) − θ0 )−1 + (µ(s) − µ0 )−1 . This does not change the proof of the theorem. The Euler, Prandtl and Navier-Stokes equations will be solved in a time integrated R form. That is, the system wt = A(w) will replaced by u = A( udt) in which u = wt . In this form the natural estimate on the difference of F is the right hand side of (3.7) plus an additional term like Z t Z t 1 2 0 −1 0 |u − u |ρ ds (ρ(s) − ρ ) ds . (3.8) 0

0

442

M. Sammartino, R. E. Caflisch

This additional term can bounded by the right hand side of (3.7) as follows: First replace |u(t)|ρ by |u|ρ,t everywhere, so that the norms are increasing in t. Also restrict to ρ(s) which is decreasing in s, so that (ρ(s) − ρ0 )−1 is also increasing. Then use the following simple lemma: Lemma 3.1. Suppose that a(s) and b(s) are positive increasing functions. Then Z t Z t Z t a(s)ds b(s)ds ≤ t a(s)b(s)ds . (3.9) 0

0

0

4. Existence and Uniqueness for the Euler Equations In this section we shall prove the following theorem: l,ρ,θ E , l ≥ 4, with ∇ · u E Theorem 4.1. Suppose that u E 0 ∈ H 0 = 0 and γn u 0 = 0. Then the Euler equations Eq. (2.1)-Eq. (2.4) in either 2D or 3D admit a unique solution 0 ,θ0 u E in Hβl,ρ for some 0 < ρ0 < ρ, 0 < θ0 < θ, 0 < β0 , 0 < T . This solution satisfies 0 ,T

0 ,θ0 the following bound in Hβl,ρ : 0 ,T

|u E |l,ρ0 ,θ0 ,β0 ,T < c|u E 0 |l,ρ,θ .

(4.1)

The proof of this theorem will be based on the ACK Theorem in the function spaces l,ρ,θ . The key idea in recasting the Euler equations into a Xρ = H l,ρ,θ and Yρ,β,T = Hβ,T form suitable for an iterative procedure is to introduce a new variable u ? , essentially a projected velocity (see Eq. (4.36) below), so that the boundary, initial and incompressibility conditions are automatically satisfied. The core of this section is devoted to introduction of the half space projection operator, and to the estimate of this operator (Subsections 4.1 and 4.2). After that we shall introduce an estimate on the convective part of the Euler equation, Eq. (4.34), as a consequence of the Cauchy estimate for the derivative of an analytic function. In Subsect. 4.4 we shall solve the Euler equation, and recast it in the form given by Eq. (4.37). Using the estimates on the projection operator, and the Cauchy estimate, it will then be straightforward to verify all the hypotheses of the ACK Theorem. 4.1. The projection operator. To prove the above theorem we need to define several operators. We start with the Fourier transform of the functions f (x) and g(x, y): Z Z 0 1 1 −ixξ 0 0 dxf (x)e dxdyg(x, y)e−ixξ −iyξn , (4.2) , g(ξ ˆ , ξ ) = fˆ(ξ 0 ) = n 1/2 2π (2π) where the above integrals are on the whole real line and real plane respectively. Later we shall restrict y to be nonnegative. In the rest of this paper we shall adopt the convention of using ξ 0 and ξn as the dual of x and y respectively. In 3D, x and ξ 0 are vectors, and xξ 0 is replaced by x · ξ 0 . Suppose that σ(T )(ξ 0 ) is a function of ξ 0 such that Tcf (ξ 0 ) = σ(T )(ξ 0 )fˆ(ξ 0 ),

(4.3)

where T is an operator acting on the function of one variable. Then σ(T ) is called the symbol of the pseudodifferential operator T . If T acts on functions of two (or more) variables, the definition is analogous.

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

443

We can now define the free space projection operator P ∞ acting on vectorial functions u (x, y), as the operator whose symbol is (here and in the rest of the paper we shall often omit the distinction between the operator and its symbol): ! ξn2 −ξ 0 ξn 1 ∞ . (4.4) P = 02 ξ + ξn2 −ξ 0 ξn ξ 02 It is easy to see that the action of the above operator consists in projecting vectorial functions onto their divergence-free part, i.e. ∇ · P ∞ u = 0.

(4.5)

The operator P ∞ can be thought of as P ∞ = 1 − ∇1−1 ∇ ·

(4.6)

We define the operators N , D as the operators whose symbols are: 0

D = e−|ξ |y , N = −

1 −|ξ0 |y e . |ξ 0 |

(4.7)

The above operators solve the Laplace equation in the half plane with Dirichlet and Neumann boundary condition respectively. In fact the problems

and

1u(x, y) = 0, u(x, y = 0) = f (x)

(4.8)

1u(x, y) = 0, ∂y u(x, y = 0) = f (x)

(4.9)

admit the solutions u = Df and u = N f respectively. Another operator which is useful to introduce is iξ 0 (4.10) N0 = 0 . |ξ | Now we express the components of the projection operator on the free space in a different form; i.e. 1 P ∞ n = −ξ 0 ξn , ξ 02 02 ξ + ξn2 1 |ξ 0 | |ξ 0 | |ξ 0 | |ξ 0 | −N 0 = − + , . 2 |ξ 0 | + iξn |ξ 0 | − iξn |ξ 0 | + iξn |ξ 0 | − iξn

(4.11)

This will be useful in proving existence and uniqueness for the Euler equations, as well as for the Navier-Stokes error equation Part II. We now give an expression for the projection operator involving the Fourier transform only in the x variable, which will simplify the subsequent estimates. Lemma 4.1. Let f (ξ 0 , y) be a function admitting the Fourier transform in y. Then Z y 0 0 |ξ 0 | 0 0 f (ξ , y) = |ξ | e−|ξ |(y−y ) f (ξ 0 , y 0 )dy 0 (4.12) 0 |ξ | + iξn −∞ and

|ξ 0 | f (ξ 0 , y) = |ξ 0 | 0 |ξ | − iξn

Z

∞ y

0

0

e|ξ |(y−y ) f (ξ 0 , y 0 )dy 0 .

(4.13)

444

M. Sammartino, R. E. Caflisch

We prove the second of these equalities. Define the function ( 0 |ξ 0 |e|ξ |y y ≤ 0 0 k2 (ξ , y) = 0 y>0

(4.14)

so that the integral operator in Eq. (4.13) can be written as the convolution of k2 and f ; i.e. Z ∞ |ξ 0 |

0

y

0

e|ξ |(y−y ) f (ξ 0 , y 0 )dy 0 = k2 (ξ 0 , y) ∗ f (ξ 0 , y).

(4.15)

To prove Eq. (4.13) it is enough to apply the Fourier transform with respect to the y variable to Eq. (4.15), and notice that kˆ 2 =

1 |ξ 0 | . 0 1/2 (2π) |ξ | − iξn

The proof of Eq. (4.12) is similar, and is done using the function ( 0 |ξ 0 |e−|ξ |y y ≥ 0 0 k1 (ξ , y) = 0 y < 0.

(4.16)

(4.17)

As a result of Lemma 4.1, the normal and transversal components (P ∞ n and P ∞0 ) of the projection operator can be written as Z 0 0 1 0 y ∞ |ξ | dy 0 e−|ξ |(y−y ) (−N 0 u + v) P nu = 2 −∞ Z ∞ 0 0 +|ξ 0 | dy 0 e|ξ |(y−y ) (N 0 u + v) , (4.18) y

Z y 0 0 1 ∞0 0 P u = u + −|ξ | dy 0 e−|ξ |(y−y ) (u + N 0 v) 2 −∞ Z ∞ 0 0 −|ξ 0 | dy 0 e|ξ |(y−y ) (u − N 0 v)

(4.19)

y

in which u = (u, v). We now define the projection operator P on the half plane y ≥ 0, with vanishing normal component at the boundary, i.e. for γn u (y = 0) = 0, as P = P ∞ − ∇N γn P ∞ .

(4.20)

It is easy to see that the following properties hold for all u : ∇ · P u = 0, γn P u = 0, P 2 = P.

(4.21) (4.22) (4.23)

Explicit formulas for the half-space projection P are given by Z y Z y 0 0 0 0 1 P 0 u = u − |ξ 0 | dy 0 e−|ξ |(y−y ) (u + N 0 v) + dy 0 e−|ξ |(y+y ) (u − N 0 v) 2 0 0 Z ∞ 0 0 −2|ξ 0 |y + 1+e dy 0 e|ξ |(y−y ) (u − N 0 v) , (4.24) y

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

1 Pn u = |ξ 0 | 2

Z

y

0 −|ξ 0 |(y−y 0 )

dy e 0

445

Z

0

y

0

0

(−N u + v) − dy 0 e−|ξ |(y+y ) (N 0 u + v) 0 Z ∞ −2|ξ 0 |y 0 |ξ 0 |(y−y 0 ) 0 + 1−e dy e (N u + v) . (4.25) y

If y is complex, these must be understood as contour integrals. The derivation above, based on the free space projection operator, is simple, but one can also directly check that the formulas (4.24) and (4.25) satisfy the conditions (4.21), (4.22) and (4.23). In the following subsection we shall introduce some bounds on the norm of the projection operator which we shall use in the rest of this section. The formulas Eq. (4.24) and Eq. (4.25) can be extended to 3D by the following modifications: Replace u, ξ 0 and N 0 = iξ 0 /|ξ 0 | by vectors. In Eq. (4.24), replace u in the integrals by N 0 N 0 · u. In Eq. (4.25) replace N 0 u by N 0 · u. 4.2. Estimates on the projection operators. Let f ∈ H 0l,ρ and consider the norm |f |l,ρ =

X Z

 1/2   ρ|ξ0 | 0 k ˆ 0 2 |ξ | f (ξ ) . dξ e 0

(4.26)

k≤l

It is easy to see that the above norm is equivalent to the one we have previously introduced in H 0l,ρ . In the rest of this paper we shall use both of them according to convenience, sometime switching from one to the other during the same estimate. Occasionally we shall omit the distinction between the function and its Fourier transform. Using the norm we just introduced, Jensen’s inequality, and the expressions (4.24), (4.25) for P , one can easily prove the estimates in the next two lemmas. Lemma 4.2. Let u ∈ H l,ρ,θ . Then P u ∈ H l,ρ,θ and |P u |l,ρ,θ ≤ c|u |l,ρ,θ .

(4.27)

We now define the function χ as χ(y) = min(1, |y|).

(4.28)

What we want to show is that the normal component of P goes to zero linearly fast near the origin. A precise statement is the following: Lemma 4.3. Let u ∈ H l,ρ,θ with l > 0. Then |Pn u |0,ρ ≤ c|u |l,ρ,θ . χ(y) y∈Σ(θ) sup

(4.29)

The significance of Lemma 4.3 will be clear only when we introduce the Cauchy estimate for normal derivatives. We shall use it in this paper to estimate the convective part of the Euler equation, and it will be crucial in Part II to handle the large (i.e. O(ν −1/2 )) generation of vorticity in the boundary layer. The proof of (4.29) can be easily achieved by distinguishing the two cases: |y| ≥ 1 and |y| < 1. 4.3. The Cauchy estimates. To deal with the convective part of the Euler equations we introduce the Cauchy estimate of the derivative of an analytic function:

446

M. Sammartino, R. E. Caflisch 00

Lemma 4.4. Let f ∈ H 0l,ρ . If ρ0 < ρ00 then |∂x f |l,ρ0 ≤

|f |l,ρ00 . ρ00 − ρ0

(4.30)

If the derivative is with respect to the y variable, because of the angular shape of the region of analyticity, we must multiply by |y| for y near 0. 00

Lemma 4.5. Let f ∈ H l,ρ,θ . If θ0 < θ00 < π/4 then |χ(y)∂y f |l,ρ,θ0 ≤

|f |l,ρ,θ00 . θ00 − θ0

(4.31)

With the above two Cauchy estimates and using the Sobolev inequality (see e.g. Proposition 2.1), it is easy to prove that Lemma 4.6. Let f and g ∈ H l,ρ

00

,θ 00

with l ≥ 4, and let ρ0 < ρ00 . Then

|g∂x f |l,ρ0 ,θ00 ≤ c|g|l,ρ0 ,θ00 Lemma 4.7. Let f and g ∈ H l,ρ Then

00

,θ 00

|f |l,ρ00 ,θ00 . ρ00 − ρ0

(4.32)

, with l ≥ 4 and with g(y = 0) = 0, and let θ0 < θ00 .

|g∂y f |l,ρ00 ,θ0 ≤ c|g|l,ρ00 ,θ0

|f |l,ρ00 ,θ00 . θ00 − θ0

(4.33)

We can finally estimate the convective part of the Euler equations. l,ρ,θ l ≥ 4, and that γn u 1 = γn u 2 = 0. Lemma 4.8. Suppose that u 1 and u 2 are in Hβ,T 0 00 Moreover let ρ and ρ satisfy

ρ − βt ≥ ρ00 > ρ0 , θ − βt ≥ θ00 > θ0 for 0 ≤ t ≤ T . Then

|u 1 · ∇u 1 − u 2 · ∇u 2 |l,ρ0 ,θ0 ≤ c

|u 1 − u 2 |l,ρ00 ,θ0 |u 1 − u 2 |l,ρ0 ,θ00 + , (4.34) ρ00 − ρ0 θ00 − θ0

where the constant c depends only on |u 1 |l,ρ,θ,β,T and |u 2 |l,ρ,θ,β,T . 4.4. Pressure-free Euler equations. The usual problem with the Euler equations is the presence of the pressure gradient in the conservation of momentum equations and the corresponding coupling of these evolution type equations to the incompressibility equation. There are two ways to circumvent these problems: the projection method, which is employed here, and the vorticity formulation. First we define the operator Pt , whose action on a vector function u (x, y, t) is given by Z t dsu (x, y, s), (4.35) Pt u (x, y, t) = P 0

and pose

? u E (x, y, t) = u E 0 (x, y) + Pt u .

(4.36)

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

447

It is clear that once u E is expressed in the above form, the initial and boundary conditions for the Euler equations and the incompressibility condition are automatically satisfied. If we put (4.35) into the conservation of momentum equation we get

where

u ? + H (u ? , t) = 0,

(4.37)

? E ? H (u ? , t) = (u E 0 + Pt u ) · ∇(u 0 + Pt u ).

(4.38)

?

Existence and uniqueness of the solution u of Eq. (4.37), which implies existence and uniqueness for the Euler equations, is stated in the following theorem: l,ρ,θ E l ≥ 4, with ∇ · u E Theorem 4.2. Suppose u E 0 ∈ H 0 = 0 and γn u 0 = 0. Then l,ρ ,θ Eq. (4.37) admits a unique solution u ? in Hβ0 ,T0 0 for some 0 < ρ0 < ρ, 0 < θ0 < θ, β0 > 0, T > 0.

Theorem 4.1 follows directly from Theorem 4.2, using the following proposition, which is a consequence of Lemma 4.2: 0 ,θ0 0 ,θ0 . Then Pt u ? ∈ Hβl,ρ and Proposition 4.1. Let u ? ∈ Hβl,ρ 0 ,T 0 ,T

|Pt u ? |l,ρ0 ,θ0 ,β0 ,T ≤ c|u ? |l,ρ0 ,θ0 ,β0 ,T .

(4.39)

We have also the following bound on Pt : 0 ,θ0 Proposition 4.2. Let u ? ∈ Hβl,ρ and let ρ0 < ρ0 − β0 T and θ0 < θ0 − β0 T . Then 0 ,T ? l,ρ0 ,θ 0 Pt u ∈ H for each 0 < t < T and Z t ds|u ? (·, ·, s)|l,ρ0 ,θ0 ≤ c|u ? |l,ρ0 ,θ0 ,β0 ,T . (4.40) |Pt u ? |l,ρ0 ,θ0 ≤ c

0

In the rest of this section we shall be concerned with proving Theorem 4.2. To do this we shall verify that the operator H satisfies all the hypotheses of the ACK Theorem in l,ρ,θ (as a function the function spaces Xρ,θ = H l,ρ,θ (at each fixed t) and Yρ,θ,β,T = Hβ,T of t), and with ρ replaced by the vector (ρ, θ). 4.5. The forcing term. It is obvious that H satisfies the first condition of the ACK Theorem in the norms H l,ρ,θ . In this subsection we shall prove that there exists a constant R0 such that |H (t, 0)|l,ρ0 −βt,θ0 −βt ≤ R0 (4.41) in H l,ρ,θ for 0 ≤ t ≤ T , which verifies the second assumption of the theorem. The constant R0 will of course depend on |u E 0 |l,ρ,θ and on the difference between ρ and ρ0 , θ and θ0 . From Eq. (4.38), we see that E H (t, 0) = u E 0 · ∇u 0 ,

(4.42)

E E 2 |u E 0 · ∇u 0 |l,ρ0 −βt,θ0 −βt ≤ c|u 0 |l,ρ,θ ,

(4.43)

and Lemmas 4.6 and 4.7 imply

which gives the desired bound (4.41). We now pass to the Cauchy estimate.

448

M. Sammartino, R. E. Caflisch

4.6. The Cauchy estimate. In this subsection we shall be concerned with proving that the operator H satisfies the last hypothesis of the ACK Theorem. We have to show that, l,ρ0 ,θ0 l ≥ 4, if ρ0 < ρ(s) ≤ ρ0 − βs, θ0 < θ(s) ≤ θ0 − βs, and if u ?1 and u ?2 are in Hβ,T with (4.44) |u ?1 |l,ρ0 ,θ0 ,β,T ≤ R, |u ?2 |l,ρ0 ,θ0 ,β,T ≤ R, then in H l,ρ,θ |H (t, u ?1 ) − H (t, u ?2 )|l,ρ0 ,θ0 Z t ?1 |u − u ?2 |l,ρ(s),θ0 |u ?1 − u ?2 |l,ρ0 ,θ(s) ≤C ds + . ρ(s) − ρ0 θ(s) − θ0 0

(4.45)

First estimate the nonlinear term of H . Using Lemma 4.8 to estimate the convective part of the operator H and then Proposition 4.2 leads to   Pt u ?1 · ∇Pt u ?1 − Pt u ?2 · ∇Pt u ?2  0 0 l,ρ ,θ Z t ?1 ?2 ?1 0 |u − u ?2 |l,ρ0 ,θ(s) |u − u |l,ρ(s),θ ≤c ds + ρ(s) − ρ0 θ(s) − θ0 0 Z t |u ?1 − u ?2 |l,ρ0 ,θ0 ,β0 ,T ds + 0 Z t ?1 |u |l,ρ(s),θ0 + |u ?2 |l,ρ(s),θ0 |u ?1 |l,ρ0 ,θ(s) + |u ?2 |l,ρ0 ,θ(s) + ds × ρ(s) − ρ0 θ(s) − θ0 0 Z t ?1 |u − u ?2 |l,ρ(s),θ0 |u ?1 − u ?2 |l,ρ0 ,θ(s) ds + (4.46) ≤C ρ(s) − ρ0 θ(s) − θ0 0 in H l,ρ,θ , using Lemma 3.1 and the bound (4.44) in the last step. The estimate of the linear part is similar. 4.7. Conclusion of the proof of Theorem 4.1. Since all of the hypotheses of the ACK Theorem have been verified, the proof of Theorem 4.2 has been achieved. There exist 0 < ρ0 < ρ, 0 < θ0 < θ, and a β0 > 0 such that Eq. (4.37) admits a unique solution in 0 ,θ0 . This also concludes the proof of Theorem 4.1 for the Euler equations. Hβl,ρ 0 ,T 5. Existence and Uniqueness for Prandtl’s Equations We want to prove that the Prandtl equations (2.6)-(2.11) admit a unique solution in an appropriate function space. The main result of this section is the following theorem: Theorem 5.1. Suppose that uP 0 satisfies the compatibility conditions (2.14) and (2.15), l+1,ρ,θ P E l+1,ρ0 ,θ0 ,µ0 that u E ∈ H , and that u l ≥ 3 (l ≥ 4 in 3D). Then there 0 0 − γu0 ∈ K P exists a unique solution u of the Prandtl equations (2.6)-(2.11). This solution can be written as: (5.1) uP (x, Y, t) = u˜ P (x, Y, t) + γuE , 1 ,θ1 ,µ1 , with 0 < ρ1 < ρ0 , 0 < θ1 < θ0 , 0 < µ1 < µ0 , β1 > β0 > 0. where u˜ P ∈ Kβl,ρ 1 ,T

1 ,θ1 ,µ1 This solution satisfies the following bound in Kβl,ρ : 1 ,T

E E |u˜ P |l,ρ1 ,θ1 ,µ1 ,β1 ,T < c |uP 0 − γu0 |l+1,ρ0 ,θ0 ,µ0 + |u 0 |l+1,ρ,θ .

(5.2)

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

449

In particular this shows that if the initial condition for Prandtl equations exponentially approaches the initial value of the Euler flow calculated at the boundary, the same property will be true for the Prandtl solution at least for short time. The proof of this theorem will occupy the remainder of this section. As in the proof of existence and uniqueness for the Euler equation, we shall recast the Prandtl equations in a form suitable for the use of the ACK Theorem (see Eq. (5.38) below). In Prandtl’s equations a second order operator (the heat operator) is present. The key idea is to invert this operator, taking into account boundary and initial conditions. Therefore we shall first introduce the heat operators, and prove some bounds on them. In Subsect. 5.2 we find an operator form for Prandtl equations, Eq. (5.38). The resulting operator F consists of two terms: The first is a forcing term that accounts for BC and IC. The second is the composition of a convective operator and the inverse of the heat operator with zero BC and IC. With the bounds on the heat and convective operators, it is then straightforward to get the desired bounds, which is performed in Subsects. 5.3 and 5.4. In the rest of this section we shall always suppose l ≥ 4 (l ≥ 5 in 3D), as needed for Proposition 2.3. 5.1. Estimates on heat operators. To solve Prandtl equations we introduce the heat kernel: 1 exp (−Y 2 /4t), (5.3) E0 (Y, t) = (4πt)1/2 and the heat operators acting on functions f (Y ) with
The operator E0 (t) is obtained by convolution with the heat kernel E0 (Y, t), with respect to Y , once the function f (Y ) is extended in an odd manner to
(5.5) (5.6) (5.7)

We need the following operator E1 , acting on functions defined on the boundary: Z t ds h(Y, t − s)g(x, s), (5.8) E1 g(x, t) = 0

where h(Y, t) is defined by: h(Y, t) =

Y exp(−Y 2 /4t) . t (4πt)1/2

(5.9)

The function E1 g solves the heat equations with zero initial data and with boundary value g; i.e. (∂t − ∂Y Y ) E1 g = 0, E1 g|t=0 = 0, γE1 g = g.

(5.10)

450

M. Sammartino, R. E. Caflisch

Using E0 (t) we define the operator E2 by Z

t

E2 f = Z

dsE0 (t − s)f (s)

0

Z

t

∞

ds

= 0

dY 0 E0 (Y − Y 0 , t − s) − E0 (Y + Y 0 , t − s) f (Y 0 , s).(5.11)

0

The operator E2 inverts the heat operator with zero initial data and boundary data; i.e. (∂t − ∂Y Y ) E2 f = f, E2 f |t=0 = 0, γE2 f = 0.

(5.12)

We now recall some basic properties of the heat operators. In the estimates below, c is a constant depending (at most) only on ρ, θ, β, µ and T . Notice that the restriction θ < π/4 is needed here. Proofs of the results in this subsection are presented in Appendix A. Lemma 5.1. Let f (Y ) and g(Y ) be two continuous bounded functions, and let g be exponentially decaying at infinity; i.e. there exists a positive µ such that supY ∈Σ(θ) eµ
Y ∈Σ(θ)

sup e

Y ∈Σ(θ)

µ
Z   

  dY |E0 (Y ± Y , t)|f (Y )  ≤ c sup |f (Y )|,

(5.13)

  µ
(5.14)

∞ 0

∞

0

0

0

Y ∈Σ(θ)

0

0

0

Y ∈Σ(θ)

0

in which the constant c depends only on θ and µ. Lemma 5.2. Let f ∈ C 1 ([0, T ]), with f (0) = 0, 0 < θ < π/4 and j = 1, 2. Then sup eµ
(5.15)

t>0

Y ∈Σ(θ)

    sup eµ0

Y ∈Σ(θ)

t>0

(5.16)

The following bounds on analytic norms of the heat operators will be used throughout the rest of this paper: Proposition 5.1. Let u ∈ K l,ρ,θ,µ with γu = 0. Then E0 (t)u ∈ K l,ρ,θ,µ for all t and sup |E0 (t)u|l,ρ,θ,µ ≤ c |u|l,ρ,θ,µ .

(5.17)

0≤t≤T

l,ρ,θ,µ for all β and T and that The above estimate obviously implies that E0 (t)u ∈ Kβ,T

|E0 (t)u|l,ρ,θ,µ,β,T ≤ c|u|l,ρ,θ,µ .

(5.18)

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

451

Corollary 5.1. Let u0 = u + f with u ∈ K l,ρ,θ,µ , f ∈ H 0l,ρ constant with respect to Y and t; moreover γu = −f . Then E0 (t)u0 − f ∈ K l,ρ,θ,µ for all t and sup |E0 (t)u0 − f |l,ρ,θ,µ ≤ c |u|l,ρ,θ,µ + |f |l,ρ . (5.19) 0≤t≤T

l,ρ,θ,µ The above estimate obviously implies that E0 (t)u0 − f ∈ Kβ,T for all β and T and that (5.20) |E0 (t)u0 − f |l,ρ,θ,µ,β,T ≤ c |u|l,ρ,θ,µ + |f |l,ρ .

In the next proposition we give an estimate of E0 (t)u in a different space, namely in Ll,ρ,θ ; we shall not use this estimate in this paper but in Part II Part II. Proposition 5.2. Let u ∈ Ll,ρ,θ with γu = 0. Then E0 (t)u ∈ Ll,ρ,θ for all t and sup |E0 (t)u|l,ρ,θ ≤ c|u|l,ρ,θ .

(5.21)

0≤t≤T

The above estimate obviously implies that E0 (t)u ∈ Ll,ρ,θ β,T for all β and T and that |E0 (t)u|l,ρ,θ,β,T ≤ c|u|l,ρ,θ .

(5.22)

0l,ρ l,ρ,θ,µ with φ(t = 0) = 0. Then E1 φ ∈ Kβ,T and Proposition 5.3. Let φ ∈ Kβ,T

|E1 φ|l,ρ,θ,µ,β,T ≤ c|φ|l,ρ,β,T .

(5.23)

We have the following estimate for E2 : l,ρ,θ,µ l,ρ,θ,µ Proposition 5.4. Let u ∈ Kβ,T . Then E2 u ∈ Kβ,T and

|E2 u|l,ρ,θ,µ,β,T ≤ c|u|l,ρ,θ,µ,β,T .

(5.24)

The following estimates will also be useful: l,ρ,θ,µ Proposition 5.5. Let u ∈ Kβ,T with γu = 0. If ρ0 < ρ − βt, θ0 < θ − βt and 0 µ < µ − βt, then Z t |E2 u|l,ρ0 ,θ0 ,µ0 ≤ c ds|u(·, ·, s)|l,ρ0 ,θ0 ,µ ≤ c|u|l,ρ,θ,µ,β,T . (5.25) 0

5.2. The final form of Prandtl’s equations. It is useful to introduce the new variable u˜ P : u˜ P = uP − γuE .

(5.26)

It is more natural to write Prandtl equations in terms of this new variable: First because the matching condition with the outer Euler flow, Eq. (2.10), will be simply a consequence 1 ,θ1 ,µ1 . of the fact that u˜ P is exponentially decaying in Y , i.e. of the fact that u˜ P ∈ Kβl,ρ 1 ,T Second the gradient of the pressure will not show up in the equation. Equation (2.6) in terms of u˜ P becomes: (∂t − ∂Y Y ) u˜ P + u˜ P ∂x γuE "Z E

P

P

Y

P

+ γu ∂x u˜ + u˜ ∂x u˜ −

# P

0

∂x u˜ dY + Y ∂x γu 0

E

∂Y u˜ P = 0,

(5.27)

452

M. Sammartino, R. E. Caflisch

where we have used P

Z

Y

v =−

P

0

"Z

∂x u dY = −

0

#

Y

P

0

∂x u˜ dY + Y ∂x γu

E

(5.28)

0

and the Euler equation at the boundary γ ∂t uE + uE ∂x uE + ∂x pE = 0.

(5.29)

The initial condition for Eq. (5.27) is E ˜P u˜ P (x, Y, t = 0) = uP 0 (x, Y ) − γu0 = u 0

(5.30)

while the boundary condition is γ u˜ P = −γuE .

(5.31)

Equation (5.27) for u˜ P with (5.30) as initial condition and (5.31) as boundary condition, and with v P given by (5.28), is equivalent to (2.6)-(2.11) for u P . To prove existence and uniqueness for (5.27)-(5.31), we shall use the ACK Theorem with the norms K l,ρ,θ,µ and l,ρ,θ,µ . To put Eq. (5.27) in a suitable form for the application of the ACK Theorem, Kβ,T we have to invert the heat operator in Eq. (5.27), taking into account the IC and BC. We define U to be E E E ˜P (5.32) U = −γuE 0 − E1 γu − γu0 + E0 (t) u 0 + γu0 . It is easy to see that U solves the heat equation with (5.30) as IC and (5.31) as BC; i.e. (∂t − ∂Y Y ) U = 0, U (t = 0) = u˜ P 0 , γU = −γuE .

(5.33) (5.34) (5.35)

Define the operators K(u˜ P , t), which is (minus) the convective part of Eq. (5.27), and F as K(u˜ P , t) = − u˜ P ∂x γuE + γuE ∂x u˜ P + u˜ P ∂x u˜ P "Z # Y

−

∂x u˜ P dY 0 + Y ∂x γuE ∂Y u˜ P ,

(5.36)

0

F (t, u˜ P ) = E2 K(u˜ P , t) + U.

(5.37)

The following equation is then equivalent to Eqs. (5.27)–(5.31): u˜ P = F (t, u˜ P ).

(5.38)

The rest of this section is devoted to proving that the operator F (t, u˜ P ) satisfies all the l,ρ,θ,µ . hypotheses of the ACK Theorem with X = K l,ρ,θ,µ and Y = Kβ,T 5.3. The forcing term. It is obvious that the operator F satisfies the first condition of the ACK Theorem. In this subsection we shall prove that the operator F satisfies the second condition of the ACK Theorem. Namely we prove that

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

|F (t, 0)|l,ρ1 −β0 t,θ1 −β0 t,µ1 −β0 t ≤ R0 in K l,ρ1 −β0 t,θ1 −β0 t,µ1 −β0 t for 0 ≤ t ≤ T , where R0 is a constant. Since F (t, 0) = U,

453

(5.39)

(5.40)

Corollary 5.1 and Proposition 5.3 show that l+1,ρ1 ,θ1 ,µ1 E E l+1,ρ1 ,θ1 with γ u˜ P , Proposition 5.6. Given that u˜ P 0 ∈K 0 = −γu0 and u0 ∈ H 0l,ρ1 l,ρ1 ,θ1 ,µ1 E then γu ∈ Kβ0 ,T and U ∈ Kβ0 ,T satisfying

˜P |U |l,ρ1 ,θ1 ,µ1 ,β0 ,T ≤ c |uE 0 |l+1,ρ1 ,θ1 + |u 0 |l+1,ρ1 ,θ1 ,µ1 .

(5.41)

1 , a Sobolev estimate in the y variable has been Notice how, to get that γuE ∈ Kβ0l,ρ 0 ,T used. With this proposition one sees that the forcing term is estimated in terms of the initial conditions for Prandtl equations and of the outer Euler flow. This concludes the proof of the estimate (5.39)

5.4. The Cauchy estimate. In this and in the next subsections we shall prove that the operator F as given by Eq. (5.37) satisfies the last hypothesis of the ACK Theorem. Namely we want to show that if ρ0 < ρ(s) ≤ ρ1 − β0 s, θ0 < θ(s) ≤ θ1 − β0 s, 1 ,θ1 ,µ1 with µ0 < µ(s) ≤ µ1 − β0 s, and if u(1) and u(2) are in Kβl,ρ 0 ,T |u(1) |l,ρ1 ,θ1 ,µ1 ,β0 ,T < R and |u(2) |l,ρ1 ,θ1 ,µ1 ,β0 ,T < R,

(5.42)

then |F (t, u(1) ) − F (t, u(2) )|l,ρ0 ,θ0 ,µ0 Z t (1) |u − u(2) |l,ρ(s),θ0 ,µ0 |u(1) − u(2) |l,ρ0 ,θ(s),µ0 ≤C ds + ρ(s) − ρ0 θ(s) − θ0 0 (1) (2) |u − u |l,ρ0 ,θ0 ,µ(s) + . µ(s) − µ0

(5.43)

In this subsection we shall be concerned with the operator K. The operator K involves three different kinds of terms: 1. The nonlinear term involving the x-derivative u˜ P ∂x u˜ P ; RY 2. the nonlinear term involving the Y -derivative, 0 dY 0 ∂x u˜ P · ∂Y u˜ P ; 3. the linear terms. Before going into the details we anticipate that term (1) will be estimated using the Cauchy estimate in the x-variable, term (2) will be estimated using the Cauchy estimate in the Y -variable, the linear growth in Y of the coefficient of term (3) will be estimated using the exponential decay in Y of the solution. Here and in the rest of this section 0 < ρ0 < ρ00 ≤ ρ1 − β0 t, 0 < θ0 < θ00 ≤ θ1 − β0 t, 0 < µ0 < µ00 ≤ µ1 − β0 t. We now state some lemmas used to estimate the convective operator.

454

M. Sammartino, R. E. Caflisch

1 ,θ1 ,µ1 Lemma 5.3. Suppose that u(1) and u(2) are in Kβl,ρ . Then 0 ,T

|u(1) ∂x u(1) − u(2) ∂x u(2) |l,ρ0 ,θ0 ,µ0 ≤ c

|u(1) − u(2) |l,ρ00 ,θ0 ,µ0 ρ00 − ρ0

(5.44)

in the K l,ρ,θ,µ norm, where the constant c depends only on |u(1) |l,ρ1 ,θ1 ,µ1 ,β0 ,T and |u(2) |l,ρ1 ,θ1 ,µ1 ,β0 ,T . In fact |u(1) ∂x u(1) − u(2) ∂x u(2) |l,ρ0 ,θ0 ,µ0 ≤ |u(1) ∂x u(1) − u(2) |l,ρ0 ,θ0 ,µ0 + |u(2) ∂x u(1) − u(2) |l,ρ0 ,θ0 ,µ0 +|∂x u(2) u(1) − u(2) |l,ρ0 ,θ0 ,µ0 ≤c

|u(1) − u(2) |l,ρ00 ,θ0 ,µ0 . ρ00 − ρ0

(5.45)

We now pass to estimation of terms involving the Y -derivative. First we give a version of the Cauchy estimate Lemma 4.5 for analytic functions exponentially decaying in the Y -variable. 0

Lemma 5.4. Let f ∈ H l,ρ ,θ

00

,µ00

. Then

|f |l,ρ0 ,θ00 ,µ0 + µ0 |f |l,ρ0 ,θ0 ,µ0 , θ00 − θ0 |f |l,ρ0 ,θ00 ,µ0 |f |l,ρ0 ,θ0 ,µ00 ≤ + µ0 00 + |f |l,ρ0 ,θ0 ,µ0 . 00 0 θ −θ µ − µ0

|χ(Y )∂Y f |l,ρ0 ,θ0 ,µ0 ≤

(5.46)

|Y ∂Y f |l,ρ0 ,θ0 ,µ0

(5.47)

To estimate the nonlinear term involving the Y -derivative we have to use the fact that the normal component of the velocity, as expressed by the integral from 0 to Y , goes to zero linearly fast. We now state a lemma similar to Lemma 5.3. 1 ,θ1 ,µ1 . Then Lemma 5.5. Suppose that u(1) and u(2) are in Kβl,ρ 0 ,T

  Z Y Z Y    (1) 0 (1) (2) 0 (2)  dY ∂x u − ∂Y u dY ∂x u  ∂ Y u   0 0 0 0 0 l,ρ ,θ ,µ (1) (2) (1) (2) |u − u |l,ρ00 ,θ0 ,µ0 |u − u |l,ρ0 ,θ00 ,µ0 ≤c + ρ00 − ρ0 θ00 − θ0

(5.48)

in the K l,ρ,θ,µ norm. The proof of Lemma 5.5 goes like the proof of Lemma 5.3. The only thing to be noticed is the fact that, because of the presence of a derivative in both the terms, one cannot use the Sobolev estimate right away, but has to pay attention to the way the l derivatives distribute between them. If all the l derivatives hit the term involving the integral one has to Cauchy estimate the x-derivatives inside it. If instead all derivatives hit the term involving the Y -derivative one has to Cauchy estimate that derivative. The estimate of the linear term whose coefficient grows linearly in Y is expressed in the following lemma.

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

455

1 ,θ1 ,µ1 Lemma 5.6. Suppose that u(1) and u(2) are in Kβl,ρ . Then 0 ,T

|Y ∂x γuE ∂Y u(1) − Y ∂x γuE ∂Y u(2) |l,ρ0 ,θ0 ,µ0 ≤c

|u(1) − u(2) |l,ρ0 ,θ00 ,µ0 |u(1) − u(2) |l,ρ0 ,θ0 ,µ00 (1) (2) 0 0 0 + + |u − u | l,ρ ,θ ,µ .(5.49) θ00 − θ0 µ00 − µ0

Using Lemmas 5.3, 5.5 and 5.6 we can conclude this subsection with the following estimate on the convective operator of Prandtl equations: 1 ,θ1 ,µ1 . Then Proposition 5.7. Suppose u(1) and u(2) are in Kβl,ρ 0 ,T   K(u(1) , t) − K(u(2) , t) 0 0 0 l,ρ ,θ ,µ

≤c

|u(1) − u(2) |l,ρ00 ,θ0 ,µ0 + ρ00 − ρ0

|u(1) − u(2) |l,ρ0 ,θ00 ,µ0 |u(1) − u(2) |l,ρ0 ,θ0 ,µ00 (1) (2) + + |u − u |l,ρ0 ,θ0 ,µ0 . θ00 − θ0 µ00 − µ0 (5.50) 5.5. Conclusion of the Proof of Theorem 5.1. To conclude the proof of estimate (5.43), first notice that in the iterative construction of the solution of (5.38), each term satisfies γ u˜ P = −γuE . The difference K(u(1) ) − K(u(2) ) need be considered only for these functions. As a result, we may assume that (5.51) γ K(u(1) , t) − K(u(2) , t) = 0. This will allow us to use Proposition 5.5. In fact:   F (u(1) , t) − F (u(2) , t) 0

l−1,ρ ,θ 0 ,µ0

  = E2 K(t, u(1) ) − K(t, u(2) ) l−1,ρ0 ,θ0 ,µ0 Z

t

≤c 0

Z

  ds K(u(1) , t) − K(u(2) , t)l−1,ρ0 ,θ0 ,µ0

t

≤c

ds 0

|u(1) (·, ·, s) − u(2) (·, ·, s)|l−1,ρ(s),θ0 ,µ0 ρ(s) − ρ0

+

|u(1) (·, ·, s) − u(2) (·, ·, s)|l−1,ρ0 ,θ(s),µ0 θ(s) − θ0

+

|u(1) (·, ·, s) − u(2) (·, ·, s)|l−1,ρ0 ,θ0 ,µ(s) . µ(s) − µ0

(5.52)

With the above estimate we conclude this subsection. The proof of the estimate (5.43) has been finally achieved. Therefore operator F satisfies all the hypotheses of the ACK Theorem, and Theorem 5.1 has been proved.

456

M. Sammartino, R. E. Caflisch

5.6. A final remark. The main result of this section is Theorem 5.1 stating the existence and the uniqueness of a solution uP of Eqs. (2.6)–(2.11), and that this solution is the sum of a function exponentially decaying outside the boundary layer and of the value at the boundary of the Euler flow. What about normal velocity v P ? Corresponding to u˜ P define v˜ P by v˜ P = −

Z

Y

∂x u˜ P dY 0 .

(5.53)

0

Using this expression, the fact that u˜ P is exponentially decaying in the Y variable, and a Cauchy estimate in the x variable, it follows that v˜ P differs by a constant ( in Y ) from l−1,ρ0 ,θ ,µ a function in Kβ1 ,T 1 1 1 with ρ01 < ρ1 . Renaming ρ01 , just to simplify the notation, we can therefore conclude that 1 ,θ1 ,µ1 , (5.54) u˜ P ∈ Kβl−1,ρ 1 ,T 1 ,θ1 ,µ1 . v¯ P = v˜ P − v˜ P (Y = ∞) ∈ Kβl−1,ρ 1 ,T

(5.55)

6. Conclusions This concludes the proofs of existence for the Euler and Prandtl equations with analytic initial data. These results will be used in Part II [12] as the leading order terms in an asymptotic expansion for the solution of the Navier-Stokes equations with small viscosity. The solution will be found as a composite expansion, using the Prandtl solution near the boundary and the Euler solution far from the boundary.

Appendix A: The estimates for the heat operators Proof of Lemma 5.1. We prove √ the estimate (5.14); the estimate (5.13) can be proved analogously. Set η = (Y 0 ± Y )/ 4t so that  Z   ∞ −(Y ±Y 0 )2 /4t e   √ dY 0 g(Y 0 ) sup eµ
(A.1)

Y ∈Σ(θ)

This uses the restriction that 0 < θ < π/4, so that e−η ≤ e−k<η , for some constant k and thus Z ∞ √ −η 2 exp −(∓Y + η 4t) ≤ c. (A.2) √ dηe 2

2

±Y / 4t

Proof of √ Lemma 5.2. We begin with the estimate (5.15). Use the change of variable ζ = Y / 4(t − s) to obtain

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

457

Z  2  t  Y e−Y /4(t−s)   √ sup e ds f (s)    t − s 4π(t − s) Y ∈Σ(θ) 0   Z   ∞ 2  −ζ 2 2  dζe f (t − Y /4ζ ) = sup eµ
t

Y ∈Σ(θ)

Y / 4t

= c sup |f (t)|.

(A.3)

t

We now pass to the estimate (5.16) with j = 1. Since f (0) = 0, then sup eµ
Y ∈Σ(θ)

 Z  2   t Y e−Y /4(t−s)   √ ds f (s)  ∂Y   t − s 4π(t − s) 0

= 2 sup eµ
 Z   ∞ 2 Y   dζe−ζ 2 f 0 (t − Y 2 /4ζ 2 )  √   Y / 4t ζ

0

≤ sup |f (t)| sup e t

Y ∈Σ(θ)

µ
Z Y

∞ √

dζe−ζ /ζ 2 2

Y / 4t

= c sup |f 0 (t)|.

(A.4)

t

The estimate (5.16) with j = 2 can be proved in a similar way, using ∂Y2 E1 f = ∂t E1 f . Proof of Propositions 5.1 . Denote with ∂ j a j th derivative in x and Y where the derivative in Y does not show up more than twice (we stress again that in our functional setting a Y derivative is required up to order two, see e.g. Definition 2.4). Then |E0 (t)u|l,ρ,θ,µ =

X

sup eµ
j≤l Y ∈Σ(θ)

Z

j sup

∂

|=x|<ρ

≤

X

∞ 0

dY 0 E0 (Y − Y 0 , t) − E0 (Y + Y 0 , t) u(x, Y 0 )

L2

sup eµ
j≤l Y ∈Σ(θ)

Z

∞ 0

dY 0 |E0 (Y − Y 0 , t)| + |E0 (Y + Y 0 , t)| sup ∂ j u(·, Y 0 ) L2

≤ c|u|l,ρ,θ,µ ,

|=x|<ρ

(A.5)

where Lemma 5.1 has been used in the last step. For the first derivative in Y , the boundary terms at Y = 0 vanished because u(Y = 0) = 0; for the second derivative, they vanished due to cancelation of the two E0 factors.

458

M. Sammartino, R. E. Caflisch

Proof of Proposition 5.2 . Denote by ∂ j a j th derivative in x and Y , where the derivative in Y does not show up more than twice. Also if a Y derivative does show up, the order of the x derivative is at most l − 2 (as required in H l,ρ,θ ). Then |E0 (t)u|l,ρ,θ = (

Z

j sup

∂

|=x|<ρ

X

X

Z dY

sup

0 j≤l θ ≤θ

∞

dY

0

0

0(θ 0 ,a/ε)

0 0 0 E0 (Y − Y , t) − E0 (Y + Y , t) u(x, Y ) (

Z dY

sup

0 j≤l θ ≤θ

0(θ 0 ,a/ε)

√

∞

X Z ≤ 2 j≤l

∞ −∞

L2 (<x)



(

Z

dηe−η sup  2

L2 (<x)

"Z

∞ √ 2

sup ∂ j dηe−η u(x, Y + η 4t) √ |=x|<ρ −Y / 4t

#

−z 2 u(x, −Y + z 4t) − √ dze

Y / 4t Z

)2 1/2 

θ 0 ≤θ

dY 0(θ 0 ,a/ε)

2 1/2    

sup ∂ j u(·, Y 0 ) L2

)2 1/2 

|=x|<ρ

≤ c|u|l,ρ,θ .

(A.6)

Proof of Proposition 5.3 . We have |E1 φ|l,ρ,θ,µ,β,T ≤ c

X

X

sup |=x|≤ρ−βt

≤c

sup

sup

e(µ−βt)
α1 ≤2 α2 ≤l−α1 0≤t≤T Y ∈Σ(θ−βt)

X

k∂Yα1 ∂xα2 E1 φkL2

X

sup

sup

e(µ−βt)
α1 ≤2 α2 ≤l−α1 0≤t≤T Y ∈Σ(θ−βt)

     α1  α2 sup k∂x φkL2   ∂ Y E1   |=x|<ρ−βt ≤ c|φ|l,ρ,β,T .

(A.7)

In passing from the first to the second line we have used the fact that E1 u solves the heat equation (so that ∂t E1 u = ∂Y Y E1 u); in passing from the second to the third line we have used Lemma 5.2 with f (t) = sup|=x|<ρ−βt k∂xα2 φkL2 .

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

459

Proof of Proposition 5.4 . To estimate |E2 u|l,ρ,θ,µ,β,T we must estimate |∂xα E2 u|0,ρ,θ,µ,β,T with α ≤ l, |∂Y ∂xα E2 u|0,ρ,θ,µ,β,T with α ≤ l − 1, |∂t ∂xα E2 u|0,ρ,θ,µ,β,T with α ≤ l − 1 and |∂Y Y ∂xα E2 u|0,ρ,θ,µ,β,T with α ≤ l − 2. We begin with |∂xα E2 u|0,ρ,θ,µ,β,T : |∂xα E2 u|0,ρ,θ,µ,β,T = sup

e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βT )

Z "Z

t ∞ p 2

α sup ∂x ds dηe−η u(x, Y + η 4(t − s), s) √ |=x|≤ρ−βt 0 −Y / 4(t−s) # Z ∞

p

−z 2 + dze u(x, −Y + z 4(t − s), s) √

2 Y / 4(t−s) L

≤

sup

e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βT )

Z

"Z

t

∞

ds

+

∞ √ Y / 4(t−s)

"Z

t

ds

−z 2

dxe

sup |=x|≤ρ−βt

k∂xα u(·, Y + η

k∂xα u(·, −Y

+z

p

∞

p

4(t − s), s)kL2 #

4(t − s), s)kL2

sup

0≤t≤T Y ∈Σ(θ−βT )

dηe−η + 2

√ −Y / 4(t−s)

0

sup |=x|≤ρ−βt

≤ |∂xα u|0,ρ,θ,µ,β,T sup Z

2

−Y / 4(t−s)

0

Z

dηe−η

√

Z

#

∞ √ Y / 4(t−s)

dze−z

2

= |∂xα u|0,ρ,θ,µ,β,T .

(A.8)

We now pass to |∂Y ∂xα E2 u|0,ρ,θ,µ,β,T ; the only difference √ from the above estimate will be the appearance of a boundary term, behaving like (t − s), which is nevertheless bounded using the regularizing property of the integration in time, |∂Y ∂xα E2 u|0,ρ,θ,µ,β,T ≤ sup Z

"Z

t

∞

ds Z

+

dηe−η

√

2

∞

dxe−z

√

e(µ−βt)
sup

k∂Y ∂xα u(·, Y + η

|=x|≤ρ−βt

−Y / 4(t−s)

0

sup

0≤t≤T Y ∈Σ(θ−βT )

Y / 4(t−s)

e−Y /4(t−s) −2 √ 4(t − s)

2

sup |=x|≤ρ−βt

k∂Y ∂xα u(·, −Y + z

sup

4(t − s), s)kL2

4(t − s), s)kL2

#

2

|=x|≤ρ−βt

p

p

k∂xα u(·, 0, s)kL2

≤ |∂Y ∂xα u|0,ρ,θ,µ,β,T + c|∂xα u|0,ρ,θ,µ,β,T sup

Z

t

sup

0≤t≤T Y ∈Σ(θ−βt)

0

e−Y /4(t−s) ds √ 4(t − s) 2

≤ |∂Y ∂xα u|0,ρ,θ,µ,β,T + c|∂xα u|0,ρ,θ,µ,β,T . (A.9) We now pass to |∂t ∂xα E2 u|0,ρ,θ,µ,β,T :

460

M. Sammartino, R. E. Caflisch

|∂t ∂xα E2 u|0,ρ,θ,µ,β,T ≤ sup

sup

( e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βt)

Z

"

t

ds

+

0 Z ∞

Y e−Y /4(t−s) √ t − s 4(t − s) dηe−η η √ 2

−Y /4(t−s)

−

k∂xα u(·, Y, t)kL2

2

+ Z

|=x|≤ρ−βt

∞

dze Y /4(t−s)

−z 2

sup |=x|≤ρ−βt

k∂xα u(·, 0, t)kL2

p 1 sup k∂Y ∂xα u(·, Y + η 4(t − s), t)kL2 (t − s) |=x|≤ρ−βt

p 1 z√ sup k∂Y ∂xα u(·, −Y + z 4(t − s), t)kL2 (t − s) |=x|≤ρ−βt

#)

≤ c|∂xα u|0,ρ,θ,µ,β,T + c|∂Y ∂xα u|0,ρ,θ,µ,β,T ≤ c|u|l,ρ,θ,µ,β,T . (A.10) The term |∂Y Y ∂xα E2 u|0,ρ,θ,µ,β,T can be bounded using the estimate (A.10) and the fact that ∂Y Y E2 u = ∂t E2 u − u. This concludes the proof of Proposition 5.4. Proof of Proposition 5.5. The proof of Proposition 5.5 is very similar to the proof of Proposition 5.4, the main difference being that one is not allowed to use the regularizing properties of the integration in time; no singular term appears, though, because of the requirement that u = 0 at the boundary. Here we present the estimate of the term |∂Y ∂xα E2 u|0,ρ0 ,θ0 ,µ0 with α ≤ l − 1: Z t α µ0
"Z

∞ √

dηe−η

+ Z

√ Y / 4(t−s) t

≤ 0

∞

sup k∂Y ∂xα u(·, Y + η

p

|=x|≤ρ0

−Y / 4(t−s)

Z

2

0

−z 2

dxe

sup |=x|≤ρ0

k∂Y ∂xα u(·, −Y

ds|∂Y ∂xα u(·, ·, s)|0,ρ0 ,θ0 ,µ0 .

+z

4(t − s), s)kL2

p

# 4(t − s), s)kL2 (A.11)

Acknowledgement. Part of this work has been done while Marco Sammartino was visiting the Mathematics Department of UCLA. He wishes to express his gratitude for the warm hospitality that he received. We greatly benefited from discussions with a number of people including Kiyoshi Asano, Antonio Greco, Tom Hou, Mario Pulvirenti and Zhouping Xin.

References 1. Asano, K.: Zero-viscosity limit of the incompressible Navier-Stokes equations 1 and 2. Preprint (1988) 2. Bardos C. and Benachour, S.: Domaine d’analycite des solutions de l’equation d’Euler dans un ouvert de Rn . Annali della Scuola Normale Superiore di Pisa IV, 4 647–687 (1977)

Zero Viscosity Limit for Analytic Solutions, of N-S Equation. I.

461

3. Beale, J.T. and Majda, A.: Rates of convergence for viscous splitting of the Navier-Stokes equations. Math. Comp. 37, 243–259 (1981) 4. Cowley, S.J.: Computer extension and analytic continuation of Blasius’ expansion for impulsive flow past a circular cylinder. J. Fluid Mech.135, 389–405 (1983) 5. Van Dommelen, L.L. and Cowley, S.J. On the Lagrangian description of unsteady boundary-layer separation. Part 1. General theory and Part 2. The spinning sphere. J. Fluid Mech.210, 593–626 and 627–645 (1990) 6. Van Dommelen, L.L. and Shen, S.F.: The spontaneous generation of the singularity in a separating laminar boundary layer. J. Comput. Phys.38, 125–140 (1980) 7. Ebin, D. and Marsden: Group of diffeomorphisms and the motion of an incompressible fluid. Ann. Math. 92, 102–163 (1970) 8. Kato, T.: Remarks on zero viscosity limit for nonstationary Navier-Stokes flows with boundary. In: Seminar on Partial Differential Equations. Berkeley: MSRI, 1984, pp. 85–98 9. Nickel, K.: Prandtl’s boundary-layer theory from the viewpoint of a mathematician. Ann. Rev. of Fluid Mech. 5, 405–428 (1973) 10. Oleinik, O.A.: On the mathematical theory of boundary layer for an unsteady flow of incompressible fluid. J. Appl. Math. Mech. 30, 951–974 (1966) 11. Safonov, M.V.: The abstract Cauchy-Kovalevskaya theorem in a weighted Banach space. Comm. Pure and Appl. Math.48, 629–637 (1995) 12. Sammartino, M. and Caflisch, R.E.: Zero Viscosity Limit for Analytic Solutions of the Navier-Stokes Equation on a Half-Space II. Construction of Navier-Stokes Solution. Commun. Math. Phys.192, 463– 491 (1998) 13. Swann, H.: The convergence with vanishing viscosity of non-stationary Navier-Stokes flow to ideal flow in R3 . Trans. AMS 157, 373- 397 (1971) 14. E, W. and Enquist, B.: Blowup of solutions of the unsteady Pradt’s equation. Comm. Pure and Appl. Math. 50, 1287–1293 (1997) Communicated by J. L. Lebowitz

Commun. Math. Phys. 192, 463 – 491 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Zero Viscosity Limit for Analytic Solutions of the Navier-Stokes Equation on a Half-Space. II. Construction of the Navier-Stokes Solution Marco Sammartino1,? , Russel E. Caflisch2,?? 1 Dipartimento di Matematica, University of Palermo, Via Archirafi 34, 90123 Palermo, Italy. E-mail: [email protected] 2 Mathematics Department, UCLA, Los Angeles, CA 90096-1555, USA. E-mail: [email protected]

Received: 5 September 1996 / Accepted: 14 July 1997

Abstract: This is the second of two papers on the zero-viscosity limit for the incompressible Navier-Stokes equations in a half-space in either 2D or 3D. Under the assumption of analytic initial data, we construct solutions of Navier-Stokes for a short time which is independent of the viscosity. The Navier-Stokes solution is constructed through a composite asymptotic expansion involving the solutions of the Euler and Prandtl equations, which were constructed in the first paper, plus an error term. This shows that the NavierStokes solution goes to an Euler solution outside a boundary layer and to a solution of the Prandtl equations within the boundary layer. The error term is written as a sum of first order Euler and Prandtl corrections plus a further error term. The equation for the error term is weakly nonlinear; its linear part is the time dependent Stokes equation. This error equation is solved by inversion of the Stokes equation, through expressing the solution as a regular (Euler-like) part plus a boundary layer (Prandtl-like) part. The main technical tool in this analysis is the Abstract Cauchy-Kowalewski Theorem. 1. Introduction This is the second of two papers on the zero viscosity limit of the incompressible NavierStokes equations in a half-space with analytic initial data, and in either two or three spatial dimensions. Under the analyticity restriction and for small viscosity, we prove that the Navier-Stokes equations have a solution for a short time (independent of the viscosity). In the zero-viscosity limit, we show that this Navier-Stokes solution goes to an Euler solution outside a boundary layer and to a solution of the Prandtl equations within the boundary layer. As argued in the Introduction of Part I [6], we believe that the imposition of analyticity is needed to make this problem well-posed, by preventing boundary layer separation, but there is no proof of this. ? ??

Research supported in part by the NSF under grant #DMS-9306720. Research supported in part by the ARPA under URI grant number # N00014092-J-1890.

464

M. Sammartino, R. E. Caflisch

In the first paper [6], we proved short time existence of solutions for the Euler equations and the Prandtl equations with analytic initial data. In this second paper, we construct the Navier-Stokes solution as a sum of the Euler solution, the Prandtl solution and an error term. Existence and bounds of size ε (the square root of the viscosity) for the error term are the main results of this paper. The error equation is weakly nonlinear, since its solution is small. Its linear part is exactly the time-dependent Stokes equation, with forcing terms and with boundary and initial data. As for the solution of the Euler equations in [6], the incompressibility of the solution is ensured by use of the projection method in order to avoid dealing directly with the pressure. The main technical tool here is the Abstract Cauchy-Kowalewski (ACK) Theorem, which is invoked to establish existence for the error equation. As discussed in the Introduction to Part I, the abstract version of this theorem applies to dissipative equations, even though the classical version does not. A discussion of related references from the literature is presented in the Introduction to Part I. In Sect. 2 we state the Navier-Stokes equations and discuss how the Euler equations and Prandtl equations, in the limit of small viscosity, can be formally derived from Navier-Stokes through different scalings and asymptotic expansions. The introduction of two different scalings, typical in singular perturbation theory, is formally necessary to describe two different regimes of the flow: the inviscid regime (far away from the boundary) and the viscous regime (close to the boundary) where the viscous forces cannot be neglected even for small viscosity. The meaning of Theorem 1, which is the main result of this paper, is to rigorously establish this formal result; i.e. to show that the Euler and Prandtl equations are each a good approximation of the Navier-Stokes equations in their respective domains of validity. In particular, the solution of the full Navier-Stokes equations is divided into Euler, Prandtl and error terms, and the error term is further divided into first order Euler, first order Prandtl and a higher order correction. Section 3 contains an analysis of the time-dependent Stokes equations with prescribed boundary data. For this linear problem, which we shall solve explicitly, we also show that the solution is the superposition of an inviscid part, a boundary layer part, and a small correction. Section 4 contains the decomposition of the error equation Eqs. (4.1)– (4.4) into first order Euler and Prandtl equations, which are solved in Sections 5 and 6. The analysis of the equations for the remaining error takes all of Sect. 7. These “NavierStokes error equations” contain terms of size O(ε−1 ) due to the generation of vorticity at the boundary. They are solved using what we call the “ Navier-Stokes operator,” which solves Stokes equations with a forcing term (see Eqs. (7.22)–(7.25)). It is suitable for solving the error equation (and thus the original Navier-Stokes equations) with an iterative procedure. With the bounds on this operator, and with the use of the abstract version of the Cauchy-Kowalewski Theorem, we can prove existence, uniqueness and boundedness (in a suitable norm) for the error. Final conclusions are stated in Sect. 8. The function spaces that are used in this paper are all defined in Part I. For convenience, tables of function spaces and operators are presented there. As in Part I, the exposition is presented for the two-dimensional problem, but the results are all expressed for 3D as well as 2D. 2. Navier-Stokes Equations 2.1. A singular perturbation problem. The Navier-Stokes equations on the half plane for a velocity field u N S = (uN S , v N S ) are

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

(∂t − ν1) u N S + u N S · ∇u N S + ∇pN S = 0, ∇ · u N S = 0, γu N S = 0, NS S (t = 0) = u N u 0 .

465

(2.1) (2.2) (2.3) (2.4)

Here, ν = ε2 is the viscosity coefficient, and γ is the trace operator, i.e. γf (x, t) = S (x, y) must satisfy the incompressibility condif (x, y = 0, t). The initial velocity u N 0 tion and the compatibility condition with the BC Eq. (2.3): S = 0, ∇·uN 0

γu

NS 0

= 0.

(2.5) (2.6)

In this paper we are interested in the behavior of the solution of N-S equations in the limit of small viscosity ν << 1. As usual in perturbation theory, it is natural to write the solution as an asymptotic series of the form u N S = u 0 + εu 1 + ε2 u 2 + . . . ,

(2.7)

i where all the √terms u satisfy equations that are independent of ε (the reason for expanding in ε = ν comes from the boundary layer expansion, which is described below). The equation for the leading order term u 0 comes from just neglecting the viscous term in the Navier-Stokes equations, which yields the Euler equations

∂t u E + u E · ∇u E + ∇pE = 0, ∇ · u E = 0, γn u E = v E (x, y = 0, t) = 0, u E (x, y, t = 0) = u E 0 (x, y) .

(2.8) (2.9) (2.10) (2.11)

This procedure works well, at least for short times far away from the boundary, but gives unsatisfactory answers close to the boundary. Comparison of the boundary conditions Eqs. (2.3) and (2.10) for the Navier-Stokes and Euler equations, respectively, shows the cause of the failure. For Euler equations we can only impose zero normal velocity, since the equations are first order; while for Navier-Stokes the no-slip condition requires both normal and tangential velocities to vanish. We must therefore allow a region in the vicinity of the boundary where viscous forces are comparable to inertial forces, and where there is an adjustment of the tangential velocity from zero at the boundary √to the value predicted by the Euler equations. This boundary layer should have size ε = ν, so that the viscous term νuyy is of size O(1). Thus it is natural to write all quantities in terms of a rescaled normal variable Y = y/ε. Next, the incompressibility condition requires that vy = ε−1 vY = O(1), which requires the vertical velocity v to be size O(ε). Imposing this scaling in the NavierStokes equations, and again neglecting terms which are first order in ε, one gets Prandtl’s equations for the fluid velocity u P (x, Y, t) = (uP , εv P ) in the vicinity of the boundary; i.e.

466

M. Sammartino, R. E. Caflisch

(∂t − ∂Y Y ) uP + uP ∂x uP + v P ∂Y uP + ∂x pP = 0, ∂Y pP = 0, ∂x uP + ∂Y v P = 0, γuP = γv P = 0, uP (x, Y → ∞) −→ γuE , P u (x, Y, t = 0) = uP 0 (x, Y ) .

(2.12) (2.13) (2.14) (2.15) (2.16) (2.17)

Equation (2.16) is the matching condition between the inner (viscous) flow and the outer (inviscid) flow. This condition is equivalent to the existence of an intermediate region (e.g. a region where y = O(εα ) with 0 < α < 1), where there is a smooth transition between the viscous and inviscid regimes. As already noticed (see Subsect. 5.2 of [6]), it is natural to introduce the new variable u˜ P = (u˜ P , εv˜ P ) defined as u˜ P = uP − γuE , Z Y P P E v˜ = v + Y ∂x γu = − dY 0 ∂x u˜ P ,

(2.18) (2.19)

0

and write Prandtl equations in terms of u˜ P as (∂t − ∂Y Y ) u˜ P + u˜ P ∂x γuE + γuE ∂x u˜ P + u˜ P ∂x u˜ P + v˜ P − Y ∂x γuE ∂Y u˜ P = 0, P

E

γ u˜ = −γu , u˜ (x, Y → ∞) −→ 0, E ˜P u˜ P (x, Y, t = 0) = uP 0 (x, Y ) − γu0 = u 0 . P

(2.20) (2.21) (2.22) (2.23)

We also define the normal velocity v P to be the velocity v˜ P minus its value at infinity; i.e. Z ∞ v P (Y ) = v˜ P (Y ) − v˜ P (Y = ∞) =

Y

dY 0 ∂x u˜ P .

(2.24)

In [6] we have proved that, under suitable hypotheses on the initial conditions, i.e. analyticity, incompressibility and compatibility with boundary conditions, both the Euler and Prandtl equations admit a unique solution in the appropriate space of analytic functions (see Theorems 4.1 and 5.1 in [6]). To be more specific, we found the existence and the uniqueness of an analytic solution for Euler equations which is L2 in both the x and y variable. For Prandtl, on the other hand, we proved existence and uniqueness for a solution u˜ P which is L2 in the x variable, and exponentially decaying in the Y variable (i.e. outside the boundary layer); the normal component v˜ P of the velocity is O(ε), but not decaying in Y , and in fact goes to a constant outside the boundary layer. At this point, a natural question is whether one can use the solutions of the Euler and Prandtl equations to build a zeroth order approximation to the solution of Navier-Stokes equations. The following theorem, which is the main result of this paper, gives a positive answer to this question: Theorem 1 (Informal Statement). Suppose that u E (x, y, t) and u P (x, Y, t) are solutions of the Euler and Prandtl equations, respectively, which are analytic in the spatial variables x, y, Y . Then for a short time T , independent of ε, there is a solution u N S (x, y, t) of the Navier-Stokes equations with

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

u

NS

=

u E + O(ε) outside boundary layer u P + O(ε) inside boundary layer.

467

(2.25)

A formal version of this result, with a complete specification of the possible initial data for the Navier-Stokes solution is given in the following theorem: Theorem 1. Suppose the initial condition for the Navier-Stokes equations is given in the following form S P =uE uN 0 0 (x, y) + u 0 (x, Y ) + ε [ω 0 (x, y) + Ω0 (x, Y ) + e 0 (x, Y )] ,

(2.26)

where (i)

E E l,ρ,θ and uE 0 = (u0 , v0 ) ∈ H E ∇·uE 0 = 0 , γn u 0 = 0,

P l,ρ,θ,µ (ii) u P ˜P and 0 = (u 0 , εv 0 ) ∈ K Z ∞ E vP = dY 0 ∂x u˜ P ˜P 0 0 , γu 0 = −γu0 , Y

(iii) ω 0 = (ω01 , ω02 ) ∈ N l,ρ,θ , ∇ · ω 0 = 0 , γω02 = −γv P 0 , (iv) Ω = (10 , ε20 ) ∈ K l,ρ,θ,µ and Z ∞ 20 = dY 0 ∂x 10 , γ10 = −γω01 , Y

(v) e 0 = (e10 , e20 ) ∈ Ll,ρ,θ and ∇ · e 0 = 0 , γe 0 = (0, −γ20 ), with l ≥ 6. Then there exist ρ < ρ, θ < θ, µ < µ, β > 0, and T > 0, all independent of ε, such that the solution of the Navier-Stokes equations can be written in the form u N S = u E (x, y, t) + u P (x, Y, t) + ε [ω(x, y, t) + Ω(x, Y, t) + e (x, Y, t)]

(2.27)

in which (i)

u E ∈ H l,ρ,θ is the solution of the Euler equations (2.1)–(2.4), β,T

(ii) u = (u˜ P , εv P ) ∈ K l,ρ,θ,µ is the modified Prandtl solution as defined in (2.18) β,T and (2.24), exponentially decaying outside the boundary layer, P

(iii) ω ∈ N l,ρ,θ is the first order correction to the inviscid flow; it solves Eqs. (4.7)– β,T (4.10) below, (iv) Ω ∈ K l,ρ,θ,µ is the first order correction to the boundary layer flow; it solves β,T Eqs. (4.11)–(4.14) below,

468

M. Sammartino, R. E. Caflisch

(v) e ∈ Ll,ρ,θ is an overall correction; it solves Eqs. (4.15)–(4.18) below. β,T

The norms of ω, Ω and e in the above spaces are bounded by a constant that does not depend on the viscosity. 2.2. Discussion of the Theorem. Since u P is exponentially decaying for large Y = y/ε, then the expression (2.27) shows that u N S = u E + O(ε) for y outside of the boundary layer (i.e. y >> ε). For y inside the boundary layer (i.e. y ≤ ε), u E = (γuE , 0) + O(ε), so that u N S = u P + O(ε). This shows that the informal statement of the theorem follows from the rigorous statement. In this theorem the Navier-Stokes solution is represented in terms of a composite expansion of the form (2.27), which includes a regular (Euler) term u E , a boundary layer term u P and a correction term. Since the Euler solution has non-zero boundary values, the Prandtl solution must be modified so that the sum of the two is zero at the boundary and approaches the Euler solution at the outer edge of the boundary layer. The theorem says that if the initial condition is a function L2 in transversal and normal component (together with its derivatives up to order l), then the solution of the NavierStokes equations will have the composite expansion form given in Eq. (2.27), at least for a short time. There are several other ways to represent the Navier-Stokes solution for small viscosity. The most common method in perturbation theory [3] is to write the solution as a matched asymptotic expansion in which u N S = u P + O(ε) u N S = u E + O(ε)

for y small enough, for y not too small.

(2.28) (2.29)

The formal validity of this representation is usually demonstrated by showing that the O(ε) terms are small, and that there is a region of overlap for the validity of the two expansions. While this representation is more easily understood than the composite expansion, it is much more difficult to rigorously analyze due to the two spatial regimes. A second method for representing the solution, which has been used for example in [4, 8], is to introduce a cut off function m = m(y/εα ) with m(0) = 1, m(∞) = 0, and 0 < α < 1. The solution is then written as u N S = mu P + (1 − m)u E + O(εα ).

(2.30)

This method has two difficulties: It introduces an artificial length scale εα which makes the error terms artificially large. It also requires error terms in the incompressibility equation, since mu P + (1 − m)u E is not divergence-free. For these reasons we have found the composite expansion method to be the most convenient for analysis. The rest of this paper is devoted to proving Theorem 1. Unless otherwise stated, l ≥ 6 throughout. 2.3. The error equation. If we pose uN S = uE + u˜ RP + εw1 , ∞ v N S = v E + ε Y dY 0 ∂x u˜ P + εw2 = v E + εv P + εw2 , NS E w = p + εp , p

(2.31)

and use these expressions in the N-S equations, we get the following equation for the error w = (w1 , w2 ):

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

∂t − ε2 1 w + w · ∇u 0 + u 0 · ∇w + εw · ∇w + ∇pw ∇·w γw w (t = 0)

469

= f + g · ∂y u˜ P , 0 , = 0, = (0, g) , = ω 0 + Ω0 + e 0 ,

(2.32) (2.33) (2.34) (2.35)

in which u 0 = (u0 , v 0 ) is defined by u0 = uE + u˜ P , R∞ v 0 = v E + εv P = v E + ε Y dY 0 ∂x u˜ P .

(2.36)

The forcing term is f = (f 1 , f 2 ) given by f 1 = −ε−1 u˜ P ∂x uE − ∂x γuE + (∂x u˜ P ) uE − γuE + (∂y u˜ P ) v E + y∂x γuE

− v P ∂y uE + ε1uE + ε∂x2 u˜ P , P

P

P

P

f = − ∂t v + u ∂x v + v ∂y v + v ∂y v 2

0

0

Z

and also g=

∞

E

(2.37) −ε

−1 P

E

0

u˜ ∂x v + ε1v ,

dY 0 ∂x u˜ P .

(2.38)

(2.39)

0 1 ,θ1 We want to show that the forcing term f is in Ll−2,ρ , and that in this space it has β1 ,T O(1) norm, namely that

2

˜P |f |l−2,ρ1 ,θ1 ,β1 ,T ≤ c |u E 0 |l,ρ,θ + |u 0 |l,ρ,θ,µ + 1

,

(2.40)

where the constant c does not depend on ε. Let us consider f 1 . From Theorems 4.1 and 5.1 of Part I [6], it is clear that the terms ε1uE and ε∂x2 u˜ P satisfy the estimate (2.40). Each of the remaining terms in f has a similar form: They are each ε−1 times the product of a function which is exponentially decaying (with respect to Y = y/) outside the boundary layer (terms containing u˜ P and v P ), and a function that is O(ε) inside the boundary layer (e.g. uE − γuE ). It follows that they all satisfy (2.40). In an analogous way one can see that f 2 is O(1) and satisfies the estimate (2.40). Thus Eqs. (2.32)–(2.35) for the error term w (x, Y, t) have bounded forcing terms. In Sects. 4–7 we shall prove that this system admits a solution w which can be represented in the following form: w =ω+Ω+e, (2.41) where the norms (in the appropriate function spaces) of ω, Ω and e remain bounded by a constant independent of ε. The difficulty of this proof is the presence in Eq. (2.32) of terms like ∂y u˜ P , which are O(ε−1 ) inside the boundary layer. 3. The Boundary Layer Analysis for Stokes Equations Before addressing the problem of solving Eqs. (2.32)–(2.35), it is useful to consider a somewhat simpler problem, the Stokes equations with zero initial condition and boundary data g . This problem is of intrinsic interest, and the results will be used in the analysis of the Navier-Stokes equations. The time-dependent Stokes equations are

470

M. Sammartino, R. E. Caflisch

(∂t − ν1) u S + ∇pS ∇·uS γu S S u (x, y, t = 0)

= 0, = 0, = g (x, t), = 0.

(3.1) (3.2) (3.3) (3.4)

Here g is a vectorial function g = (g 0 , gn ). Primed quantities denote the tangential components of a vector, while the subscript n denotes the normal component. The compatibility condition g (x, t = 0) = 0 is required for the boundary data. In this section we shall show that the solution of the above problem has a structure similar to that for the Navier-Stokes solution Eq. (2.27); i.e. it is the superposition of an inviscid (Euler) part, a boundary √ layer (Prandtl) part which exponentially decays to zero outside a region of size ε = ν, and a correction term which is size O(ε) everywhere. The Stokes problem Eqs. (3.1)–(3.4) has already been addressed by Ukai in [7], (where even the case of non-zero initial data was considered), without making the distinction between inviscid part, boundary layer part and correction term. We seek a solution of the form uS = uE + u˜ P + w1 , v S = v E + εv P + w2 , pS = pE + pw ,

(3.5)

so that (uE , v E ) represents an inviscid solution, (u˜ P , v P ) is a boundary layer solution decaying (in both components) outside the boundary layer, (w1 , w2 ) is a small correction, and the pressures pE and pw are bounded at infinity. Please note that in this section uE , u˜ P and w refer to the “Euler”, “Prandtl” and correction components of the Stokes solution; everywhere else in the paper, this notation is used for the usual Euler and Prandtl solutions and for the correction in the Navier-Stokes solution. These quantities solve the following equations: ∂t u E + ∇pE = 0, ∇ · u E = 0, γn u E = g n , u E (x, y, t = 0) = 0,

(3.6) (3.7) (3.8) (3.9)

(∂t − ν1) u˜ P = 0, ∂x u˜ P + ∂Y v P = 0, v P → 0 asY → ∞, γ u˜ P = g 0 − γuE , u˜ P (x, y, t = 0) = 0,

(3.10) (3.11) (3.12) (3.13)

(∂t − ν1) w + ∇pw = 0, ∇ · w = 0, γw = (0, −εγv P ), w (x, y, t = 0) = 0.

(3.14) (3.15) (3.16) (3.17)

Note that Eq. (3.10)–Eq. (3.13) use the fast variable Y = y/ with ν = 2 , in terms of which 1 = 2 ∂xx + ∂Y Y . Also, there is no term 1u E , since it is identically zero. We now solve explicitly these equations.

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

471

3.1. Convective equation. Take the divergence of (3.6) to obtain 1pE = 0. Then apply 1 to (3.6) and use the initial condition u E = 0, to obtain 1u E = 0.

(3.18)

Therefore the solution of Euler problem is u E = ∇N gn ,

(3.19)

where the operator N = −1/|ξ 0 | exp (−|ξ 0 |y) solves the Laplace equation with Neumann boundary condition; i.e. 1N gn = 0, (3.20) γ∂y N gn = gn . 3.2. Boundary Layer Problem. To solve Eqs. (3.10)–(3.13) it is useful to introduce the operator E˜ 1 acting on functions f (x, t) defined on the boundary Z t Y exp −Y 2 /4(t − s) ˜ E1 f (x, Y, t) =2 ds t − s (4π(t − s))1/2 0 (3.21) Z ∞ exp −(x − x0 )2 /4ε2 (t − s) 0 dx0 f (x , s). 1/2 −∞ 4πε2 (t − s) This operator solves the heat equation with boundary conditions ∂t − ε2 ∂xx − ∂Y Y E˜ 1 f = γ E˜ 1 f = E˜ 1 f (x, Y, t = 0) =

conditions f and zero initial 0, f, 0.

(3.22)

Note that the operator E˜ 1 differs from the operator E1 (defined in Sect. 5.1 of Part I) by the fact that it involves an integration on the transversal component x also. Define M g = g 0 + N 0 gn .

(3.23)

The solution of the boundary layer equations is written as u˜ P = E˜ 1 M g .

(3.24)

Using the incompressibility condition and the limiting condition, the normal component is Z vP =

∞

Y

dY 0 ∂x u˜ P .

(3.25)

3.3. The Correction Term. Here we shall use the Fourier transform variable with respect to x. As in Part I, the corresponding transform variable is denoted ξ 0 . As in Subsect. 3.1, 1pw = 0. Since pw is bounded at ∞, then (3.26) ∂y + |ξ 0 | pw = 0. Define τ = ∂y + |ξ 0 | w2 , so that Eqs. (3.14)–(3.16) imply

472

M. Sammartino, R. E. Caflisch

∂t − ε2 1 τ = 0,

(3.27)

γτ = γ ∂y + |ξ 0 | w2 ,

= γ −∇0 w1 + |ξ 0 |w2 ,

= |ξ 0 |α, Z

in which

∞

α = −ε

(3.28)

dY 0 ∂x u˜ P .

(3.29)

0

Since τ solves the heat equation with the above boundary condition, then τ = |ξ 0 |E˜ 1 α.

(3.30)

∂y w2 + |ξ 0 |w2 = |ξ 0 |E˜ 1 α,

(3.31)

0 w2 (x, Y, t) = e−|ξ |y α + U E˜ 1 α

(3.32)

From the definition of τ , w2 satisfies

which leads to in which U is defined as

U f (ξ 0 , Y ) = ε|ξ 0 |

Z

Y

0

0

e−ε|ξ |(Y −Y ) f (ξ 0 , Y 0 )dY 0 .

(3.33)

0

Notice that a similar operator occurs in Eq.(4.12) in [6]. Finally, the incompressibility condition implies that 0

w1 = −N 0 e−|ξ |y α + N 0 (1 − U )E˜ 1 α.

(3.34)

These above results can be summarized as follows: The solution of the Stokes problem Eqs. (3.1)–(3.4) is denoted by Sg , with u S = Sg = S E g + S P g + S C g 0 E˜ 1 M g −N 0 Dgn −N 0 e−|ξ |y + N 0 (1 − U )E˜ 1 R = + α. + 0 ∞ Dgn ε Y dY 0 ∂x E˜ 1 M g e−|ξ |y + U E˜ 1 (3.35) After some manipulation, this can be simplified, as in [7], to 0 −N 0 e−|ξ |y gn + N 0 (1 − U )E˜ 1 V1 g , u S = Sg = 0 e−|ξ |y gn + U E˜ 1 V1 g in which

V1 g = g n − N 0 g 0 .

(3.36)

(3.37)

3.4. Estimates. In this subsection we prove some basic simple estimates on the operators S E , S P , and S C . Propositions 3.1, 3.2 and 3.3 are presented as results on the timedependent Stokes equations, but are not used in the sequel. For analysis of the NavierStokes equations, only Proposition 3.4 and Lemma 3.2 will be used. We cannot in general give an estimate for the operator S E in a space involving the 2 L norm in y. Nevertheless it is possible to give such an estimate for a special class of boundary data.

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

473

Proposition 3.1. Suppose that g satisfies Z ∞ gn = |ξ 0 | dy 0 f (ξ 0 , y 0 , t)k(ξ 0 , y 0 )

(3.38)

0

R∞ l,ρ,θ l,ρ,θ . Then S E g ∈ Hβ,T , and the following with |ξ 0 | 0 dy 0 |k(ξ 0 , y 0 )| ≤ 1 and f ∈ Hβ,T estimate holds: (3.39) |S E g |l,ρ,θ,β,T ≤ c|f |l,ρ,θ,β,T . R ∞ Using Jensen’s inequality and replacing a factor of |ξ 0 | 0 dy 0 |k(ξ 0 , y 0 )| by 1, Z supθ0 ≤θ

Z

dy 0(θ 0 ) Z ≤ sup θ 0 ≤θ

Z =

2 Z ∞ 0 0 dξ 0 e2ρ|ξ | dy 0 |ξ 0 |e−|ξ |y k(ξ 0 , y 0 )f (ξ 0 , y 0 ) 0 Z ∞ Z 2 0 0 2ρ|ξ 0 | dy dξ e dy 0 |ξ 0 |e−2|ξ |y k(ξ 0 , y 0 ) f (ξ 0 , y 0 )

0(θ 0 )

0

Z

0

dξ 0 e2ρ|ξ |

∞

2 dy 0 k(ξ 0 , y 0 ) f (ξ 0 , y 0 )

0

≤ |f |20,ρ,θ,β,T .

(3.40)

Analogous bounds can be proved for the differentiated terms in the norm. Now consider the “Prandtl” part. We first state an estimate for the operator E˜ 1 . 0l,ρ with f (t = 0) = 0. Then E˜ 1 f ∈ Ll,ρ,θ Lemma 3.1. Let f ∈ Kβ,T β,T for some θ, and the

following estimate holds in Ll,ρ,θ β,T :

|E˜ 1 f |l,ρ,θ,β,T ≤ c|f |l,ρ,β,T .

(3.41)

A much stronger estimate actually holds. One can in fact prove the exponential decay of E˜ 1 f in the normal variable away from the boundary; see the proof given in the Appendix. Using Lemma 3.1, the following estimates on S P 0 and SnP (respectively the transversal and the normal components of the operator S P ) are obvious: 0l,ρ with g (t = 0) = 0. Then S P 0 g ∈ Ll,ρ,θ Proposition 3.2. Suppose g ∈ Kβ,T β,T and

SnP g ∈ Ll−1,ρ,θ for some θ, and β,T

|S P 0 g |l,ρ,θ,β,T ≤ c|g |l,ρ,β,T ,

(3.42)

|SnP g |l−1,ρ,θ,β,T ≤ c|g |l,ρ,β,T .

(3.43)

P

Again a stronger estimate could be proved, namely that S g is exponentially decaying when Y −→ ∞ (i.e. outside the boundary layer). The loss of one derivative in the normal component is due to the incompressibility condition (see e.g., Eq. (3.25)). The estimate on S C will be a consequence of the following bound on the operator U: l,ρ,θ Lemma 3.2. Let f ∈ Ll,ρ,θ β,T . Then U f ∈ Lβ,T and

|U f |l,ρ,θ,β,T ≤ c|f |l,ρ,θ,β,T .

(3.44)

474

M. Sammartino, R. E. Caflisch

The proof of Lemma 3.2 is like the proof of Proposition 3.1, and is based on the fact that U f can be written as a derivative with respect to the normal variable. Lemma 3.2 leads to the following proposition for S C : 0l,ρ . Then S C g ∈ Ll−1,ρ,θ for some θ, and Proposition 3.3. Suppose g ∈ Kβ,T β,T

|S C g |l−1,ρ,θ,β,T ≤ ε1/2 c|g |l,ρ,β,T .

(3.45)

This estimate on the size of the error is not optimal. In fact, a more careful analysis of S C would reveal that the error term is made up of two parts: a Eulerian part (namely 0 l−1,ρ,θ , and a e−|ξ |y α) depending on the unscaled variable y, which is of size ε in Hβ,T l−1,ρ,θ ˜ part (namely U E1 α) depending on the scaled variable Y , which is of size ε in Lβ,T . Something similar occurs in the analysis of the error for the Navier-Stokes equations (see Sect. 4); to prove that the error w is size ε we shall break it up in several parts (see Eq. (4.6) below) and estimate them in the appropriate function spaces. We now give an estimate on the Stokes operator S. Combine Lemma 3.1, 3.2 with the representation (3.36) to obtain the following bound on S: R∞ 0 0 Proposition 3.4. Suppose that g ∈ K 0 l,ρ β,T , with g (t = 0) = 0 and gn = |ξ | 0 dy R ∞ l,ρ,θ f (ξ 0 , y 0 , t)k(ξ 0 , y 0 ) with |ξ 0 | 0 dy 0 |k(ξ 0 , y 0 )| ≤ 1 and f ∈ Ll,ρ,θ β,T . Then Sg ∈ Lβ,T , and (3.46) |Sg |l,ρ,θ,β,T ≤ c |g 0 |l,ρ,β,T + |f |l,ρ,θ,β,T . 0

0

In addition, for each t ≤ T , Sg ∈ K l,ρ ,θ , and satisfies sup |Sg |l,ρ0 ,θ0 ≤ c |g 0 |l,ρ,β,T + |f |l,ρ,θ,β,T

(3.47)

0≤t≤T

in which 0 < ρ0 < ρ − βT and 0 < θ0 < θ − βt. The proof of this proposition uses Jensen’s inequality as in the proof of Proposition 3.1. Proposition 3.4 and Lemma 3.2 are the only results from this section that will be used in the rest of this paper. 4. The Error Equation The equation for the error is ∂t − ε2 1 w + w · ∇u 0 + u 0 · ∇w +εw · ∇w + ∇pw ∇·w γw w (t = 0)

= f + g · ∂y u˜ P , 0 , = 0, = (0, g) , = ω 0 + Ω0 + e 0 ,

(4.1) (4.2) (4.3) (4.4)

1 ,θ1 in which the forcing term f is in Ll−2,ρ , and is O(1) (see Eq. (2.40)). Notice that β1 ,T in Eqs. (4.1) and (4.2), and in the rest of this paper, the divergence and the gradient are taken with respect to the unscaled variable y; i.e. (4.5) ∇ = ∂x , ∂ y .

The rest of this paper is concerned with proving that equations (4.1)–(4.4) admit a unique solution, and that this solution is O(1). We shall prove the following Theorem:

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

475

l,ρ,θ l,ρ,θ,µ Theorem 2. Suppose that u E ∈ Hβ,T , that u˜ P ∈ Kβ,T , so that f has norm in

Ll−2,ρ,θ bounded by a constant independent of ε. Then there exist ρ2 < ρ, θ2 < θ and β,T β2 > β and µ2 > 0 such that Eqs. (4.1)–(4.4) admit a solution which can be written in the form: w =ω+Ω+e, (4.6) where 2 ,θ2 satisfies Eqs. (4.7)–(4.10); • ω ∈ Nβl−2,ρ 2 ,T

2 ,θ2 ,µ2 • Ω ∈ Kβl−2,ρ satisfies Eqs. (4.11)–(4.14), and 2 ,T

2 ,θ2 • e ∈ Ll−2,ρ satisfies Eqs. (4.15)–(4.18). β2 ,T

The quantity ω represents the first order correction to the Euler flow. It satisfies the following equations: ∂t ω + ω · ∇u E + u E · ∇ω + ∇pω = 0, ∇ · ω = 0, γn ω = g, ω(t = 0) = ω 0 .

(4.7) (4.8) (4.9) (4.10)

In addition the initial data ω 0 is required to satisfy the condition (iii) of Theorem 2.1. The quantity Ω = 1 , 2 represents the first order correction inside the boundary layer, with the convective terms omitted. It satisfies the following equations: (∂t − ∂Y Y ) 1 = 0, Z 2 =ε

(4.11) ∞ Y

dY 0 ∂x 1 ,

γ1 = −γω 1 , 1 (t = 0) = 10 . The third part of the error e satisfies the following equations: ∂t − ε2 1 e + e · ∇ u 0 + ε (ω + Ω) + u 0 + ε (ω + Ω) · ∇e + εe · ∇e + ∇pe = Ξ, ∇ · e = 0, γe = 0, −γ2 , e (t = 0) = e 0 .

(4.12) (4.13) (4.14)

(4.15) (4.16) (4.17) (4.18)

The forcing term Ξ is given by: Ξ = − u 0 · ∇Ω+ω · ∇ u P +εΩ +Ω · ∇u 0 + u P +εΩ+εω · ∇ω+εΩ · ∇Ω (4.19) +ε2 1ω + ∂xx 1 , 0 − 0, (∂t − ε2 1)2 + f + g · ∂y u˜ P , 0 . The initial data 10 and e 0 are required to satisfy conditions (iv) and (v) of Theorem 2.1. The reason for the complicated representation Eq. (4.6) for the error w is the following: To solve Eqs. (4.1)–(4.4) one has to use the projection operator due to the incompressibility condition. The natural ambient space is therefore the space of functions which are L2 in both transversal and normal components. In the right-hand side of

476

M. Sammartino, R. E. Caflisch

Eq. (4.1), there are terms which are rapidly varying inside the boundary layer, and thus depend on the rescaled variable Y . So, in taking the L2 norm with respect to the normal variable we are forced to use the variable Y instead of y. The boundary condition (4.3), on the other hand, gives rise to terms which depend on the variable y. Their L2 norm evaluated using the rescaled variable Y would be O(ε−1/2 ). To avoid such a catastrophic error, we use the decomposition (4.6): ω, which is L2 in y, takes care of the boundary condition (4.3) (see Eq. (4.9)); e , which is L2 in Y , takes care of the rapidly varying forcing term; Ω cancels the transversal component of ω at the boundary (see Eq. (4.13)). 5. The Correction to the Euler Flow In this section we shall prove the following theorem: 0l−1,ρ . Then there exist ρ2 < ρ, θ2 < θ and β2 > β such Theorem 3. Suppose that g ∈ Kβ,T

2 ,θ2 that Eqs. (4.7)–(4.10) admit a unique solution ω ∈ Nβl−2,ρ . The following estimate 2 ,T 2 ,θ2 in Nβl−2,ρ holds: 2 ,T

2 ,θ2 ≤ c |u E ˜P |ω|l−2,ρ 0 |l,ρ,θ + |u 0 |l,ρ,θ,µ + |ω 0 |l,ρ,θ , β2 ,T

(5.1)

l,ρ,θ ˜P , K l,ρ,θ,µ and N l,ρ,θ respectively. where the norms of u E 0 ,u 0 and ω 0 are taken in H

The structure of Eqs. (4.7)–(4.10) is somewhat similar to the structure of Euler equations and the proof of the above theorem closely follows the proof of Theorem 4.1 in [6]. The functional setting here is slightly different; in fact Theorem 3 above is stated in the space l,ρ,θ , where only the first derivative with respect to time is taken, instead of the space Nβ,T l,ρ,θ Hβ,T , where time derivatives up to order l are allowed. This is due to the presence of the boundary condition g deriving from Prandtl equations. We shall prove the above theorem using the ACK Theorem. The solution of Eqs. (4.7)–(4.10) can be written as 0 (5.2) ω = ω 0 + −N 0 , 1 e−|ξ |y (g − g0 ) + Pt ω ∗ ,

where the operator Pt is the integrated (with respect to time) half space projection operator defined in Eq.(4.35) of [6]. The first term in this expression provides the correct initial data, the second term the correct boundary data, and the third term the correct forcing terms. l,ρ,θ : The projection operator Pt satisfies the following bounds in Nβ,T l,ρ,θ l,ρ,θ Proposition 5.1. Let u ∗ ∈ Nβ,T . Then Pt u ∗ ∈ Nβ,T and

|Pt u ∗ |l,ρ,θ,β,T ≤ c|u|l,ρ,θ,β,T .

(5.3)

l,ρ,θ Proposition 5.2. Let u ∗ ∈ Nβ,T . Let ρ0 < ρ − βT and θ0 < θ − βT . Then Pt u ∗ ∈ 0 0 N l,ρ ,θ for all 0 ≤ t ≤ T , and Z t ∗ 0 0 ds|u(·, ·, s)|l,ρ0 ,θ0 ≤ c|u|l,ρ,θ,β,T . (5.4) |Pt u |l,ρ ,θ ≤ c 0

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

477

Using Eq. (5.5) one sees that (4.7)–(4.10) are equivalent to the following equation for ω ∗ : (5.5) ω ∗ + H 0 (ω ∗ , t) = 0, where

i h 0 H 0 (ω ∗ , t) = ω 0 + (−N 0 , 1)e−|ξ |y (g − g0 ) + Pt ω ∗ · ∇u E i h 0 +u E · ∇ ω 0 + (−N 0 , 1)e−|ξ |y (g − g0 ) + Pt ω ∗ .

(5.6)

Using the Cauchy estimate, and with the same procedure we used to prove existence and uniqueness for Euler equations in [6], one can see that the operator H 0 satisfies all the hypotheses of the ACK Theorem; therefore there exist ρ2 < ρ, θ2 < θ and β2 > β such 2 ,θ2 . Equation (5.2) and Proposition that Eq. (5.5) admits a unique solution ω ∗ ∈ Nβl−2,ρ 2 ,T 2 ,θ2 5.1 also imply ω ∈ Nβl−2,ρ . Theorem 3 is thus proved. 2 ,T

6. The Boundary Layer Correction We prove the following theorem: Theorem 4. Let ω be the solution of Eqs. (4.7)–(4.10) found in Theorem 5.1. Then there exist ρ02 > ρ2 , θ20 > θ2 , β20 > β2 , and µ2 > 0 such that Eqs. (4.11)–(4.14) admit a unique l−2,ρ0 ,θ 0 ,µ l−2,ρ0 ,θ 0 ,µ solution Ω ∈ Kβ 0 ,T 2 2 2 . It satisfies the following estimate in Kβ 0 ,T 2 2 2 : 2

2

˜P |Ω|l−2,ρ02 ,θ20 ,µ2 ,β20 ,T ≤ c |u E 0 |l,ρ,θ + |u 0 |l,ρ,θ,µ + |ω 0 |l,ρ,θ + |Ω0 |l,ρ,θ,µ2 ,

(6.1)

l,ρ,θ where the norms of u E ˜P , K l,ρ,θ,µ , N l,ρ,θ and K l,ρ,θ,µ 0 ,u 0 , ω 0 and Ω0 are taken in H respectively.

The proof of this theorem uses the following lemma: 0l−2,ρ02

Lemma 6.1. There exists ρ02 < ρ2 such that the boundary data γω 1 is in Kβ2 ,T following estimate holds in

. The

0l−2,ρ0 Kβ2 ,T 2 :

|γω 1 |l−2,ρ02 ,β2 ,T ≤ c|ω|l−2,ρ2 ,θ2 ,β2 ,T .

(6.2)

The above lemma can be proved using a Sobolev estimate to bound the sup with respect to y of ω 1 , and then a Cauchy estimate on the x derivative to bound the term ∂y ∂xl−2 ω 1 . The solution of Eqs. (4.11)–(4.14) can be explicitly written as (6.3) 1 = E0 (t)10 − E1 γω 1 = E0 (t) 10 + γω01 − E1 γ ω 1 − ω01 − γω01 , where the operator E0 (t) and E1 have been defined in [6]. Proposition 5.1 and Propol−2,ρ0 ,θ 0 ,µ sition 5.3 of [6], imply that 1 ∈ Kβ 0 ,T 2 2 2 . Using the expression (4.12) for 2 2 and again shrinking the domain of analyticity in x, and renaming ρ02 , we obtain also l−2,ρ0 ,θ 0 ,µ 2 ∈ Kβ 0 ,T 2 2 2 . The proof of Theorem 4 is thus complete. 2 By a redefinition of ρ2 , θ2 , β2 , we may take ρ02 = ρ2 , θ20 = θ2 , β20 = β2 .

478

M. Sammartino, R. E. Caflisch

7. The Navier-Stokes Operator In this section we shall prove the following theorem Theorem 5. Under the hypotheses of Theorem 2, there exist ρ2 , θ2 , β2 , such that 2 ,θ2 . This solution satisfies the Eqs. (4.15)–(4.18) admit a unique solution e ∈ Ll−2,ρ β2 ,T

2 ,θ2 following estimate in Ll−2,ρ : β2 ,T

˜P |e |l−2,ρ2 ,θ2 ,β2 ,T ≤ c |u E 0 |l,ρ,θ + |u 0 |l,ρ,θ,µ + |ω 0 |l,ρ,θ + |Ω0 |l,ρ,θ,µ + |e 0 |l,ρ,θ , (7.1) P l,ρ,θ l,ρ,θ,µ l,ρ,θ l,ρ,θ,µ , u ˜ , ω , Ω and e are taken in H , K , N , K where the norms of u E 0 0 0 0 0 and Ll,ρ,θ respectively. We shall prove this theorem using the ACK Theorem. In the same way as for the Euler and Prandtl equations, we first invert the second order heat operator, taking into account the incompressibility condition and the BC and IC. This is performed using the heat op erator,defined in Subsect. 7.1, which inverts ∂t − ∂Y Y − ε2 ∂xx . Then in Subsect. 7.2 we insert the divergence-free projection and obtain the operator N0 . Using the Stokes operator from Sect. 3 to handle the boundary data, in Subsect. 7.3 we define the operator N ∗ , which is suitable for the iterative solution of the Navier-Stokes equations (i.e. treating initial data and nonlinearities as forcing terms). Bounds on this operator are given in Propositions 7.6 and 7.7. With the use of this Navier-Stokes operator, and taking into account initial and boundary data Eq. (4.17) and Eq. (4.18), in Subsect. 7.4 we finally solve the error equation. In Subsects. 7.5 and 7.6 we prove by the ACK Theorem that this iterative procedure converges to a unique solution. 7.1. The heat operator. We have already introduced the operator E˜ 1 in (3.21) which solves the heat equation with boundary data. We now want to solve the heat equation with a source and with zero initial and boundary data on the half plane Y ≥ 0; i.e. ∂t − ε2 ∂xx − ∂Y Y u = w(x, Y, t), (7.2) u(x, Y, t = 0) = 0, γu = 0. First introduce the heat kernel E˜ 0 (x, Y, t), defined by e−x /4tε e−Y /4t √ . E˜ 0 (x, Y, t) = √ 4πt 4πtε2 2

2

2

(7.3)

We solve the problem (7.2) on the half plane with the following operator: u(x, Y, t) = E˜ 2 w Z t Z ds =

∞

Z

∞

dx0 E˜ 0 (x − x0 , Y − Y 0 , t − s) 0 0 −∞ −E˜ 0 (x − x0 , Y + Y 0 , t − s) w(x0 , Y 0 , s). dY

0

(7.4)

We now state some estimates on this operator. In these estimates w is defined for Y ≥ 0. l,ρ,θ ˜ Proposition 7.1. Let w ∈ Ll,ρ,θ β,T . Then E2 w ∈ Lβ,T and

|E˜ 2 w|l,ρ,θ,β,T ≤ c|w|l,ρ,θ,β,T .

(7.5)

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

479

0 0 Proposition 7.2. Suppose w ∈ Ll,ρ,θ β,T with γw = 0 and that ρ ≤ ρ − βt, θ ≤ θ − βt. 0 0 l,ρ ,θ Then E˜ 2 w(t) ∈ L and Z t ds|w(·, ·, s)|l,ρ0 ,θ0 ≤ c|w|l,ρ,θ,β,T . (7.6) |E˜ 2 w|l,ρ0 ,θ0 ≤ c 0

The proofs of these two propositions are given in the Appendix. 7.2. The projected heat operator. In [6] we introduced the divergence-free projection operator P ∞ . Here we employ a similar operator with the normal variable rescaled by a ∞ factor ε. The projection operator in the x and Y variable, P , is the pseudodifferential operator whose symbol is 1 ∞ ξn2 −εξ 0 ξn P = 2 02 , (7.7) ε ξ + ξn2 −εξ 0 ξn ε2 ξ 02 where ξ 0 and ξn denote the Fourier variables corresponding to x and Y respectively. For all w this operator satisfies ∞

∇ · P w = ∂x P

∞0

w + ε∂Y P

∞

nw

= 0.

(7.8)

In [6] to avoid Fourier transform in y we expressed P ∞ as an integration in the normal ∞ variable. For P one can similarly see that " Z Y 0 0 1 ∞ ε|ξ 0 | P nw = dY 0 e−ε|ξ |(Y −Y ) (−N 0 w1 + w2 ) 2 −∞ (7.9) Z ∞ 0 0 ε|ξ 0 |(Y −Y 0 ) 0 1 2 + ε|ξ | dY e (N w + w ) , Y

" P

∞0

1 −ε|ξ 0 | w =w + 2

Z

1

Y

0

0

dY 0 e−ε|ξ |(Y −Y ) (w1 + N 0 w2 )

−∞ 0

− ε|ξ |

Z

∞

0 ε|ξ 0 |(Y −Y 0 )

dY e Y

0

(7.10)

(w − N w ) . 1

2

Next we present estimates on the projection operator. In these estimates w is defined ∞ on Y ≥ 0, but we write P w to mean the following: First extend w oddly to Y < 0, i.e. w(x, Y ) = −w(x, −Y ) when Y ≤ 0 ; (7.11) ∞

then apply P , and finally restrict the result to Y ≥ 0 for application of the norm. The ∞ resulting expressions for P are "Z Y 0 0 0 0 1 0 ∞ P n w = ε|ξ | dY 0 e−ε|ξ |(Y −Y ) − e−ε|ξ |(Y +Y ) (−N 0 w1 + w2 ) 2 0 Z ∞ 0 0 0 0 + dY 0 eε|ξ |(Y −Y ) (N 0 w1 + w2 ) − eε|ξ |(−Y −Y ) (−N 0 w1 + w2 ) , (7.12) Y

480

M. Sammartino, R. E. Caflisch

P

∞0

"Z Y 0 0 0 0 1 0 w = w − ε|ξ | dY 0 e−ε|ξ |(Y −Y ) − e−ε|ξ |(Y +Y ) (w1 + N 0 w2 ) 2 0 Z ∞ 0 0 0 0 + dY 0 eε|ξ |(Y −Y ) (w1 − N 0 w2 )−eε|ξ |(−Y −Y ) (w1 + N 0 w2 ) . 1

Y

(7.13) The following estimate is easily proved ∞

Proposition 7.3. Let w ∈ Ll,ρ,θ with γw = 0. Then P w ∈ Ll,ρ,θ and ∞

|P w |l,ρ,θ ≤ c|w |l,ρ,θ .

(7.14)

We are now ready to introduce the projected heat operator N0 , acting on vectorial functions, defined as ∞ (7.15) N0 = P E˜ 2 . ∞ 2 One can easily show that P commutes with the heat operator ∂t − ∂Y Y − ε ∂xx . It then follows that for each w such that γw = 0, ∇ · N0 w = 0, ∞ ∂t − ∂Y Y − ε2 ∂xx N0 w = P w . The following estimates are a consequence of the properties of P Proposition 7.4. Suppose w ∈

Ll,ρ,θ β,T .

Then N0 w ∈

Ll,ρ,θ β,T

|N0 w |l,ρ,θ,β,T ≤ c|w |l,ρ,θ,β,T .

(7.16) (7.17) ∞

and E˜ 2 separately:

and (7.18)

0 0 Proposition 7.5. Suppose w ∈ Ll,ρ,θ β,T with γw = 0 and that ρ ≤ ρ − βt, θ ≤ θ − βt. 0 0 Then w and N0 w are in Ll,ρ ,θ for each t, and Z t ds|w (·, ·, s)|l,ρ0 ,θ0 ≤ c|w|l,ρ,θ,β,T . (7.19) |N0 w |l,ρ0 ,θ0 ≤ c 0

Note that E˜ 2 has zero boundary data; thus the conditions in Proposition 7.3 are all satisfied. 7.3. The Navier-Stokes operator. With the Stokes operator defined in Sect. 3 and the projected heat operator of the previous subsection, we now introduce the Navier-Stokes operator N ∗ defined as (7.20) N ∗ = N0 − SγN0 . This operator is used to solve the time-dependent Stokes equations with forcing, which is equivalent to the Navier-Stokes equations if the nonlinear terms are put into the forcing. In fact (7.21) w = N ∗w ? solves the system (7.22) ∂t − ∂Y Y − ε2 ∂xx w + ∇pw = w ? , ∇ · w = 0, (7.23) γw = 0, (7.24) w (t = 0) = 0, (7.25) and satisfies the following bound:

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

481

l,ρ,θ ∗ Proposition 7.6. Suppose w ∈ Ll,ρ,θ β,T . Then N w ∈ Lβ,T and

|N ∗ w |l,ρ,θ,β,T ≤ c|w |l,ρ,θ,β,T .

(7.26)

We already know, from Proposition 7.4, that N0 obeys an estimate like (7.26). Therefore the only part of N ∗ which has to be estimated is that involving the Stokes operator S. To bound this term it is enough to notice that γN0 w is a boundary data for which the assumptions of Proposition 3.4 hold. In fact since E˜ 2 w has been extended oddly for ∞ Y < 0, then γn N0 w = γP n E˜ 2 w is (see Eq. (7.12) ) Z ∞ 0 0 ∞ γP n E˜ 2 w = ε|ξ 0 | dY 0 e−ε|ξ |Y N 0 E˜ 2 w1 . (7.27) 0

According to Proposition 7.1, this is of the form required in Proposition 3.4 for the normal part gn of g = N0 w . The tangential part g 0 satisfies the bound

Therefore

|g 0 |l,ρ,θ,β,T ≤ c|E˜ 2 w |l,ρ,θ,β,T .

(7.28)

|SγN0 w |l,ρ,θ,β,T ≤ c|E˜ 2 w |l,ρ,θ,β,T ≤ c|w |l,ρ,θ,β,T ,

(7.29)

which concludes the proof of Proposition 7.6. We shall also use the following Proposition, which is proved in the same way as the previous result, using Proposition 7.5: 0 0 Proposition 7.7. Suppose w ∈ Ll,ρ,θ β,T with γw = 0 and that ρ ≤ ρ − βt, θ ≤ θ − βt. l,ρ0 ,θ 0 Then in L , Z t |N ∗ w |l,ρ0 ,θ0 ≤ c ds|w (·, ·, s)|l,ρ0 ,θ0 ≤ c|w|l,ρ,θ,β,T . (7.30) 0

7.4. The solution of the error equation. We can now solve Eqs. (4.15)–(4.18). If one looks at these equations one sees that they are of the form (7.22)–(7.25) (where all forcing and nonlinear terms are in w ? , see Eq. (7.37) below) plus boundary and initial data. We therefore express e as the sum of two terms: the first involving the Navier-Stokes operator and the second where all boundary and initial data are. In fact we write e = N ∗e ∗ + σ ,

(7.31)

where σ solves the following time-dependent Stokes problem with initial and boundary data: (∂t − ∂Y Y ) σ + ∇φ, = 0 ∇ · σ = 0, γσ = (0, εG), σ (t = 0) = e 0 , having denoted:

Z

∞

G=−

dY 0 ∂x 1 ,

0

and where e ∗ satisfies the following equation:

(7.32) (7.33) (7.34) (7.35)

(7.36)

482

M. Sammartino, R. E. Caflisch

e ∗ = Ξ − e · ∇ u 0 + ε (ω + Ω) + u 0 + ε (ω + Ω) · ∇e + εe · ∇e − ε2 ∂xx σ .

(7.37)

Equations (7.32)–(7.35) can be solved explicitly. First note that φ is harmonic, so that, imposing it to be bounded at infinity, (7.38) ∂y + |ξ 0 | φ = 0. Apply (∂y + |ξ 0 |) to the normal component of Eq. (7.32), and define τ = ∂y + |ξ 0 | σ 2

(7.39)

which satisfies (∂t − ∂Y Y ) τ = 0, γτ = ε|ξ 0 |G, τ (t = 0) = |ξ 0 |V1 e 0 ,

(7.40) (7.41) (7.42)

in which V1 e 0 = e20 − N 0 e10 . Denote G0 = G(t = 0). Then the solution of the system (7.40)–(7.42) is τ = E0 (t) |ξ 0 |V1 e 0 − ε|ξ 0 |G0 + E1 γε|ξ 0 |G − ε|ξ 0 |G0 + ε|ξ 0 |G0 = |ξ 0 |τ˜ .

(7.43) 00

00

The initial condition e 0 is in Ll,ρ,θ ; this obviously implies e 0 ∈ Ll−2,ρ2 ,θ2 . One has the following proposition: 2 Proposition 7.8. Given that e 0 ∈ Ll−2,ρ2 ,θ2 , that G ∈ Kβ0l−2,ρ , and the compatibility 2 ,T 2 ,θ2 condition γn e 0 = εG0 , then τ˜ ∈ Ll−2,ρ and β2 ,T

|τ˜ |l−2,ρ2 ,θ2 ,β2 ,T ≤ c |e 0 |l−2,ρ2 ,θ2 + |G|l−2,ρ2 ,β2 ,T .

(7.44)

The proof of this proposition is based on the estimates on the operators E0 (t) and E1 given in Propositions 5.2 and 5.3 of [6]; regarding the estimate in Proposition 5.3, we l,ρ,θ,µ it is a fortiori in Ll,ρ,θ notice in fact that if a function is in Kβ,T β,T . Now, the expression (7.43) for τ in (7.39) and the boundary condition (7.34) on σ 2 imply that 0 (7.45) σ 2 = εe−ε|ξ |Y G + U τ˜ , where U has been defined in (3.33). The incompressibility condition then leads to 0

σ 1 = −εN 0 e−ε|ξ |Y G + N 0 (1 − U )τ˜ .

(7.46)

A bound for σ is given by ˜ with G˜ ∈ K l−2,ρ2 , then σ ∈ Ll−2,ρ2 ,θ2 and Proposition 7.9. Suppose that G = |ξ 0 |G, β2 ,T β2 ,T ˜ l−2,ρ2 ,β2 ,T |σ |l−2,ρ2 ,θ2 ,β2 ,T ≤ c |e 0 |l−2,ρ2 ,θ2 + |G| ˜P ≤ c |u E 0 |l,ρ,θ + |u 0 |l,ρ,θ,µ + |ω 0 |l,ρ,θ + |Ω0 |l,ρ,θ,µ + |e 0 |l,ρ,θ . (7.47)

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

483

The proof of this proposition is based on Lemma 3.2 and Proposition 7.8 for the estimate 0 2 , then ε|ξ 0 |e−ε|ξ |Y G˜ ∈ of the terms involving τ˜ , and on the fact that if G˜ ∈ Kβ0l−2,ρ 2 ,T

2 ,θ2 Ll−2,ρ . β2 ,T We are now ready to prove existence and uniqueness for Eqs. (4.15)–(4.18). Use Eq. (7.31) in (7.37), interpret this equation as an equation for e ∗ , and use the abstract version of the Cauchy–Kowalewski Theorem, in the function spaces Xρ = Ll,ρ,θ and Yρ,β,T = Ll,ρ,θ β,T , to prove existence and uniqueness for the solution. This is similar to the procedure used in [6] to prove existence and uniqueness for the Euler and Prandtl equations. Rewrite Eq. (7.37) as

where F (e ∗ , t) is F (e ∗ , t) = k −

e ∗ = F (e ∗ , t),

(7.48)

u 0 + ε (ω + Ω + σ ) · ∇N ∗ e ∗ + N ∗ e ∗ · ∇ u 0 + ε (ω + Ω + σ ) +εN ∗ e ∗ · ∇N ∗ e ∗ }

(7.49)

and k is the forcing term k = Ξ − u 0 +ε (ω+Ω) · ∇σ +σ · ∇ u 0 +ε (ω+Ω) +εσ · ∇σ 0 =f − u +ε(ω+Ω+σ ) · ∇Ω+ ω · ∇u P − (g∂y u˜ P , 0) +(Ω+σ ) · ∇u P (Ω+σ ) · ∇u E + u P +ε(ω+Ω+σ ) · ∇ω+ u 0 +ε(ω+Ω+σ ) · ∇σ (7.50) +ε2 1ω+∂xx 1 , 0 +∂xx σ − 0, (∂t − ε2 1)2 . The rest of this section is concerned with proving that the operator F satisfies all the hypotheses of ACK Theorem. 7.5. The forcing term. In this subsection we shall prove the following proposition, 2 ,θ2 and O(1): asserting that the forcing term is bounded in Ll−2,ρ β2 ,T Proposition 7.10. There exists a constant R0 such that

Equation (7.49) shows that

|F (0, t)|l−2,ρ2 −β2 t,θ2 −β2 t ≤ R0 .

(7.51)

F (0, t) = k

(7.52) 2 ,θ2 Ll−2,ρ β2 ,T

(see the discussion after with k given by (7.50). We already know that f ∈ Eq. (2.40)). The terms in the first square brackets are exponentially decaying outside the boundary layer. Inside the boundary layer they can be shown to be O(1) with a Cauchy estimate on the terms where ∂y is present: this is possible because they go linearly fast to zero at the boundary. All terms inside the second square brackets are more easily handled because no O(ε−1 ) appear. Proposition 7.10 is thus proved. 7.6. The Cauchy estimate. In this subsection we shall prove that the operator F satisfies the last hypothesis of the ACK Theorem. Here and in the rest of this section ρ0 < ρ(s) ≤ ρ2 − β2 s, θ0 < θ(s) ≤ θ2 − β2 s.

484

M. Sammartino, R. E. Caflisch

Proposition 7.11. Suppose ρ0 < ρ(s) ≤ ρ2 − β2 s and θ0 < θ(s) ≤ θ2 − β2 s. If e ∗1 2 ,θ2 and e ∗2 are in Ll−2,ρ with β2 ,T |e ∗1 |l−2,ρ2 ,θ2 ,β2 ,T ≤ R, |e ∗2 |l−2,ρ2 ,θ2 ,β2 ,T ≤ R,

(7.53)

then |F (e ∗1 , t) − F (e ∗2 , t)|l−2,ρ0 ,θ0 Z t ∗1 |e − e ∗2 |l−2,ρ(s),θ0 |e ∗1 − e ∗2 |l−2,ρ0 ,θ(s) ≤C ds + ρ(s) − ρ0 θ(s) − θ0 0 Z t X ∗i |e |l−2,ρ(s),θ0 |e ∗i |l−2,ρ0 ,θ(s) ∗1 ∗2 ds + (7.54) +C|e − e |l−2,ρ0 ,θ0 ,β2 ,t ρ(s) − ρ0 θ(s) − θ0 0 i=1,2

in which all the norms are in Ll,ρ,θ and Ll,ρ,θ β,T . The proof of the above proposition occupies the remainder of this section. First introduce the Cauchy estimates in Ll,ρ,θ . Lemma 7.1. Let f (x, Y ) ∈ Ll,ρ,θ . Then for 0 < ρ0 < ρ and 0 < θ0 < θ, |f |l,ρ,θ , ρ − ρ0 |f |l,ρ,θ ≤c . θ − θ0

|∂x f |l,ρ0 ,θ ≤ c |χ(Y )∂Y f |l,ρ,θ0

(7.55) (7.56)

In the above proposition χ(Y ) is a monotone, bounded function, going to zero linearly fast near the origin (see e.g. Eq.(4.28) ) of [6]. The Sobolev inequality implies the following lemmas: Lemma 7.2. Let f (x, Y ) and g(x, Y ) be in Ll,ρ,θ . Then for 0 < ρ0 < ρ, |g∂x f |l,ρ0 ,θ ≤ c|g|l,ρ0 ,θ

|f |l,ρ,θ . ρ − ρ0

(7.57)

Lemma 7.3. Let f (x, Y ) and g(x, Y ) be in Ll,ρ,θ with g(x, Y = 0) = 0. Then for 0 < θ0 < θ, |f |l,ρ,θ . (7.58) |g∂Y f |l,ρ,θ0 ≤ c|g|l,ρ,θ0 θ − θ0 Lemmas 7.2 and 7.3 then imply with γn e 1 = γn e 2 = 0. Then for Lemma 7.4. Suppose e 1 and e 2 are in Ll−2,ρ,θ β,T 0 0 0 < ρ < ρ and 0 < θ < θ, |e 1 · ∇e 1 −e 2 · ∇e 2 |l−2,ρ0 ,θ0 ≤ c

|e 1 − e 2 |l−2,ρ,θ0 |e 1 − e 2 |l−2,ρ0 ,θ , (7.59) + ρ − ρ0 θ − θ0

where the constant c depends only on |e 1 |l−2,ρ,θ,β,T and |e 2 |l−2,ρ,θ,β,T .

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

485

We are now ready to prove Proposition 7.11. We first take into consideration the nonlinear part N ∗ e ∗ · ∇N ∗ e ∗ . From the estimates (7.26) and (7.30) on the Navier-Stokes operator, the estimate (7.59) on the convective operator and the fact that γn N ∗ e ∗ = 0, it follows that |N ∗ e ∗1 · ∇N ∗ e ∗1 − N ∗ e ∗2 · ∇N ∗ e ∗2 |l−2,ρ0 ,θ0 Z t ∗1 |e (·, ·, s) − e ∗2 (·, ·, s)|l−2,ρ(s),θ2 ≤C ds ρ(s) − ρ0 0 ∗1 ∗2 |e (·, ·, s) − e (·, ·, s)|l−2,ρ2 ,θ(s) + θ(s) − θ0 Z t X ∗i |e (·, ·, s)|l−2,ρ(s),θ2 ∗1 ∗2 ds +C|e − e |l−2,ρ2 ,θ2 ,β2 ,T ρ(s) − ρ0 0 i=1,2 |e ∗i (·, ·, s)|l−2,ρ2 ,θ(s) + θ(s) − θ0 Z t ∗1 |e (·, ·, s) − e ∗2 (·, ·, s)|l−2,ρ(s),θ2 ds ≤C ρ(s) − ρ0 0 |e ∗1 (·, ·, s) − e ∗2 (·, ·, s)|l−2,ρ2 ,θ(s) + . (7.60) θ(s) − θ0 +Ω+σ) · Since γn u 0 + ε(ω + Ω + σ ) = 0, one can estimate the term u 0 + ε(ω ∇N ∗ w ? in a similar fashion. The term N ∗ w ? · ∇ u 0 + ε(ω + Ω + σ ) is easily estimated. The proof of Proposition 7.11 is thus achieved. 7.7. Conclusion of the Proof of Theorem 5. The operator F (e ∗ , t) satisfies all the hypotheses of the ACK Theorem. Therefore, there exists a β2 > 0 such that Eq. (7.48) has 2 ,θ2 2 ,θ2 . Because of Proposition 7.6, then N ∗ e ∗ ∈ Ll−2,ρ . a unique solution e ∗ ∈ Ll−2,ρ β2 ,T β2 ,T Given the expression (7.31) for the error e and Proposition [7.9] for σ , the proof of Theorem 5 is achieved. l,ρ,θ 7.8. Conclusion of the Proof of Theorem 1. We have thus proved that u E ∈ Hβ,T l−1,ρ,θ,µ 2 ,θ2 (Theorem 4.1 of [6]), that u P ∈ Kβ,T (Theorem 3 of [6]), that ω ∈ Nβl−2,ρ 2 ,T

2 ,θ2 ,µ2 2 ,θ2 (Theorem 5.1), that Ω ∈ Kβl−2,ρ (Theorem 4), and that e ∈ Ll−2,ρ (Theorem β2 ,T 2 ,T 5). By a redefinition of the parameters, we may take (ρ2 , θ2 , β2 , µ2 ) = (ρ, θ, β, µ), and the proof of Theorem 1 is achieved.

8. Conclusions In the analysis above, we have proved existence of solutions of the Navier-Stokes equations in two and three dimensions for a time that is short but independent of the viscosity. As the viscosity goes to zero, the Navier-Stokes solution has been shown to approach an Euler solution away from the boundary and a Prandtl solution in a thin boundary layer. The initial data were assumed to be analytic: although this restriction is severe, we believe that it might be optimal. In fact separation of the boundary layer is related to development of a singularity in the solution of the time-dependent Prandtl equations, as

486

M. Sammartino, R. E. Caflisch

discussed in [2]. We conjecture that the time of separation (and thus the singularity time) cannot be controlled by a Sobolev bound on the initial data, unless some positivity and monotonicity is assumed as in [5]. It would be very important to verify this by an explicit singularity construction, or to refute it by an existence theorem in Sobolev spaces for Prandtl. This result suggest further work on several related problems: Analysis of the zeroviscosity limit for Navier-Stokes equations in the exterior of a ball is presented in [1]. An alternative derivation of this result may be possible by a more direct analysis of the Navier-Stokes solution. In two-dimensions, a solution is known to exist for a time that is independent of the viscosity. Thus by writing the solution as a Stokes operator times the nonlinear terms and analysis of the Stokes operator, it should be possible to recognize the regular (Euler) and boundary layer (Prandtl) parts directly. We believe that the method of the present paper could be used to prove convergence of the Navier-Stokes solution to an Euler solution with a vortex sheet, in the zero viscosity limit outside a boundary layer around the sheet. Note that the problem with a vortex sheet should be easier because the boundary layer is weaker since tangential slip is allowed, but it is more complicated since the boundary is curved and moving. Appendix A: The Estimates for the Heat Operators Proof of Lemma 3.1. To prove Lemma 3.1 it is useful to introduce the following changes of variables into the expression (3.21) for the operator E˜ 1 : ζ= One has 2 E˜ 1 f = π

Z

∞

x0 − x Y , η= . 1/2 [4(t − s)] [4(t − s)]1/2

dζe−ζ

√

2

Y / (4t)

Z

∞ −∞

2 dηe−η f x + ηY /ζ, t − Y 2 /4ζ 2 .

(A.1)

(A.2)

2 To get an estimate in Ll,ρ,θ β,T one has to bound the appropriate L norm in x and Y of j ∂xi E˜ 1 f with i ≤ l, ∂t ∂xi E˜ 1 f with i ≤ l − 2 and ∂xi ∂Y E2 f with i ≤ l − 2, j ≤ 2. We shall in fact prove a stronger estimate; we shall in fact prove that these terms are exponentially decaying in the Y variable. Let us first bound ∂xi E˜ 1 f :

sup

e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βt)

= sup

sup |=x|≤ρ−βt

e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βt)

k∂xi E˜ 1 f kL2 (<x) Z

∞

d<x

sup |=x|≤ρ−βt

−∞

 #2 1/2 2 Y ηY ,t − 2 dζe−ζ dηe−η ∂xi f x + Y  ζ 4ζ √ −∞ 4t ( Z 2 2 ∞ (µ−βt)
2 π

Z

∞

−∞

2

Z

∞

2

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

≤ sup

e

sup

(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

2 √ π

Z

√ Y / (4t)

dζe

sup

0≤t≤T |=x|≤ρ−βt

sup

"  

#2 sup

sup

0≤t≤T |=x|≤ρ−βt

k∂xi f (·

+ i=x, t)kL2 (<x)

)1/2

∞

≤ sup

487

−ζ 2

k∂xi f (· + i=x, t)kL2 (<x) ( e

sup

Z

2 √ π

(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

)1/2

∞ √ Y / (4t)

dζe

−ζ 2

≤ |∂xi f |0,ρ,θ,β,T .

(A.3)

In passing from the second to the third line, we used the Jensen inequality to pass the square inside the integrals in ζ and η, and performed the integration in <x. We now bound ∂t ∂xi E˜ 1 f with i ≤ l − 2 by sup

e(µ−βt)
sup

0≤t≤T Y ∈Σ(θ−βt)

≤ sup

( e

sup

k∂t ∂xi E˜ 1 f kL2 (<x)

k∂t ∂xi f (· + i=x, t)kL2 (<x)

sup

0≤t≤T |=x|≤ρ−βt

sup

sup |=x|≤ρ−βt

2 √ π

(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

Z

)1/2

∞ √ Y / (4t)

dζe

−ζ 2

≤ |∂t ∂xi f |0,ρ,θ,β,T .

(A.4)

The procedure for the above bound is essentially the same that was used for ∂xi E˜ 1 f . The only thing to note is that the derivative with respect to time passed through the integral in ζ because f (x, t = 0) = 0. We now bound ∂Y ∂xi E˜ 1 f with i ≤ l − 2 by sup

sup

e(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

= sup

sup

sup |=x|≤ρ−βt

e(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

k∂Y ∂xi E˜ 1 f kL2 (<x)

sup |=x|≤ρ−βt

Z ∞ Z ∞

2 2 −ζ 2

dζe dηe−η

π √ Y / (4t) −∞

Y η i+1 ∂x f (x + ηY /ζ, t − Y 2 /4ζ 2 ) − 2 ∂t ∂xi f (x + ηY /ζ, t − Y 2 /4ζ 2 )

ζ 2ζ ≤ sup

sup

e(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

2 √ π

Z

∞ √

Y / (4t)

dζe−ζ

sup |=x|≤ρ−βt

2

Y 2ζ 2

h

i 1/2

∂xi+2 f (· + =x, t − Y 2 /4ζ 2 ) 2 + ∂t ∂xi f (· + =x, t − Y 2 /4ζ 2 ) 2 ≤ |∂xi+2 f |0,ρ,θ,β,T + |∂t ∂xi f |0,ρ,θ,β,T

488

M. Sammartino, R. E. Caflisch

" sup

e

sup

2 √ π

(µ−βt)
0≤t≤T Y ∈Σ(θ−βt)

Z

∞ √ Y / (4t)

dζe

−ζ 2

Y 2ζ 2

#1/2

≤ c|f |l,ρ,θ,β,T .

(A.5)

In passing from the second to the third line of the above estimate, we first integrated by parts in η the term ∂xi+1 f and then used Jensen’s inequality to pass the L2 norm in <x inside the integral in ζ and η. To bound ∂Y Y ∂xi E˜ 1 f with i ≤ l − 2, note that E˜ 1 f satisfies the heat equation and use the bounds above. The proof of Lemma 3.1 is thus achieved. Proof of Proposition 7.1. To prove Proposition 7.1, it is useful to make the following changes of variables into the expression (7.4) for the operator E˜ 2 : Y0−Y Y0+Y x0 − x ζ=√ , z=√ , η=√ . 4(t − s) 4(t − s) 4(t − s) These lead to Z t Z ds E˜ 2 f =

−η 2

"Z

Z

∞

dηe

dζe−ζ f (x+η 2

√

−∞

0

−

∞

p

4(t−s), Y +ζ

−Y / 4(t−s)

∞

dζe

√ Y / 4(t−s)

−ζ 2

f (x + η

p

(A.6)

4(t − s), −Y + ζ

p

p 4(t−s), s) #

4(t − s), s) .

(A.7)

i ˜ i ˜ To get an estimate in Ll,ρ,θ β,T , bound ∂x E2 f with i ≤ l, ∂t ∂x E2 f with i ≤ l − 2 and ∂xi ∂Yj E˜ 2 f with i ≤ l − 2, j ≤ 2. First bound ∂xi E˜ 2 f by

i ˜ sup sup k∂x E2 f kL2 (<x) sup

2 0 0 0≤t≤T θ ≤θ−βt |=x|≤ρ−βt L (0(θ ,a/ε)) (Z Z ∞

≤ sup

dY

sup

0≤t≤T θ 0 ≤θ−βt

"Z

Z

t

ds

−

∞

√ −Y / 4(t−s)

0

Z

0(θ 0 ,a/ε)

∞ √ Y / 4(t−s)

≤ c sup

dζe−ζ

2

Z

∞

dζe

−ζ 2

∞

Z

∞

−z 2

Z

Z 0(θ 0 ,a/ε) 2

dηe

4(t−s), Y + ζ

p

4(t−s), s) !#2 1/2 

p 4(t−s), −Y +ζ 4(t−s), s)



ds 0

sup |=x|≤ρ−βt

−η 2

p

p

T

dY

∞ −∞

2

dηe−η ∂xi f (x+η

dηe−η

−∞

dηe−η ∂xi f (x + η

2

−∞

dze −∞

∞

sup

∞

+

Z

(Z

−∞

Z

2

|=x|≤ρ−βt

−∞

−∞

0≤t≤T θ 0 ≤θ−βt

Z

dζe−ζ

d<x

sup

sup |=x|≤ρ−βt

2

p

f (· + i=x, Y + ζ 4(t − s))

2 p

f (· + i=x, −Y + z 4(t − s))

#)1/2

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

i ≤ c sup sup sup k∂x f kL2 (<x)

0≤t≤T θ 0 ≤θ−βt |=x|≤ρ−βt

489

.

(A.8)

L2 (0(θ 0 ,a/ε))

In passing from the second to the third line of the above estimate, we used Jensen’s inequality and overestimated the integrals in s, ζ and z. Now bound ∂Y ∂xi E˜ 1 f with i ≤ l − 2 by

i ˜ sup sup sup k∂Y ∂x E2 f kL2 (<x)

2 0 0≤t≤T θ 0 ≤θ−βt |=x|≤ρ−βt L (0(θ ,a/ε)) (Z Z ∞

≤ sup

dY

sup

0≤t≤T θ 0 ≤θ−βt

"Z

Z

t

−

dζe−ζ

√

−Y / 4(t−s)

0

Z

0(θ 0 ,a/ε)

∞

ds ∞ √ Y / 4(t−s)

dζe

−ζ 2

Z

2

|=x|≤ρ−βt ∞

−∞

Z

∞

dηe

−∞

dηe−η ∂Y ∂xi f (x+η

−η 2

−∞

d<x

sup 2

∂Y ∂xi f (x

+η

p

p

4(t−s), Y +ζ

4(t − s), −Y + ζ

p 4(t−s), s)

p

! 4(t − s), s)

#2 1/2 Z 2  p 2 e−Y /4(t−s) ∞ + 2 ds √ dηe−η ∂xi f (x + η 4(t − s), 0, s)  4(t − s) −∞ 0

i ≤ c sup sup sup k∂Y ∂x f kL2 (<x)

2 0 0≤t≤T θ 0 ≤θ−βt |=x|≤ρ−βt L (0(θ ,a/ε)) (Z Z Z

t

∞

+ sup

dY

sup

0≤t≤T θ 0 ≤θ−βt

0(θ 0 ,a/ε)

d<x

sup |=x|≤ρ−βt

−∞

#2  21 " Z Z ∞ 2  t p 2 e−Y /4t ds √ dηe−η ∂xi f (x + η 4(t − s), 0, s) 2  4(t − s) −∞ 0 ≤ c|∂Y ∂xi f |0,ρ,θ,β,T ( + c sup

sup

0≤t≤T θ 0 ≤θ−βt

Z dY e

sup

sup

0≤t≤T |=x|≤ρ−βt

≤ c|∂Y ∂xi f |0,ρ,θ,β,T

Z

t

0≤j≤1

≤ c|f |l,ρ,θ,β,T .

L (<x)

)1/2 1 ds √ 4(t − s) 0

X

j i +c sup sup k∂Y ∂x f kL2 (<x)

0≤t≤T |=x|≤ρ−βt

−Y 2 /4t

0(θ 0 ,a/ε)

i

∂x f (· + i=x, 0, t) 2 2

L2 (0(θ 0 =0,a/ε))

(A.9)

In passing from the third to the fourth line, we estimated the value of ∂xi f at the boundary with the L2 (in Y ) estimate of ∂xi f and ∂Y ∂xi f . Now bound ∂Y Y ∂xi E˜ 1 f with i ≤ l − 2 by

490

M. Sammartino, R. E. Caflisch

2 i ˜ sup sup sup k∂Y ∂x E2 f kL2 (<x)

2 0 0≤t≤T θ 0 ≤θ−βt |=x|≤ρ−βt L (0(θ ,a/ε))

X

j i i ≤ c|∂Y Y ∂x f |0,ρ,θ,β,T + c sup sup k∂Y ∂x f kL2 (<x)

2 0 1≤j≤2 0≤t≤T |=x|≤ρ−βt L (0(θ =0,a/ε)) (Z Z ∞

+ sup

dY

sup

0≤t≤T θ 0 ≤θ−βt

"Z

t 0

0(θ 0 ,a/ε)

−Y 2 /4(t−s)

Ye ds p 4(t − s)3

≤ c|f |l,ρ,θ,β,T + sup ∞ √ Y / 4t

∞ −∞

≤ c|f |l,ρ,θ,β,T .

2

Z

2

0(θ 0 ,a/ε)

Z

−∞

dηe−η ∂xi f (x + η

dY

sup

dζe−ζ

|=x|≤ρ−βt

(Z

0≤t≤T θ 0 ≤θ−βt

"Z

Z

d<x

sup

∞ −∞

p

4(t − s), 0, s)

d<x −∞

#2  21 

dηe−η ∂xi f (x + ηY /ζ, 0, s) 2



∞

sup |=x|≤ρ−βt

#2  21 

 (A.10)

In passing from the third to the fourth line, we used Jensen’s inequality to pass the √ square inside the integral in ζ and η. Then we used the fact that the integral in ζ from Y / 4t to infinity is an exponential decaying function of Y to perform the integration in Y . Finally we estimated the value of ∂xi f at the boundary with the L2 (in Y ) estimate of ∂xi f and ∂Y ∂xi f . Proof of Proposition 7.2. The proof of Proposition 7.2 uses the same calculations as in the previous proof, except that in Proposition 7.2 the boundary terms with Y = 0 are all zero. With these terms absent, the result (7.6) follows from the estimates (A.8)–(A.10).

Acknowledgement. Part of this work has been done while Marco Sammartino was visiting the Mathematics Department of UCLA. He wishes to express his gratitude for the warm hospitality that he received. As in Part I, we are happy to acknowledge the help that we received through discussions with a number of people, including Kiyoshi Asano, Antonio Greco, Tom Hou and Mario Pulvirenti.

References 1. Caflisch, R.E. and Sammartino, M.: Navier-Stokes Equation on an exterior circular domain: Construction of the solution and the zero viscosity limit. Comptes Rendus Acad. Sci. I-Math. 324, 861–866 (1997) 2. Van Dommelen, L.L. and Shen, S.F.: The spontaneous generation of the singularity in a separating laminar boundary layer. J. Comput. Phys.38, 125–140 (1980) 3. Van Dyke: Perturbation methods in fluid mechanics. New York: Academic Press, 1964 4. Goodman, J. and Xin, Z.P.: Viscous limits for piecewise smooth solutions to systems of conservation laws. Arch. Rat. Mech. 121, 235–265 (1992) 5. Oleinik, O.A.: On the mathematical theory of boundary layer for an unsteady flow of incompressible fluid. J. Appl. Math. Mech. 30, 951–974 (1966)

Zero Viscosity Limit for Analytic Solutions of N-S Equation. II.

491

6. Sammartino, M. and Caflisch, R.E.: Zero Viscosity Limit for Analytic Solutions of the Navier-Stokes Equation on a Half-Space I. Existence for Euler and Prandtl equations. Commun. Math. Phys.192, 433–461 (1998) 7. Ukai, S.: A solution formula for the Stokes equations in R+n . Commun. Pure and Appl. Math. 40, 611–621 (1987) 8. Xin, Z.P.: The fluid-dynamic limit of the Broadwell model of the nonlinear Boltzmann equation in the presence of shocks. Commun. Pure and Appl. Math. 44, 679–713 (1991) Communicated by J. L. Lebowitz

Commun. Math. Phys. 192, 493 – 517 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Geometric Quantization and Two Dimensional QCD S. G. Rajeev, O. T. Turgut? University of Rochester, Rochester, NY 14623, USA. E-mail: [email protected] Received: 13 August 1996 / Accepted: 20 May 1997

Abstract: In this article, we will discuss geometric quantization of two dimensional Quantum Chromodynamics with fermionic or bosonic matter fields. We identify the respective large-Nc phase spaces as the infinite dimensional Grassmannian and the infinite dimensional Disc. The Hamiltonians are quadratic functions, and the resulting equations of motion for these classical systems are nonlinear. In [33], it was shown that the linearization of the equations of motion for the Grassmannian gave the ‘t Hooft equation. We will see that the linearization in the bosonic case leads to the scalar analog of the ‘t Hooft equation found in [36]. 1. Introduction In the large Nc limit of various quantum field theories (e.g., Quantum Chromodynamics or QCD) the quantum fluctuations become small and the theories tend to a classical limit. This classical limit however is different from the conventional one, in that many of the essential non-perturbative features of the quantum theory survive the large Nc limit [16, 17, 38]. These classical theories are somewhat unusual in that they are not local field theories. Moreover the phase spaces are some curved manifolds, and some of the nonlinearities (interactions) of the theory arise from this curvature. The point of view of these theories starting from the geometry of these phase spaces is therefore natural. Pioneering work in this direction was done by Berezin [8]. The first author has revived this point of view more recently [33]. ( See also [13, 14, 10, 22, 23].) The situation is analogous to that in atomic physics. In the conventional classical limit the atom is unstable due to radiation from the orbiting electrons. However, in the mean field (or Hartree–Fock) approximation, the atom is stable. In this approximation quantum fluctuations in the density matrix of electrons is ignored. What is perhaps less well appreciated is that the Hartree–Fock approximation of atomic physics is equivalent ?

Present address: IHES, 35 route de Chartres, 91440, Bures-sur-Yvette, France. E-mail:[email protected]

494

S. G. Rajeev, O. T. Turgut

to a classical theory whose phase space is an infinite dimensional complex manifold [33]. (Indeed this approximation can be thought of as a large Nc limit of the atomic system.) The density matrix itself is treated as a dynamical variable: the set of values it can take is the phase space of the theory. We will develop the geometric point of view to these theories further, by treating fermionic and bosonic systems in parallel. It turns out that many of the mathematical ideas are simpler for bosons, since the phase space admits a global co-ordinate system. In the case of fermionic systems, the appropriate phase spaces are coset spaces of unitary groups called the Grassmannian. These are well known objects in algebraic geometry in the finite dimensional case [12, 15]. The corresponding infinite dimensional generalization is essentially due to G. Segal [31, 34]. We will show that the phase space of the large Nc limit of the bosonic systems is a coset space of the pseudo-unitary group. This space is a non-commutative analog of the open disc in the complex plane: it is the space of matrices Z such that Z † Z < 1. The pseudo-unitary group acts by fractional linear transformations on this space. We will find also the appropriate infinite dimensional version of this space relevant to two dimensional field theory. First we will describe a classical dynamics on these coset spaces. We will then quantize these classical systems using geometrical ideas. The special case of the large Nc limit of scalar QCD in two space-time dimensions will be discussed in more detail. The linear approximation to this theory can then be obtained. We will see that this agrees with the traditional large Nc analysis, which in this case was carried out by Tomaras [36] following the original ideas of ’t Hooft [17]. It is important to emphasize that the large Nc limit of QCD we obtain is a highly nonlinear classical theory. It is only the linear approximation that can be obtained by conventional methods of summing planar diagrams. The geometrical approach encodes the nonlinearities in the curvature of the underlying phase space.

2. The Disc and the Grassmannian We will describe first the geometry of two homogeneous symplectic manifolds, the Disc and the Grassmannian. The quantization of the dynamical systems with these as phase spaces lead to a bosonic and a fermionic system respectively. If the dimension of the homogeneous spaces are finite, this will be a quantum mechanical system with a finite number of degrees of freedom. But our discussion will be tailored to apply to infinite dimensional cases subject to some convergence conditions; these correspond to bosonic or fermionic quantum field theory in two space-time dimensions. It is at the moment not known how to generalize this discussion to the (less convergent) infinite dimensional manifolds that correspond to quantum field theories in higher dimensional space-times. Our approach is much influenced by the discussion of the Grassmannian in the book by Pressley and Segal [31]. Let H be a separable complex Hilbert space; H− and H+ are two orthogonal subspaces with H = H− ⊕ H+ . Define the Disc D1 (H− , H+ ) to be the set of all operators Z : H+ → H− such that 1 − Z † Z > 0 and Z is Hilbert–Schmidt: trZ † Z < ∞. The last condition is automatically satisfied if either H− or H+ is finite dimensional. Next, we define the pseudo-unitary group to be a subset of the invertible operators from H to H: U1 (H− , H+ ) = {g|gg † = ,

g −1 exists and [, g] is Hilbert − Schmidt}.

(1)

Geometric Quantization and Two Dimensional QCD

495

Again, if the dimension of H+ or H− is finite, the last condition is unnecessary. Here −1 0 = with respect to the decomposition H = H− ⊕ H+ . If we decompose the 0 1 matrix into block forms, a b g= (2) c d we have, a : H− → H− , b : H+ → H− , c : H− → H+ and d : H+ → H+ . Then, the off diagonal elements b and c are Hilbert–Schmidt and the diagonal elements a and d are bounded and invertible operators. The space of Hilbert–Schmidt operators form a two-sided ideal (which we will denote by I2 ) in the algebra of bounded operators. Thus the condition on the off-diagonal elements is preserved by multiplication and inversion, so that U1 is indeed a group. The geometric meaning of the condition on [, g] is that the linear transformation g does not mix the subspaces H± by “too much”. We define an action of U1 (H− , H+ ) on the Disc D1 : Z 7→ g ◦ Z = (aZ + b)(cZ + d)−1 .

(3)

The condition 1 − Z † Z > 0 implies that cZ + d is invertible and bounded. Since the space of Hilbert–Schmidt operators is a two-sided ideal, (aZ + b)(cZ + d)−1 is still Hilbert–Schmidt. Thus our action is well-defined. We note that the stability subgroup of the point Z = 0 is U (H− ) × U (H+ ), U (H± ) being the group of all unitary operators on H± . Moreover, any point Z is the image of 0 under the action of the group, g◦(Z = 0) = bd−1 . (Note that bd−1 is in I2 and d† d = 1+b† b implies that 1 − (bd−1 )† bd−1 > 0.) We therefore see that D1 is a homogeneous space and given by the quotient, U1 (H− , H+ ) . (4) D1 = U (H− ) × U (H+ ) It will prove convenient to parametrize the Disc by operators 8 : H → H, −(1 − ZZ † )−1 Z (1 − ZZ † )−1 8=1−2 . † † −1 † † −1 −Z (1 − ZZ ) Z Z (1 − ZZ )

(5)

One can see that under the transformation Z 7→ g◦Z, 8 7→ g −1 8g. 8 satisfies 8† = 8 and 82 = 1. Also, 8 − ∈ I2 , so that as an operator 8 does not differ from by “too much”. Many physical quantities are best described as functions of the deviation of 8 from the standard value , M = 8−. We will see that corresponds to the vacuum state, so this vacuum subtraction is the geometric analogue of normal ordering in quantum ˜ = M . field theory. On occasion, we may use the self-adjoint operator M Given a complex Hilbert space H = H− ⊕ H+ and orthogonal subspaces H± as before, we can define another homogeneous space, the Grassmannian. We define the Grassmannian to be the following set of operators on H: Gr1 = {8|8 = 8† ; 82 = 1; 8 − ∈ I2 }.

(6)

This is the same as the restricted Grassmannian of Ref. [31] if H± are infinite dimensional. To each point in the Grassmannian there corresponds a subspace of H, the eigenspace of 8 with eigenvalue −1. In fact the Grassmannian is viewed usually as the set of subspaces of a Hilbert space. If H is finite dimensional, the Grassmannian is a compact manifold. It is disconnected, with each connected component labelled by tr 1−8 2 = 0, 1, · · · dim H.

496

S. G. Rajeev, O. T. Turgut

Gr1 is the homogeneous space of a unitary group. If H± are infinite dimensional, in order to have a well-defined action on Gr1 , we must restrict to an appropriate sub-group of U (H). We define, (7) U1 (H) = {g|g † g = 1; [, g] ∈ I2 }. Let us split g into 2 × 2 blocks,

g=

g11 g21

g12 . g22

(8)

The convergence condition on [, g] is the statement that the off-diagonal blocks g12 and g21 are in I2 . It then follows, in the case where H± are both infinite dimensional, that g11 and g22 are Fredholm operators. (To see this, we recall that an operator is Fredholm if it is invertible modulo a compact operator. Any operator in I2 is compact and moreover, g is invertible.) The Fredholm index of g11 is opposite to that of g22 ; this integer is a homotopy invariant of g and we can decompose U1 (H) into connected components labeled by this integer. With the projection g → gg † , we see that Gr1 is a homogeneous space of U1 (H): Gr1 = U1 (H)/U (H− ) × U (H+ ).

(9)

For, any 8 ∈ Gr1 can be diagonalized by an element of U1 (H), 8 = gg † ; this g is ambiguous up to right multiplication by an element that commutes with . Such elements form the subgroup h11 0 † † (10) ; h11 h11 = 1 = h22 h22 }. U (H− ) × U (H+ ) = {h|h = 0 h22 Each point 8 ∈ Gr1 corresponds to a subspace of H: the eigenspace of 8 with eigenvalue −1. Thus Gr1 consists of all subspaces obtained from H− by an action of U1 . It is also possible to view Gr1 as a coset space of complex Lie groups: this defines a complex structure on Gr1 which will be useful for geometric quantization. Define the restricted general linear group GL1 = {γ|γ is invertible; [, γ] ∈ I2 }.

(11)

Again if we were to decompose into 2 × 2 submatrices γ12 , γ21 ∈ I2 while γ11 and γ22 are Fredholm. Define also the subgroup ( ‘Borel subgroup’) of matrices which are upper triangular in this decomposition, β11 β12 |β ∈ GL1 }. (12) B1 = {β = 0 β22 This is the stability group of H− under the action of GL1 on H. Thus the Grassmannian (which is the orbit of H− ) is the complex coset space, Gr1 = GL1 /B1 .

(13)

The tangent space to the Grassmannian at may be identified with the Hilbert space I2 (H− ; H+ ). In finite dimensions the Grassmannian and Disc have a symplectic structure ω = i tr8d8d8. It is obvious that this two-form is invariant under the action of the unitary 4 group on the Grassmannian and the pseudo-unitary group on the Disc, since they can both be written as, 8 7→ g8g −1 . Now, recall that in both cases 82 = 1 so that 8d8 = −d88.

Geometric Quantization and Two Dimensional QCD

497

Hence, dω = tr(d8)3 = tr82 (d8)3 = −tr8(d8)3 8 = −tr(d8)3 82 = −dω which proves that ω is closed. Due to the homogenity, it is enough to prove that ω is non-degenerate at one point, say 8 = , which is also straightforward. It is not clear that this symplectic form exists in the infinite dimensional case; the trace could diverge. But if we think in terms of the variable M , obtained from 8 through a “vacuum subtraction”, we see that ω=

i Tr( + M )dM dM 4

(14)

is well-defined because dM ∈ I2 . Indeed this is why we imposed the convergence conditions. It is possible to weaken the convergence condition ( which is interesting for quantum field theories in dimensions greater than two [29]), without changing much of the structure, but we will lose the symplectic form. The above form is invariant under the action of U1 (H) for the Grassmannian and invariant under the action of U1 (H− , H+ ) for the Disc. Thus, Gr1 and D1 are both homogeneous symplectic manifolds just as in the finite dimensional case. We can look for the moment maps, which generate the infinitesimal action of U1 (H− , H+ ) and U1 (H) respectively. In the finite dimensional case, this is just the function −Tru8, where u is a hermitian matrix for the Grassmannian and a pseudo hermitian (u† = u) matrix for the Disc. Indeed, the infinitesimal change of 8 under the group is [u, 8], and 2Tr8[u, 8]d8 = −dTru8. We cannot take fu = −Tru8 in the infinite dimensional case, because the trace diverges. However, we do a vacuum subtraction from this expression and get instead −Tr(8 − )u = −TrM u; this trace is conditionally convergent, so we have a chance of obtaining a generating function. We will now describe this procedure in more detail. Let us decompose the operator M into block form, M11 M12 , (15) M= M21 M22 where, M11 : H− → H− , M12 : H+ → H− , M21 : H− → H+ , and M22 : H+ → H+ ; since 8 − is in I2 in both cases, we have M12 , M21 ∈ I2 . If we use the quadratic constraint, M 2 + [, M ]+ = 0, 2 M11 + M12 M21 − 2M11 = 0,

2 M22 + M21 M12 + 2M22 = 0.

(16)

Next we use thefact that M equations to get M11 , M22 ∈ I1 . 12 , M21 ∈ I2and the above B I2 I1 I 2 Now, u ∈ . (Here B is the space of bounded , and M ∈ I2 B I2 I1 operators.) Thus the diagonal blocks in M u are both trace-class. We now define the conditional trace Tr of an operator to be the sum of the traces of its diagonal submatrices: Tr X = 21 Tr[X + X]. (Such conditional traces have been used recently to study anomalies [28].) This conditional trace exists for M u and we define fu = −Tr M u.

(17)

If we restrict to finite rank matrices u, this function differs by a constant from the previous moment map; therefore it generates the same Hamiltonian vector fields: ω(Vfu , .) = −dfu 7→ Vfu = i[u, + M ].

(18)

However, there is an important change in the Poisson bracket relations; they will differ by a constant term from the previous ones:

498

S. G. Rajeev, O. T. Turgut

{fu , fv } = f−i[u,v] − iTr [u, v].

(19)

In the finite dimensional case we can remove the extra term by adding a constant term to fu . However this is not possible in the infinite dimensional case, as the term we must add to fu will diverge. This is, in fact, the Lie algebra of the non-trivial central extension of GL1 . It will be convenient to identify the Hilbert space H with L2 (R) of square integrable functions on the real line. (Since all abstract infinite dimensional Hilbert spaces are isomorphic, in fact there is no loss of generality. But this realization is convenient for application to two-dimensional quantum field theories.) The operator can be thought of as an integral operator with kernel (p, q). We can choose (p, q) = sgn(p)δ(p − q). A natural choice of basis for the operators on this space is the set of the Weyl operators e(p, q) defined by the integral kernels e(p, q)(r, s) = δ(p − s)δ(q − r). Define also R λ(p, q) = (p, r)e(r, q)[dr]. R We can express the Poisson brackets in terms of the expansions M = M (p, q) R ˜ (p, q)λ(q, p)[dpdq] for the Disc1 : e(q, p)[ dp dq] for the Grassmannian and M = M {M (p, q), M (r, s)} = i(M (r, q)δ(p − s) − M (p, s)δ(r − q)) + i((r, q)δ(p−s)−(p, s)δ(r−q)), (20) and ˜ (p, q), M ˜ (r, s)} = {M ˜ ˜ (p, s)(r, q)) + i((p, s)δ(r − q) − (r, q)δ(p − s)). (21) i(M (r, q)(p, s) − M 3. Canonical Quantization So far we have discussed the geometrical structures associated to classical physics, symplectic geometry. The passage to the quantum theory can be made most directly by an algebraic method: find an irreducible representation for the above commutation relations of the moment maps. In the infinite dimensional case, the central extension will play an important role in this story. Let us start with operators satisfying fermionic anticommutation relations: [χα (p), χ†β (q)]+ = δ(p − q)δβα ;

(22)

all the other anticommutators vanish. (We have introduced an extra label (“color”) α taking values 1, · · · Nc in order to get the most general interesting representations of the algebra.) Here, χα (p) and χ†α (p) are to be thought of as operator valued distributions. The fermionic Fock space F can be built from the vacuum state |0 > , defined by χα (p)|0 >= 0 for p ≥ 0 and χ†α (p)|0 >= 0 for p < 0, by the action of χα (p)’s and ˆ (p, q) = χ†α (q)’s. the hermitian operators acting on the Fock space F ; M P We define 2 † α − Nc α : χα (p)χ (q) : . As is common in quantum field theory, we define the normal ordering of pairs of fermionic operators by χ†α (p)χα (q) if p ≥ 0 . (23) : χ†α (p)χα (q) := −χα (q)χ†α (p) if p < 0 1 To be precise we must use a countable basis, this can be done by using functions on the circle and letting the radius go to infinity.

Geometric Quantization and Two Dimensional QCD

499

Let us pause to explain why such an orderingRrule is necessary. In order to represent ˆ (p, q)u(p, q)[dpdq] must be wellthe Lie algebra of the restricted unitary group, M B I2 defined on the Fock space, where u ∈ . Now the vacuum expectation value I2 B R ˆ < 0| M (p, q)u(p, q)[dpdq]|0 > would have diverged if we had not used the normal ordered product. With the normal Rordered product, this expectation value vanishes. ˆ (p, q)u(p, q)[dpdq]|0 > ||2 is finite since only Moreover, the norm of the state || M the off-diagonal, Hilbert–Schmidt, part of u contributes to it. This is indeed why we imposed the convergence conditions in the definition of the restricted unitary group. After some algebra one can check that ˆ (p, q), M ˆ (r, s)] = [M 2 ˆ ˆ (p, s)δ(r − q) + (sgn(r) − sgn(s))δ(r − q)δ(p − s)}. (24) {M (r, q)δ(p − s) − M Nc We see that if we identify (r, q) = δ(r − q)sgn(r) and ~ = N2c we obtain a unitary representation of the central extension of U1 (H) in the fermionic Fock space. (The same central extension was obtained in the mathematics literature by Kac and Peterson [21]). We introduce, in a similar fashion, operators satisfying the bosonic commutation relations, (25) [aβ (p), a†α (s)] = sgn(p)δ(p − s)δαβ ; where, α, β = 1, ..Nc and all the other commutators vanish. The bosonic Fock space is constructed from a vacuum state, defined through aα (p)|0 >= 0 for p ≥ 0 and a†α (p)|0 >= 0 for p < 0, applying the operators aα (p) and a†α (p). We introduce operators, ˆ˜ (p, q) = 2 P : a† (p)aα (q) : with the normal ordering prescription given by, M α α Nc : a†α (p)aα (q) :=

a†α (p)aα (q) if p ≥ 0 aα (q)a†α (p) if p < 0.

(26)

We can calculate the commutation relations; ˆ˜ (p, q), M ˆ˜ (r, s)] = [M 2 ˆ˜ ˆ˜ (p, s)(r, q) + (p, s)δ(r − q) − (r, q)δ(p − s)}, (27) {M (r, q)(p, s) − M Nc ˆ˜ (q, p). This ˆ˜ (p, q)† = M where we identified (p, s) = −sgn(p)δ(p − s). Note that here M shows that there is a quantization of the Poisson Brackets on the bosonic Hilbert space. As they stand, these will not be irreducible representations: the particle number Nˆ and the ‘color’ operators Qˆ α β , which generate a SU (NRc ) symmetry, commute with them. We can check that for the fermionic operators, Nˆ = : χ†α (p)χα (p) : [dp] and Qˆ α β = R † 1 α α † (: χβ (p)χ (p) : − Nc δβ : χγ (p)χγ (p) :)[dp]. To obtain an irreducible representation, we need to fix the color number to zero R and the Fermion number to a fixed finite value. Similarly, in the bosonic Fock space : a†α (p)aα (q) : (q, p)[dpdq] has to be fixed to R † 1 α α † γ a finite value and the color operator Qˆ α β = (: aβ (p)a (q) : − Nc δβ : aγ (p)a (q) : )(q, p)[dpdq] will be put equal to zero.

500

S. G. Rajeev, O. T. Turgut

4. Geometric Quantization Although it is possible to use this algebraic method to complete the quantization of the system, many ideas are much clearer in the geometric method. We will start by describing classical mechanics in geometric terms [2, 1]. Let us assume that the phase space, 0, is a smooth manifold on which a closed and non-degenerate 2-form ω is defined. The observables of a classical system are smooth functions on the phase space. Since ω is non-degenerate, given a smooth function f on 0 we can find a vector field generated by f , (28) − df = iVf ω. This allows us to define the Poisson brackets of two functions f1 and f2 , {f1 , f2 } = ω(Vf1 , Vf2 ).

(29)

Time evolution for the observables is defined if we choose a Hamiltonian function E, df = {f, E}. dt

(30)

Quantization of a given classical system requires finding an appropriate Hilbert space such that the functions on phase space are replaced by self-adjoint operators acting on this Hilbert space. There is no unique prescription for quantization. The first step in geometric quantization is to find a pre-quantum Hilbert space [3, 19, 27]. If the symplectic form −iω ~ belongs to the integral cohomology H 2 (0, Z), one can construct a line bundle on 0 with −iω Chern class ~ . There is then a connection on this line bundle with curvature given by −iω ~ [37, 30]. Here ~ denotes the quantization parameter, ~ → 0 being the classical limit. Square integrable sections of this line bundle provides the “prequantum Hilbert space” HP re . If we denote the correspondence between real functions and self-adjoint operators acting on this Hilbert space via f 7→ f˜, then we require i ˜ ˜ ^ [f1 , f2 ]. {f 1 , f2 } = ~

(31)

It is possible to realize the above representation of the Poisson algebra by using the connection on HP re , and the answer is given by, f˜ = −i~∇Vf + f,

(32)

where Vf denotes the vector field generated by f . The mathematical disadvantage of prequantization is that this representation of the Poisson algebra is highly reducible. Intuitively the wave functions depend on both momenta and coordinates, which is incompatible with the uncertainty principle. To remedy this we can restrict to a subspace of sections which are independent of “half” of the degrees of freedom. This procedure is called picking a polarization. In this article our applications will be on Homogeneous K¨ahler spaces for which there is a natural choice of polarization, the complex polarization; thus we will only talk about this special case. (Quantization on K¨ahler manifolds was first considered in detail by Berezin, using symbols of operators [5, 6, 7].) We require ∇i¯ ψ = 0 on sections as our choice of polarization, where ¯i denotes the antiholomorphic coordinates. The integrability condition for holomorphic sections of line bundles is given by [∇i¯ , ∇j¯ ]ψ = 0.

(33)

Geometric Quantization and Two Dimensional QCD

501

One can split the tangent and cotangent spaces into tensors of type-(r,s) according to the occurrence of holomorphic and antiholomorphic components [25]. This expression is equal to the (0,2) component of the curvature and in our case is proportional to ω. If the symplectic form ω is a multiple of the K¨ahler form, it is of type-(1,1) and the above condition is satisfied. This is only a local necessary condition; the existence of globally holomorphic sections is a harder question. We will in fact construct holomorphic sections in our examples. This way we can reduce the size of our Hilbert space and restrict ourselves to holomorphic sections. We define the quantum Hilbert space HQ as the space of holomorphic sections of the prequantum line bundle. There is a projection K from HP re into HQ and a given prequantum operator f˜ can be projected to an operator fˆ = K f˜K acting on HQ . Although fˆ operates on the correct Hilbert space, they no longer satisfy Eq. (31) in general. We will, in fact, use another prescription to get the quantum operator, which differs from this one by higher order terms. In the language of geometric quantization, we will obtain the quantum operators corresponding to the moment maps and also see that the above commutation relations correspond to the unitary representations of the central extensions of the unitary and pseudo-unitary Lie algebras. We have seen that D1 is a contractible complex manifold. The prequantum line bundle has a connection on it, we choose the same expression as in the finite dimensional case, given by 1 (34) 2 = (Tr(1 − Z † Z)−1 dZ † Z − Tr(1 − Z † Z)−1 Z † dZ). ~ Note that all the traces and inverses are well-defined here. It also satisfies d2 = − ~i ω; in the above coordinate system ω = −2i∂Z ∂Z † log det(1 − Z † Z). Since Z ∈ I2 the expression inside the determinant is of type 1 + I1 and hence is well-defined. We use the homogeneity of ω and the equality of the two expressions at 8 = , or Z = 0 to fix the normalization. The quantum Hilbert space is given, as before, by the holomorphicity requirement HQ = {ψ|∇Z † ψ = 0,

ψ ∈ HP re },

(35)

1 ~

which has solutions as ψ = det (1−Z † Z)9(Z), where 9(Z) is an ordinary holomorphic function of the variables Z. It is possible to establish an inner product on this space (and by completion, turn it into a Hilbert space) following ideas of G. Segal [34]. Alternatively, we can establish an inner product by Pickrell’s measure on the Grassmannian [[32]] which should also have a counterpart on the Disc. We will not address this issue in this paper. Of course in the finite dimensional cases, standard measures exist which can be used to construct the Hilbert spaces. We will obtain the action of the prequantum operators corresponding to the moment maps on H. Suppose that f−u , where u ∈ U 1 (H− , H+ ), is the moment map and Vu = † VuZ ∂Z + VuZ ∂Z † is the vector field generated by it. We calculate the f˜−u action on the 1 wave function ψ = det ~ (1 − Z † Z)9(Z). We note that the action of Vu on Z is given by LVu Z = αZ + β − ZγZ − Zδ, where α β u= (36) γ δ is the decomposition of u into block form, α† = −α , β † = γ and δ † = −δ and further γ, β ∈ I2 . If we use the moment map as TruM we can write the action explicitly as

502

S. G. Rajeev, O. T. Turgut

2 f˜−u ψ = − i~[− Tr(1 − Z † Z)−1 Z † (αZ + β − ZγZ − Zδ)]ψ ~ + 2i[Trα(1 − (1 − ZZ † )−1 ) − TrβZ † (1 − ZZ † )−1 + Trγ(1 − ZZ † )−1 Z + TrδZ † (1 − ZZ † )−1 Z)]ψ − i~det ~ (1 − Z † Z)LVu 9(Z). 1

(37)

Note that because of the vacuum subtraction, all the traces in the above expression are well-defined. A careful calculation is needed to verify that at each step all the traces are convergent. This can indeed be done with the result 1 2 f˜−u ψ(Z, Z † ) = det ~ (1 − Z † Z)(−i~)[LVu 9(Z) − Tr(γZ)9]. ~

(38)

This differs from the finite dimensional case by a constant term. We see that the trace in this formula is finite. Furthermore, the changes introduced are all holomorphic. This shows that the action of the moment maps in fact preserves the holomorphicity requirement. One can define a representation of the central extension of U 1 (H− , H+ ) on the space of holomorphic functions generated by f˜−u , 2 f˜−u 9 = −i~[LVu 9 − Tr(γZ)9]. ~

(39)

This can be exponentiated to a group action given by 2 ρ(g −1 )9 = det− ~ (d−1 cZ + 1)9 (aZ + b)(cZ + d)−1 .

(40)

In finite dimension this action gives a representation of the group, but due to the vacuum subtraction it is not clear that the above action gives a representation. In fact if we calculate ρ(g2−1 )ρ(g1−1 )9(Z) and compare this with ρ((g1 g2 )−1 )9(Z), after some calculations, we see that they differ by a factor c(g1 , g2 ) = det ~ [(d1 d2 )−1 c1 b2 + 1]. 2

(41)

This is well defined since c1 b2 ∈ I1 . We will see that the representation we have corresponds to the representation of the central extension of U1 (H− , H+ ). Let us recall that a central extension of a group G by C∗ is given by the following exact sequence: i π 1 → C∗ −→Gˆ −→G → 1,

(42)

where i and π are group homomorphisms. The extension can be nontrivial algebraically and also topologically, if the above sequence also generates a nontrivial principal fiber bundle. If the extension is topologically trivial, as in the case of U1 (H− , H+ ), there is an equivalent description. In this case we can find a globally defined map s : G → Gˆ such that π(s(g)) = g. Note that Gˆ ≈ G × C∗ as a topological space, and C∗ is identified ˆ Thus we can think of s as s(g) = (g, λ(g)); and see that with its image in the center of G. s(g1 )s(g2 ) = (g1 g2 , λ(g1 )λ(g2 )) = (g1 g2 , c−1 (g1 , g2 )λ(g1 g2 )); here c is a map c : G×G → C∗ which measures how much s differs from a group homomorphism. For this c must satisfy the so-called co-cycle condition; c(g1 g2 , g3 )c(g1 , g2 ) = c(g1 , g2 g3 )c(g2 , g3 ). An extension will be algebraically non-trivial if there is no function φ : G → C∗ , such that c(g1 , g2 ) = φ(g1 )φ(g2 )(φ(g1 , g2 ))−1 . We can see that our formula for c satisfies the cocycle condition:

Geometric Quantization and Two Dimensional QCD

503

c(g1 g2 , g3 )c(g1 , g2 ) = c(g1 , g2 g3 )c(g2 , g3 ) = det ~ [(d1 d2 d3 )−1 c1 a2 b3 +(d2 d3 )−1 c2 b3 +(d1 d2 )−1 c1 b2 +1]. (43) 2

Note that the expression inside the square brackets is of type 1 + I1 , hence the determinant makes sense. The cocycle c, in the finite dimensional case, can be obtained from φ(g) =det(d). This is not well-defined in infinite dimensions; in fact, the extension is nontrivial. Thus, we obtain a representation of a central extension Uˆ 1 (H− , H+ ) in the Quantum Hilbert space of holomorphic sections. Now we continue with the case of Gr1 . It turns out to be better to view the Grassmannian as the coset of yet another pair of groups. This is because the extension is nontrivial both topologically and algebraically. The cocycle does not exist as a continuous function. Essentially, we will enlarge “numerator” and “denominator” by the same amount so that the quotient is still the Grassmannian. In terms of these larger groups, we can find an explicit description of the cocycle. Let us consider first the picture in terms of unitary Lie groups. Define E˜ 1 = {(g, q)|g ∈ U1 (H); q ∈ U (H− ); g11 q −1 − 1 ∈ I1 },

(44)

where we use the same decomposition as in (8). Group multiplication is just the pairwise product: (45) (g, q)(g 0 , q 0 ) = (gg 0 , qq 0 ). Define also, F˜1 = {(b, q)|b ∈ U (H− ) × U (H+ ); q ∈ U (H− ); b11 q −1 − 1 ∈ I1 },

(46)

so that the quotient remains the same: Gr1 = E˜ 1 /F˜1 .

(47)

Now, we can construct a line bundle over Gr1 starting with a representation of F˜1 on C. We choose the representation, ρ(b, q) = detNc [b11 q −1 ].

(48)

The determinant exists due to the condition b11 q −1 − 1 ∈ I1 . Now we see the reason for enlarging GL1 and B1 . The determinant det b11 does not exist in general; we need to factor out a part of it. This is the role of the operator q. The line bundle L˜ ρ is defined as L˜ ρ = (E˜ 1 ×ρ C)/F˜ .

(49)

A section of this line bundle is then a function ψ : E˜ 1 → C

(50)

ψ(gb, qr) = ρ(b, r)ψ(g, q).

(51)

such that An example would be the function ψ0 (g, q) = detNc [g11 q −1 ].

(52)

Of course if we were to restrict to finite dimensions, the dependence of ψ on the additional variable q is just an overall factor of det q −1 due to the equivariance condition. Thus L˜ ρ can be identified as the pre-quantum line bundle.

504

S. G. Rajeev, O. T. Turgut

This new point of view on the Gr1 forces us to re-examine the classical theory. It is best to view Gr1 as a coset space of E˜ 1 there as well. The action of E˜ 1 on Gr1 is just (g, q)8 → g8g −1 ;

(53)

the new variable q has no effect on 8. However, the moment maps are now defined in terms of the Lie algebra of E˜ 1 . Consider the function, 1 − 8 1 − 8 r 0 f(u,r) (8) = 2Tr u − . (54) 0 0 2 2 In the finite dimensional case, this will be just the previous moment map except for a piece independent of 8. We can view this as a “vacuum subtraction” of the moment map. If the operators are decomposed into 2 × 2 blocks, we can check that the conditional trace exists. We can also view this as a function on the group E˜ 1 invariant under F˜1 : f(u,r) (g, q) = 2Tr [

1 − 1 − † g ug − q −1 rq]. 2 2

(55)

It takes a careful calculation to check that this is in fact invariant under F˜1 . This moment map induces the Hamiltonian vector field generating the infinitesimal action of E˜ 1 : − df(u,r) = ω(Vf(u,r) , .),

Vf(u,r) = i[u, 8].

(56)

Now we can proceed with finding the pre-quantum operators. We start by looking for a connection on the principal bundle F˜1 → E˜ 1 → Gr1 .

(57)

In finite dimensions there is a connection on this principal bundle induced from an invariant metric. In general it is not possible to use an invariant metric on the total space to find the connection. The traces in the inner product, which is used in finite dimensions, will diverge. Therefore, we instead postulate an expression for the connection one-form and check that it indeed satisfies the necessary conditions [25]. We can regard a vector field Y = (X, T ) on E˜ 1 as an ordered pair of operator-valued function on E˜ 1 generating a left action, Y (g, q) = (X(g, q), T (g, q)).

(58)

We will have X11 (g, q)−T (g, q) ∈ I1 . Now, define the connection using the right action, i (Y )(g, q) = ( [, [, g † X(g, q)g]+ ]+ , q −1 T (g, q)q). 4

(59)

This is valued in the Lie algebra of F˜1 . The first component agrees with the expression for the connection one-form in finite dimensions coming from the invariant inner product on the total space. The second component is chosen in such a way that satisfies (V(u,r) ) = (u, r)

(60)

on vertical vector fields. This will allow us to define the covariant derivatives acting on the sections of L˜ ρ . We will obtain the prequantum operators, corresponding to the

Geometric Quantization and Two Dimensional QCD

505

moment maps we constructed, acting on HP re . We define the covariant derivative of Y (g, q) = (X, T ), ∇(X,T ) ψ(g, q) = L(X,T ) ψ(g, q) − ρ([(X, T )])ψ(g, q),

(61)

where ρ refers to the infinitesimal form of the representation ρ. This expression reduces to the covariant derivative we would obtain in the finite dimensional case. Let us write down the prequantum operator corresponding to a moment map; fˆ(u,r) ψ = −i~∇V (u,r) ψ + f(u,r) ψ = −i~L(u,r) ψ 1 − 1 − † g ug − q −1 rq]ψ − ~Nc Tr [ 2 2 1 − 1 − † g ug − q −1 rq]ψ, + 2Tr [ 2 2

(62)

, here the Lie derivative is defined through L(u,r) ψ(g, q) = limt→0 ψ((1+itu)g,(1+itr)q)−ψ(g,q) t and preserves the determinant condition since u − r ∈ I1 . Since ~Nc = 2 the last two terms cancel out and we end up with fˆ(u,r) ψ(g, q) = −i~L(u,r) ψ(g, q).

(63)

These operators provide a representation of the central extension of the Lie algebra U 1 (H), as we will see. We introduce the quantum Hilbert space HQ using holomorphicity. First, we will show that the Grassmannian is a complex manifold. The complexification of U1 (H) is given by (64) GL1 (H) = {γ|γ ∈ GL(H), [, γ] ∈ I2 }. We define the closed complex subgroup B1 of GL 1 (H), as the set of β’s, such that β β11 β12 has the decomposition into the block form; β = , where β11 : H− → H− 0 β22 and similarly for the others. Using the same argument in finite dimensions, we see that the Grassmannian is a complex homogeneous manifold given by Gr1 = GL1 (H)/B1 .

(65)

Even though we have a complex structure, because of the divergences, to define holomorphic line bundles, we need to extend the complex general linear group GL1 (H) to another group G˜ 1 : G˜ 1 = {(γ, q)|q ∈ GL(H− );

γ ∈ GL1 (H),

γ11 q −1 − 1 ∈ I1 }.

(66)

Here, similar to the previous cases, γ11 denotes the mapping γ11 : H− → H− in the block form of the matrix γ ∈ GL1 (H). G˜ 1 is a complex Banach-Lie group under the multiplication (γ, q)(γ 0 , q 0 ) = (γγ 0 , qq 0 ). We introduce B˜ 1 , a closed complex subgroup of G˜ 1 as B˜ 1 = {(β, t)|β ∈ B1 , t ∈ GL(H− ), β11 t−1 − 1 ∈ I1 }. (67) There is an action of B˜ 1 on G˜ 1 , and this action is holomorphic too. We enlarged GL1 (H) and B1 with the same set of elements, thus the quotient is still the same B˜ 1 → G˜ 1 → Gr1 .

(68)

506

S. G. Rajeev, O. T. Turgut

Now, we can introduce the holomorphic line bundle corresponding to the representation ρ(β, r) = detNc (β11 r−1 ). Since β11 is invertible it is enough to require Nc to be an integer for the holomorphicity. We can denote the associated line bundle as (G˜ 1 ×ρ C)/B˜ 1 . A section of this line bundle can be identified with equivariant functions: ψ : G˜ 1 → C

such that ψ(γβ, qr) = ρ(β, r)ψ(γ, q).

(69)

One such function is ψ0 (γ, q) = detNc [γ11 q −1 ]. We see that for this function to be globally holomorphic we need to restrict ourselves to positive values of Nc . We can write down a general expression for the holomorphic sections; imagine that we decompose γ as γ11 γ12 . (70) γ= γ21 γ22 Assume that we label the rows of γ11 in some basis as (0, −1, −2, ..., −k, ...) and also the rows of γ21 as (1, 2, ..., k, ...). We define a matrix γA which consists of the rows of γ11 and γ21 . A denotes the rows we pick as a sequence (a1 , a2 , ..., ak , ...) such that this sequence differs from (0, −1, −2, ..., −k, ...) only for finitely many ai ’s. If we think of γA as γA : H− → H and extend γ11 trivially to a map from H− to H, then γA − γ11 is a finite rank operator. Because γA = γ11 +RA , where RA is the finite rank piece, γA q −1 −1 is also in I1 . This implies that the determinant det(γA q −1 ) is still well-defined. If we consider a set of such sequences A1 , A2 , ..., Ap , we can construct a general solution: ψ(γ, q) = det w1 (γA1 q −1 )detw2 (γA2 q −1 )...det wp (γAp q −1 ),

(71)

where, w1 +w2 +...+wp = Nc and they all must be positive integers for the holomorphicity. This, in turn, fixes the value of ~ to be a positive number, N2c . These sections are not all linearly independent, but one can find a linearly independent family among them. As we remarked before, a suitable completion of the set of such holomorphic sections constitute the quantum Hilbert space HQ . If we look back at the formula (62), giving the action of moment maps on sections, we see that the moment maps act as Lie derivatives. When we restrict them to the holomorphic sections, they preserve the holomorphicity condition, since the Lie derivative L(u,r) generates the infinitesimal action on the left by constant operators (u, r). It is possible to exponentiate this to a group action. Consider the left action of (λ, s) ∈ G˜ 1 on the holomorphic sections: r(λ, s)ψ(γ, q) = ψ(λ−1 γ, s−1 q). Since λ and s are constant matrices, the product still satisfies the holomorphicity and due to the left action equivariance is also preserved. We consider the following diagram:

1

→

1

→

1 ↓ SL1 ↓ GL1 ↓ C∗ ↓ 1

→ → →

1 ↓ T˜1 ↓ E˜ 1 ↓ d GL1 ↓ 1

→ GL1 | → GL1

→ 1

(72)

→ 1

In the above diagram, GL1 is the subgroup of operators in GL(H− ) for which the determinant exists. SL1 is its subgroup of operators with the determinant one. T˜1 denotes the subgroup of E˜ 1 , which consists of elements of the form (1, q), with determinant of

Geometric Quantization and Two Dimensional QCD

507

q is one. The first horizontal map is the identification of the two groups. The second one is an exact sequence of groups, which comes from the fact that the second map is a natural imbedding and, the third is a projection to the first factor. The first vertical sequence is exact, if we define the second map as the imbedding and the third to be the d1 = E˜ 1 /T˜1 . In the determinant. The second vertical sequence is also exact if we define GL last horizontal sequence, we introduce the second map in a way that makes the diagram commutative. If we combine all this information, it follows that the second horizontal d1 of the sequence of maps is also exact; that is to say we have a central extension GL ˜ group GL1 . If we look at the action of E1 on the sections we note that if we multiply an element (λ, s) with (1, t) with dett = 1, the representation r(λ, s)=r(λ, st). This shows that the representation-r on the space of holomorphic sections factors through T˜1 . Hence, d1 on HQ . If we restrict ourselves to the real compact we have a representation of GL form we get a representation of the central extension Uˆ 1 (H). This is the representation generated by the moment maps. We can check that for an arbitrary function F (8) on the Grassmannian the associated prequantum operator will not, in general, preserve the holomorphicity. In the infinite dimensional case, projecting to the holomorphic subspace by an operator K is even more complicated. We are interested in polynomial functions of 8, or M ; in this case we can introduce a simpler quantization scheme. We replace each M in the polynomial by the quantum operator we get using the moment maps. This replacement requires an operator ordering rule, such as normal ordering, since the operators associated to different matrix elements of M , typically, do not commute. For quadratic functions of M we can state the normal ordering rule in Fourier expansion: ˆ (r, s)M ˆ (p, q) if p < q and r > s ◦ ˆ ◦ ˆ , (73) ◦ M (p, q)M (r, s) ◦= M ˆ ˆ (r, s) otherwise M (p, q)M ˆ˜ (p, q). This will ensure that the interaction will not change similarly for the variables M the energy by an infinite amount with respect to the free Hamiltonian. The quantum operators we obtain for the polynomial functions will continue to preserve the holomorphicity condition. Hence we have a prescription for quantizing polynomial functions of 8 through operators acting on the space of holomorphic sections. The price one pays is that the commutators of operators no longer form a representation of the Poisson algebra. The deviations are higher order in ~, and this situation is common in quantum theories. 5. Application to Scalar QCD In this section we will apply some of the previous ideas to scalar quantum chromodynamics. We will show that the large-Nc limit of this theory is a somewhat unusual classical theory, and its phase space is the infinite dimensional Disc, D1 . We will obtain the equations of motion, and show that their linearization will give the analog of the ‘t Hooft equation in QCD, first obtained in reference [36]. The infinite dimensional Grassmannian, Gr1 , was shown to be the phase space of fermionic QCD in [33] by the first author. In that case also, the linearization of the equations of motion was shown to give the ‘t Hooft equation. The equations of motion are nonlinear in general; in the fermionic case, these equations have soliton solutions, and they are interpreted as baryons of that theory. The estimate of their masses is given through a numerical solution in [4] and by a variational argument in [33].

508

S. G. Rajeev, O. T. Turgut

We start with the Hilbert space of square integrable functions H = L2 (R). The elements of this space will be thought of as functions of a real variable which has the physical interpretation of the null component of momentum (see below). There is a decomposition of H into H− and H+ , where H− corresponds to the positive momentum components and H+ the negative momentum components in the Fourier expansion. They can be thought of as eigensubspaces of the operator (p, q) = −sgn(p)δ(p, q). (Notice that we have a different sign convention here, as compared to the previous section.) We introduce D1 , the set of operators Z : H+ → H− , to be: D1 = {Z| TrZ † Z < ∞

1 − Z † Z > 0}.

(74)

From R the previous discussion we know that it is convenient to introduce the variables ˜ (p, q)λ(q, p)dpdq. Here, M was defined through a non-linear transformation: M= M 2 0 −(1 − ZZ † )−1 Z (1 − ZZ † )−1 M= −2 , (75) 0 0 Z † (1 − ZZ † )−1 −Z † (1 − ZZ † )−1 Z where the expression is written in the decomposition H = H− ⊕ H+ . We know that the Disc is a homogeneous symplectic complex manifold under the action of U1 (H− , H+ ). We can think of it as the phase space of a dynamical system. In that case the fundamental Poisson brackets satisfied by the coordinates can be expressed through ˜ (p, q), M ˜ (r, s)} = {M ˜ (r, q)(p, s) − M ˜ (p, s)(r, q)) + i((p, s)δ(r − q) − (r, q)δ(p − s)). (76) i(M The above realization, using complex valued functions, has a physical interpretation in 0 1 , terms of the free scalar field theory. If we use the light cone coordinates, x± = x √±x 2 2 − + the metric of two dimensional Minkowski space becomes ds = 2dx dx . The Lorentz transformations in this language can be written as x− → eθ x− and x+ → e−θ x+ , where θ is the rapidity. Here, we consider x− to be the “space” coordinate and x+ to be the “time”. That is, initial data are given on a surface x+ = constant, and the equations of motion predict the evolution of the fields off this surface. L2 (R; C) is the space of complex valued functions on x− , and the momentum variable refers to p− . Let us write down the action of the complex scalar field with Nc components in this language: Z (77) S = (∂− φ∗α ∂+ φα + ∂+ φ∗α ∂− φα − m2 φ∗α φα )dx− dx+ . We see that in the light cone formalism, the action is first order in the time variable. The phase space of the theory is the set of functions φ : R → CNc . Poisson brackets satisfied by the variables φ can be read off, if we think of ∂− as the symplectic form. Imposing antisymmetry, we can calculate the inverse of this operator, and obtain the relation: {φα (x− , x+ ), φ∗β (y − , x+ )} =

1 sgn(x− − y − )δβα . 2

(78)

The other Poisson brackets, such as, {φ, φ}, {φ† , φ† } vanish. The Hamiltonian, which describes the null component P+ of the total momentum of the field is Z m2 H0 = φ∗α φα dx− . (79) 2

Geometric Quantization and Two Dimensional QCD

509

The quantization of this system would replace functions by operators and the Poisson brackets by commutators: 1 [φˆ α (x− ), φˆ †β (y − )] = −i sgn(x− − y − )δβα , 2

(80)

and the other combinations vanish. Note that the above commutator is invariant under conjugate-transpose. It is convenient to expand φˆ α into the Fourier modes; we will R ∞ aα (p) ipx− e [dp] . The hermitian conjugate of this denote p− as p for brevity: φˆ α = −∞ √ 2|p|

expansion will give the expansion for φˆ †α . We note that in order to satisfy the commutation relations we need to take [aα (p), a†β (q)] = sgn(p)δ(p − q)δβα ,

(81)

and the other commutators vanish. We can build a bosonic Fock space using the above set of operators. We define the ‘vacuum’ state |0 > as, a†α (p)|0 >= 0

if p ≤ 0

and

aα (p)|0 >= 0

if p > 0.

(82)

It is known that due to divergences we need to use a normal ordering, and we define it as follows: a†α (p)aβ (q) if p > 0 . (83) : a†α (p)aβ (q) := aβ (q)a†α (p) if p ≤ 0 We can express the quantum Hamiltonian acting on this Fock space as Z X m2 dp Hˆ 0 = . : a†α (p)aα (p) : 2 |p| α

(84)

The equation of motion that follows from this is (2∂+ ∂− + m2 )φˆ α = 0. (85) P ˆ˜ (p, q) = 2 † α Let us introduce the variables M α : aα (p)a (q) : satisfying the commuNc tation relations ˆ˜ (r, q)(p, s) − M ˆ˜ (p, s)(r, q) ˆ˜ (p, q), M ˆ˜ (r, s)] = 2 (M [M Nc + (p, s)δ(r − q) − (r, q)δ(p − s)).

(86)

These form a quantization of Eq. (76) if ~ = N2c . One can also check by direct computation that this realization satisfies the quadratic constraint up to terms of order N1c in the color invariant sector. This is just a special case of what we discussed in Sect. 3. We can ˆ˜ (p, q): write down the equations of motion for the free field in terms of the variable M 2 ˆ˜ (k, l; x+ ) = i m [ 1 − 1 ]M ˆ˜ (k, l; x+ ). ∂+ M 2 l k

(87)

ˆ˜ (p, q) tends to the classical variable M ˜ (p, q), satisfying In the classical limit, Nc → ∞, M ˜ (k, l; x+ ) i 2 1 1 ∂M ˜ (k, l; x+ ). = m [ − ]M ∂x+ 2 l k

(88)

510

S. G. Rajeev, O. T. Turgut

This is the same as the equation of motion we get through the Poisson brackets if we ˜ (p, q), as take the Hamiltonian H0 , a linear function of M Z − q) m2 ˜ (p, q) δ(p p M dp dq. (89) H0 = 2 |p||q| One can, in fact, motivate this choice of the Hamiltonian independently. We note that since the Poisson brackets are at equal null time x+ , they must be Lorentz invariant. This ˜ (e−θ p, e−θ q) under Lorentz transformations. Since the ˜ (p, q) → e−θ M implies that M Hamiltonian is the null component ofR the total momentum of the field, it must transform ˜ (p, q)dp dq, where we choose h(p, q) = as H0 → eθ H0 . Let us write H0 = h(p, q)M h(p)δ(p − q) when we require translation invariance (or momentum conservation). This implies that h(e−θ p) = eθ h(p) under Lorentz transformations. Since the original Poisson ˜ (q, p). If we ˜ (p, q)∗ = M brackets are invariant under complex conjugation, we have M impose the reality of the Hamiltonian we should have h(p) real. If we also assume the ˜ (p, q) must transform under parity as M ˜ (p, q) → parity invariance this implies that M ˜ (−q, −p). Thus, the parity invariance gives us h(p) = h(−p), and combining all these M 2 R δ(p−q) 1 ˜ (p, q) √ . This implies that H0 = m2 M dp dq, where we put we see that h(p) ∼ |p| |p||q|

m2 2 .

This is a moment map, as we have seen in the second a constant of proportionality section. ˜ (p, q) (a moment map) corresponds to free field We see that a linear function of M theory. The same equation of motion in the language of Z is highly nonlinear. Thus, even the free field theory has a complicated description in terms of Z. From our discussion we conclude that the two quantum theories, quantization on D1 and the complex scalar free field with Nc components in the color singlet sector are identical. The large-Nc limit, being a classical theory, is rigorously defined through the classical dynamics on the infinite dimensional Disc, D1 . The next question in this direction is to introduce interactions, and see its meaning in the more conventional point of view. The general interaction term one can think of can be written as an integral kernel, Z ˜ q; s, t)M ˜ (p, q)M ˜ (s, t) dp dq ds dt. (90) HI = G(p, ˜ q; s, t) = G(s, ˜ t; p, q). Translational invariance There is an immediate symmetry, G(p, ˜ q; s, t) = G(p, q; s, t)δ(p + s − q − t). Lorentz invariance implies that will require G(p, ˜ (p, q), the G(p, q; s, t) = e−2θ G(e−θ p, e−θ q; e−θ s, e−θ t). Using the transformation of M ∗ reality condition implies that G(p, q; s, t) = G (q, p; t, s). It is also reasonable to demand it to be non-separable, that is to say we cannot write the interaction as a sum of terms which are not related to each others by any symmetry. Parity invariance puts a further restriction: G(p, q; s, t) = G(−q, −p; −t, −s). We can see that these are satisfied for a simple non-separable kernel, Z 1 ˜ (p, q)M ˜ (s, t) dp dq ds dt. p δ(p + s − q − t)M (91) HI = λ˜ |p||q||s||t| The choice of δ-function is due to the momentum conservation; the momentum dependence is the one necessary for the correct Lorentz transformation property. This, in fact, corresponds to the large-Nc limit of λφ∗α φα φ∗β φβ theory, with appropriate rescalings of ˜ Let us see this in more detail. the coupling constant λ.

Geometric Quantization and Two Dimensional QCD

511

We assume that the classical theory is given by the free Hamiltonian and the following interaction added to it: Z (92) HI = λ φ∗α φα φ∗β φβ dx− . To write down the quantum theory we need to use normal ordering, and it is better to express it in terms of the creation and annihilation operators: λ Hˆ I = 4

Z

∞ −∞

1

p

|p||q||s||t|

δ(p+s−t−q) : a†α (p)aα (q)a†β (s)aβ (t) : dp dq ds dt, (93)

where we use the same conventions as in the free field theory. We normalize the free ˜ and Hamiltonian and the interaction by N2c , and redefine the coupling constant λ8 Nc = λ, take the limit Nc → ∞ while keeping λ˜ fixed. In this limit the normal ordered product decomposes as a product of two normal ordered color singlet operators. More explicitly, as Nc → ∞, the normal ordered products N12 : a†α (p)aα (q)a†β (s)aβ (t) : decompose c

into N1c : a†α (p)aα (q) : N1c : a†β (s)aβ (t) : +O( N1c ). If we introduce the variable ˆ˜ (p, q) = 2 : a† (p)aα (q) : , in the limit N → ∞ the interaction Hamiltonian (93) M c α Nc ˜ implies that we are goes to the expression in (91). The constraint on the variable M looking at the color invariant sector of this theory. This is unlike the case of 2d QCD where the Hamiltonian diverges except in the color invariant sector. One can calculate the ˜ (k, l), for this choice of the Hamiltonian: equations of motion for the basic variables M 2 ˜ (k, l) =m [ 1 − 1 ]M ˜ (k, l) ∂+ M 2 Zl k sgn(k) ˜ (k + p, l) − psgn(l) M ˜ (k, l + p) × p +2iλ˜ M |k(k + p)| |l(l + p)| p p dpds ˜ (s − , s + ) p ×M 2 2 |(s + p2 )(s − p2 )| Z dp [sgn(k) − sgn(l)] ˜ (p + k − l , p − k − l ) q p M +2iλ˜ 2 2 k−l |kl| |(p + )(p − 2

k−l 2 )|

.

(94) Later on we will comment on the linear approximation to this model. Another choice, consistent with the Lorentz transformation property, translational invariance and parity invariance is g˜ 2 HI = 2

Z

∞ −∞

δ(q + t − p − s)(

1 1 qt + ps + st + pq + ) p (p − t)2 (q − s)2 |p||q||s||t|

˜ (p, q)M ˜ (s, t) dq dp ds dt. M

(95)

We can calculate the equations of motion by using the Poisson brackets when we add the free part to this interaction Hamiltonian. The result can be written as

512

S. G. Rajeev, O. T. Turgut 2 ˜ (k, l) = i m [ 1 − 1 ]M ˜ (k, l) ∂+ M 2 l k Z ˜ (p, l)sgn(k)δ(p + s − k − t) M + ig˜ 2 1 kt + ps + st + pk 1 p + × (p − t)2 (s − k)2 |pkst| ˜ (k, p)sgn(l)δ(p + t − l − s) −M 1 pt + ls + st + lp 1 ˜ (s, t) dp dt ds p + M × (p − s)2 (l − t)2 |lpst| Z kt + ls + kl + st p + ig˜ 2 (sgn(l) − sgn(k))δ(l + s − k − t) |lkst| 1 1 ˜ (s, t) ds dt. × M + (l − t)2 (k − s)2

(96)

One can simplify the equations of motion by some redefinition of the variables: 2 ˜ (k, l) = i m [ 1 − 1 ]M ˜ (k, l) ∂+ M 2 l k Z 2 2 ˜ (k + v, l) p sgn(k) (s − v/2 + k) M + 2ig˜ |k(k + v)| (s − v/2 − k)2 2 ˜ (k, l − v) p sgn(l) (s + v/2 + l) −M |l(l − v)| (s + v/2 − l)2 ˜ (s − v/2, s + v/2) M × p ds dv |(s − v/2)(s + v/2)| Z sgn(k) − sgn(l) [s + (k + l)/2]2 p +2ig˜ 2 [s − (k + l)/2]2 |kl| ˜ (s − (l − k)/2, s + (l − k)/2) M ds. × p |(s − (l − k)/2)(s + (l − k)/2)|

(97)

This is a rather complicated, nonlinear equation. Using the above form we can show that in fact TrM is conserved by the time evolution, so we can think of it as a conserved charge which has the meaning of baryon number. We recall that in two dimensional QCD TrM is a topological invariant and hence conserved by the equations of motion. One can see that the above theory in more conventional language corresponds to the large-Nc limit of a complex scalar field with a current-current interaction through a propagator. We can write the quantum theory, using the same conventions as in the case of free field, 1 Hˆ = m2 2

Z

∞ −∞

Z

×

: a†α (p)aα (p) : ∞

−∞

dp 1 2 + g |p| 4

: jˆβα (x− )|x− − y − |jˆαβ (y − ) : dx− dy − ;

(98)

Geometric Quantization and Two Dimensional QCD

513

here jˆβα (x− ) = i : ∂− φˆ ∗α (x− )φˆ β (x− ) − φˆ ∗α (x− )∂− φˆ β (x− ) : denotes the current and we sum over repeated indices. In terms of the creation and annihilation operators, the normal ordered product of two currents has the following term; : [pa†α (p)aβ (t) + ta†α (p)aβ (t)][sa†β (s)aα (q) + qa†β (s)aα (q)] : .

(99)

Dividing Hˆ by Nc and redefining the coupling constant as g 2 Nc → g 2 , we can take the large Nc limit. In this limit, the normal ordered expression (99) actually splits into a product of four color invariant operators, up to corrections of order N1c , as in the previous ˆ˜ (p, q) = 2 : a† (p)aα (q) : , as N → ∞, Hˆ tends to case. With the identification M c α Nc the expression in (95) plus the free part (89). It is possible to obtain this Hamiltonian from the scalar QCD, by eliminating the gauge degrees of freedom. Let us consider the spin zero bosonic matter fields, φα , which are in the fundamental representation of U (Nc ). We write down the Lagrangian in 1 + 1 dimensions, for φ coupled to U (Nc ) Yang-Mills theory. Aµ is the Yang-Mills field, a Lie algebra valued vector. It can be written as Aµ = Aaµ Ta , where Ta ’s are the generators and α = 1, ..., rankU (Nc ). We use hermitian generators in the fundamental representation, and normalize them with respect to the Killing form TrTa Tb = δab . The action for YangMills fields is defined through the field strength tensor Fµν = ∂µ Aν −∂ν Aµ +ig[Aµ , Aν ]. Here, commutator refers to the Lie algebra commutator. We also need to write down the β gauge invariant coupling, through the covariant derivatives Dµ φα = ∂µ φα +ig(Aµ )α βφ , and similarly for the complex conjugate field, φ∗α . The action is given by Z 1 S = (− TrFµν F µν + (Dµ φ)† (Dµ φ) − m2 φ† φ)dx− dx+ , (100) 4 where † denotes the conjugate-transpose of a matrix. We write down the action in the light-cone coordinates, and we pick the light-cone gauge, Aa− = 0. Then the action becomes: Z 1 S = dx− dx+ ( (∂− Aa+ )2 +ig(∂− φ† A+ φ−φ† A+ ∂− φ)−m2 φ† φ+∂+ φ† ∂− φ+∂− φ† ∂+ φ). 2 (101) We see that the A+ has no time derivatives, thus it is not a dynamical field. We can eliminate Aa+ through its equation of motion. We introduce jβα = i(∂− φ∗α φβ − φ∗α ∂− φβ ) and substitute the solution for Aa+ . Then, we obtain the following action: Z Z 1 1 S = (∂− φ∗α ∂+ φα +∂+ φ∗α ∂− φα )dx− dx+ − (m2 φ† φ− g 2 jβα 2 jαβ )dx− dx+ . (102) 2 ∂− We see that this action is first order in its “time” variable and it is written completely in terms of the field φ. We can read off the Hamiltonian Z Z 1 2 2 − ∗ α dx φα φ + g jβα (x− )|x− − y − |jαβ (y − )dx− dy − . (103) H=m 4 The quantization of this theory through conventional methods will lead to the Hamiltonian (98). Therefore, the above choice of the Hamiltonian has fundamental importance. Let us remark that the true large-Nc theory, although it is classical, is not a conventional field theory. Its dynamical variable is non-local and satisfies a complicated nonlinear equation. In fact, the nonlinearity is what is necessary for the theory to accommodate soliton solutions. Baryons will arise as solitons of the above set of nonlinear

514

S. G. Rajeev, O. T. Turgut

equations. Mesons correspond to small oscillations around the vacuum configuration. As is shown by the first author [33], this point of view is valid in the fermionic QCD in 1 + 1 dimensions. The usual ‘t Hooft equation for the meson spectrum was obtained through the linearization of equations of motion and by choosing a simple ansatz for the variable M (p, q). First, we will obtain the analog of the ‘t Hooft equation, by linearization. Then we will show how to make a variational estimate for the baryon mass in scalar QCD. ˜ (p, q) = 0, one To start linearization of the theory around the vacuum configuration M should note that it is also necessary to linearize the constraint equation, M 2 + [, M ] = ˜ (p, q) represents small oscillations, we can disregard the 0. Since we assume that M quadratic term. Explicitly Z ˜ (p, s)[λ(s, p) + λ(s, p)](k, l) = 0. [dp][ds]M (104) This implies that ˜ (k, l)[1 + sgn(k)sgn(l)] = 0, M

(105)

˜ (k, l) = 0 unless, k > 0 and l < 0, or k < 0 and l > 0. We which is to say that M ˜ (k, l) start with the Hamiltonian H = H0 + HI and calculate the Poisson bracket of M with H in the linear approximation. To perform this, we drop the quadratic terms in the equations of motion, and write down the interaction part only, Z qt + ps + 2st ˜ (k, l) = g˜ 2 δ(q − s + t − p) 1 p (sgn(l) − sgn(k)) ∂x+ M 2 (p − t) |p||q||s||t| ˜ (s, t) + δ(k − t)δ(l − s)M ˜ (p, q)]dqdpdsdt. (106) × [δ(p − l)δ(q − k)M We know that in the linear approximation k and l have the opposite signs. We define P = k − l and consider the case for which k > 0 and l < 0. Let us concentrate on the first term, and perform the delta-function integrations, Z kt + l(P + t) + 2t(P + t) 1 ˜ (t + P, t)dt. p M I1 = (l − t)2 (−kl)|t||P + t| Note that in the above expression we can find the limits of integration, since either t + P > 0 and t < 0, or t + P < 0 and t > 0. The second alternative is impossible since P > 0. We have −P < t < 0 as the integration region. If we make a change of variable t → −t and redefine x = Pk , y = Pt , we see that I1 =

1 P

Z

1 0

(y − 1)x + y(x − 1) + 2y(y − 1) 1 ˜ (P y, P (y − 1))dy. √ M (x − y)2 x(1 − x)y(1 − y)

(107)

The second term is very similar and we can apply the same set of transformations to rewrite it as Z 1 1 (y − 1)x + y(x − 1) + 2x(x − 1) 1 ˜ (P y, P (y − 1)) dy. (108) √ M I2 = P 0 (x − y)2 x(1 − x)y(1 − y) If we sum the two expression and insert the resulting expression into the full equation of motion, we obtain:

Geometric Quantization and Two Dimensional QCD

515

2 1 ˜ (P x, P (x − 1)) = i m [ 1 − ˜ (P x, P (x − 1)) ]M ∂ x+ M 2P x (x − 1) Z 4g˜ 2 1 (x + y)(2 − x − y) 1 ˜ (P y, P (y − 1)) dy. √ M −i P 0 (1 − x)x(1 − y)y (x − y)2 (109)

The above form suggests that we define a quark-antiquark wave function ξ(x)eiP+ x = ˜ (P x, P (x − 1); x+ ), in a definite “energy” eigenstate P+ . We also express the coupling M g2 . When this is constant in terms of the coupling constant of the gauge theory: g˜ 2 = 32π substituted into the equations of motion, we see that +

1 g2 1 ]ξ(x) − µ ξ(x) = m [ + x 1−x 4π 2

Z

1

2

0

(x + y)(2 − x − y) 1 √ ξ(y) dy, (110) (1 − x)x(1 − y)y (x − y)2

where µ2 = 2P P+ is the invariant mass of the meson in the linear approximation. This is the equation obtained by Tomaras in [36], as the analog of the ‘t Hooft equation for scalar QCD. The mass spectrum one will get from the semiclassical techniques is similar to the fermionic QCD. We emphasize that we have obtained all the nonlinearities in the large Nc limit of scalar QCD. In principle we can obtain the interactions of the mesons with each other by going to the next order in the linearization. This would be the analogue of the work of Callan, Coote and Gross [9] in scalar QCD. Our approach also describes phenomena that cannot be seen to any order of such an expansion, such as solitons. The simplest such soliton is the baryon, more complicated ones describe the analogues of nuclei in scalar QCD. We can obtain an estimate of the baryon mass by a variational approach. If we choose, (as in [33]) M = −2uu† , where u satisfies u = u, ||u||2 = u† u = 1, we will satisfy all the constraints on M : M 2 + [, M ] = 0, M † = M . Moreover, being rank one, it obviously satisfies the convergence conditions. In the variable M , the baryon number is just − 21 trM which is equal to one for our choice. In momentum space, u will be represented by a function u(p) ˜ which vanishes for p < 0. We can get a formula for the energy of this configuration by substituting it into our Hamiltonian. The “kinetic energy” is best written in momentum space and the “potential energy” in position space: Z ∞ Z 2 [dp] + g˜ 2 dxdy|x − y|=v ∗ (x)v 0 (x)=v ∗ (y)v 0 (y), (111) |v(p)| ˜ H(v) = m2 p 0 R ∞ ipx ˜ √ and v(x) = where = refers to the imaginary part, v(p) ˜ = u(p) e v(p)[dp]. ˜ The p 0 R 2 p[dp] = 1 is the mass of the lowest energy minimum of this over all v˜ satisfying |v(p)| ˜ baryon, in the large N limit. This can be found by numerically solving the resultant integral equation. A reasonable estimate can be made using a variational ansatz, such as v(p) ˜ = Cpe−pκ . (C is a normalization constant.) In the linear approximation the theory defined by the interaction Hamiltonian [11] given in (91) reduces to a similar integral equation, 1 1 ]ξ(x) + λ˜ µ ξ(x) = m [ + x 1−x 2

Z

2

0

1

√

1 ξ(y) dy, (1 − x)x(1 − y)y

(112)

516

S. G. Rajeev, O. T. Turgut

with the same identifications for all the variables as before. One can see that this equation can be solved analytically and it gives an equation for the square of the invariant mass of the small excitations, µ2 : µ 1 1 p arctan p =− , ˜ 2 2 2 2 λ µ 4m − µ 4m − µ

(113)

where we assumed that the spectrum satisfies µ2 < 4m2 . In addition to the discrete spectrum, this theory also has a continuous spectrum of scattering states; unlike in twodimensional QCD, the particles are not confined. Acknowledgement. We would like to thank S. Guruswamy and P. Vitale for discussions at the beginning stages, and to R. Henderson and C. W. H. Lee for reading the article and discussions. The second author would like to thank to IHES, where he is presently an EPDI fellow, for the excellent environment provided during the completion of this work. Our research is also supported by the grant DE-FG02-91ER40685.

References 1. Abraham, R. and Marsden, J.E.: Foundations of Mechanics. 2nd Ed., New York: The Benjamin/Cummings Publ. Comp., Inc., 1978 2. Arnold, V.I.: Mathematical Methods of Classical Mechanics, 2nd Ed., New York: Springer Verlag, 1986 3. Axelrod, S., Pietra, S.D. and Witten, E.: Jour. Diff. Geom. 33, 787 (1991) 4. Bedaque, P., Horvath, I. and Rajeev, S.G.: Mod. Phys. Lett. A7, 3347 (1992) 5. Berezin, F.A.: Commun. Math. Phys. 40, 153 (1975) 6. Berezin, F.A.: USSR Izv. 6, No. 5 (1972) 7. Berezin, F.A.: USSR Izv. 9, No. 2 (1975) 8. Berezin, F.A.: Commun. Math. Phys. 63, 131 (1978) 9. Callan, C.G., Coote, N. and Gross, J.D.: Phys. Rev. D13, 1649 (1976) 10. Cavicchi, M.: Intr. Jour. Mod. Phys. A10, 167 (1995) 11. Coleman, S., Jackiw, R. and Politzer, H.D.: Phys. Rev D10, 2491 (1974) 12. Chern, S.S.: Complex Manifolds without Potential Theory. 2nd Ed., New York: Springer Verlag, 1979 13. Dhar, A., Mandal, G. and Wadia, S.R.: Nucl. Phys. B436, 487 (1994) 14. Dhar, A., Lakdawala, P., Mandal, G. and Wadia, S.: Int. Jour. Mod. Phys. A10, 2189 (1995) 15. Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley Classics, 1995 16. ‘t Hooft, G.: Nucl. Phys. B72, 461 (1974) ??????? 17. ‘t Hooft, G.: Nucl. Phys. B75, 461 (1974) ??????? 18. Humphreys, J.E.: Introduction to Lie Algebras and Representation Theory. New York: Springer Verlag, 1980 19. Hurt, N.E.: Geometric Quantization in Action Dordrecht: D. Reidel, 1983 20. Jacobson, N.: Lie Algebras. New York: Dover Publications, 1979 21. Kac, V. and Peterson, D.H.: Proc. Nat. Acad. Sci., USA, 78, 3308 (1981) 22. Kikkawa, K.: Phys. Lett. B92, 315 (1980) 23. Kikkawa, K.: Ann. Phys. 135, 222 (1981) 24. Knapp, A.W.: Representation Theory of Semisimple Groups. Princeton, NJ: Princeton Univ. Press, 1986 25. Kobayashi, S. and Nomizu, Foundations of Differential Geometry. 2 vols, New York: Wiley Interscience, 1968 26. Kirillov, A.: Elements of the Theory of Representations, New York: Springer Verlag, 1976 27. Kirillov, A.: Geometric Quantization. In Dynamical Systems IV, ed. by V. I.. Arnold, Berli–Heidelberg– New York: Springer Verlag, 1988 28. Langmann, E.: J. Math. Phys. 36, 3822, (1995) hep-th /9507088 and /9508003 29. Mickelsson, J. and Rajeev, S.G.: Commun. Math. Phys. 116, 365 (1988) 30. Milnor, and Stasheff, J.: Lectures on Characteristic Classes. Ann. of Math. Stud. 76, Princeton, NJ: Princeton Univ. Press, 1974 31. Pressley, A. and Segal, G.: Loop Groups. Oxford: Oxford University Press, 1988

Geometric Quantization and Two Dimensional QCD

32. 33. 34. 35. 36. 37. 38.

Pickrell, D.: J. Funct. Anal. 70, 323 (1987) Rajeev, S.G.: Int. J. Mod. Phys. A9, 5583 (1994) Segal, G.: Commun. Math. Phys. 80, 301 (1981) Simon, B.: Trace Ideals and Their Applications. Cambridge: Cambridge Univ. Press, 1979 Tomaras, T.N.: Nucl. Phys. B163, 79 (1980) Wells, (1980): Differential Analysis on Complex Manifolds. New York: Springer Verlag, 1980 Witten, E.: Nucl. Phys. B160, 57 (1979)

Communicated by G. Felder

517

Commun. Math. Phys. 192, 519 – 541 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Generalized Hermite Polynomials and the Heat Equation for Dunkl Operators Margit R¨osler? Zentrum Mathematik, Technische Universit¨at M¨unchen, Arcisstr. 21, D-80333 M¨unchen, Germany. E-mail: [email protected] Received: 10 March 1997 / Accepted: 7 July 1997

Abstract: Based on the theory of Dunkl operators, this paper presents a general concept of multivariable Hermite polynomials and Hermite functions which are associated with finite reflection groups on RN . The definition and properties of these generalized Hermite systems extend naturally those of their classical counterparts; partial derivatives and the usual exponential kernel are here replaced by Dunkl operators and the generalized exponential kernel K of the Dunkl transform. In the case of the symmetric group SN , our setting includes the polynomial eigenfunctions of certain Calogero-Sutherland type operators. The second part of this paper is devoted to the heat equation associated with Dunkl’s Laplacian. As in the classical case, the corresponding Cauchy problem is governed by a positive one-parameter semigroup; this is assured by a maximum principle for the generalized Laplacian. The explicit solution to the Cauchy problem involves again the kernel K, which is, on the way, proven to be nonnegative for real arguments. 1. Introduction Dunkl operators are differential-difference operators associated with a finite reflection group, acting on some Euclidean space. They provide a useful framework for the study of multivariable analytic structures which reveal certain reflection symmetries. During the last years, these operators have gained considerable interest in various fields of mathematics and also in physical applications; they are, for example, naturally connected with certain Schr¨odinger operators for Calogero-Sutherland-type quantum many body systems, see [L-V] and [B-F2, B-F3]. For a finite reflection group G ⊂ O(N, R) on RN the associated Dunkl operators are defined as follows: For α ∈ RN \ {0}, denote by σα the reflection corresponding to α, i.e. in the hyperplane orthogonal to α. It is given by ? This paper was written while the author held a Forschungsstipendium of the DFG at the University of Virginia, Charlottesville, VA, USA.

520

M. R¨osler

σα (x) = x − 2

hα, xi α, |α|2

p where h., .i is the Euclidean scalar product on RN and |x| := hx, xi. (We use the same notations for the standard Hermitian inner product and norm on CN .) Let R be the root system associated with the reflections of G, normalized such that hα, αi = 2 for all α ∈ R. Now choose a multiplicity function k on the root system R, that is, a G-invariant function k : R → C, and fix some positive subsystem R+ of R. The Dunkl operators Ti (i = 1, . . . , N ) on RN associated with G and k are then given by Ti f (x) := ∂i f (x) +

X

k(α) αi ·

α∈R+

f (x) − f (σα x) , f ∈ C 1 (RN ); hα, xi

here ∂i denotes the ith partial derivative. In case k = 0, the Ti reduce to the corresponding partial derivatives. In this paper, we shall assume throughout that k ≥ 0 (i.e. all values of k are non-negative), though several results of Sect. 3 may be extended to larger ranges of k. The most important basic properties of the Ti , proved in [D2], are as follows: Let P = C [RN ] denote the algebra of polynomial functions on RN and Pn (n ∈ Z+ = {0, 1, 2 . . .}) the subspace of homogeneous polynomials of degree n. Then The set {Ti } generates a commutative algebra of differentialdifference operators on P.

(1.1)

Each Ti is homogeneous of degree −1 on P, that is, Ti p ∈ Pn−1 for p ∈ Pn .

(1.2)

Of particular importance in this paper is the generalized Laplacian associated with G PN and k, which is defined as 1k := i=1 Ti2 . It is homogeneous of degree −2 on P and given explicitly by 1k f (x) = 1f (x) + 2

X

h h∇f (x), αi

k(α)

α∈R+

hα, xi

−

f (x) − f (σα x) i . hα, xi2

(Here 1 and ∇ denote the usual Laplacian and gradient respectively.) The operators Ti were introduced and first studied by Dunkl in a series of papers ([D1–4]) in connection with a generalization of the classical theory of spherical harmonics: Here the uniform spherical surface measure on the (N − 1)-dimensional unit sphere is modified by a weight function which is invariant under the action of some finite reflection group G, namely Y |hα, xi|2k(α) , wk (x) = α∈R+

where k ≥ 0 is some fixed multiplicity function on the root system R of G. Note that wk is homogeneous of degree 2γ, with X k(α). γ := α∈R+

In this context, in [D3] the following bilinear form on P is introduced: [p, q]k := (p(T ) q)(0) for p, q ∈ P .

Hermite Polynomials and the Heat Equation for Dunkl Operators

521

Here p(T ) is the operator derived from p(x) by replacing xi by Ti . Property (1.1) assures that [. , .]k is well-defined. A useful collection of its properties can be found in [D-J-O]. We recall that [. , .]k is symmetric and positive-definite (in case k ≥ 0), and that [p, q]k = 0 for p ∈ Pn , q ∈ Pm with n 6= m. Moreover, for all i = 1, . . . , N, p, q ∈ P and g ∈ G, [xi p, q]k = [p, Ti q]k

and

[g(p), g(q)]k = [p, q]k ,

(1.3)

−1

where g(p)(x) = p(g (x)). The pairing [. , .]k is closely related to the scalar product 2 on L2 (RN , wk (x)e−|x| /2 dx): In fact, according to [D3], Z 2 e−1k /2 p(x) e−1k /2 q(x) wk (x)e−|x| /2 dx for all p, q ∈ P, (1.4) [p, q]k = nk RN

with some normalization constant nk > 0. Given an orthonormal basis {ϕν , ν ∈ ZN + } of P with respect to [. , .]k , an easy rescaling of (1.4) shows that the polynomials Hν (x) := 2|ν| e−1k /4 ϕν (x) are orthogonal with respect to wk (x)e−|x| dx on RN . We call them the generalized Hermite polynomials on RN associated with G, k and {ϕν }. 2

The first part of this paper is devoted to the study of such Hermite polynomial systems and associated Hermite functions. They generalize their classical counterparts in a natural way: these are just obtained for k = 0 and ϕν (x) = (ν!)−1/2 xν . In the onedimensional case, associated with the reflection group G = Z2 on R, our generalized Hermite polynomials coincide with those introduced in [Chi] and studied in [Ros]. Our setting also includes, for the symmetric group G = SN , the so-called non-symmetric generalized Hermite polynomials which were recently introduced by Baker and Forrester in [B-F2, B-F3]. These are non-symmetric analogues of the symmetric, i.e. permutationinvariant generalized Hermite polynomials associated with the group SN , which were first introduced by Lassalle in [L2]. Moreover, the “generalized Laguerre polynomials” of [B-F2, B-F3], which are non-symmetric analogues of those in [L1], can be considered as a subsystem of Hermite polynomials associated with a reflection group of type BN . We refer to [B-F1 and vD] for a thorough study of the symmetric multivariable Hermiteand Laguerre systems. After a short collection of notations and basic facts from Dunkl’s theory in Sect. 2, the concept of generalized Hermite polynomials is introduced in Sect. 3, along with a discussion of the above-mentioned special classes. We derive generalizations for many of the well-known properties of the classical Hermite polynomials and Hermite functions: A Rodrigues formula, a generating relation and a Mehler formula for the Hermite polynomials, analogues of the second order differential equations and a characterization of the generalized Hermite functions as eigenfunctions of the Dunkl transform. Parts of this section may be seen as a unifying treatment of results from [B-F2, B-F3 and Ros] for their particular cases. In Sect. 4, which makes up the second major part of this paper, we turn to the Cauchy problem for the heat operator associated with the generalized Laplacian: Given an initial distribution f ∈ Cb (RN ), there has to be found a function u ∈ C 2 (RN × (0, T )) ∩ C(RN × [0, T ]) satisfying Hk u := 1k u − ∂t u = 0 on RN × (0, ∞),

u(. , 0) = f.

(1.5)

For smooth and rapidly decreasing initial data f an explicit solution is easy to obtain; it involves the generalized heat kernel

522

M. R¨osler

0k (x, y, t) =

Mk tγ+N/2

e−(|x|

2

+|y|2 )/4t

x y , K √ ,√ 2t 2t

x, y ∈ RN , t > 0.

Here Mk is a positive constant and K denotes the generalized exponential kernel associated with G and k as introduced in [D3]. In the theory of Dunkl operators and the Dunkl transform, it takes over the rˆole of the usual exponential kernel ehx,yi . Some of its properties are collected in Sect. 2. Without knowledge whether K is nonnegative, a solution of (1.5) for arbitrary initial data seems to be difficult. However, one can prove a maximum principle for the generalized Laplacian 1k , which is the key ingredient to assure that 1k leads to a positive one-parameter contraction semigroup on the Banach space (C0 (RN ), k.k∞ ). Positivity of this semigroup enforces positivity of K and allows to determine the explicit solution of (1.5) in the general case. We finish this section with an extension of a well-known maximum principle for the classical heat operator to our situation. This in particular implies a uniqueness result for solutions of the above Cauchy problem. 2. Preliminaries The purpose of this section is to establish our basic notations and collect some further facts on Dunkl operators and the Dunkl transform which will be of importance later on. General references here are [D3, D4, and dJ]. First of all we note the following product rule, which is confirmed by a short calculation: For each f ∈ C 1 (RN ) and each g ∈ C 1 (RN ) which is invariant under the natural action of G, (2.1) Ti (f g) = (Ti f )g + f (Ti g) for i = 1, . . . , N. We use the common multi-index notation; in particular, for ν = (ν1 , . . . , νN ) ∈ ZN + and x = (x1 , . . . , xN ) ∈ RN we set xν := xν1 1 · . . . · xνNN , ν!P:= ν1 ! · . . . · νN ! and |ν| := ν1 + . . . + νN . If f : RN → C is analytic with f (x) = ν aν xν , the operator f (T ) is defined by X X aν T ν = aν T1ν1 · . . . · TNνN . f (T ) := ν

k

ν

N

We restrict its action to C (R ) if f is a polynomial of degree k and to P otherwise. The following formula will be used frequently: Lemma 2.1. Let p ∈ Pn . Then for c ∈ C and a ∈ C \ {0}, −2 ec1k p (ax) = an ea c 1k p(x) for all x ∈ RN . In particular, for p ∈ Pn we have √ n √ e−1k /2 p ( 2x) = 2 e−1k /4 p (x). Proof. For m ∈ Z+ with 2m ≤ n, the polynomial n − 2m. Hence

1m k p

(2.2)

is homogeneous of degree

bn/2c m bn/2c m X c X c n a−2 c 1k (1m an−2m (1m p)(ax) = p (x). ec1k p (ax) = k k p)(x) = a e m! m! m=0

m=0

Hermite Polynomials and the Heat Equation for Dunkl Operators

523

A major tool in this paper is the generalized exponential kernel K(x, y) on RN ×RN , which generalizes the usual exponential function ehx,yi . It was first introduced in [D3] by means of a certain intertwining operator. By a result of [O1] (see also [dJ]), the function x 7→ K(x, y) may be characterized as the unique analytic solution of the system Ti f = yi f (i = 1, . . . , N ) on RN with f (0) = 1. Moreover, K is symmetric in its arguments andPhas a holomorphic extension to CN × CN . Its power series can ∞ be written as K = n=0 Kn , where Kn (x, y) = Kn (y, x) and Kn is a homogeneous polynomial of degree n in each of its variables. Note that K0 = 1 and K(z, 0) = 1 for all z ∈ CN . For the reflection group G = Z2 on R, the multiplicity function k is characterized by a single parameter µ ≥ 0, and the kernel K is given explicitly by K(z, w) = jµ−1/2 (izw) +

zw jµ+1/2 (izw), 2µ + 1

z, w ∈ C,

where for α ≥ −1/2, jα denotes the normalized spherical Bessel function jα (z) = 2α 0(α + 1)

∞

X (−1)n (z/2)2n Jα (z) . = 0(α + 1) · zα n! 0(n + α + 1) n=0

For details and related material we refer to [D4, R, R-V and Ro]. We list some further general properties of K and the Kn (all under the assumption k ≥ 0) from [D3, D4 and dJ]: For all z, w ∈ CN and λ ∈ C,

|Kn (z, w)| ≤

K(λz, w) = K(z, λw);

(2.3)

1 n n |z| |w| and |K(z, w)| ≤ e|z||w| . n!

(2.4)

For all x, y ∈ RN and j = 1, . . . , N, |K(ix, y)| ≤

p |G| ;

(2.5)

Tjx Kn (x, y) = yj Kn−1 (x, y) and Tjx K(x, y) = yj K(x, y);

(2.6)

here the superscript x denotes that the operators act with respect to the x-variable. In [dJ], exponential bounds for the usual partial derivatives of K are given. They imply in particular that for each ν ∈ ZN + there exists a constant dν > 0 such that |∂xν K(x, z)| ≤ dν |z||ν| e|x| |Re z| for all x ∈ RN , z ∈ CN .

(2.7)

Let us finally recall a useful reproducing kernel property of K from [D4] (it is rescaled with respect to the original one, thus fitting better in our context of generalized Hermite polynomials): Define the probability measure µk on RN by Z −1 2 2 dµk (x) := ck e−|x| wk (x)dx, with ck = e−|x| wk (x)dx . RN

PN

Moreover, for z ∈ CN set l(z) := i=1 zi2 . Then for all z, w ∈ CN , Z K(2z, x)K(2w, x) dµk (x) = el(z)+l(w) K(2z, w). RN

(2.8)

524

M. R¨osler

The generalized exponential function K gives rise to an integral transform, called the Dunkl transform on RN , which was introduced in [D4] and has been thoroughly studied in [dJ] for a large range of parameters k. The Dunkl transform associated with G and k ≥ 0 is defined by Z 1 N N f (x) K(−iξ, x) wk (x)dx (ξ ∈ RN ). Dk : L (R , wk (x)dx) → C(R ); Dk f (ξ) := RN

In [dJ], many of the important properties of Fourier transforms on locally compact abelian groups are proved to hold true for Dk . In particular, Dk f ∈ C0 (RN ) for f ∈ L1 (RN , wk (x)dx), and there holds an L1 -inversion theorem, which we recall for later reference: If f ∈ L1 (RN , wk (x)dx) with Dk f ∈ L1 (RN , wk (x)dx), then 2 f = 4−γ−N/2 c2k Ek Dk f a.e., where Ek f (x) = Dk f (−x). (Note that Dk (e−|x| /2 )(0) = 2γ+N/2 c−1 k , which gives the connection of our constant ck with that of de Jeu.) Moreover, the Schwartz space S(RN ) of rapidly decreasing functions on RN is invariant under Dk , and Dk can be extended to a Plancherel transform on L2 (RN , wk (x)dx). For details see [dJ]. 3. Generalized Hermite Polynomials and Hermite Functions Let {ϕν , ν ∈ ZN + } be an orthonormal basis of P with respect to the scalar Lproduct [. , .]k such that ϕν ∈ P|ν| and the coefficients of the ϕν are real. As P = n≥0 Pn and Pn ⊥ Pm for n 6= m, the ϕν with |ν| = n can for example be constructed by Gram-Schmidt orthogonalization within Pn from an arbitrary ordered real-coefficient basis of Pn . If k = 0, the Dunkl operator Ti reduces to the usual partial derivative ∂i , and the canonical choice of the basis {ϕν } is just ϕν (x) := (ν!)−1/2 xν . As in the classical case, we have the following connection of the basis {ϕν } with the generalized exponential function K and its homogeneous parts Kn : X ϕν (z) ϕν (w) for all z, w ∈ CN . Lemma 3.1. (i) Kn (z, w) = (ii) K(x, y) =

X

|ν|=n

ϕν (x) ϕν (y) for all x, y ∈ RN ,

ν∈ZN +

where the convergence is absolute and locally uniform on RN × RN . Proof. (i) It suffices to consider the case z, w ∈ RN . So fix some w ∈ RN . As a function of z, the polynomial Kn (z, w) is homogeneous of degree n. Hence we have X Kn (z, w) = cν, w ϕν (z) with cν,w = [Kn (. , w), ϕν ]k . |ν|=n

Repeated application of formula (2.6) for Kn gives cν,w = ϕν (T z )Kn (z, w) = ϕν (w) K0 (z, w) = ϕν (w). 1 |x|2n Thus part (i) is proved. For (ii), first note that by (2.4) we have |Kn (x, x)| ≤ n! 1 n N and hence, as the ϕν (x) are real, |ϕν (x)| ≤ √n! |x| for all x ∈ R and all ν with P |ϕν (x)ϕν (y)| is majorized on |ν| = n. It follows that for each M > 0 the sum ZN P + n+N −1 2n M /n! . This yields {(x, y) : |x|, |y| ≤ M } by the convergent series n≥0 n the assertion.

Hermite Polynomials and the Heat Equation for Dunkl Operators

525

For homogeneous polynomials p, q ∈ Pn , relation (1.4) can be rescaled (by use of formula (2.2)): Z n e−1k /4 p(x) e−1k /4 q(x) dµk (x). (3.1) [p, q]k = 2 RN

This suggests to define a generalized multivariable Hermite polynomial system on RN as follows: Definition 3.2. The generalized Hermite polynomials {Hν , ν ∈ ZN + } associated with the basis {ϕν } on RN are given by b|ν|/2c |ν| −1k /4

Hν (x) := 2 e

ϕν (x) = 2

|ν|

X (−1)n 1n ϕν (x). 4n n! k

(3.2)

n=0

Moreover, we define the generalized Hermite functions on RN by hν (x) := e−|x|

2

/2

Hν (x),

ν ∈ ZN + .

(3.3)

Note that Hν is a polynomial of degree |ν| satisfying Hν (−x) = (−1)|ν| Hν (x) for all x ∈ RN . A standard argument shows that P is dense in L2 (RN , dµk ). Thus by virtue 2 N of (3.1) the {2−|ν|/2 Hν , ν ∈ ZN + } form an orthonormal basis of L (R , dµk ). Let us give two immediate examples: Examples 3.3. (1) In the classical case k = 0 and ϕν (x) := (ν!)−1/2 xν , we obtain N

N

2|ν| Y −∂i2 /4 νi 1 Y b Hν (x) = √ e (xi ) = √ Hνi (xi ), ν! i=1 ν! i=1 b n , n ∈ Z+ denote the classical Hermite polynomials on R defined by where the H n 2 b n (x) = (−1)n d e−x2 . e−x H n dx

(2) For the reflection group G = Z2 on R and multiplicity parameter µ ≥ 0, the polynomial basis {ϕn } on R with respect to [. , .]µ is determined uniquely (up to sign-changes) by suitable normalization of the monomials {xn , n ∈ Z+ }. One obtains Hn (x) = dn Hnµ (x), where dn ∈ R \ {0} are constants and the Hnµ , n ∈ Z+ are the generalized Hermite polynomials on R as introduced e.g. in [Chi] and studied in [Ros] 2 (in some different normalization). They are orthogonal with respect to |x|2µ e−|x| and can be written as ( µ−1/2 2 µ (x) = (−1)n 22n n! Ln (x ), H2n µ+1/2 2 µ n 2n+1 n! xLn (x ); H2n+1 (x) = (−1) 2 here the Lα n are the Laguerre polynomials of index α ≥ −1/2, given by Lα n (x) =

1 −α x dn n+α −x x e x e . n! dxn

526

M. R¨osler

Before discussing further examples, we are going to establish generalizations of the classical second order differential equations for Hermite polynomials and Hermite functions. For their proof we shall employ the sl(2)-commutation relations of the operators E :=

1 1 2 |x| , F := − 1k 2 2

and H :=

N X

xi ∂i + γ + N/2

i=1

on P, which can be found e.g. in [H]; they are H, E = 2E, H, F = −2F, E, F = H. (3.4) (As usual, A, B = AB − BA for operators A, B on P.) The first two relations are PN immediate consequences of the fact that the Euler operator ρ := i=1 xi ∂i satisfies ρ(p) = np for each homogeneous p ∈ Pn . We have the following general result: L Theorem 3.4. (1) For n ∈ Z+ set Vn := {e−1k /4 p : p ∈ Pn }. Then P = n∈Z+ Vn , and Vn is the eigenspace of the operator 1k − 2ρ on P corresponding to the eigenvalue −2n. (2) For q ∈ Vn , the function f (x) := e−|x| /2 q(x) satisfies 1k − |x|2 f = −(2n + 2γ + N )f . L Proof. (1) It is clear that P = Vn . By induction from (3.4) we obtain the commuting relations 2ρ, 1nk = −4n1nk for all n ∈ Z+ , hence 2ρ, e−1k /4 = 1k e−1k /4 . 2

For arbitrary q ∈ P and p := e1k /4 q it now follows that 2ρ(q) = (2ρe−1k /4 )(p) = 2e−1k /4 ρ(p) + 1k e−1k /4 p = 2e−1k /4 ρ(p) + 1k q. Hence for a ∈ C there are equivalent: (1k − 2ρ)(q) = −2aq ⇐⇒ ρ(p) = ap ⇐⇒ a = n ∈ Z+ and p ∈ Pn . This yields the assertion. (2) From (3.4) it is easily verified by induction that 1k , E n = 2nE n−1 H + 2n(n − 1)E n−1 for all n ∈ N, and thus 1k , e−E = −2e−E H + 2Ee−E . It follows that (1k − |x|2 )f = 1k e−E q − 2Ee−E q = e−E 1k q − 2e−E (ρ + γ + N/2)q . The stated relation is now a consequence of (1).

Corollary 3.5. (i) The generalized Hermite polynomials satisfy the following differential-difference equation:

1k − 2

N X i=1

xi ∂i Hν = −2|ν|Hν ,

ν ∈ ZN + .

Hermite Polynomials and the Heat Equation for Dunkl Operators

527

(ii) The generalized Hermite functions {hν , ν ∈ ZN + } form a complete set of eigenfunctions for the operator 1k − |x|2 on L2 (RN , wk (x)dx) with 1k − |x|2 hν = −(2|ν| + 2γ + N ) hν . Note also that as a consequence of the above theorem, the operator 1k − 2ρ has for each p ∈ Pn a unique polynomial eigenfunction q of the form q = p + r, where the degree of r is less than n; it is given by q = e−1k /4 p. Examples 3.6. (3) The SN -case. For the symmetric group G = SN (acting on RN by permuting the coordinates), the multiplicity function is characterized by a single parameter which is often Q denoted by 1/α > 0, and the corresponding weight function is given by wS (x) = i<j |xi − xj |2/α . The associated Dunkl operators are TiS = ∂i +

1 X 1 − sij α xi − xj

(i = 1, . . . , N ),

j6=i

where sij denotes the operator transposing xi and xj . The operator 1S − 2ρ is a Schr¨odinger operator of Calogero-Sutherland type, involving exchange terms and an external harmonic potential, see [B-F2 and B-F3]. It is given explicitly by 1S − 2ρ = 1 − 2

N X i=1

xi ∂ i +

2X 1 − sij i 1 h (∂i − ∂j ) − . α i<j xi − xj xi − xj

(3.5)

In [B-F2], Baker and Forrester study “non-symmetric generalized Hermite polynomials” Eν(H) , which they define as the unique eigenfunctions of 3.5 of the form X Eν(H) = Eν + cµ, ν Eµ , |µ|<|ν|

where the Eν , ν ∈ ZN + are the non-symmetric Jack polynomials (associated with SN and α) as defined e.g. in [O2] (see also [K-S]). Thus Eν(H) = e−1S /4 Eν (just by Lemma 3.4), and indeed, up to some normalization factors, the Eν(H) make up a system of generalized Hermite polynomials for SN in our sense. This follows from the fact that the non-symmetric Jack polynomials Eν , being homogeneous of degree |ν| and forming a vector space basis of P, are also orthogonal with respect to Dunkl’s scalar product [. , .]S . This was proved in [B-F3] via orthogonality of the Eν(H) . A short direct proof can be given as follows: According to [O2], Prop. 2.10, the Eν are simultaneous eigenfunctions of the Cherednik operators ξi for SN , which were introduced in [C] and can be written as X sij (i = 1, . . . , N ). (3.6) ξi = αxi TiS + 1 − N + j>i

In fact, the Eν satisfy ξi Eν = ν i Eν , where the eigenvalues ν = (ν 1 , . . . , ν N ) are given explicitly in [O2]. They are distinct, i.e. if ν 6= µ, then ν 6= µ. On the other hand, it follows at once from (3.6) together with properties (1.3) for [., .]S that the Cherednik operators ξi are symmetric with respect to [. , .]S . Together, this proves that the Eν are orthogonal with respect to [. , .]S . Hence a possible choice for the basis {ϕν } is to set ϕν = dν Eν , with some normalization constants dν > 0.

528

M. R¨osler

We finally remark that in this case the locally uniform convergence of the series in Lemma 3.1(ii) extends to CN × CN , see also [B-F3], Prop. 3.10. This is because the coefficients of the non-symmetric Jack-polynomials Eν in their monomial expansions are known to be nonnegative ([K-S], Theorem 4.11), hence |Eν (z)| ≤ Eν (|z|) for all z ∈ CN . (4) A remark on the BN -case. Suppose that G is the Weyl group of type BN , generated by sign-changes and permutations. Here the multiplicity function is characterized by two parameters k0 , k1 ≥ 0. The weight function is wB (x) =

N Y

|xi |2k1

i=1

Y

|x2i − x2j |2k0 .

i<j

Let TiB and 1B denote the associated Dunkl operators and Laplacian. We consider the space W := {f ∈ C 1 (RN ) : f (x) = F (x2 ) for some F ∈ C 1 (RN )} of “completely even” functions; here x2 = (x21 , . . . , x2N ). It is easily checked that for completely even f , 1B f is also completely even. The restriction of 1B to W is given by N X X 1 1 1 ∂i + 2k0 ∂ i − ∂j + ∂i + ∂j xi xi − xj xi + x j i<j i=1 X 1 1 1 − sij . + − 2k0 2 2 (xi − xj ) (xi + xj ) i<j

1B |W = 1 + 2k1

Again, the operator (1B − 2ρ)|W is of Calogero-Sutherland type. Its completely even polynomial eigenfunctions are discussed in [-BF2 and B-F3] separately from the Hermite-case; they are called “non-symmetric Laguerre polynomials” and denoted by Eν(L) (x2 ). It is easy to see that they make up the completely even subsystem of a suitably chosen generalized Hermite-system {Hν } for BN (and parameters k0 , k1 , where we assume k0 > 0): To this end, let again Eν denote the SN -type non-symmetric Jack polynomials, 2 b corresponding to α = 1/k0 . For ν ∈ ZN + set Eν (x) := Eν (x ). These modified Jack polynomials form a basis of P ∩ W . The non-symmetric Laguerre polynomials of Baker and Forrester can be written as bν (x) . Eν(L) (x2 ) = e−1B /4 E (Note that the polynomials on the right side are in fact completely even and eigenfunctions of 1B − 2ρ.) Involving again the SN -type Cherednik operators from (3), it is bν are orthogonal with respect to Dunkl’s pairing [. , .]B : The easily checked that the E b ξi induce operators ξi (i = 1, . . . , N ) on W by ξbi f (x) := (ξi F )(x2 )

if f (x) = F (x2 ),

bν = ν i E bν , and a short calculation gives cf. [B-F3]. Thus ξbi E X X α sij = xi TiB f (x) + 1 − N + sij . ξbi f (x) = αx2i (TiS F )(x2 ) + 1 − N + 2 j>i j>i

Hermite Polynomials and the Heat Equation for Dunkl Operators

529

Together with (1.3), this shows that the ξbi are symmetric with respect to [. , .]B on P ∩W, and yields our assertion by the same argument as in the previous example. We therefore bη obtain an orthonormal basis {ϕν } of P with respect to [. , .]B by setting ϕν := dν E N for ν = (2η1 , . . . , 2ηN ) and completing the set {ϕν , ν ∈ (2Z+ ) } by a Gram-Schmidt procedure. Many properties of the classical Hermite polynomials and Hermite functions on RN have natural extensions to our general setting. We start with a Rodrigues-formula; for SN -type symmetric Hermite polynomials such a formula, involving the (symmetric) Jack polynomials, is known, see e.g. [K]. N Theorem 3.7. For all ν ∈ ZN + and x ∈ R we have

Hν (x) = (−1)|ν| e|x| ϕν (T ) e−|x| . 2

2

(3.7)

Proof. First note that if p is a polynomial of degree n ≥ 0, then p(T ) e−|x| = q(x) e−|x| 2

2

with a polynomial q of the same degree. This follows easily from induction by the degree of p, together with the product rule (2.1). In particular, the function Qν (x) := (−1)|ν| e|x| ϕν (T ) e−|x| = e|x| ϕν (−T ) e−|x| 2

2

2

2

is a polynomial of degree |ν|. In order to prove that Qν = Hν , it therefore suffices to show that for each η ∈ ZN + with |η| ≤ |ν|, Z 2−|η| Qν (x)Hη (x) dµk (x) = δν,η , (3.8) RN

where δν,η denotes the Kronecker delta. Using the antisymmetry of the Ti with respect to L2 (RN , wk (x)dx) (Lemma 2.9 of [D4]) as well as the commutativity of {Ti }, we can write Z Z 2 −|η| Qν (x)Hη (x) dµk (x) = ck ϕν (−T ) e−|x| e−1k /4 ϕη (x) wk (x)dx 2 Z = ck

RN

e RN

RN

−|x|2

ϕν (T )e

−1k /4

ϕη (x) wk (x)dx =

Z RN

e−1k /4 ϕν (T ) ϕη (x) dµk (x).

But as |η| ≤ |ν|, we have ϕν (T ) ϕη = [ϕν , ϕη ]k = δν,η from which (3.8) follows. There is also a generating function for the generalized Hermite polynomials: P Proposition 3.8. For n ∈ Z+ and z, w ∈ CN put Ln (z, w) := |ν|=n Hν (z) ϕν (w). Then ∞ X Ln (z, w) = e−l(w) K(2z, w), n=0

the convergence of the series being locally uniform on CN × CN .

530

M. R¨osler

Proof. Suppose first that z, w ∈ RN . By definition of the Hν and in view of formula (2.6) for Kn we may write bn/2c

z

Ln (z, w) = 2n e−1k /4 Kn (z, w) = 2n

X (−1)m l(w)m Kn−2m (z, w) 4m m!

m=0 bn/2c

=

X (−1)m l(w)m Kn−2m (2z, w) m!

m=0

for all n ∈ Z+ . By analytic continuation, this holds for all z, w ∈ CN as well. Using estimation (2.4), one obtains bn/2c

Sn (z, w) :=

bn/2c X 1 X 1 |2z|n−2m |w|n−2m |l(w)|m |Kn−2m (2z, w)| ≤ |w|2m · . m! m! (n − 2m)!

m=0

m=0

If n is even, set k := n/2 and estimate further as follows: Sn (z, w) ≤

k k |w|2k X k 1 |w|2 (1 + 2|z|2 ) . (2|z|2 )k−m = k! k! m m=0

A similar P estimation holds if n is odd. This entails the locally uniform convergence of ∞ the series n=0 Ln (z, w) on CN × CN , and also that ∞ X

Ln (z, w) =

n=0

=

∞ ∞ X X (−1)m

m!

n=0 m=0 ∞ m X m=0

for all z, w ∈ CN .

(−1) m!

l(w)m Kn−2m (2z, w)

l(w)m

∞ X

(with Kj := 0 for j < 0)

Kn−2m (2z, w) = e−l(w) K(2z, w)

n=0

Applying Lemma 2.1 to p = ϕν with c = −1/4 and a = 1/λ, we obtain the following formula for the generalized Hermite polynomials: N Lemma 3.9. For λ ∈ C \ {0}, ν ∈ ZN + and x ∈ R ,

λ |ν| 2

Hν

x λ

= e−λ

2

1k /4

ϕν (x).

Proposition 3.10. The generalized Hermite functions {hν , ν ∈ ZN + } are a basis of eigenfunctions of the Dunkl transform Dk on L2 (RN , wk (x)dx), satisfying |ν| Dk (hν ) = 2 γ+N/2 c−1 hν . k (−i)

Proof. We use Prop. 2.1 from [D4], which says that for all p ∈ P and z ∈ CN , Z 2 ck e−1k /2 p(x) K(x, z) wk (x)e−|x| /2 dx = el(z)/2 p(z). (3.9) γ+N/2 2 RN PN Here again, l(z) = i=1 zi2 . Let pν (x) := e1k /2 Hν (x). In view of (3.9) we can write

Hermite Polynomials and the Heat Equation for Dunkl Operators

Z Dk (hν )(ξ) =

RN

Hν (x)K(−iξ, x) wk (x)e−|x|

2

/2

531

−|ξ| dx = 2γ+N/2 c−1 k e

2

/2

pν (−iξ)

for all ξ ∈ RN . By definition of Hν we have pν (x) = 2|ν| e1k /4 ϕν (x). So we arrive at −|ξ| Dk (hν )(ξ) = 2γ+N/2 c−1 k e

2

/2

2|ν| e1k /4 ϕν (−iξ).

Application of Lemma 3.9 with λ = −i now yields that (−i/2)|ν| Hν (ξ), hence

e1k /4 ϕν (−iξ) =

|ν| Dk (hν )(ξ) = 2γ+N/2 c−1 k (−i) hν (ξ).

We finish this section with a Mehler-type formula for the generalized Hermite polynomials. For this, we need the following integral representation: Lemma 3.11. For all x, y ∈ RN and ν ∈ ZN + we have Z 2 e−|x| Hν (x) = 2|ν| K(x, −2iy) ϕν (iy) dµk (y). RN

Proof. A short calculation, using again relation (2.2), shows that for homogeneous polynomials p formula (3.9) may be rewritten as Z e−1k /4 p(x) K(x, 2z) dµk (x) = el(z) p(z) (z ∈ CN ). (3.10) RN

By linearity, this holds for all p ∈ P. Lemma 3.9 with λ = i further shows that e1k /4 ϕν (x) =

i |ν| Hν (−ix). 2

As ϕν is homogeneous of degree |ν|, we thus can write ϕν (2iy) = (−i)|ν| e−1k /4 Hν∗ (y) with Hν∗ (y) = Hν (iy). From (3.10) it now follows that Z 2 K(x, −2iy) ϕν (2iy) dµk (y) = e−|x| Hν∗ (−ix), RN

which yields the assertion. Theorem 3.12. (Mehler-formula for the Hν ). For r ∈ C with |r| < 1 and all x, y ∈ RN , 2 X Hν (x)Hν (y) 1 r (|x|2 + |y|2 ) 2rx |ν| r = exp − ,y . K 1 − r2 1 − r2 2|ν| (1 − r2 ) γ+N/2 N ν∈Z+

Proof. Consider the integral M (x, y, r) := Z 2 2 c2k K(−2rz, v)K(−2iz, x)K(−2iv, y) wk (z)wk (v) e−(|z| +|v| ) d (z, v). RN ×RN

532

M. R¨osler

The bounds (2.4) and (2.5) on K assure thatPit converges for all r ∈ C with |r| < 1 and ∞ all x, y ∈ RN . Now write K(−2rz, v) = n=0 (2r)n Kn (iz, iv) in the integral above. As ∞ X |2r|n |Kn (iz, iv)| ≤ e2|r||z||v| , n=0

the dominated convergence theorem yields that M (x, y, r) =

∞ X

(2r)n

∞ X n=0

(2r)n

X Z |ν|=n

Z RN

n=0

=

Z

RN

RN

Kn (iz, iv) K(−2iz, x) K(−2iv, y) dµk (z) dµk (v) Z

K(−2iz, x) ϕν (iz) dµk (z)

RN

K(−2iv, y) ϕν (iv) dµk (v) .

From the above lemma we thus obtain M (x, y, r) = e−(|x|

2

+|y|2 )

X ν∈ZN +

r|ν|

Hν (x)Hν (y) , 2|ν|

(3.11)

and this series, as a power series in r, converges absolutely for all x, y ∈ RN . On the other hand, iterated integration and repeated application of formula (2.3) and the reproducing formula (2.8) show that for real r with |r| < 1 we have Z Z 2 K(−2rz, v) K(−2iy, v) dµk (v) K(−2iz, x)e−|z| wk (z)dz M (x, y, r) = ck N N R R Z −|y|2 (r 2 −1)|z|2 e K(2iry, z) K(−2ix, z) wk (z) dz = ck e RN Z 2 2iry −2ix 2 −(γ+N/2) −|y|2 e e−|u| K u , √ K u, √ wk (u) du = ck (1 − r ) 1 − r2 1 − r2 RN 2rx |x|2 + |y|2 = (1 − r2 )−(γ+N/2) exp − K , y . 1 − r2 1 − r2 By analytic continuation, this holds for {r ∈ C : |r| < 1} as well. Together with (3.11), this finishes the proof. 4. The Heat Equation for Dunkl Operators As before, let 1k denote the generalized Laplacian associated with some finite reflection group G on RN and a multiplicity function k ≥ 0 on its root system R. Recall that its action on C 2 (RN ) is given by X k(α) δα f , 1k f = 1f + 2 α∈R+

where δα f (x) =

f (x) − f (σα x) h∇f (x), αi − . hα, xi hα, xi2

Hermite Polynomials and the Heat Equation for Dunkl Operators

533

Its action may as well be restricted to C 2 (), where ⊂ RN is open and invariant under the group operation of G. We call a function f ∈ C 2 () k-subharmonic on , if 1k f ≥ 0 on . The generalized Laplacian satisfies the following maximum principle, which will be important later on: Lemma 4.1. Let ⊆ RN be open and G-invariant. If a real-valued function f ∈ C 2 () attains an absolute maximum at x0 ∈ , i.e. f (x0 ) = supx∈ f (x), then 1k f (x0 ) ≤ 0 . Proof. Let D2 f (x) denote the Hessian of f in x ∈ . The given situation enforces that ∇f (x0 ) = 0 and D2f (x0 ) is negative semi-definite; in particular, 1f (x0 ) ≤ 0. Moreover, f (x0 ) ≥ f (σα x0 ) for all α ∈ R, so the statement is obvious in the case that hα, x0 i 6= 0 for all α ∈ R. If hα, x0 i = 0 for some α ∈ R, we have to argue more carefully: Choose an open ball B ⊆ with center x0 . Then σα x ∈ B for x ∈ B, and σα x−x = −hα, xiα. Now Taylor’s formula yields 1 f (σα x) − f (x) = −hα, xi h∇f (x), αi + hα, xi2 αt D2f (ξ)α , 2 with some ξ on the line segment between x and σα x. It follows that for x ∈ B with hα, xi = 6 0 we have δα f (x) = 21 αt D2f (ξ)α. Passing to the limit x → x0 now leads to δα f (x0 ) = 21 αt D2f (x0 )α ≤ 0, which finishes the proof. At this stage it is not much effort to gain a weak maximum principle for ksubharmonic functions on bounded, G-invariant subsets of RN , which we want to include here before passing over to the heat equation. Its range of validity is quite general, in contrast to the strong maximum principle in [D1], which is restricted to k-harmonic polynomials on the unit ball. Our proof follows the classical one for the usual Laplacian, as it can be found e.g. in [J]. Theorem 4.2. Let ⊂ RN be open, bounded and G-invariant, and let f ∈ C 2 () ∩ C() be real-valued and k-subharmonic on . Then max (f ) = max ∂ (f ) . Proof. Fix > 0 and put g := f + |x|2 . A short calculation gives 1k (|x|2 ) = 2N + 4γ > 0. Hence 1k g > 0 on , and Lemma 4.1 shows that g cannot achieve its maximum on at any x0 ∈ . It follows that max (f + |x|2 ) = max ∂ (f + |x|2 ) for each > 0. Consequently, max (f ) + min |x|2 ≤ max ∂ (f ) + max ∂ |x|2 . The assertion now follows with → 0.

534

M. R¨osler

In this section we consider the generalized heat operator Hk := 1k − ∂t on function spaces C 2 ( × (0, T )), where T > 0 and ⊆ RN is open and G-invariant. Among the variety of initial- and boundary value problems which may be posed for Hk in analogy to the corresponding classical problems, we here focus on the homogeneous Cauchy problem: Find u ∈ C 2 (RN × (0, T )) which is continuous on RN × [0, T ] and satisfies on RN × (0, T ), Hk u = 0 (4.1) u(. , 0) = f ∈ Cb (RN ). First of all, let usP note some basic solutions of the generalized heat equation Hk u = 0. Again we set γ := α∈R+ k(α) ≥ 0. Lemma 4.3. For parameters a ≥ 0 and b ∈ R \ {0}, the function 1 b|x|2 exp u(x, t) = 4(a − bt) (a − bt)γ+N/2 solves Hk u = 0 on RN × (−∞, a/b) in case b > 0, and on RN × (a/b, ∞) in case b < 0. PN Proof. The product rule (2.1) together with i=1 Ti xi = N + 2γ shows that for each λ > 0, 1k eλ|x|

2

=

N X

Ti 2λxi eλ|x|

2

= 2λ (N + 2γ + 2λ|x|2 ) eλ|x| . 2

i=1

From this the statement is obtained readily by a short calculation.

In particular, the function Fk (x, t) =

Mk −|x|2 /4t e γ+N/2 t

,

with Mk = 4−γ−N/2 ck ,

is a solution of the heat equation Hk u = 0 on RN ×(0, ∞). It generalizes the fundamental solution for the classical heat equation 1u − ∂t u = 0, which is given by F0 (x, t) = 2 (4πt)−N/2 e−|x| /4t . The normalization constant Mk is chosen such that Z Fk (x, t) wk (x)dx = 1 for all t > 0. RN

In order to solve the Cauchy problem (4.1), it suggests itself to apply Fourier transform methods – in our case, the Dunkl transform – under suitable decay assumptions on the initial data f . In fact, in the classical case k = 0 a bounded solution of (4.1) is obtained by convolving f with the fundamental solution F0 , and its uniqueness is a consequence of a well-known maximum principle for the heat operator. It is not much effort to extend this maximum principle to the generalized heat operator Hk in order to obtain uniqeness results; we shall do this in Prop. 4.12 and Theorem 4.13 at the end of this section. However, in our general situation it is not known whether there exists a reasonable convolution structure on RN matching the action of the Dunkl transform Dk , i.e. making it a homomorphism on suitable function spaces. In the one-dimensional case

Hermite Polynomials and the Heat Equation for Dunkl Operators

535

this is true: there is a L1 -convolution algebra associated with the reflection group Z2 on R and the multiplicity parameter k = µ ≥ 0; this convolution enjoys many properties of a group convolution. It is studied in [R] (see also [R-V and Ros]. In the N -dimensional case, we may introduce the notion of a generalized translation at least on the Schwartz space S(RN ) (and similar on L2 (RN , wk (x)dx)), as follows: Z c2k Dk f (ξ) K(ix, ξ)K(iy, ξ) wk (ξ)dξ; y ∈ RN , f ∈ S(RN ). Lyk f (x) := γ+N/2 4 N R (4.2) Note that in case k = 0, we simply have Ly0 f (x) = f (x+y), while in the one-dimensional case, (4.2) matches the above-mentioned convolution structure on R. Clearly, Lyk f (x) = Lxk f (y); moreover, the inversion theorem for the Dunkl transform assures that Lyk f = f for y = 0 and Dk (Lyk f )(ξ) = K(iy, ξ)Dk f (ξ). From this it is not hard to see (by use of the bounds (2.7)) that Lyk f belongs to S(RN ) again. Let us now consider the “fundamental solution” Fk (. , t) for t > 0. A short calculation, using Prop. 3.10 or Lemma 4.11 of [dJ], shows that (Dk Fk )(ξ, t) = e−t|ξ| . 2

(4.3)

By use of the reproducing formula (2.8) one therefore obtains from 4.2 the representation Mk −(|x|2 +|y|2 )/4t x y K √ ,√ . (4.4) L−y k Fk (x, t) = γ+N/2 e t 2t 2t Definition 4.4. The generalized heat kernel 0k is given by x 2 2 Mk y , 0k (x, y, t) := γ+N/2 e−(|x| +|y| )/4t K √ , √ t 2t 2t

x, y ∈ RN , t > 0.

Lemma 4.5. The heat kernel 0k has the following properties on RN × RN × (0, ∞): Z 2 c2k (1) 0k (x, y, t) = γ+N/2 e−t|ξ| K(ix, ξ) K(−iy, ξ) wk (ξ)dξ. 4 N R (2) For fixed y ∈ RN , the function u(x, t) := 0k (x, y, t) solves the generalized heat equation Hk u = 0 on RN × (0, ∞). Z 0k (x, y, t) wk (x)dx = 1. (3) RN

(4) |0k (x, y, t)| ≤

Mk tγ+N/2

e−(|x|−|y|)

2

/4t

.

Proof. (1) is clear from the above derivation. For (2), remember that 1xk K(ix, ξ) = −|ξ|2 K(ix, ξ). Hence the assertion follows at once from representation (1) by taking the differentiations under the integral. This is justified by the decay properties of the integrand and its derivatives in question (use estimation (2.7) for the partial derivatives of K(ix, ξ) with respect to x.) To obtain (3), we employ formula (4.4) as well as (4.2) and write Z 0k (x, y, t) wk (x)dx = Dk L−y k Fk (0, t) = K(−iy, 0)(Dk Fk )(0, t) = 1. RN

Finally, (4) is a consequence of the estimate (2.4) for K.

536

M. R¨osler

Remarks. 1. For integer-valued multiplicity functions, Berest and Molchanov [B-M] constructed the heat kernel for the G-invariant part of Hk (in a conjugated version) by shift-operator techniques. 2. In contrast to the classical case, it is not yet clear at this stage that the kernel 0k is generally nonnegative. In fact, it is still an open conjecture that the function K(iy, .) is positive-definite on RN for each y ∈ RN (cf. the remarks in [dJ and D3]. This would imply a Bochner-type integral representation of K(iy, .) and positivity of K on RN × RN as an immediate consequence. In the one-dimensional case this conjecture is true, and the Bochner-type integral representation is explicitly known (see [Ros or R]. By one-parameter semigroup techniques, it will however soon turn out that K is at least positive on RN × RN . Definition 4.6. For f ∈ Cb (RN ) and t ≥ 0 set Z  0k (x, y, t)f (y) wk (y)dy H(t)f (x) := RN  f (x)

if t > 0,

(4.5)

if t = 0.

Part (4) of Lemma 4.5 assures that for each t ≥ 0, H(t)f is well-defined and continuous on RN . It provides a natural candidate for the solution to our Cauchy problem. Indeed, when restricting to initial data from the Schwartz space S(RN ), we easily obtain the following: Theorem 4.7. Suppose that f ∈ S(RN ). Then u(x, t) := H(t)f (x), (x, t) ∈ RN × [0, ∞), solves the Cauchy-problem (4.1) for each T > 0. Morover, it has the following properties: (i) H(t)f ∈ S(RN ) for each t > 0. (ii) H(t + s)f = H(t)H(s)f for all s, t ≥ 0. (iii) kH(t)f − f k∞,RN → 0 with t → 0. Proof. Using formula (1) of Lemma 4.5 and Fubini’s theorem, we can write u(x, t) = H(t)f (x) Z Z 2 c2k K(ix, ξ)K(−iy, ξ) e−t|ξ| f (y) wk (ξ)wk (y) dξdy = γ+N/2 4 N N R R Z 2 2 ck e−t|ξ| Dk f (ξ)K(ix, ξ) wk (ξ)dξ (4.6) = γ+N/2 4 N R for all t > 0. (Remember that S(RN ) is invariant under the Dunkl transform.) This makes clear that (i) is satisfied. As before, it is seen that differentiations may be taken under the integral in (4.6), and that Hk u = 0 on RN × (0, ∞). Moreover, in view of the inversion theorem for the Dunkl transform, (4.6) holds for t = 0 as well. Using (2.5), we thus obtain the estimation Z p 2 c2k |Dk f (ξ)| 1 − e−t|ξ| wk (ξ)dξ, kH(t)f − f k∞,RN ≤ |G| γ+N/2 4 RN and this integral tends to 0 with t → 0. This yields (iii). In particular, it follows that u 2 is continuous on RN × [0, ∞). To prove (ii), note that Dk H(t)f (ξ) = e−t|ξ| Dk f (ξ) . Therefore

Hermite Polynomials and the Heat Equation for Dunkl Operators

537

2 Dk H(t + s)f (ξ) = e−t|ξ| Dk H(s)f f (ξ) = Dk H(t)H(s)f (ξ). The statement now follows from the injectivity of the Dunkl transform on S(RN ).

We are now going to show that in fact the linear operators H(t) on S(RN ) extend to a positive contraction semigroup on the Banach space C0 (RN ), equipped with its uniform norm k.k∞ . To this end, we consider the generalized Laplacian 1k as a densely defined linear operator on C0 (RN ) with domain S(RN ). Theorem 4.8. (1) The operator 1k on C0 (RN ) is closable, and its closure 1k generates a positive, strongly continuous contraction semigroup {T (t), t ≥ 0} on C0 (RN ). (2) The action of T (t) on S(RN ) is given by T (t)f = H(t)f . Proof. (1) We apply a variant of the Lumer-Phillips theorem, which characterizes generators of positive one-parameter contraction semigroups (see e.g. [A], Cor. 1.3). It requires two properties: (i) The operator 1k satisfies the following “dispersivity condition”: Suppose that f ∈ S(RN ) is real-valued with max{f (x) : x ∈ RN } = f (x0 ). Then 1k f (x0 ) ≤ 0. (ii) The range of λI − 1k is dense in C0 (RN ) for some λ > 0. Property (i) is an immediate consequence of Lemma 4.1. Condition (ii) is also satisfied, because λI − 1k maps S(RN ) onto itself for each λ > 0; this follows from the fact that the Dunkl transform is a homeomorphism of S(RN ) and Dk (λI − 1k )f (ξ) = (λ + |ξ|2 )Dk f (ξ). The assertion now follows by the above-mentioned theorem. (2) It is known from semigroup theory that for every f ∈ S(RN ), the function t 7→ T (t)f is the unique solution of the abstract Cauchy problem   d u(t) = 1 u(t) for t > 0, k dt (4.7)  u(0) = f within the class of all (strongly) continuously differentiable functions u on [0, ∞) with values in the Banach space (C0 (RN ), k.k∞ ). By property (i) of Theorem 4.6 we have H(t)f ∈ C0 (RN ) for f ∈ S(RN ). Moreover, from representation 4.6 of H(t)f it is readily seen that t 7→ H(t)f is continuously differentiable on [0, ∞) and solves (4.7). This finishes the proof. Corollary 4.9. The heat kernel 0k is strictly positive on RN ×RN ×(0, ∞). In particular, the generalized exponential kernel K satisfies K(x, y) > 0

for all x, y ∈ RN .

Proof. For any initial distribution f ∈ S(RN ) with f ≥ 0 the last theorem implies that Z 0k (x, y, t)f (y) wk (y)dy = T (t)f (x) ≥ 0 for all (x, t) ∈ RN × [0, ∞). RN

As y 7→ 0k (x, y, t) is continuous on RN for each fixed x ∈ RN and t > 0, it follows that 0k (x, y, t) ≥ 0 for all x, y ∈ RN and t > 0. Hence K is nonnegative as well. Now recall again the reproducing identity (2.8), which says that

538

M. R¨osler

e(|x|

2

+|y|2 )

Z K(2x, y) = ck

K(x, 2z)K(y, 2z) wk (z) e−|z| dz 2

RN

for all x, y ∈ RN . The integrand on the right side is continuous, non-negative and not identically zero (because K(x, 0)K(y, 0) = 1). Therefore the integral itself must be strictly positive. Corollary 4.10. The semigroup {T (t)} on C0 (RN ) is given explicitly by T (t)f = H(t)f, f ∈ C0 (RN ). Proof. This is clear from part (2) of Theorem 4.8 and the previous corollary, which implies that the operators H(t) are continuous – even contractive – on C0 (RN ). Remark. The generalized Laplacian also leads to a contraction semigroup on the Hilbert space H := L2 (RN , wk (x)dx); this generalizes the results of [Ros] for the one-dimensional case. In fact, let M denote the multiplication operator on H defined by M f (x) = −|x|2 f (x) and with domain D(M ) = {f ∈ H : |x|2 f (x) ∈ H}. M is self-adjoint and generates the strongly continuous contraction semigroup 2 M (t)f (x) = e−t|x| f (x) (t ≥ 0) on H. For f ∈ S(RN ), we have the identity Dk (1k f ) = M (Dk f ). As S(RN ) is dense in D(M ), this shows that 1k has a self-adjoint e k = D−1 M Dk , where here Dk denotes the Plancherele k on H, namely 1 extension 1 k e k is the Sobolev-type space extension of the Dunkl transform to H. The domain of 1 2 e k ) = {f ∈ H : |ξ| Dk f (ξ) ∈ H}. Being unitarily equivalent with M, the operator D(1 e 1k also generates a strongly continuous contraction semigroup {Te(t)} on H which is unitarily equivalent with {M (t)}; it is given by Z 2 e−t|ξ| Dk f (ξ) K(ix, ξ) wk (ξ)dξ. Te(t)f (x) = RN

The knowledge that 0k is nonnegative allows also to solve the Cauchy problem (4.1) in its general setting: Theorem 4.11. Let f ∈ Cb (RN ). Then u(x, t) := H(t)f (x) is bounded on RN ×[0, ∞) and solves the Cauchy problem (4.1) for each T > 0. Proof. In order to see that u is twice continuously differentiable on RN × (0, ∞) with Hk u = 0, we only have to make sure that the necessary differentiations of u may be taken under the integral in (4.5). One has to use again the estimations (2.7) for the partial derivatives of K; these provide sufficient decay properties of the derivatives of 0k , allowing the necessary differentiations of u under the integral by use of the dominated convergence theorem. Boundedness of u is clear from the positivity and normalization (Lemma 4.5(3)) of 0k ; in fact, |u(x, t)| ≤ kf k∞,RN on RN × [0, ∞). Finally, we have to show that H(t)f (x) → f (ξ) with x → ξ and t → 0. We start by the usual method: For fixed > 0, choose δ > 0 such that |f (y) − f (ξ)| < for |y − ξ| < 2δ and let M := kf k∞,RN . Keeping in mind the positivity and normalization of 0k , we obtain for |x − ξ| < δ the estimation Z 0k (x, y, t) f (y) − f (ξ) wk (y)dy |H(t)f (x) − f (ξ)| ≤ N R Z 0k (x, y, t)|f (y) − f (ξ)|wk (y)dy ≤ |y−x|<δ

Hermite Polynomials and the Heat Equation for Dunkl Operators

539

Z + |y−x|>δ

0k (x, y, t)|f (y) − f (ξ)|wk (y)dy

Z

< + 2M |y−x|>δ

0k (x, y, t)wk (y)dy.

It thus remains to show that for each δ > 0, Z lim(x,t)→(ξ,0) 0k (x, y, t)wk (y)dy = 0. |y−x|>δ

For abbreviation put Z I(x, t) := |y−x|≤δ

0k (x, y, t)wk (y)dy.

As I(x, t) ≤ 1, it suffices to prove that lim inf (x,t)→(ξ,0) I(x, t) ≥ 1. For this, choose some positive constant δ 0 < δ and h ∈ S(RN ) with 0 ≤ h ≤ 1, h(ξ) = 1 and such that h(y) = 0 for all y with |y − ξ| > δ − δ 0 . Then for each x with |x − ξ| < δ 0 the support of h is contained in {y ∈ RN : |y − x| ≤ δ}; therefore Z h(y)0k (x, y, t)wk (y)dy ≤ I(x, t) RN

for all (x, t) with |x − ξ| < δ 0 . But according to Theorem 4.7 we have Z h(y)0k (x, y, t) wk (y)dy = h(ξ) = 1. lim(x,t)→(ξ,0) RN

This finishes the proof.

It is still open whether our solution of the Cauchy problem 4.1 is unique within an appropriate class of functions. As in the classical case, this follows from an maximum principle for the generalized heat operator on RN ×(0, ∞). The first step is the following weak maximum principle for Hk on bounded domains. It is proved by a similar method as used in Theorem 4.2. By virtue of Lemma 4.1, this proof is literally the same as the standard proof in the classical case (see e.g. [J]) and therefore omitted here. Proposition 4.12. Suppose that ⊂ RN is open, bounded and G-invariant. For T > 0 set T := × (0, ∞)

and

∂∗ T := {(x, t) ∈ ∂T : t = 0 or x ∈ ∂ } .

Assume further that u ∈ C 2 (T ) ∩ C(T ) satisfies Hk u ≥ 0 in T . Then max T (u) = max ∂∗ T (u) . Under a suitable growth condition on the solution, this maximum principle may be extended to the case where = RN . The proof is adapted from the one in [dB] for the classical case.

540

M. R¨osler

Theorem 4.13. (Weak maximum principle for Hk on RN ). Let ST := RN × (0, T ) and suppose that u ∈ C 2 (ST ) ∩ C(S T ) satisfies

Hk u ≥ 0 u(. , 0) = f ,

in ST ,

where f ∈ Cb (RN ) is real-valued. Assume further that there exist positive constants C, λ, r such that u(x, t) ≤ C · eλ|x| Then

2

for all (x, t) ∈ ST

with |x| > r.

sup S T (u) ≤ supRN (f ).

Proof. Let us first assume that 8λT < 1. For fixed > 0 set 1 v(x, t) := u(x, t) − · exp (2T − t) γ+N/2

|x|2 4(2T − t)

(x, t) ∈ RN × [0, 2T ).

,

By Lemma 4.3, v satisfies Hk v = Hk u ≥ 0 in ST . Now fix some constant ρ > r and consider the bounded cylinder T = × (0, T ) with = {x ∈ RN : |x| < ρ}. Setting M := supRN (f ), we have v(x, 0) < u(x, 0) ≤ M for x ∈ . Moreover, for |x| = ρ and t ∈ (0, T ], v(x, t) ≤ Ceλρ − · 2

1 (2T ) γ+N/2

eρ

2

/8T

.

As λ < (8T )−1 , we see that v(x, t) ≤ M on ∂∗ T , provided that ρ is large enough. Then by Prop. 4.12 we also have v(x, t) ≤ M on T . As ρ > r was arbitrary, it follows that v(x, t) ≤ M on S T . As > 0 was arbitrary as well, this implies that u(x, t) ≤ M on S T . If 8λT ≥ 1, we may subdivide ST into finitely many adjacent open strips of width less than 1/8λ and apply the above conclusion repeatedly. Corollary 4.14. The solution of the Cauchy problem (4.1) according to Theorem 4.11 is unique within the class of functions u ∈ C 2 (ST ) ∩ C(S T ) which satisfy the following exponential growth condition: There exist positive constants C, λ, r such that |u(x, t)| ≤ C · eλ|x|

2

for all (x, t) ∈ ST with |x| > r.

Acknowledgement. It is a pleasure to thank Charles F. Dunkl and Michael Voit for some valuable comments and discussions.

Note added in proof It has ecently been proven by the author that for k ≥ 0, the Dunkl kernel x 7→ K(iy, x) is in fact positive-definite on RN for each y ∈ RN (cf. Remark 2 after Lemma 4.5); this result will be published in a forthcoming paper.

Hermite Polynomials and the Heat Equation for Dunkl Operators

541

References [A]

Arendt, W.: Characterization of positive semigroups on Banach lattices. In: Nagel, R. (ed.) Oneparameter Semigroups of Positive operators. Lecture Notes in Math. 1184, Berlin: Springer-Verlag, 1986, pp. 247–291 [B-F1] Baker, T.H., Forrester, P.J.: The Calogero-Sutherland model and generalized classical polynomials. Commun. Math. Phys. 188, 175–216 (1997) [B-F2] Baker, T.H., Forrester, P.J.: The Calogero-Sutherland model and polynomials with prescribed symmetry. Nucl. Phys. B 492, 682–716 (1997) [B-F3] Baker, T.H., Forrester, P.J.: Non-symmetric Jack polynomials and integral kernels. Duke J. Math., to appear [B-M] Berest, Y.Y., Molchanov, Y.: Fundamental solutions for partial differential equations with reflection group invariance. J. Math. Phys. 36, no. 8, 4324–4339 (1995) [C] Cherednik, I.: A unification of the Knizhnik-Zamolodchikov equations and Dunkl operators via affine Hecke algebras. Invent. Math. 106, 411–432 (1991) [Chi] Chihara, T.S.: An Introduction to Orthogonal Polynomials. New York: Gordon and Breach, 1978 [dB] DiBenedetto, E.: Partial Differential Equations. Boston, Basel, Berlin: Birkh¨auser Verlag, 1995 [vD] van Diejen, J.F.: Confluent hypergeometric orthogonal polynomials related to the rational quantum Calogero system with harmonic confinement. Commun. Math. Phys. 188, 467–497 (1997) [D1] Dunkl, C.F.: Reflection groups and orthogonal polynomials on the sphere. Math. Z. 197, 33–60 (1988) [D2] Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. Am. Math. Soc. 311, 167–183 (1989) [D3] Dunkl, C.F.: Integral kernels with reflection group invariance. Can. J. Math. 43, 1213–1227 (1991) [D4] Dunkl, C.F.: Hankel transforms associated to finite reflection groups. In: Proc. of the special session on hypergeometric functions on domains of positivity, Jack polynomials and applications. Proceedings, Tampa 1991, Contemp. Math. 138, 123–138 (1992) [D-J-O] Dunkl, C.F., de Jeu, M.F.E., Opdam, E.M.: Singular polynomials for finite reflection groups. Trans. Am. Math. Soc. 346, 237–256 (1994) [dJ] de Jeu, M.F.E.: The Dunkl transform. Invent. Math. 113, 147–162 (1993) [H] Heckman, G.J.: A remark on the Dunkl differential-difference operators. In: Barker, W., Sally, P. (eds.), Harmonic analysis on reductive groups. Progress in Math. 101, Basel: Birkh¨auser Verlag, 1991, pp. 181–191 [J] John, F.: Partial Differential Equations. New York–Heidelberg–Berlin: Springer-Verlag, 1986 [K] Kakei, S.: Common algebraic structure for the Calogero-Sutherland models. J. Phys. A 29, L619– L624 (1996) [K-S] Knop, F., Sahi, S.: A recursion and combinatorial formula for Jack polynomials. Invent. Math. 128, no. 1, 9–22 (1997) [L-V] Lapointe, L., Vinet, L.: Exact operator solution of the Calogero-Sutherland model. Comm. Math. Phys. 178, 425–452 (1996) [L1] Lassalle, M.: Polynˆomes de Laguerre g´en´eralis´es. C.R. Acad. Sci. Paris, t. 312, S´erie I, 725–728 (1991) [L2] Lassalle, M.: Polynˆomes de Hermite g´en´eralis´es. C.R. Acad. Sci. Paris, t. 313, S´erie I, 579–582 (1991) [O1] Opdam, E.M.: Dunkl operators, Bessel functions and the discriminant of a finite Coxeter group. Compos. Math. 85, 333–373 (1993) [O2] Opdam, E.M.: Harmonic analysis for certain representations of graded Hecke algebras. Acta Math. 175, 75–121 (1995) [P] Pasquier, V.: A lecture on the Calogero-Sutherland models. In: Integrable models and strings (Espoo, 1993). Lecture Notes in Phys. 436, Berlin 1994: Springer Verlag, pp. 36–48 [R] R¨osler, M.: Bessel-type signed hypergroups on R. In: Heyer, H., Mukherjea, A. (eds.), Probability measures on groups and related structures XI. Proceedings, Oberwolfach 1994, Singapore: World Scientific, 1995, pp. 292–304

542

[R-V] [Ros]

M. R¨osler

R¨osler, M., Voit, M.: An uncertainty principle for Hankel transforms. Proc. Am. Math. Soc., to appear Rosenblum, M.: Generalized Hermite polynomials and the Bose-like oscillator calculus. In: Operator Theory: Advances and Applications. Vol. 73, Basel: Birkh¨auser Verlag, 1994,pp. 369–396

Communicated by T. Miwa

Commun. Math. Phys. 192, 543 – 554 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

The Zero-Mach Limit of Compressible Flows David Hoff? Department of Mathematics, Rawles Hall, Indiana University, Bloomington, IN 47495-5701, USA Received: 10 June 1997 / Accepted: 15 July 1997

Abstract: We prove that compressible Navier-Stokes flows in two and three space dimensions converge to incompressible Navier-Stokes flows in the limit as the Mach number tends to zero. No smallness restrictions are imposed on the external force, the initial velocity, or the time interval. We assume instead that the incompressible flow exists and is reasonably smooth on a given time interval, and prove that compressible flows with compatible initial data converge uniformly on that time interval. Our analysis shows that the essential mechanism in this process is a hyperbolic effect which becomes stronger with smaller Mach number and which ultimately drives the density to a constant. 1. Introduction We prove that compressible Navier-Stokes flows in two and three space dimensions converge to incompressible Navier-Stokes flows in the limit as the Mach number tends to zero. No smallness restrictions are imposed on the external force, the initial velocity, or the time interval. We assume instead that the incompressible flow exists and is reasonably smooth on a given time interval, which may be infinite, and we prove that compressible flows with compatible initial data converge uniformly on that time interval as the Mach number goes to zero. The precise statement is given in the theorem below. The proof relies on routine energy estimates together with an application of a certain Helmholtz decomposition of the force density, which, via the effective viscous flux, reveals an important underlying hyperbolic dissipation. This dissipation becomes stronger as the Mach number decreases, ultimately driving the density to a constant. These ideas are discussed in somewhat greater detail at the end of this section. We begin by giving a brief description of the zero-Mach problem. The Navier-Stokes equations for a compressible, isentropic or isothermal fluid express the conservation of mass and the balance of momentum as follows: ?

This research was supported in part by the NSF under grant No. DMS-9322274.

544

D. Hoff

ρt + div (ρu) = 0 (ρuj )t + div (ρuj u) + P (ρ)xj = ε4uj + λdiv uxj + ρf j , j = 1, . . . , n .

(1.1)

Here ρ and u = (u1 , . . . , un ) are the fluid density and velocity, which are the unknown functions of x ∈ Rn , n = 2 or 3, and t ∈ [0, ∞). P = P (ρ) is the pressure, assumed here to satisfy (1.2) P (ρ), P 0 (ρ) > 0 for ρ > 0 . ε > 0 and λ ≥ 0 are viscosity constants, and f = f (x, t) is the external force vector. We can rewrite the system (1.1) in dimensionless form by taking x = Lx, ¯ t = T t¯, ρ = Rρ, ¯ and u = T u/L, ¯ where L, T , and R are typical values of x, t, and ρ, and the quantities with bars are dimensionless. Making these substitutions into (1.1), say for the representative case of an ideal, polytropic fluid P = Kργ , γ ≥ 1, then dropping the bars, we find that, for new, dimensionless variables x, t, ρ, u, f , and new dimensionless viscosity constants ε and λ, ρt + div (ρu) = 0 (1.3) (ρuj )t + div (ρuj u) + δ −2 P (ρ)xj = ε4uj + λdiv uxj + ρf j , where now P = ργ and δ is the Mach number, given by δ2 =

(L/T )2 . KγRγ−1

δ is thus the ratio of the typical particle speed to the typical sound speed, and is therefore small, by definition, for highly subsonic flows. We shall study the system (1.3) for small δ with fixed, general P satisfying (1.2), and make no further reference to the special equation of state for a polytropic fluid. Evidently, when δ is small, the pressure gradient is the dominant term in the system. This suggests the following asymptotic expansion for the solution (ρ, u) = (ρδ , uδ ): δ ρ (x, t) = ρ0 + δ 2 ρ1 (x, t) + ϕδ (x, t) (1.4) uδ (x, t) = w(x, t) + ψ δ (x, t) , where ϕδ = o(δ 2 ) and ψ δ = o(1) as δ → 0. Substituting (1.4) into (1.3) and gathering like powers of δ, we find that, under the assumption that ρ0 is a positive constant, the functions ρ1 and w must satisfy ρ0 (wtj + ∇wj · w) + (c2 ρ1 )xj = ε4wj + ρ0 f j (1.5) div w = 0 , where c2 = P 0 (ρ0 ). Thus ρ0 , w, and c2 ρ1 are respectively the density, velocity, and pressure in an incompressible Navier-Stokes flow. The zero-Mach limit problem in the present context then consists of fixing an incompressible flow (ρ0 , w, c2 ρ1 ), that is, a solution of (1.5), taking initial data (ρδ (·, 0), uδ (·, 0)) close to (ρ0 , w(·, 0)), and establishing that, for small δ, the corresponding solution (ρδ , uδ ) of (1.3) remains close to (ρ0 , w) for positive time. This problem has been studied from a number of points of view, among which we mention the work of Klainerman and Majda [2], who establish the zero-Mach limit for small time, but without assuming the existence of the underlying incompressible flow; Kreiss, Lorenz, and Naughton [3], who treat periodic solutions of a “slightly compressible" system; and Lions [5], who establishes certain compactness properties of (ρδ , uδ ),

Zero-Mach Limit of Compressible Flows

545

on a bounded region in space or for periodic solutions, for certain polytropic fluids, and with minimal regularity hypotheses. Our results overlap with these to some extent, but by assuming the existence of the underlying incompressible flow, we succeed in establishing a convergence result which is global in time, with no smallness assumptions on either the external force or on the underlying incompressible flow. Furthermore, our analysis makes quite clear that the essential mechanism driving the flow to incompressibility in the zero-Mach limit is the hyperbolicity of the inviscid system corresponding to (1.1). We now give a precise formulation of our results. First we fix ρ0 > 0 and initial data (w(·, 0), ρ1 (·, 0)) for (1.5) satisfying w(·, 0) ∈ W 2,2 , ρ1 (·, 0) ∈ W 1,2 ,

(1.6)

where W 2,2 = W 2,2 (Rn ), etc.; we assume that the force f satisfies

f ∈ L∞ (I; L2 ) ∩ L2 (I; L2 ) ft ∈ L2 (I; L2 ), ∇f ∈ L2 (I; L∞ ),

(1.7)

where I = [0, T ) and T ≤ ∞ (T is not to be confused with the scaling time which occurred earlier); and we assume finally that (1.5) has a corresponding solution (w, ρ1 ) defined on Rn × I, satisfying the following conditions:  ρ1 ∈ L∞ (I; L∞ ∩ W 1,2 ) ∩ L2 (I; L∞ )     ρ1t , ∇ρ1 ∈ (L1 ∩ L2 )(I; L∞ ) ∩ L2 (I; L2 ) w ∈ L∞ (I; W 2,2 ∩ W 1,∞ )  2  L2 (I; L2 ∩ L∞ )   ∇w, Dx w ∈ 1 ρ1 4w ∈ L (I; L2 ∩ L∞ ).

(1.8)

When T is finite, the requirements (1.8) are simply that the solution of (1.5) be reasonably smooth. The relevant existence and regularity theory is well-known, and differs considerably in the case that n = 3, where smallness conditions are needed, from the case that n = 2. When T = ∞ on the other hand, the conditions in (1.8) require that the incompressible solution decays in a certain sense. See Schonbek [7], Wiegner [9], and the references therein for various results concerning rates of decay of solutions of (1.5). Of course, we do not exclude here the important case that the incompressible flow (w, ρ1 ) exists for all time and satisfies the conditions in (1.8) for all finite T but not for T = ∞. We now let δ-dependent initial data (ρδ (·, 0), uδ (·, 0)) be given for some or all values of δ ∈ (0, 1], and assume that the perturbations ϕδ and ψ δ defined in (1.4) satisfy kψ δ (·, 0)kL2 + δk∇ψ δ (·, 0)kL2 + δ 2 kDx2 ψ δ (·, 0)kL2 ≤ C0 δ,

(1.9)

kϕδ (·, 0)kL2 + δk∇ϕδ (·, 0)kL2 ≤ C0 δ 2 ,

(1.10)

kϕδ (·, 0)kL∞ ≤ h(δ),

(1.11)

where h is a strictly increasing, continuous function on [0, 1] satisfying h(0) = 0 and h(1) ≤ C0 . We then obtain the following convergence estimates, which are the main results of this paper:

546

D. Hoff

Theorem. Fix n = 2 or 3, let an external force f satisfying (1.7) be given, and let (ρ0 , w, c2 ρ1 ) be a corresponding solution of (1.5) as described above, satisfying the conditions in (1.8), and with all the norms indicated in (1.7) and (1.8) bounded by C1 . Let the function P = P (ρ) be given, satisfying (1.2), and let C0 > 0 be given. Then there are positive constants δ0 and C, depending on C0 , C1 , ε, λ, P, ρ0 , and, in the case that T < ∞, on T , such that: given initial data (ρδ (·, 0), uδ (·, 0)) for which the perturbations ϕδ and ψ δ defined in (1.4) satisfy (1.9)–(1.11), there is a corresponding solution (ρδ , uδ ) of (1.3) defined on Rn × [0, T ). This solution satisfies sup kρδ (·, t) − ρ0 kL2 ≤ Cδ 2 ,

(1.12)

sup kρδ (·, t) − ρ0 kL∞ ≤ C h(δ) + δ 1/4 ,

(1.13)

sup kuδ (·, t) − w(·, t)kL2 ≤ Cδ,

(1.14)

sup k∇(uδ − w)(·, t)kL2 ≤ C.

(1.15)

0≤t
0≤t
0≤t
0≤t
Thus in particular, as δ → 0, ρδ → ρ0 in Lp for p ∈ [2, ∞] and uδ → w in Lp for p ∈ [2, ∞) when n = 2 and p ∈ [2, 6] when n = 3; in each case the convergence is uniform in t for t ∈ [0, T ). Notice that there are no explicit smallness conditions on either T , the force, or the incompressible flow; (of course, the assumptions in (1.8) are not known to hold in general when n = 3 except for small initial data or small time). Notice also that the zero-Mach limit (ρ0 , w, c2 ρ1 ) depends on c2 = P 0 (ρ0 ), but is otherwise independent of the particular form of the pressure function P . The theorem also establishes the existence of (some) large-data solutions of the Navier-Stokes equations of compressible flow: when the initial data is sufficiently close to that of a large-data, incompressible flow, then the compressible solution exists on the same time interval as does the incompressible solution, and its density remains bounded above and below away from zero. We recall in this regard the result of Lions [4], which gives the global existence of solutions of (1.1) for certain pressures with general, large initial data. The proof of the theorem is given in a sequence of three lemmas in Sect. 2. The key idea is that, because of the assumption that P 0 > 0, the Navier-Stokes system (1.1) retains in some sense the hyperbolicity of the underlying Euler equations of inviscid flow. As we shall see, this hyperbolicity exerts a damping effect on the solution, and, because of the way in which δ enters the equations, this damping effect increases as δ → 0, thereby driving the density to the constant value ρ0 . The mechanics of this argument begin in Lemma 2.1 with some routine energy estimates for ϕδ , ψ δ , ∇ψ δ , and duj = ujt + ∇uj · u, etc.) u˙ δ (the dot here denotes a convective derivative; thus u˙ j = dt These estimates are nearly identical to those given in the existence theory of Hoff [1]; we therefore omit most of the details, except for estimates for the pressure terms, because it is these which involve the key parameter δ. These energy estimates give bounds for various L2 norms, but only in terms of the 4 L norm of ∇uδ . To obtain an estimate for the latter, we make important use of the effective viscous flux F introduced in [1], defined by F = (ε + λ)div uδ − δ −2 [P (ρδ ) − P (ρ0 )].

(1.16)

Zero-Mach Limit of Compressible Flows

547

To understand the role of F here, we first rewrite the momentum equation in (1.3) by substituting for P in terms of F : + ρδ f j , ρ(u˙ δ )j = Fxj + εωxj,k k

(1.17)

where the vorticity ω is defined by ω j,k = ujxk − ukxj , and summation over repeated indices is understood. Observe that (1.17) gives a Helmholtz decomposition of the vector = 0, so that ρδ (u˙ δ − f ). In particular, ωxj,k k xj 4F = div(ρδ u˙ δ − ρδ f ).

(1.18)

Equation (1.18) is the compressible analog of the well-known elliptic equation for the incompressible pressure associated with the system (1.5). Bounds for u˙ δ then translate via (1.18) into bounds for F , and the required L4 bounds for ∇uδ then follow easily from these estimates and the observation that 1uδ is a linear combination of elements of ∇F, ∇ω, and ∇P (ρδ ). These estimates are carried out in detail in Lemma 2.2. Finally in Lemma 2.3 we give the key step in the entire argument, which is the derivation of a pointwise bound for ρδ . It is here that the hyperbolicity alluded to above enters in a crucial way. The idea may be explained briefly as follows. First we write the mass equation in (1.3) in the form d δ (ρ − ρ0 ) = −ρδ div uδ , dt and, substituting for div uδ in terms of F , (ε + λ)

d δ (ρ − ρ0 ) + δ −2 P (ρδ ) − P (ρ0 ) = −ρδ F . dt

(1.19)

The fact that P is increasing implies that the second term on the left here is dissipative for this ode; evidently, this dissipative effect is greater for smaller δ. On the other hand, the pointwise bounds that we obtain from (1.18) for the driving term F degrade as δ → 0, but at a specific rate which we can compute. As we shall see in Lemma 2.3, the dissipative effect in (1.19) is just sufficient to control F in the zero-Mach limit. To summarize, by substituting for div uδ in terms of the effective viscous flux F , we have uncovered a crucial hyperbolic dissipation in the Navier-Stokes equations (1.1); this dissipation is stronger for smaller δ, and ultimately drives the density to the constant value ρ0 . 2. Proof of the Theorem In this section we prove the above theorem in a sequence of three lemmas, as described in Sect. 1. We shall drop the superscript δ on the functions ρ, u, ϕ, and ψ, and we shall denote both Lp (Rn ) norms and xj -derivatives with simple subscripts. Thus kwkp = kwkLp (Rn ) and wj = wxj , etc. To begin, local existence theories show that there is a unique local solution (ρ, u) of (1.3) on Rn ×[0, t) for some positive time t ≤ T , satisfying ρ(·, t)−ρ0 ∈ W 1,2 , u(·, t) ∈ W 2,2 , and 1 (2.1) 2 ρ0 < ρ(x, t) < 2ρ0 , 0 ≤ t < t.

548

D. Hoff

See Kawashima’s result, for example, described in Racke [6]. We shall derive a priori bounds for this solution sufficient to establish its existence up to time T ; the convergence estimates (1.12)–(1.15) will then follow directly from these bounds. Specifically, we let ϕ and ψ be the perturbations defined in (1.4), and we define, for t < t, Z tZ A(t) = sup kψ(·, s)k22 + δ −2 kϕ(·, s)k22 + |∇ψ|2 dxds, (2.2) 0≤s≤t

B(t) = sup k∇ψ(·, s)k22 + 0≤s≤t

D(t) = sup 0≤s≤t

Z tZ

0

˙ 2 dxds, |ψ|

(2.3)

|∇u| ˙ 2 dxds,

(2.4)

0

ku(·, ˙ s)k22

Z tZ + 0

where again the dot denotes a convective derivative, u˙ j = ujt +ujk uk , etc., and summation over repeated indices is understood. We shall show in a sequence of energy estimates that A(t) ≤ Cδ 2 , B(t) ≤ C, D(t) ≤ Cδ −2 , (2.5) and then derive as a consequence that i h kϕ(·, t)k∞ ≤ C h(δ) + δ 1/4 . (C denotes a generic positive constant as described in the theorem.) We begin with the following coupled estimates for A, B, and D: Lemma 2.1. Let (ρ, u) be a solution of (1.3) on Rn × [0, t) satisfying (2.1), and let A, B, and D be as defined above in (2.2)–(2.4). Then for t < t and for δ sufficiently small, as described in the theorem of Sect. 1, Z tZ 2 −4 4 |ϕ| dxds ; (2.6) A(t) ≤ C δ + δ 0

B(t) ≤ C 1 + δ −2 A(t) +

Z tZ

|∇ψ|3 dxds ;

(2.7)

0

Z tZ |∇ψ|4 dxds . D(t) ≤ C δ −2 + δ −4 A(t) + B(t) +

(2.8)

0

Proof. To prove (2.6), we subtract the momentum equations in (1.3) and (1.5), computing that δ −2 P (ρ) − P (ρ0 ) − c2 δ 2 ρ1 j = c2 δ −2 ϕj + δ −2 P [ρ0 , ρ0 , ρ](δ 2 ρ1 + ϕ)2 j , where P [ρ0 , ρ0 , ρ] is a standard divided difference. The result is that ρψ˙ j + c2 δ −2 ϕj + δ −2 P [ρ0 , ρ0 , ρ](δ 2 ρ1 + ϕ)2 j 2 2 j k j = ε4ψ j + λdiv ψj + ρ−1 0 (δ ρ1 + ϕ)(c ρ1j − ε4w ) − ρψ wk .

Next we obtain from the mass equation in (1.3) and the definition (1.4) of ϕ that

(2.9)

Zero-Mach Limit of Compressible Flows

549

ϕt + ρ0 div ψ = −div(ϕu) − δ 2 [ρ1t + div(ρ1 u)].

(2.10)

Integrating and combining (2.9) and (2.10), and noting the cancellation of the leading pressure terms, we obtain that Z Z tZ t 2 2 −1 −2 2 1 ρ|ψ| + c ρ0 δ ϕ dx|0 + ε|∇ψ|2 + λ(div ψ)2 dxds 2 Z tZ = 0

+ ρ−1 0

0

P [ρ0 , ρ0 , ρ] δ 2 ρ21 + 2ρ1 ϕ + δ −2 ϕ2 div ψ dxds Z tZ h 0

− c2 ρ−1 0

Z tZ

δ 2 ρ1 + ϕ 1

2δ

0

−2

i c2 ρ1j − ε4wj ψ j − ρψ j ψ k wkj dxds

ϕ2 div ψ + (ρ1t + div(ρ1 u)) ϕ dxds.

(2.11)

We apply standard techniques to estimate each of the terms on the right here. For example, when n = 3, Z tZ Z t 1/4 3/4 j k j ≤C ρψ ψ w dxds k∇w(·, s)k2 kψ(·, s)k2 k∇ψ(·, s)k2 ds k 0 0 Z tZ Z t 3 2 −1 4 |∇ψ| dxds + η k∇w(·, s)k2 kψ(·, s)k2 ds . ≤C η 0

0

The first term on the right here can be absorbed into the left side of (2.11) for small η, Z t and the second is bounded by C k∇w(·, s)k42 A(s) ds. Estimating the other terms in 0

(2.11) in a similar way, we thus find that Z tZ Z t |ϕ|4 dxds + k∇w(·, s)k42 A(s) ds . A(t) ≤ C δ 2 + δ −4 0

0

Gronwall’s inequality, together with our assumptions in (1.8) that Z

T

k∇w(·, s)k42 ds ≤ CC1 ,

0

then enables us to eliminate the last term on the right here. This proves (2.6). To prove (2.7) we multiply (2.9) by ψ˙ j and integrate to obtain Z tZ Z Z ˙ 2 dxds + |ψ(x, t)|2 dx ≤ C |∇ψ(x, 0)|2 dx |ψ| 0

Z tZ

|∇ψ|2 |∇u| dxds +

+C 0

∇(c2 ρ1 − δ −2 P ) · ψ˙ dxds

0

Z tZ h

(δ 4 ρ21

+

Z tZ

i + ϕ )(|∇ρ1 | + |4w|2 ) + |ψ|2 |∇w|2 dxds. (2.12) 2

2

0

Again, we can bound each of the terms on the right here in a fairly straightforward way by applying the hypotheses (1.7)–(1.8), with the exception of the pressure term

550

D. Hoff

Z

t

J=

∇ c2 ρ1 − δ −2 P (ρ) · ψ˙ dxds,

0

which is somewhat more complicated. First, integrating by parts in both x and t and letting ζ = P (ρ) − P (ρ0 ) − c2 δ 2 ρ1 , we find that Z Z t Z h i t j k −2 −2 |J| ≤ Cδ ζdiv ψ dx|0 + Cδ (div ψ)ζt + ψk u ζj dxds . (2.13) 0

The first term on the right here is easily seen to be bounded by C δ −2 A(t) + δ 4 , plus a term which can be absorbed into the left side of (2.12). Applying (1.3) to compute ζt and ∇ζ, we find that the other term is bounded by Z t Z h Z tZ i Cδ −2 (div ψ)2 dxds + Cδ −2 Pj uk ψkj − Pj uj ψkk dxds , 0

0

plus terms which are of higher order in δ. The first term above is bounded by Cδ −2 A(t); and an integration by parts shows that the second is bounded by ZZ −2 |P (ρ) − P (ρ0 )||∇u||∇ψ| Cδ ZZ δ 2 |ρ1 | + |ϕ| |∇w| + |∇ψ| |∇ψ|. ≤ Cδ −2 It now follows easily that |J| ≤ C 1 + δ −2 A(t) plus terms which can be absorbed into the left side of (2.12). The other terms on the right side of (2.12) are estimated in a similar way; details are omitted. This proves (2.7). Finally, to prove (2.8), we apply the operator Dt + ∇ · u to the momentum equation in (1.3), obtaining ρu¨ j + δ −2 Pjt + div(Pj u) = ε 4ujt + div(4uj u) + λ div ujt + div((div uj )u) + (ρf j )t + div (ρf j u). Multiplying by u˙ j and integrating and simplifying, we then get Z Z tZ 2 t 2 1 ρ| u| ˙ ε|∇u| ˙ 2 + λ div˙ u dxds dx| + 0 2 0

≤C

Z tZ

|∇u|2 |∇u| ˙ + |div˙ u| dxds 0 Z t Z −2 j u˙ Pjt + div(Pj u) dxds +δ 0 Z t Z j j j u˙ (ρf )t + div (ρf u) dxds . +

(2.14)

0

Again, routine estimates apply to each of the terms on the right here, with the exception of the pressure term ZZ ZZ −2 j −2 u˙ Pjt + div(Pj u) = −δ (u˙ jj Pt + u˙ jk Pj uk ). K=δ

Zero-Mach Limit of Compressible Flows

551

Writing Pt = −P 0 div(ρu) in the first term and interchanging the xk and xj integrations in the second term, we find that the ∇ρ terms cancel, so that Z Z j j k −2 0 (ρP − P )u˙ j div ψ + (P − P0 )u˙ k uj |K| = δ ZZ ZZ 2 −1 −4 2 2 2 ≤C η |∇u| ˙ + η δ |∇ψ| + |ρ − ρ0 | |∇w| . The first term on the right here can be absorbed into the left side of (2.14) when η is small, and the second is easily seen to be bounded by C 1 + δ −4 A(t) . The other terms on the right side of (2.14) are estimated in a similar way; details are omitted. In the following lemma we derive several auxiliary estimates required for closing the bounds of Lemma 2.1. These arguments make important use of the effective flux F described above in (1.16)–(1.18). Lemma 2.2. Assume that the hypotheses of Lemma 2.1 are in force, and define E(t) = 1 + δ −2 A(t) + B(t) + δ 2 D(t). Then

Z tZ

F 4 + |ω|4 dxds ≤ Cδ −1 E(t)2 ,

0

Z tZ

(2.15)

(2.16)

ϕ4 dxds ≤ Cδ 7 E(t)2 ,

(2.17)

|∇ψ|3 dxds ≤ Cδ 1/2 E(t)3/2 ,

(2.18)

|∇ψ|4 dxds ≤ Cδ −1 E(t)2 .

(2.19)

0

Z tZ 0

Z tZ 0

Proof. We give the proof for the case that n = 3; the proof for n = 2 is similar and, in fact, results in slightly more favorable dependence on δ. First, from the definition (1.16) of F , Z (div u)2 + δ −4 (ρ − ρ0 )2 dx kF (·, s)k22 ≤ C Z ≤C |∇ψ|2 + δ −4 (δ 2 ρ1 + ϕ)2 dx ≤ CE(s); and from (1.18),

˙ s)k22 + kf (·, s)k22 k∇F (·, s)k22 ≤ C ku(·, ˙ s)k22 + kf (·, s)k22 ≤ C kw(·, ˙ s)k22 + kψ(·, ≤ Cδ −2 E(s) ,

and

Z tZ 0

|∇F |2 dxds ≤ CE(t).

552

D. Hoff

Then by a standard Sobolev imbedding, Z tZ Z t F 4 dxds ≤ C kF (·, s)k2 k∇F (·, s)k32 ds 0

≤ Cδ

0 −1

E(t)2 .

This proves the bound in (2.16) for F ; the bound for ω follows easily from the identity ε|∇ω|2 = εdiv(ω∇ω) + (ρω u˙ k )j − (ρω u˙ j )k + ρ(u˙ j ωk − u˙ k ωj ) + (ρf j )k − (ρf k )j ω for ω = ω j,k , which can be derived from the momentum equation in (1.3). To prove (2.17), we substitute from (1.16) for div u in terms of F in the mass equation in (1.3) to obtain (ε + λ) (ρ − ρ0 )t + ∇(ρ − ρ0 ) · u + δ −2 ρ[P (ρ) − P (ρ0 )] = −ρF. Multiplying by (ρ − ρ0 )3 and applying the assumptions (1.2) and (2.1), we then get ρ(ρ − ρ0 )4 t + div ρ(ρ − ρ0 )4 + C −1 δ −2 (ρ − ρ0 )4 ≤ C|ρ − ρ0 |3 |F | h i ≤ C ηδ −2 (ρ − ρ0 )4 + η −1 δ 6 F 4 . Taking η small and integrating, we then obtain Z tZ Z tZ Z 2 4 (ρ − ρ0 )4 dxds ≤ Cδ 8 F 4 dxds + Cδ 2 (δ ρ1 + ϕ)(x, 0) dx, 0

0

which together with (2.16), (1.8), and (1.10), proves (2.17). Next, as in [1], we write 4uj = ujkk = ukkj + (ujk − ukj )k = (ε + λ)−1 F + δ −2 (P − P0 ) j + ωkj,k , so that, by the Marcinkiewicz theorem (Stein [8], p. 96), Z tZ Z tZ |∇ψ|4 dxds ≤ C |∇w|4 + F 4 + δ −8 (ρ − ρ0 )4 + |ω|4 dxds 0

0

≤ Cδ −1 E(t)2 , by (1.8), (2.16), Z Z and (2.17). This proves (2.19), and (2.18) follows by interpolation with the bound

|∇ψ|2 ≤ A ≤ δ 2 E.

We are now in a position to derive our last preliminary estimate, which gives a pointwise bound for ϕ = ρ − ρ0 − δ 2 ρ1 . Lemma 2.3. Assume that the hypotheses of Lemma 2.1 are in force. Then sup kϕ(·, s)k∞ ≤ C h(δ) + δ 1/4 E(t)1/2 , 0≤s≤t

where E is as defined in (2.15).

(2.20)

Zero-Mach Limit of Compressible Flows

553

Proof. Again we substitute from (1.16) for div u in terms of F in the mass equation in (1.3) to obtain (ε + λ)

d [log ρ − log ρ0 ] + δ −2 [P (ρ) − P (ρ0 )] = −F, dt

where the convective derivative on the left is along a particle path x(t). Defining α(t) = (ε + λ)−1

P (ρ(x(t), t)) − P (ρ0 ) , log(ρ(x(t), t)) − log ρ0

we then get Z t ρ(x(0), 0) ρ(x(t), t) α(s) ds log = exp −δ −2 log ρ0 ρ0 0 Z t Z t −1 −2 exp −δ α(τ ) dτ F (x(s), s) . −(ε + λ) s

0

Now, (2.1) shows that C −1 ≤ α ≤ C, so that by (1.11), h i |ρ(x(t), t) − ρ0 | ≤ C h(δ) + δ 1/4 Z t 2 e(s−t)/Cδ kF (·, s)k4 + k∇F (·, s)k4 ds. (2.21) +C 0

We apply (1.18) to bound the ∇F term here, in the case that n = 3, by Z t 2 e(s−t)/Cδ ku(·, ˙ s)k4 + |kf (·, s)k4 ds . C 0

The u˙ term here is bounded by Z t 2 1/4 3/4 e(s−t)/Cδ ku(·, ˙ s)k2 k∇u(·, ˙ s)k2 ds C 0

Z Z ≤ Cδ

˙2 |w| ˙ 2 + |ψ|

1/8 Z Z

3/8 |∇u| ˙2

≤ Cδ(1 + B)1/8 D3/8 ≤ Cδ 1/4 E 1/2 , and similarly for the f term. Estimating the other term in (2.21) in a similar way, we thus obtain that kρ − ρ0 k∞ ≤ C h(δ) + δ 1/4 E 1/2 . The result (2.20) then follows from this and the definition ρ = ρ0 + δ 2 ρ1 + ϕ. Proof of the theorem. Substituting the estimates (2.18)–(2.19) into (2.6)–(2.8) and taking δ small, we obtain that A ≤ C(δ 2 + δ 3 E 2 ) , B ≤ C(1 + δ −2 A + δ 1/2 E 3/2 ) , D ≤ C(δ −2 + δ −4 A + B + δ −1 E 2 ) , where again E = 1 + δ −2 A + B + δ 2 D. It follows easily that E(t) ≤ C[1 + δ 1/4 E(t)2 ]. Now, since E(0) ≤ C and E is continuous in t, there are positive constants δ0 and C, as

554

D. Hoff

described in the theorem, such that, if δ ≤ δ0 , then E(t) ≤ C. This proves (1.12), (1.14), and (1.15), and (1.13) follows from Lemma 2.3. The existence of the solution (ρ, u) on Rn × [0, T ) then follows from these estimates together with a routine continuation argument. Acknowledgement. The author expresses with pleasure his thanks to Zhouping Xin and to David Levermore for interesting and helpful conversations related to this work.

References 1. Hoff, D.: Global solutions of the Navier-Stokes equations for multidimensonal compressible flow with discontinuous initial data. J. of Differential Equations 120, 215–254 (1995) 2. Klainerman, S. and Majda, A.: Compressible and incompressible fluids, Commun. Pure Appl. Math. 35, 629–651 (1982) 3. Kreiss, H.O., Lorenz, J. and Naughton, M.J.: Convergence of the solutions of the compressible to the solutions of the incompressible Navier-Stokes equations. Adv. Appl. Math. 12, 187–214 (1991) 4. Lions, P.L..: Existence globale de solutions pour les e´ quations de Navier-Stokes compressibles isentropiques. C.R. Acad. Sci. Paris 316, 1335–1340 (1993) 5. Lions, P.L.: Limites incompressible et acoustique pour des fluides visqueux compressibles isentropiques. C.R. Acad. Sci. Paris, 317, 1197–1202 (1993) 6. Racke, R.: Lectures on Nonlinear Evolution Equations, Vieweg Verlag, 1992 7. Schonbek, M.E.: Large time behavior of solutions to the Navier-Stokes equations in H m spaces. Commun. Partial Diff. Eqs. 20, no. 1–2, 103–117 (1995) 8. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton: Princeton Univ. Press, 1970 9. Wiegner, M.: Decay results for weak solutions of the Navier-Stokes equations on Rn . J. London Math. Soc 35, no. 2, 303–313 (1987) Communicated by A. Jaffe

Commun. Math. Phys. 192, 555 – 567 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Demazure Modules and Perfect Crystals Atsuo Kuniba1 , Kailash C. Misra2 , Masato Okado3,? , Jun Uchiyama4 1 2 3 4

Institute of Physics, University of Tokyo, Komaba, Tokyo 153, Japan Department of Mathematics, North Carolina State University, Raleigh, NC 27695-8205, USA Department of Mathematics, The University of Melbourne, Parkville, Victoria, 3052, Australia Department of Physics, Rikkyo University, Nishi-Ikebukuro, Tokyo 171, Japan

Received: 29 August 1996 / Accepted: 18 July 1997

Abstract: We give a criterion for the Demazure crystal Bw (λ) defined by Kashiwara to b n symmetric tensor case, and see some have a tensor product structure. We study the sl Demazure characters are expressed using Kostka-Foulkes polynomials.

0. Introduction b 2 specialIn [S] Sanderson calculated the character of the Demazure module Vw (λ) of sl ized at e−δ = 1 (δ: the null root) using Littelmann’s path [L]. To state it more precisely, we set w(k) = r1−i · · · r1 r0 , | {z } k

where r0 , r1 are the simple reflections, and i = 0 (k:even), = 1 (k:odd). Let Vw(k) (λ) be the U (n+ )-module generated by an extremal vector of weight w(k) λ of the irreducible b 2 )-module of highest weight λ. Let V l be the (l + 1)-dimensional highest weight U (sl irreducible module of sl 2 . Taking λ = l30 for simplicity, her result reads as ch Vw(k) (l30 )|e−δ =1 = el3i (ch V l )k . This suggests that there exists a tensor product structure in the Demazure module at a combinatorial level. Now let g be any affine Lie algebra. As for the irreducible highest weight Uq (g)module V (λ) at q = 0, we have a combinatorial object path. (This path is different from Littelmann’s path.) It has emerged in the study of solvable lattice models (cf. [DJKMO1, DJKMO2]). Its study is accomplished with the aid of crystal base theory (cf. ? Permanent address: Department of Mathematical Science, Faculty of Engineering Science, Osaka University, Toyonaka, Osaka 560, Japan.

556

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

[KMN1, KMN2]). Let B be a perfect crystal of level l. Then, for any dominant integral weight λ of level l, the crystal B(λ) of V (λ) is represented as a set of paths. Roughly speaking, a path is an element of the semi-infinite tensor product · · · ⊗ B ⊗ B with some stability condition. On the other hand, Kashiwara [Ka] gave a recursive formula to obtain the Demazure crystal Bw (λ) for any symmetrizable Kac-Moody algebra. Denoting the Bruhat order by , it reads as follows. If ri w w, then, [ f˜in Bw (λ) \ {0}. Bri w (λ) = n≥0

Thus, it seems natural to ask how Bw (λ) is characterized in the set of paths in the case of affine Lie algebras. In this article, we present a criterion for the Demazure crystal Bw (λ) to have a tensor product structure. To explain more precisely, let w(k) (k = 1, 2, · · ·) be an increasing sequence of affine Weyl group elements with respect to the Bruhat order, let uλ be a highest weight vector of B(λ). Then, under the assumptions (I–IV) in Sect. 2.3, we can show the following isomorphism of crystals Bw(k) (λ) ' uλj ⊗ Ba(j,···,j−κ+1) ⊗ B ⊗(j−κ)

if j ≥ κ.

Here j, a, κ ≥ 1, Ba(j,···,j−κ+1) ∈ B ⊗κ are determined when checking the assumptions, and λj = σ j (λ) with an automorphism σ defined from the perfect crystal B. In Sanderson’s case above, we have j = k, a = 1, κ = 1, Ba(j) = B and B is the crystal of the (l + 1)-dimensional irreducible Uq (sl 2 )-module. As a corollary, we can see ch Bw(k) (λ)|e−δ =1 = eλj (ch Ba(j,···,j−κ+1) )(ch B)j−κ

if j ≥ κ.

The plan of this article is as follows. In Sect. 1, we summarize the theory of perfect crystal. We introduce Kashiwara’s Demazure crystal Bw (λ), and state the main result in b n symmetric tensor case in Sect. 3. A relation between Sect. 2. We apply our result to the sl Demazure characters and Kostka-Foulkes polynomials and some other discussions are included in Sect. 4.

1. Perfect Crystal 1.1. Notation. We follow the notations of the quantized universal enveloping algebra and the crystal base in [KMN1]. In particular, {αi }i∈I is the set of simple roots, {hi }i∈I is the simple coroots, P is the weight lattice and P+ = {λ ∈ P | hλ, hi i ≥ 0 for any i}. Uq (g) is the quantized universal enveloping algebra of an affine Lie algebra g. V (λ) is the irreducible highest weight module of highest weight λ ∈ P+ , uλ is its highest weight vector, (L(λ), B(λ)) is its crystal base. For the notation of a finite-dimensional representation of Uq0 (g), we follow Sect. 3 in [KMN1]. For instance, Pcl is the classical weight lattice, Uq0 (g) is the subalgebra of Uq (g) generated by ei , fi , q h (h ∈ (Pcl )∗ ) and Modf (g, Pcl ) is the category of finitedecompositions. We set Pcl+ = {λ ∈ dimensional Uq0 (g)-modules which P have the weight + Pcl | hλ, hi i ≥ 0 for any i} ' Z≥0 3i and (Pcl )l = {λ ∈ Pcl+ | hλ, ci = l}, where c is the canonical central element. Assume V in Modf (g, Pcl ) has a crystal base (L, B).

Demazure Modules and Perfect Crystals

557

For an element b of B, we set εi (b) = max{n ≥ 0 | e˜ni b ∈ B}, ε(b) = P ϕi (b) = max{n ≥ 0 | f˜in b ∈ B}, ϕ(b) = i ϕi (b)3i .

P

i εi (b)3i

and

1.2. Perfect crystal. In [KMN1] Kang et al. introduced the notion of perfect crystal. For the definition of the perfect crystal, see Definition 4.6.1 in [KMN1]. Let B be a perfect crystal of level l. For λ ∈ (Pcl+ )l , let b(λ) ∈ B be the element such that ϕ(b(λ)) = λ. From the definition of perfect crystal, such a b(λ) exists and is unique. Let σ be the automorphism of (Pcl+ )l given by σλ = ε(b(λ)). We set bk = b(σ k−1 λ) and λk = σ k λ. Then perfectness assures that we have the isomorphism of crystals B(λk−1 ) ' B(λk ) ⊗ B. Iterating this isomorphism, we have ψk : B(λ) ' B(λk ) ⊗ B ⊗k . Defining the set of paths P(λ, B) by P(λ, B) = {p = · · · ⊗ p(2) ⊗ p(1) | p(j) ∈ B, p(k) = bk for k 0}, we see that B(λ) is isomorphic to P(λ, B). Under this isomorphism, the highest weight vector in B(λ) corresponds to the path p = · · · ⊗ bk ⊗ · · · ⊗ b2 ⊗ b1 . We call p the ground-state path. 1.3. Actions of e˜i and f˜i . We need to know the actions of e˜i and f˜i on the set of paths P(λ, B). To see this, we consider the following isomorphism. P(λ, B) ' B(λk ) ⊗ B ⊗k .

(1.1)

For each p ∈ P(λ, B), if we take k sufficiently large, we can assume that p corresponds to uλk ⊗ p(k) ⊗ · · · ⊗ p(1) with uλk the highest weight vector of B(λk ) and p(n) ∈ B (n = 1, · · · , k). Then we apply Proposition 2.1.1 in [KN] to see on which component e˜ i or f˜i acts. Let us suppose that e˜i or f˜i acts on the j th component from the right end. If j < k + 1, we have e˜i p = · · · ⊗ e˜i p(j) ⊗ · · · ⊗ p(2) ⊗ p(1) or

f˜i p = · · · ⊗ f˜i p(j) ⊗ · · · ⊗ p(2) ⊗ p(1).

(1.2)

(1.3) ˜ If j = k + 1, we see we should have taken k larger. This happens only for fi . This determination of the component can be rephrased using the notion of signature. Let p ∈ P(λ, B) correspond to uλk ⊗ p(k) ⊗ · · · ⊗ p(1) under (1.1) as above. With p(j) (1 ≤ j ≤ k), we associate (j) (j) = ((j) 1 , · · · , m ), m = εi (p(j)) + ϕi (p(j)), 1 ≤ a ≤ εi (p(j)) , (j) a = −

+

εi (p(j)) < a ≤ m .

For the highest weight vector uλk , we take (k+1) = (+, · · · , +). | {z } hλk ,hi i

558

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

We then append these (j) ’s so that we have = ((k+1) , (k) , · · · , (1) ). We call it the signature of p truncated at the k th position. Next we consider a sequence of signatures, = η0 , η1 , · · · , ηmax . Here ηj+1 is obtained from ηj by deleting the leftmost adjacent (+, −) pair of ηj . Eventually, we arrive at the following signature ηmax = (−, · · · , −, +, · · · , +), | {z } | {z } n−

n+

with n± ≥ 0. We call it the reduced signature and denote it by . The component on which e˜i or f˜i acts in (1.2) or (1.3) reads as follows. If n− = 0 (resp. n+ = 0), we set e˜i p = 0 (resp. f˜i p = 0). Otherwise, take the rightmost − (resp. leftmost +), and find the component (j) to which it belonged. Then, this j is the position in (1.2) (resp. (1.3)) we looked for. Note that if k is large enough, the position j does not depend on the choice of k. Remark 1. Of course, this signature rule can be applied to the tensor product of two crystals B1 ⊗ B2 . b 2 , B be the classical crystal of the 3-dimensional irreducible Example 1. Let g = sl representation. Its crystal graph is described as follows: B

1

1

0

0

00 01 11.

It is known that B is perfect of level 2. We have isomorphisms B(λ) ' P(λ, B) for λ = 230 , 30 + 31 , 231 . Let λ = 230 . We see λk = 230 (k : even), 231 (k : odd). The ground-state path is given by p = · · · ⊗ 11 ⊗ 00 ⊗ 11 ⊗ 00 ⊗ 11. Consider a path

p = · · · ⊗ 11 ⊗ 01 ⊗ 01 ⊗ 01 ⊗ 00.

The dotted part of p is the same as that of p. Then the signature of p truncated at the 5th position with respect to i = 1 and its reduced signature read as follows: = (++, −−, −+, −+, −+, ++), 4 2 1 1

= (− + + +). Here the number above each sign shows the component to which it belonged. Consequently, we have e˜1 p = · · · ⊗ 11 ⊗ 00 ⊗ 01 ⊗ 01 ⊗ 00, f˜1 p = · · · ⊗ 11 ⊗ 01 ⊗ 01 ⊗ 11 ⊗ 00. We will need the following simple but useful lemma in the sequel.

Demazure Modules and Perfect Crystals

559

Lemma 1. Let B1 , B2 be crystals, and let b1 ∈ B1 , b2 ∈ B2 . For any n ≥ 0, there exist p, q ≥ 0 such that f˜ip (b1 ⊗ e˜qi b2 ) = f˜in b1 ⊗ b2 . Proof. For a reduced signature , let ()+ (resp. ()− ) denote the number of +’s (resp. −’s) in . Let i be the reduced signature of bi (i = 1, 2). We set (1 )+ = α, (2 )− = β. We divide the proof into two cases: (a) α − n ≥ β, (b) α − n < β. Consider the case (a). In this case, we simply take p = n, q = 0. Consider the case (b). In this case, we take p = β − α + 2n, q = β − α + n. 1.4. Weight of a path. A map H : B ⊗ B → Z is called an energy function if for any b, b0 ∈ B and i ∈ I such that e˜i (b ⊗ b0 ) 6= 0, it satisfies   H(b ⊗ b0 ) if i 6= 0 0 if i = 0 and ϕ0 (b) ≥ ε0 (b0 ) H(e˜i (b ⊗ b )) = H(b ⊗ b0 ) + 1  H(b ⊗ b0 ) − 1 if i = 0 and ϕ (b) < ε (b0 ). 0 0 It is shown (Proposition 4.5.4 in [KMN1]) that under the isomorphism between B(λ) and P(λ, B), the weight of a path p = · · · ⊗ p(2) ⊗ p(1) is given by wt p = λ +

∞ X

af (wt p(k)) − af (wt bk )

k=1

−

∞ X

! k(H(p(k + 1) ⊗ p(k)) − H(bk+1 ⊗ bk )) δ.

(1.4)

k=1

For the definition of af , along with cl, see Sect. 3 of [KMN1]. 2. Crystal of the Demazure Module 2.1. Demazure module. Let {ri }i∈I be the set of simple reflections, and let W be the Weyl group. For w ∈ W , l(w) denote the length of w, and ≺ denote the Bruhat order on W . Let Uq+ (g) be the subalgebra of Uq (g) generated by ei ’s. We consider the irreducible highest weight Uq (g)-module V (λ) (λ ∈ Pcl+ ). It is known that for w ∈ W , the extremal weight space V (λ)wλ is one dimensional. Let Vw (λ) denote the Uq+ (g)-module generated by V (λ)wλ . These modules Vw (λ) (w ∈ W ) are called the Demazure modules. They are finite-dimensional subspaces of V (λ). 2.2. Kashiwara’s recursive formula. Let (L(λ), B(λ)) be the crystal base of V (λ). In [Ka] Kashiwara showed that for each w ∈ W , there exists a subset Bw (λ) of B(λ) such that M Vw (λ) ∩ L(λ) = Qb. Vw (λ) ∩ qL(λ) b∈Bw (λ)

Furthermore, Bw (λ) has the following recursive property. If ri w w, then [ f˜in Bw (λ) \ {0}. Bri w (λ) = n≥0

(2.1)

560

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

This beautiful formula is essential in our consideration below. 2.3. Theorem. We list the assumptions required for our theorem. Let λ be an element of (Pcl+ )l , and let B be a classical crystal. Firstly, we assume (I)

B is perfect of level l.

Therefore, B(λ) is isomorphic to the set of paths P(λ, B). Let p = · · · ⊗ b2 ⊗ b1 be the ground-state path. We fix positive integers d, κ. For a set of elements i(j) a in I (j ≥ 1, 1 ≤ a ≤ d), we define Ba(j) (j ≥ 1, 0 ≤ a ≤ d) by [ (j) f˜in(j) Ba−1 Ba(j) = \ {0} (a = 1, · · · , d). (2.2) B0(j) = {bj }, a

n≥0

Using these, we next define Ba(j+1,j) (j ≥ 1, 0 ≤ a ≤ d) by B0(j+1,j) = B0(j+1) ⊗ Bd(j) , [ (j+1,j) f˜in(j+1) Ba−1 \ {0} Ba(j+1,j) = a

n≥0

(a = 1, · · · , d).

Similar definitions continue until we define B0(j+κ−1,···,j) = B0(j+κ−1) ⊗ Bd(j+κ−2,···,j) , [ (j+κ−1,···,j) f˜in(j+κ−1) Ba−1 Ba(j+κ−1,···,j) = \ {0} n≥0

a

(a = 1, · · · , d).

Let us now assume (II) For any j ≥ 1, we have Bd(j+κ−1,···,j) = Bd(j+κ−1,···,j+1) ⊗ B. If κ = 1, the right hand side should be understood as B. We call κ a mixing index. We normally take the minimal one. The following proposition is sometimes useful. It is proved in the same way as in the theorem using Lemma 1. Thus we omit the proof. = Ba(j+κ−1,···,j+1) Proposition 1. If Ba(j+κ−1,···,j) = Ba(j+κ−1,···,j+1) ⊗ B, then Ba(j+κ−1,···,j) 0 0 0 ⊗B for a ≤ a ≤ d. We introduce the third condition. (j) i ≤ εi(j) (b) for b ∈ Ba−1 . (III) For any j ≥ 1 and a = 1, · · · , d, we have hλj , hi(j) a a

Remark 2. Since the automorphism σ is of finite order, it suffices to check the conditions (II), (III) for finite j’s. For the last assumption, we define w(k) by w(0) = 1,

w(k) = ri(j) w(k−1) , a

where j and a are fixed by k = (j − 1)d + a, j ≥ 1, 1 ≤ a ≤ d. (IV) The sequence of Weyl group elements w(0) , w(1) , · · · is increasing with respect to the Bruhat order.

Demazure Modules and Perfect Crystals

561

This condition is equivalent to l(w(k) ) = k. The following proposition is convenient to check (IV). Proposition 2. If hwµ, hj i > 0 for some µ ∈ Pcl+ , then rj w w. Proof. From hµ, w−1 hj i > 0, we see w−1 hj is a positive coroot. This is equivalent to w−1 αj ∈ Σ+ , where Σ+ is the set of positive roots. To see this is equivalent to rj w w, we refer, for example, to Proposition 4(i) in Sect. 5.2 (p.411) of [MP]. Now we define a subset P (k) (λ, B) of P(λ, B) as follows. We set P (0) (λ, B) = {p}. For k > 0, we take j and a such that k = (j − 1)d + a, j ≥ 1, 1 ≤ a ≤ d, and set · · · ⊗ B0(j+2) ⊗ B0(j+1) ⊗ Ba(j,···,1) (j < κ) (k) (2.3) P (λ, B) = · · · ⊗ B0(j+2) ⊗ B0(j+1) ⊗ Ba(j,···,j−κ+1) ⊗ B ⊗(j−κ) (j ≥ κ). Theorem 1. Under the assumptions (I-IV), we have Bw(k) (λ) ' P (k) (λ, B). 2.4. Proof of the theorem. In view of assumption (IV), we prove by induction on the length of w(k) . If k = 0, i.e., w(k) = 1, the statement is true. Next assume k > 0 and take j and a such that k = (j − 1)d + a, j ≥ 1, 1 ≤ a ≤ d. From the recursive formula (2.1) and w(k−1) ≺ w(k) , we have [ f˜in(j) Bw(k−1) (λ) \ {0}. Bw(k) (λ) = n≥0

a

On the other hand, from induction hypothesis, we have Bw(k−1) (λ) ' P (k−1) (λ, B) (j,···,1) (j < κ) · · · ⊗ B0(j+2) ⊗ B0(j+1) ⊗ Ba−1 = (j+2) (j+1) (j,···,j−κ+1) ⊗ B0 ⊗ Ba−1 ⊗ B ⊗(j−κ) (j ≥ κ). · · · ⊗ B0 Note that by definition, this is valid even for a = 1. In view of assumption (III), we see the action of f˜i(j) does not give any effect on the part · · · ⊗ B0(j+2) ⊗ B0(j+1) . Thus we a ignore this part in the following consideration. Then, in the case of j < κ, the proof ⊗κ , B2 = B ⊗(j−κ) , and take turns out to be trivial. Assume j ≥ κ, set i = i(j) a , B1 = B (j,···,j−κ+1) any elements b1 and b2 from Ba−1 ⊂ B1 and B2 . From Lemma 1, for any n ≥ 0, there exist p, q ≥ 0 such that f˜in b1 ⊗ b2 = f˜ip (b1 ⊗ e˜qi b2 ). Noting that e˜qi b2 ∈ B ⊗(j−κ) , we can conclude · · · ⊗ B0(j+1) ⊗ Ba(j,···,j−κ+1) ⊗ B ⊗(j−κ) ⊂ Bw(k) (λ). The other direction of the inclusion is clear. 2.5. Application to characters. Since we have established the weight preserving bijection between Bw(k) (λ) and P (k) (λ, B), the following proposition turns out to be an immediate corollary.

562

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

Corollary 1. Under the assumptions (I-IV), we have X ewt p , ch Bw(k) (λ) = p∈P (k) (λ,B)

where wt p is given in (1.4). In view of the tensor product structure of P (k) (λ, B) (2.3), it would be more interesting to consider a classical character, which we define by X ]Bw (λ)µ ecl(µ) , cl ch Bw (λ) = µ∈P

Bw (λ)µ = {b ∈ Bw (λ) | wt b = µ}. Note that for B(λ), the classical character does not make sense. Corollary 2. Assume (I-IV). For k ≥ 1, take j and a such that k = (j − 1)d + a, j ≥ 1, 1 ≤ a ≤ d. Then, λ (j < κ) e j ch Ba(j,···,1) cl ch Bw(k) (λ) = eλj (ch Ba(j,···,j−κ+1) )(chB)j−κ (j ≥ κ). Proof. Let p be an element in P (k) (λ, B). From (1.4), the classical weight of p is given by ∞ X (wt p(i) − wt bi ). (2.4) λ+ i=1

Noting that p(i) = bi (i > j), wt bi = λi−1 − λi and λ0 = λ, (2.4) reads as λj +

j X

wt p(i),

i=1

which immediately implies the statement.

3.

c sl n Symmetric Tensor Case

In this section, we apply our theorem to the case of symmetric tensor representations of b n ). Uq0 (sl 3.1. Preliminaries. In what follows, ≡ always means ≡ mod n. Define δi(n) by δi(n) = 1 b n , we have hαi , hj i = 2δ (n) − δ (n) − δ (n) (i ≡ 0), = 0 (i 6≡ 0). In the case of g = sl i−j i−j−1 i−j+1 (i, j ∈ I), where I = {0, 1, · · · , n − 1}. For our purpose, it is convenient to use the notations αi , hi , 3i , ri for i ∈ Z by defining αi = αi0 , etc., if i ≡ i0 . Let B l be the classical crystal of the level l symmetric tensor representation of b n ). As a set, B l is described as B l = {(x0 , · · · , xn−1 ) ∈ Zn | Pn−1 xi = l}. The Uq0 (sl ≥0 i=0 actions of e˜i , f˜i are defined as follows.

Demazure Modules and Perfect Crystals

e˜i (x0 , · · · , xn−1 ) = f˜i (x0 , · · · , xn−1 ) =

563

(x0 , · · · , xi−1 + 1, xi − 1, · · · , xn−1 ) (i 6= 0) , (i = 0) (x0 − 1, · · · , xn−1 + 1)

(3.1)

(x0 , · · · , xi−1 − 1, xi + 1, · · · , xn−1 ) (i 6= 0) . (i = 0) (x0 + 1, · · · , xn−1 − 1)

(3.2)

If the right-hand side contains a negative component, we should understand it as 0. Following Sect. 1.2 of [KMN2], we describe the perfect crystal structure of B l . (Note that we deal with the case of (A(1) n−1 , B(l31 )) in their notation.) From (3.1) and (3.2), Pn−1 Pn−1 it is easy to see ε (x0 , · · · , xn−1 ) = i=0 xi 3i , ϕ (x0 , · · · , xn−1 ) = i=0 xi 3i+1 . Pn−1 Setting λ = i=0 mi 3i , we have b(λ) = (m1 , · · · , mn−1 , m0 ) and the automorphism Pn−1 σ is given by σλ = i=0 mi 3i−1 . Thus, the order of σ is n. It is often convenient to consider automorphisms on I and B l . By abuse of notation, we also denote them by σ. They are defined by σ(i) ≡ i − 1, 0 ≤ σ(i) ≤ n − 1 for i ∈ I, σ (x0 , · · · , xn−1 ) = (x1 , · · · , xn−1 , x0 ) for (x0 , · · · , xn−1 ) ∈ B l . It is easy to see the following properties. f˜σ(i) σ(b) = σ(f˜i b) Using σ, we have bj = σ j (m0 , · · · , mn−1 ) . e˜σ(i) σ(b) = σ(e˜i b),

for b ∈ B l .

3.2. Sequence of Weyl group elements. We define [x]+ by the largest integer not exceeding x (x ≥ 0) [x]+ = 0 (x < 0),

and set w

(k)

=

1 (k = 0) rk−1 · · · r1 r0 (k > 0).

(3.3)

We are to prove Proposition 3. The sequence of Weyl group elements w(0) , w(1) , · · · is increasing with respect to the Bruhat order. For the proof, we prepare a lemma. Pk−1 Lemma 2. w(k) 30 = 30 − i=0 i+d d + αi , with d = n − 1. Proof. i+d We prove by induction on k. Assuming the statement for k and setting γi = d + , we have w(k+1) 30 = w(k) 30 − hw(k) 30 , hk iαk = 30 −

k−1 X

γi αi

i=0

X − δk(n) + γk−1 + (γk−1−nj − 2γk−nj + γk+1−nj ) αk . j≥1

564

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

Therefore, it is sufficient to show X (γk−1−nj − 2γk−nj + γk+1−nj ) = γk . δk(n) + γk−1 + j≥1

Note that γa − γa−1 = θ(a), where θ(a) = 1 if a/d ∈ Z≥0 , = 0 otherwise. We have X γk − γk−1 − (γk−1−nj − 2γk−nj + γk+1−nj ) j≥1

= θ(k) + =

X

X

θ(k − nj) − θ(k + 1 − nj)

j≥1

θ(k − nj) − θ(k − nj − d)

j≥0

= δk(n) . This completes the proof.

to show Proof of the proposition. We take 30 for µ, and apply Proposition 2. It suffices hw(k) 30 , hk i > 0. The left hand side was already calculated to be k+d d + in the proof of the previous lemma. 3.3. λ = l30 case. We have seen that B l is perfect of level l. We have shown that the sequence of Weyl group elements w(k) (3.3) is increasing with respect to the Bruhat (j) order. Thus, taking d = n − 1 and i(j) a ≡ a − j (0 ≤ ia ≤ n − 1), the assumptions (I) and (IV) are already satisfied. Let us take λ = l30 . Then, we have λj = l3−j and the i

ground-state path is given by p = · · · ⊗ b2 ⊗ b1 , bj = (0, · · · , 0, l, 0, · · · , 0) with i ≡ −j (0 ≤ i ≤ n − 1). Let us describe Ba(j) . Thanks to the automorphism σ, it is sufficient to consider the case of j = n. From (2.2) and i(n) a = a, we easily have Ba(n) = {(x0 , · · · , xn−1 ) ∈ B l |

a X

xi = l}.

i=0

Checking the assumption (III) is also trivial. Therefore, we have the following proposition. (j) Proposition 4. For the case of λ = l30 , take d = n−1 and i(j) a ≡ a−j (0 ≤ ia ≤ n−1). Then, the assumptions (II) and (III) are satisfied with κ = 1.

3.4. λ arbitrary case. We show even if λ is any element in (Pcl+ )l , we have κ = 2 with Pn−1 the same choice of d and i(j) a as in the previous subsection. For λ = i=0 mi 3i , define a subset Bλl of B l by Bλl = {(x0 , · · · , xn−1 ) ∈ B l | x0 + · · · + xi−1 ≤ m0 + · · · + mi−1 for 1 ≤ i ≤ n − 1}. We now prepare a lemma. Pn−1 Pn−1 Lemma 3. Let λ = i=0 mi 3i , i=0 mi = l1 . For any b1 ∈ Bλl1 and b2 ∈ B l2 , there pn−1 exist an element bˇ 2 ∈ B l2 and integers p1 , · · · , pn−1 ≥ 0 such that f˜n−1 · · · f˜1p1 (bˇ 1 ⊗ bˇ 2 ) = b1 ⊗ b2 , zi ≤ xi−1 (i = 1, · · · , n − 1), where we set bˇ 1 = (m0 , · · · , mn−1 ), bˇ 2 = (z0 , · · · , zn−1 ), b1 = (x0 , · · · , xn−1 ).

Demazure Modules and Perfect Crystals

565

Proof. We prove by induction on n. If n = 1, the statement is trivial. Now assume the statement is valid when n−1, and set b2 = (y0 , · · · , yn−1 ). We divide it into two cases:(a) xn−2 > yn−1 , (b) xn−2 ≤ yn−1 . Consider the case (a). Assume (x0 , · · · , xn−1 ) ∈ Bλl1 , then we have xn−1 ≥ mn−1 . Consider elements (x0 , · · · , xn−3 , xn−2 +xn−1 −mn−1 , mn−1 ) ∈ Bλl1 , (y0 , · · · , yn−1 ) ∈ B l2 . Ignoring the last component and using the induction hypothesis, we see there exist z0 , · · · , zn−2 and p1 , · · · , pn−2 ≥ 0 such that (z0 , · · · , zn−2 , yn−1 ) ∈ B l , and pn−2 f˜n−2 · · · f˜1p1 (m0 , · · · , mn−1 ) ⊗ (z0 , · · · , zn−2 , yn−1 ) = (x0 , · · · , xn−3 , xn−2 + xn−1 − mn−1 , mn−1 ) ⊗ (y0 , · · · , yn−1 ). x

−m

n−1 n−1 Apply f˜n−1 further, then we get (x0 , · · · , xn−1 ) ⊗ (y0 , · · · , yn−1 ). Checking another condition is easy. We move to (b). The proof goes similarly. Take bˇ 2 = (z0 , · · · , zn−2 , xn−2 ) (z0 , · · · , zn−2 are determined from the induction hypothesis) and pn−1 = xn−1 − mn−1 + yn−1 − xn−2 .

(j) Proposition 5. For any λ ∈ (Pcl+ )l , take d = n − 1 and i(j) a ≡ a − j (0 ≤ ia ≤ n − 1). Then, the assumptions (II) and (III) are satisfied with κ = 2.

Proof. We again reduce the proof for all j to the j = n case by the automorphism σ. Set Pn−1 (n) λ = i=0 mi 3i . Note that i(n) a = a. Firstly, we claim that Ba consists of the elements l (x0 , · · · , xn−1 ) ∈ B satisfying x0 + · · · + xi−1 ≤ m0 + · · · + mi−1 xi = mi

(1 ≤ i ≤ a), (a + 1 ≤ i ≤ n − 1).

(n) ) = ma , and This is easily shown by induction on a. Thus, we have hλn , ha i = εa (Ba−1 (III) is checked. It remains to show Bd(n,n−1) = Bd(n) ⊗ B l . Noting that Bd(n) = Bλl , it is equivalent to check [ pn−1 f˜n−1 · · · f˜1p1 {bn } ⊗ σ −1 (Bλl ) \ {0} = Bλl ⊗ B l . (3.4) p1 ,···,pn−1 ≥0

Take any b1 = (x0 , · · · , xn−1 ) ∈ Bλl and b2 ∈ B l . From the lemma, there exist bˇ 2 = pn−1 · · · f˜1p1 (bn ⊗ bˇ 2 ) = b1 ⊗ b2 , (z0 , · · · , zn−1 ) ∈ B l and p1 , · · · , pn−1 ≥ 0 such that f˜n−1 zi ≤ xi−1 (i = 1, · · · , n−1). Noting that z1 +· · ·+zi ≤ x0 +· · ·+xi−1 ≤ m0 +· · ·+mi−1 , we see bˇ 2 ∈ σ −1 (Bλl ). Thus, we have shown the inclusion ⊃ in (3.4). cmp1 The other direction of the inclusion is clear. 4. Discussion b n , B = Bl . We retain We explain the suggestion by Kirillov. Consider the case of g = sl the notations in Sect. 3. If L is divisible by n, we already know Bw(Ld) (l30 ) ' ul30 ⊗ (B l )⊗L . Therefore, Bw(Ld) (l30 ) is invariant under the action of e˜i , f˜i (i 6= 0). On the other hand, it has been shown recently that the Kostka-Foulkes polynomial Kλ(lL ) (q) has the following expression (cf. [DF, D, NY]).

566

A. Kuniba, K. C. Misra, M. Okado, J. Uchiyama

Kλ(lL ) (q) =

X PL−1 jH(bj+1 ⊗bj ) q j=1 ,

where the sum is over all highest weight vectors bL ⊗ · · · ⊗ b1 with respect to e˜i (i 6= 0) Pn−1 with highest weight i=1 (λi − λi+1 )3i . Now Kirillov’s suggestion reads as e−l30 ch Bw(Ld) (l30 ) (z1 , · · · , zn−1 ; q) X = q −E0 Kλ(lL ) (q)sλ (x1 , · · · , xn ) x1 ···xn =1 , λ L where λ runs over all partitions of lL of depth less than or equal to n, E0 = lL 2 ( n − 1), −δ −αi = xi+1 /xi (i = 1, · · · , n − 1), and sλ is the Schur function. Note that q = e , zi = e an expression of Kλµ (q) using Gaussian polynomials is known [Ki]. In this article, we have presented a criterion for the Demazure crystal to have a tensor b n symmetric tensor product structure. One may ask if there are other cases than the sl case which have a similar property. We claim that if g is non-exceptional and λ = l30 (plus some other λ), we can find a sequence of Weyl group elements which satisfies the assumptions (II-III) with κ = 1. We would like to report on that in near future. We are not only interested in the classical character, but also in the full character. One of the advantages to relate the Demazure crystal to a set of paths is that we can utilize the results on so-called 1D sums in solvable lattice models. In [FMO] we have seen that for b 2 case, all characters of the Demazure modules are expressed in terms of q-multinomial sl coefficients. We also would like to deal with this subject for any non-exceptional affine Lie algebra (though the level will be 1) in another publication.

Acknowledgement. K.C.M. and M.O. would like to thank Omar Foda for discussions, and collaboration in our earlier work [FMO]. We would like to thank Anatol N. Kirillov for suggesting to us that in the sbln symmetric tensor case, Demazure characters can be expressed using Kostka-Foulkes polynomials. We would also like to thank Noriaki Kawanaka for introducing the book [MP]. M.O. would like to thank Peter Littelmann for stimulating discussions. K.C.M. is supported in part by NSA/MSP Grant No. 96-1-0013. A.K. and M.O. are supported in part by Grant-in-Aid for Scientific Research on Priority Areas, the Ministry of Education, Science and Culture, Japan. M.O. is supported in part by the Australian Research Council.

References Dasmahapatra, S.: On the combinatorics of row and corner transfer matrices of the A(1) n−1 restricted models. hep-th/9512095 [DF] S. Dasmahapatra and O. Foda: Strings, paths, and standard tableaux. q-alg/9601011 [DJKMO1] Date, E., Jimbo, , Kuniba, A., Miwa, T. and Okado, M.: One dimensional configuration sums in vertex models and affine Lie algebra characters. Lett. Math. Phys. 17, 69–77 (1989) [DJKMO2] Date, E., Jimbo, , Kuniba, A., Miwa, T. and Okado, M.: Paths, Maya diagrams and representations of sbl(r, C). Adv. Stud. Pure Math. 19, 149–191 (1989) [FMO] Foda, O., Misra, K.C. and Okado, M.: Demazure modules and vertex models: the sbl(2) case. q-alg/9602018 [L] Littelmann, P.: Paths and root operators in representation theory. Ann. Math. 142, 499–525 (1995) [Ka] Kashiwara, M.: The crystal base and Littelmann’s refined Demazure character formula. Duke Math. J. 71, 839–858 (1993) [Ki] Kirillov, A.N.: Dilogarithm identities. Lectures in Mathematical Sciences 7, The University of Tokyo, 1995 [KMN1] Kang, S-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T. and Nakayashiki, A.: Affine crystals and vertex models. Int. J. Mod. Phys. A 7 (suppl. 1A), 449–484 (1992)

[D]

Demazure Modules and Perfect Crystals

[KMN2] [KN] [MP] [NY] [S]

567

Kang, S-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T. and Nakayashiki, A.: Perfect crystals of quantum affine Lie algebras. Duke Math. J. 68, 499–607 (1992) Kashiwara, M. and Nakashima, T.: Crystal graphs for representations of the q-analogue of classical Lie algebras. J. Algebra 165, 295–345 (1994) Moody, R.V. and Pianzola, A.: Lie algebras with triangular decompositions. New York: WileyInterscience, 1995 Nakayashiki, A. and Yamada, Y.: Kostka polynomials and energy functions in solvable lattice models. q-alg/9512027 Sanderson, Y.B.: On characters of Demazure modules. Ph. D. Thesis, Rutgers University (1995)

Communicated by T. Miwa

Commun. Math. Phys. 192, 569 – 604 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability? Michael Skeide?? Centro Vito Volterra, Universit`a degli Studi di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Rome, Italy Received: 28 October 1996 / Accepted: 21 July 1997

Abstract: A physical system of the form R ⊗ S with a distinguished state on B(R) may be described in a natural way on a Hilbert B(S)-module. Following the ideas of Accardi and Lu [1], we apply this possibility to a concrete system consisting of a boson field in the vacuum state coupled to a free electron. We show that the physical system is described adequately on a new type of Fock module: the symmetric Fock module. It turns out that a module has to fulfill an algebraic condition in order to allow for the construction of a symmetric Fock module. We prove in a central limit theorem that in the stochastic limit the moments of the collective operators (i.e. more or less the time-integrated interaction Hamiltonian) converge to the moments of free creators and annihilators on a full Fock module. In the sense of Voiculescu [22] and Speicher [20] these operators form a free white noise over the algebra B(S). 1. Introduction In elementary particle physics and statistical physics, usually, one is interested in the evolution of a system (usually one or a small number of elementary particles) described on a Hilbert space S subject to interaction with a system (usually a field or a heat bath) described on a Hilbert space R. The system R is assumed to be in a distinguished state (usually the vacuum state of the field or some temperature state). The goal of these notes is to show how the language of Hilbert modules can be used advantageously to describe physical systems of this type. As an example we consider a single free electron coupled to the electro magnetic field in the vacuum state. Like Accardi and Lu in [1] we compute the stochastic limit of this system. However, we ?

This work has been supported by the Deutsche Forschungsgemeinschaft. Present adress: Lehrstuhl f¨ur Wahrscheinlichkeitstheorie und Statistik, Brandenburgische Technische Universit¨at Cottbus, Postfach 10 13 44, D-03013, Germany, E-mail: [email protected] ??

570

M. Skeide

do this from the beginning in terms of Hilbert modules. Along the computations we are led in a natural manner to the concepts of a symmetric and a full Fock module. These concepts are direct generalizations of the Bose and full Fock space well-known in quantum probability. The full Fock module introduced by Pimsner [14] is closely related to Voiculescu’s operator-valued free independence [22]. The symmetric Fock module is new. It is for operator-valued Bose independence what the full Fock module is for operator-valued free independence. We investigate this point in [18]. e of the field space R The interacting system is described on R ⊗ S. The vacuum determines a conditional expectation from the ∗-algebra of operators on R ⊗ S to the ∗-algebra of operators on S; see Eq. (2.6). The dynamics of the full system is determined by the interaction Hamiltonian. However, actually we are interested only in the reduced dynamics of S which is obtained by ‘projecting down’ the full dynamics from R ⊗ S to S via . Hilbert modules arise in a natural manner, namely, by GNS-construction. This GNSconstruction is, however, not based on a state, but on the completely positive mapping . The necessary facts about Hilbert modules and GNS-construction are explained in Sect. 3. In our example R is the space of the electromagnetic field, i.e. the symmetric or Bose Fock space over the one-photon sector. In Section 4 we introduce a Hilbert module which turns out to be the space of the GNS-representation of . This module has a close formal similarity to a symmetric Fock space. For this reason we call it a symmetric Fock module. In Sect. 6 we justify this name by showing that this module, indeed, arises by a systematic construction out of its one-particle sector. This construction completely parallels the construction of the symmetric Fock space. However, the possibilty for such a construction depends on an additional algebraic condition on the one-particle sector, namely, it has to be a centered Hilbert module; see Section 6. This condition turns out to be crucial for operator-valued Bose independence; see [18]. The proof that the GNSrepresentation of on the symmetric Fock module is faithful, is postponed mainly to the appendix. The methods developed in the appendix are applicable to arbitrary R and S as long as the state on R is pure. The physical model is discussed in Sect. 2. The most important objects are the collective operators defined by (2.3). When translated into the language of modules, they appear just as creators and annihilators on the symmetric Fock module. In the stochastic limit one is interested in a certain scaling limit of the moments of the collective operators in the vacuum conditional expectation . In [1] the limit has been calculated in terms of matrix elements of operators on S. Our first computational result (basically Lemma 5.6 and Corollary 5.8) going beyond [1] tells us that the limits of the matrix elements, indeed, define operators on S. In a central limit theorem we show that the moments of collective operators converge to moments of free creators and annihilators on a full Fock module. These free operators form operator-valued free white noises; see Speicher [20]. The proof of the central limit theorem is split into several steps. In the first step we compute the limit of the one-particle sector. This is done in Sect. 5. We provide a definition of convergence of Hilbert modules. All technical difficulties already arise in the limit of the one-particle sector. In Sect. 7 we generalize our techniques to the full system. The basic idea is to show in the proof of Theorem 7.3 that the limits of the collective operators fulfill the free commutation relations; see Eq. (6.1). For this aim it is indispensable to know from the general construction in Section 6 that the symmetric

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

571

Fock module is contained in the full Fock module as the submodule being spanned by symmetric tensors. Already Pimsner has shown that Relations (6.1) determine more or less the ∗-algebra generated by the free operators. This fact has the more probabilistic interpretation that the moments of the free operators are already determined by their second moments. The second moments, in turn, are known, if we know the one-particle sector. Following Speicher [20] Shlyakhtenko calls in [16] the one-particle sector a covariance matrix. He also finds the distributions of free creators as limit distributions of certain operatorvalued random variables in the totally different setup of random Gaussian band matrices (these are operator-valued generalizations of the usual random matrices leading to the Wigner semi-circle law). Our example shows that operator-valued free noise also can emerge from a purely physical setup. We remark that the appearance of free noise is due to the choice of the physical system. In particular, if the particle has discrete spectrum, it is very well possible to obtain operator-valued Bose white noise in the limit. For a detailed discussion we refer the reader to Gough [4]. He shows that the stochastic limit and the infinite-volume limit are interchangeable. This means that the limit does not change, if we first compute the stochastic limit for an electron in a finite volume and then let the volume go to infinity. Gough finds diagram rules of how to write down concrete expressions for the limit. In his diagrams the “dying out” of the non-identical permutations has a beautiful interpretation as an additional rule of energy conservation between certain vertices. According to this rule, like in the Wigner semi-circle law, only the non-crossing diagrams survive. A right Hilbert module already was constructed in [1]. However, the construction of Fock modules is based on two-sided modules. The left multiplication introduced in Sect. 4 and its limit calculated in Sect. 5 is a further computational result exceeding [1]. It turns out that the left multiplication makes the limit module much bigger than the module given in [1]. These notes are organized such that rather abstract sections where general notions are introduced change with sections where the abstract notions are applied to the concrete problem. The reader who prefers to have the general theory at the beginning can very well start reading Sections 3, 6 and the appendix and then the remaining sections. Conventions and notations. All our vector spaces are vector spaces over C. All our algebras are unital algebras. All our modules are modules over algebras (compatible with the unit of the algebra) and cary a vector space structure which is compatible with the algebra structure. All tensor products are algebraic tensor products. If V is a vector space and B an algebra, then we denote by VBf = B ⊗ V ⊗ B, VBr = V ⊗ B, VBl = V ⊗ B, and VB = V ⊗ B the free two-sided, free right, free left and free centered B-modules, respectively, generated by V equipped with their natural structures. (See Sect. 6 for the definition of a centered module.) Spaces of linear mappings between linear spaces are denoted by using the letter L. For right linear mappings between right modules we use Lr . For adjointable mappings between inner product spaces or modules we use La . Adjoint always means mutually adjoint; cf. also Remark 3.5. Notice that in the case of modules the adjointable mappings are always right linear. In the case of mappings between normed spaces we also use the letter B instead of L with the same use of the superscripts to denote spaces of bounded mappings. Spaces of continuous functions on Rd with values in a normed space B are denoted by C(Rd , B). A subscript b means bounded functions, a subscript 0 means functions vanishing at infinity and a subscript c means functions with compact support. If the

572

M. Skeide

range is not specified, we mean C. In this case a superscript n means n-fold continuously differentiable. The Schwartz functions on Rd are denoted by S(Rd ). The simple functions on R are denoted by A0 (R).

2. The Physical Model In [1] and also in [2] Accardi and Lu investigate the so-called stochastic limit or Friedrichs-van Hove scaling limit for the non-relativistic QED-Hamiltonian in d ∈ N dimensions of a single free electron coupled to the photon field without dipole approximation. Originally, the photon field has d components. However, since we neglect the possibility of polarization, we may restrict to a single component. From the mathematical point of view this is not a serious simplification. Our results can be generalized easily to d components. In addition, we forget about the fact that the electron couples to the field via the component of p into the direction of the field. The p may be reinserted after the computations easily, because we work in Coulomb gauge. Throughout these notes we assume d ≥ 3. In the sequel, we describe our simplified setup and refer to [1, 4] for a detailed description. The Hilbert space R of the field is theRsymmetric or boson Fock space 0(L2 (Rd )) over L2 (Rd ) with the Hamiltonian HR = dk |k| a+k ak . (By a+k and ak we denote the usual creator and annihilator densities which fulfill ak a+k0 − a+k0 ak = δ(k − k 0 ).) The particle space S is the representation space L2 (Rd ) of the d-dimensional Weyl algebra 2 W in momentum representation with the usual free Hamiltonian HS = p2 . The interaction is described on the compound system R ⊗ S by the interaction Hamiltonian Z HI = λ dk a+k ⊗ eik·q c(k) + h.c. . λ is a (positive) coupling constant. In the original physical model the function c is given by c(k) = √1|k| , see [1]. As in [1] we replace it by a suitable cut-off function c ∈ Cc (Rd ). Since we will identify operators on R and S, respectively, with their ampliations to R ⊗ S, we omit in the sequel the ⊗-sign in between such operators. The time-dependent interaction Hamiltonian in the interaction picture, defined by HI (t) = eit(HR +HS ) HI e−it(HR +HS ) , takes the form Z 2 1 HI (t) = λ dk a+k eik·q eitk·p eit(|k|+ 2 |k| ) c(k) + h.c. . This follows directly from the commutation relations fulfilled by a+k and ak and from the basic relation f (p)eik·q = eik·q f (p + k) for all f ∈ L∞ (Rd ). In the sequel, the special case 0

0

eik·p eik ·q = eik ·q eik·p eik·k

0

(2.1)

will be of particular interest. The wave operators U (t) defined by U (t) = eit(HR +HS ) e−it(HR +HS +HI ) are the objects of main physical interest. They fulfill the differential equation dU (t) = −iHI (t)U (t) and U (0) = 1. dt

(2.2)

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

573

For the stochastic limit the time t is replaced by λt2 and one considers the limit λ → 0. So we define the rescaled wave operators Uλ (t) = U ( λt2 ). The problem is to give sense to U0 = lim Uλ . For this aim one usually proceeds in the following way. Let V denote λ→0

the vector space which is linearly spanned by all functions f : R × Rd → C of the form f (τ, k) = χ[t,T ] (τ )fe(k) (t < T, fe ∈ Cc (Rd )). Obviously, we have V = A0 (R) ⊗ Cc (Rd ). For f ∈ V define the collective creators Z A+λ (f ) =

Z dτ

dk a+k γλ (τ, k)f (τ, k)

(2.3)

and their adjoints Aλ (f ), the collective annihilators. Here we set γλ (τ, k) =

1 ik·q i t2 k·p i t2 ω(|k|) 1 e e λ e λ and ω(r) = r + r2 . λ 2

In view of Remark 4.3 and Lemma 5.6 we keep in mind that in some respects also more general choices for γλ and ω are possible. Obviously, we have Z A+λ (χ[t,T ] c)

+ Aλ (χ[t,T ] c) =

T λ2 t λ2

dτ HI (τ ).

(2.4)

With the definition A+λ (t) = A+λ (χ[0,t] c) Eq. (2.2) transforms into dA+ dAλ dUλ λ (t) = −i (t) + (t) Uλ (t) and Uλ (0) = 1. dt dt dt

(2.5)

Henceforth, if we are able to give sense to the limit not only of Uλ but also of A+λ (f ) for fixed f ∈ V , we may expect this differential equation to hold also in the limit λ → 0. On the other hand, if we find a limit of the A+λ (f ) and a quantum stochastic calculus in which (2.5) makes sense and has a solution U0 , we may hope that U0 is the limit of the Uλ . In these notes we will be concerned exclusively with the limit of the collective operators. However, we provide a natural language which promises to allow to describe several types of rich quantum stochastic calculi. e denote the vacuum in R. Then we define the vacuum conditional expectation Let : La (R ⊗ S) → La (S) by setting e ⊗ id)a(|i e ⊗ id). (a) = (h|

(2.6)

In [1] the limit lim hξ, (Mλ )ζi, for Mλ being an arbitrary monomial in collective λ→0

operators and ξ, ζ being Schwartz functions, has been calculated. We repeat the major ideas of the proof in a new formulation. Moreover, we show as a new result that the limit considered as a sesquilinear form in ξ and ζ, indeed, determines an element of B(S).

574

M. Skeide

3. GNS–Construction and Hilbert Modules The idea to generalize the GNS-construction based on a state to a construction based on a completely positive mapping between ∗-algebras A and B is not new. The first step in this direction is the Stinespring construction for the case when A is a C ∗ -algebra and B is an operator C ∗ -algebra over a Hilbert space G. Also in the Stinespring construction we obtain a representation of A on a Hilbert space H. However, the cyclic vector is replaced by an adjointable mapping from G into H; cf. Eq. (2.6) and also the appendix. In a generalization of the Stinespring construction due to Kasparov one may replace G by a Hilbert module; see e.g. the book [8] of Lance. A more direct generalization of the GNS-construction goes back to Paschke [12]. He also obtains a representation of A. However, this representation acts on a Hilbert B-module E rather than on a Hilbert space. One benefit from this description is the existence of a cyclic vector in E. Moreover, B may be an arbitrary C ∗ -algebra and A may be any ∗-algebra which is spanned linearly by its unitaries. The last condition yields to bounded representation operators automatically. Since the collective operators are unbounded, we have to further generalize Paschke’s GNSconstruction in order to describe the ∗-algebra Aλ , which is generated by the collective operators, and the conditional expectation (2.6) on a Hilbert module. Before we do that we recall the basic definitions which can be found in an equivalent form e.g. in [8]. Definition 3.1. A pre-Hilbert B-module over a C ∗ -algebra B is a right B-module E with a sesquilinear inner product h•, •i : E × E → B, such that hx, xi ≥ 0 for x ∈ E (positivity), that hx, ybi = hx, yib for x, y ∈ E; b ∈ B (right linearity), and that hx, xi = 0 implies x = 0 (strict positivity). If h•, •i is not necessarily strictly positive, we speak of a semi-inner product and of a semi-Hilbert B-module. We remark that sesquilinearity and positivity imply hx, yi = hy, xi∗ (symmetry), and that right linearity and symmetry imply hxb, yi = b∗ hx, yi (left anti-linearity). Let A be a ∗-algebra. A pre- (or semi-) Hilbert A-B-module is a two-sided A-Bmodule E which is also a pre- (or semi-) Hilbert B-module, such that hx, ayi = ha∗ x, yi for x, y ∈ E; a ∈ A (∗-property). P A mapping : A → B is called completely positive, if b∗i (a∗i aj )bj ≥ 0 for all i,j

ai ∈ A; bi ∈ B; i = 1, . . . , n; n ∈ N. As was noted by Paschke [12, Remark 5.1] this is equivalent to the usual definition of complete positivity. A representation of a ∗-algebra A over a C ∗ -algebra B is a pre-Hilbert A-B-module E. For B = C we recover the usual notion of a representation of A on a pre-Hilbert space. Lemma 3.2. Let E be a semi-Hilbert A-B-module and denote by N the set of all x ∈ E having square length hx, xi = 0. Then N is an A-B-submodule of E and the quotient E/N inherits a pre-Hilbert A-B-module structure by hx + N, y + Ni = hx, yi. Proof. We have to show that N is stable under all module operations (including addition) and that the definition of the inner product on E/N does not depend on the choice of a representative x + n (n ∈ N) of an element x + N. Both assertions follow easily, if we establish the equivalence hx, xi = 0 ⇐⇒ hy, xi = 0∀y ∈ E.

(3.1)

But this follows by applying an arbitrary separating family of states to (3.1) and CauchySchwartz inequality (cf. also Remark 3.4).

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

575

Corollary 3.3. To any completely positive mapping : A → B from a ∗-algebra A into a C ∗ -algebra B there exists a pre-Hilbert A-B-module E and a cyclic vector η ∈ E (i.e. span AηB = E), such that (a) = hη, aηi. The pair (E, η) is determined up to two-sided pre-Hilbert module isomorphism (i.e. an isomorphism of two-sided modules, which is also an isometry). (E, η) is called the GNS-representation of . Proof. Consider A ⊗ B with its natural A-B-module structure. Since is completely positive, we turn A ⊗ B into a semi-Hilbert A-B-module by setting E X DX X a i ⊗ bi , a0j ⊗ b0j = b∗i (a∗i a0j )b0j (ai , a0j ∈ A; bi , b0j ∈ B). i

j

i,j

By the preceding lemma E = A ⊗ B/N is a pre-Hilbert A-B-module. Setting η = 1 ⊗ 1 + N, the pair (E, η) has the claimed properties. Uniqueness follows in the usual way. Remark 3.4. Up to now we did not use the C ∗ -algebra structure of B. The above definitions and results are formulated in a way such that they can be generalized to the much wider class of ∗-algebras which admit a separating family S0 of states (i.e. ϕ(b) = 0 for all ϕ ∈ S0 implies b = 0.) In this case an element b is called positive, if ϕ(b) ≥ 0 for all ϕ ∈ S0 . In particular, B may be only a pre-C ∗ -algebra. In this case on a semi-Hilbert Bmodule E we have the generalized Cauchy-Schwartz inequality hx, yihy, xi ≤ khy, yik hx, xi. (This follows either by the same method we proved (3.1), or by an investigation of yhy,xi when khy, yik 6= 0. We emphasize, however, that the case the length of x − khy,yik khy, yik = khx, xik = 0 requires a different argument.) As a consequence a pre-Hilbert B-module E is turned into a normedp module (i.e. a normed vector space fulfilling khx, xik. If E is complete, then it is called kxbk ≤ kxk kbk), by setting kxk = a Hilbert C ∗ -module or Hilbert B-module. A Hilbert C ∗ -module always admits an extension of the module operation to the completion of B. If E is a semi-Hilbert B-module, then by E we mean the Hilbert B-module associated with E, i.e. the completion of E after having divided out elements of length 0. Remark 3.5. The set La (E) of adjointable mappings on a pre-Hilbert B-module E (i.e. mappings T on E for which there exists a mapping T ∗ on E, such that hx, T yi = hT ∗ x, yi) is a subset of the set Lr (E) of right module mappings. Since the adjoint is unique, the elements of La (E) form a ∗-algebra. If E is a pre-Hilbert A-B-module, then A has a ∗-homomorphic image in La (E). The elements of La (E) extend closeably to the completion of E. Therefore, on a Hilbert B-module E the algebra La (E) and the algebra B a (E) of bounded adjointable operators on E coincide. However, in general B a (E) and B r (E) do not coincide; see [8, 12]. B a (E) is a C ∗ -algebra with respect to the operator norm. If A is a C ∗ - or a Banach ∗-algebra and E a Hilbert A-B-module, then kaxk ≤ kak kxk.

576

M. Skeide

Definition 3.6. The tensor product E ⊗ F of a semi-Hilbert B-module E and a semiHilbert C-module FDis turned into a semi-Hilbert E P B ⊗ C-module called exterior tensor P 0 P 0 xi ⊗yi , xj ⊗yj = hxi , x0j i⊗hyi , yj0 i (xi , x0j ∈ E; yi , yj0 ∈ product, by setting i

j

i,j

F ). We P show that if E and F are pre-Hilbert modules, then so is E ⊗ F . Indeed, let z = xi ⊗yi be an arbitrary element of E ⊗F . We may assume that the xi form a C-linearly i P independent set. If hz, zi = 0, then by (3.1) we have hu ⊗ v, zi = hu, xi i ⊗ hv, yi i = 0 i P for all u ∈ E, v ∈ F . For an arbitrary state ϕ on C we define 8v : z 7→ xi ϕ(hv, yi i). i

We have hu, 8v (z)i = (id ⊗ϕ)(hu ⊗ v, zi) = 0 for all u ∈ E, hence 8v (z) = 0. From linear independence of the xi we conclude that ϕ(hv, yi i) = 0 for all i. Since ϕ and v are arbitrary, we find yi = 0 for all i, i.e. z = 0. (Confer also the proof of this fact given in [8].) If E and F have a two-sided structure with ∗-algebras A and B, respectively, acting from the left, then E ⊗ F inherits a left structure over A ⊗ B which turns it into a semior pre-Hilbert A ⊗ B-B ⊗ C-module. Definition 3.7. The module tensor product E F of an A-B-module E and a B-Cmodule F is the vector space (E ⊗ F )/(xb ⊗ y − x ⊗ by) equipped with its natural A-C-module structure. By x y we denote the image of x ⊗ y under the quotient map. A mapping j : E × F → G into an A-C-module G is called A-C-bilinear, if it is left A-linear in the first and right C-linear in the second argument. j is called balanced, if j(xb, y) = j(x, by) for all x ∈ E, b ∈ B, y ∈ F . Obviously, the mapping i : (x, y) 7→ x y is balanced and A-C-bilinear. E F together with i is determined uniquely up to A-C-module isomorphism by the universal property: For an arbitrary balanced A-C-bilinear mapping j : E × F → G there exists a unique A-C-linear mapping e j : E F → G fulfilling j = e j ◦ i. Uniqueness follows in the usual way. We sketch this once for all other cases in these notes. Assume we have two tensor products G and G0 which have with mappings i and i0 the universal property. Denote by ie0 : G → G0 and ei : G0 → G the unique mappings determined by the universal property of G applied to i0 and conversely. We have ei ◦ ie0 ◦ i = ei ◦ i0 = i. By the universal property of G there is precisely one A-C-linear mapping j : G → G fulfilling j ◦ i = i, namely j = idG . We conclude ei ◦ ie0 = idG and, similarly, ie0 ◦ ei = idG0 . This means that G and G0 are isomorphic A-C-modules. We state some results which may be checked by using the universal property. Firstly, building the module tensor product is an associative operation. Secondly, given E and F as above, we find E VBf = E ⊗ VBr ,

E VB = E ⊗ V,

VBf

VB F, = V ⊗ F.

F =

VBl

⊗ F,

The ⊗-signs are those of the exterior tensor product. The right formulae applied to V = C show explicitly that B plays the role of the complex numbers. Given an A-B-linear mapping j : E → E 0 and a B-C-linear mapping k : F → F 0 , by the universal property there exists a unique a A-C-linear mapping j k : E F → E 0 F 0 , fulfilling (j k)(x y) = j(x) k(y). We call j k the tensor product of

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

577

j and k. If, for instance, E and F are submodules of E 0 and F 0 and j and k are the canonical embeddings, respectively, then j k defines a canonical embedding of E F into E 0 F 0 . However, unlike the vector space case, this embedding is, in general, not injective. This may happen, because the number of relations to be divided out in the definition of E F will usually be much smaller than the corresponding number for E0 F 0. Definition 3.8. If E and F are, in addition, two-sided semi-Hilbert modules, we may define a two-sided semi-Hilbert module structure also on the tensor product, by setting hx y, x0 y 0 i = hy, hx, x0 iy 0 i.

(3.2)

By a twofold application of the universal property we see that this is, indeed, welldefined. To see positivity it is necessary to know that B is a C ∗ -algebra. (In this case also Mn (B) is a C ∗ -algebra and we may refer to the positive square root of positive elements (hxi , xj i)i,j=1,... ,n ∈ Mn (B); see Lance [8, Chapter 4] for details.) E F is called the interior tensor product. If no confusion can arise, we say just tensor product or B-tensor product. Lance also shows that E F is a pre-Hilbert module, if E and F are Hilbert modules. We remark that the same is true (without changing a single word in the proof), if F is only a pre-Hilbert module. The same construction may be performed, when B is only a pre-C ∗ -algebra, if any b ∈ B acts as a bounded operator on F . In this case the argument to see positivity applies to the C ∗ -algebra B a F . Remark 3.9. Assume again that E and F are submodules of E 0 and F 0 , respectively. We already remarked that the canonical embedding from E F into E 0 F 0 needs not to be injective. It is, however, always an isometry. The same statement is true, if we consider the tensor product E B0 F over a subalgebra B 0 ⊂ B. Obviously, (3.2) defines a (B-valued) inner product also on E B0 F . The tensor product over a smaller algebra is always isometric to the tensor product over the bigger algebra. In the extreme case, one may choose even the tensor product of vector spaces (i.e. the tensor product over C). Often, we are interested in semi-Hilbert modules only up to isometry. Remark 3.10. Let us have another look at Corollary 3.3. Suppose that is a conditional expectation. In this case A is an example for a B-algebra; see Definition 6.2. It is easily checked that the semi-Hilbert module A ⊗ B may be replaced by the isometric semiHilbert module A B = A. This is a strong hint that a ∗-B-algebra with a positive B-B-linear mapping into B is the proper generalization of a ∗-algebra with a positive functional. See also [20, 22]. Definition 3.11. Let E denote a pre-Hilbert module over a von Neumann algebra of operators on a Hilbert space (B(S), for simplicity). By the strong Hilbert module topology on E we understand generated by the p the locally convex Hausdorff topology s family of semi-norms x 7→ hξ, hx, xiξi (ξ ∈ S). Denote by E the space obtained from E by completing the unit-ball of E in the strong Hilbert module topology. We observe that h•, •i : E × E → B(S) is jointly continuous in the strong Hilbert module topology of E and the weak topology of B(S). It follows that the inner product s extends strongly-weakly continuously to E . The strong topology is weaker than the s s norm topology, so that E is complete also in the norm topolgy. It can be shown that E

578

M. Skeide

is a self-dual Hilbert module (see the appendix for the definition of self-dual). Moreover, it can be shown that an arbitrary element of B a (E) extends uniquely to an element of s s a B E ; see [12, 19]. Therefore, if E is a pre-Hilbert A-B(S)-module then so is E . Example 3.12. The modules C0 (Rd , B(S))s and Cc (Rd , B(S))s . The examples of preHilbert B-modules most important for us are modules of functions with values in B. The inner product is given by a straightforward generalization of the usual inner product of L2 -spaces. To reduce technical difficulties it seems convenient to consider continuous functions. Unfortunately, it turns out that we need functions which are only strongly continuous. We close this section with some technical details. Although many of our statements can be generalized to other setups, we only consider the C ∗ -algebra B(S). By C0 (Rd , B(S))s = {f : Rd → B(S)|f ξ ∈ C0 (Rd , S) (ξ ∈ S)} we denote the module of all strongly continuous functions on Rd with values in B(S) vanishing strongly at infinity equipped with a B(S)-B(S)-module structure defined by the pointwise operations. Observe that C0 (Rd , S) is turned into a Banach space by the usual supremum norm. The strong topology on C0 (Rd , B(S))s is defined by the family f 7→ kf ξk (ξ ∈ S) of semi-norms. In addition, it follows by an application of the principle of uniform boundedness that the supremum norm exists also on C0 (Rd , B(S))s . In the same manner, it follows that C0 (Rd , B(S))s is an algebra (but not a ∗-algebra). We observe that the unit-ball of C0 (Rd , B(S))s is complete in both topologies in the uniform topology defined by the norm and in the strong topology. The B(S)-B(S)-module C0 (Rd ) ⊗ B(S) is contained in C0 (Rd , B(S))s , if we identify f ⊗ b with the function f b. It is a standard result that C0 (Rd ) ⊗ B(S) is dense in C0 (Rd , B(S)) with respect to the uniform topology; see e.g. Murphy [11]. Replacing in the proof uniform neighbourhoods by strong neighbourhoods, we obtain that C0 (Rd ) ⊗ B(S) is dense in C0 (Rd , B(S))s with respect to the strong topology. By Cc (Rd , B(S))s we denote the ideal of C0 (Rd , B(S))s , consisting of those functions with compact support. We turn Cc (Rd , B(S))s into a pre-Hilbert B(S)-B(S)-module by defining the inner product Z hf, gi = dk f ∗ (k)g(k). This integral is to be understood as the weak limit of Riemann sums. Observe that this inner product is jointly continuous with respect to the toplogy arising by restriction from C0 (Rd , B(S))s to functions with support in the same compact subset of Rd and the weak s topology of B(S). We denote L2 (Rd , B(S))s = Cc (Rd , B(S))s . Suppose fn n∈N is a net of elements fn ∈ C0 (Rd , B(S))s converging in the strong topology of C0 (Rd , B(S))s to an element f ∈ Cc (Rd , B(S))s . Let χ ∈ Cc (Rd ) such that χ is 1 on the support of f . Then also fn χ → f . From continuity of the inner product we conclude that fn χ → f also in the strong topolgy of Cc (Rd , B(S))s . Therefore, Cc (Rd ) ⊗ B(S) is dense in the strong Hilbert module topology of L2 (Rd , B(S))s . Actually, it can be shown that Cc (Rd ) ⊗ B(S) is sequentially dense in both strong topologies. The mapping b 7→ hf, bgi is weakly continuous on bounded subsets of B(S). This follows from the observation that if a certain Riemann sum for hξ, hf, f iξi differs from its limit at most by , then for b ≥ 0 the corresponding RiemannP sum for a∗i bai hξ, hf, bf iξi differs from its limit at most by kbk, and that, of course, b 7→ i

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

579

is a weakly continuous mapping. (Notice also hf, bf i ≤ kbk hf, f i.)R Furthermore, if f ∈ Cc (Rd , B(S))s and g ∈ Cc (Rd+e , B(S))s , then the mapping ` 7→ dk f ∗ (k)g(k, `) is an element of Cc (Re , B(S))s . Finally, for f ∈ Cc (Rd , B(S))s and g ∈ Cc (Re , B(S))s the mapping f · g : (k, `) 7→ f (k)g(`) is an element of Cc (Rd+e , B(S))s . These remarks show that f g 7→ f · g defines an isometry from Cc (Rd , B(S))s Cc (Re , B(S))s into Cc (Rd+e , B(S))s . Moreover, given an arbitrary element of Cc (Rd+e , B(S))s the order of k- and `-integration does not matter. 4. The Module for Finite λ In this section we explicitly define a pre-Hilbert module on which the ∗-algebra Aλ of collective operators may be represented. The module does not depend on λ. In view of the appendix it may be considered (up to completion) as the GNS-representation of the ∗-algebra La (R ⊗ S) in the vacuum conditional expectation as defined in Sect. 2. In view of Sect. 6 we call this module a symmetric Fock module. For the time being we fix on the algebra B(S) which contains the Weyl algebra W as a strongly dense subalgebra. In the limit λ → 0 the algebra B(S) turns out to be too big. However, in view of Remark 3.9 it is easy to restrict to submodules as well as to subalgebras. We consider the module ∞ s M Ccsym (Rd )n , B(S) . 0B(S) Cc (Rd , B(S))s = n=0

s

Ccsym (Rd )n , B(S) are the strongly continuous symmetric functions on (Rd )n with compact support and values in B(S). (A function F (kn , . . . , k1 ) depends on n arguments in Rd . F is symmetric, if it is invariant under all exchanges ki ↔ kj .) With the pointwise multiplication and the inner product Z Z hF, Gi = dkn · · · dk1 F ∗ (kn , . . . , k1 )G(kn , . . . , k1 ), (4.1) 0B(S) Cc (Rd , B(S))s is turned into a pre-Hilbert B(S)-B(S)-module; see Example 3.12 for details. We call this module the symmetric Fock module over Cc (Rd , B(S))s . Forany function f ∈ Cc (Rd , B(S))s we define the creator a+ (f ) on 0B(S) Cc (Rd , B(S))s , by setting X 1 f (ki )F (kn+1 , . . . , kbi , . . . , k1 ) n + 1 i=1 s for F ∈ Ccsym (Rd )n , B(S) . The creator has an adjoint in La 0B(S) Cc (Rd , B(S))s , namely, the annihilator a(f ), which is defined by setting Z √ [a(f )F ](kn−1 , . . . , k1 ) = n dk f ∗ (k)F (k, kn−1 , . . . , k1 ). n+1

[a+ (f )F ](kn+1 , . . . , k1 ) = √

Under the condition [f ∗ (k), g(k)] = 0 for all k one easily checks the relations a(f )a+ (g) − a+ (g)a(f ) = hf, gi

(4.2)

580

M. Skeide

which parallel the relations fulfilled by the creators and annihilators on the usual symmetric Fock space. The condition on f and g is fulfilled, if, for instance, one of them takes values only in the complex multiples of 1. Such a function commutes with all algebra elements. The vector subspace of a B-B-module E consisting of all x ∈ E which commute with all elements of B is called the B-center of Eand denoted by CB (E). We remark that both Cc (Rd , B(S))s and 0B(S) Cc (Rd , B(S))s are toplogically generated by their B(S)-center (i.e. the submodule generated by the B(S)-center is dense); see Example 3.12. In Sect. 6 this property shows to be essential for a systematic construction of a symmetric Fock module. The left-hand side of (4.2) is an operator. Consequently, also the algebra element hf, gi on the right-hand side is an operator, namely, multiplication of a module element by hf, gi from the left. Henceforth, it is impossible to understand Relations (4.2) without explicit reference to the left module structure of 0B(S) Cc (Rd , B(S))s . Notice also that a+ (bf b0 ) = ba+ (f )b0 for all b, b0 ∈ B(S). This means f 7→ a+ (f ) is a B(S)-B(S)d s linear mapping and f 7→a(f ) is a B(S)-B(S)-anti-linear mapping Cc (R , B(S)) → a d s L 0B(S) Cc (R , B(S)) . From these remarks it follows that the ∗-algebra generated by a+ Cc (Rd , B(S))s is an example for a ∗-B(S)-algebra; see Definition 6.2. Now our aim is to represent the collective operators as suitable creators and annihi lators on the symmetric Fock module 0B(S) Cc (Rd , B(S))s . Let f be an element in V (see Sect. 2). We define the mapping ϕλ : V → Cc (Rd , B(S))s , by setting Z [ϕλ (f )](k) = dτ γλ (τ, k)f (τ, k). (Since γλ is only strongly continuous, ϕλ maps, indeed, into Cc (Rd , B(S))s and not into Cc (Rd , B(S)).) Having a look at (2.3), we get the impression, as if A+λ (f ) wants to create the function ϕλ (f ). This impression is fully reconfirmed by the language of modules. Theorem 4.1. The equation α(A+λ (f )) = a+ (ϕλ (f )) defines a ∗-algebra monomorphism α : Aλ → La 0B(S) Cc (Rd , B(S))s . Moreover, s identifying 1 ∈ B(S) = Ccsym (Rd )0 , B(S) ⊂ 0B(S) Cc (Rd , B(S))s , we have (Mλ ) = h1, α(Mλ )1i

(4.3)

for any monomial Mλ in collective operators. Proof. We could check this directly by applying a state of the form hξ, •ξi to (4.3) and realizing that the left-hand side is 0, if and only if the right-hand side is 0. However, we prefer to show how our methods from the appendix work. ∞ L L2 (Rd )⊗sym n ⊗ S . The Denote by H a partial completion of R ⊗ S = n=0

completion is partial in the sense that each of the direct summands is completed, but the direct sum remains algebraic. H is a common invariant domain for the collective operators. a According to Theorem A.5 the ∗-algebra L (H) of adjointable operators on H is isomorphic to the ∗-algebra La La (S, H) of adjointable operators on the La (H)La (S)-module La (S, H). The isomorphism, denoted by α, ¯ maps the element a ∈ La (H) a to the map e a : L 7→ aL for L ∈ L (S, H).

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

581

An element F ∈ 0B(S) Cc (Rd , B(S))s may be considered as an element of of S. It is easily checked that α(A ¯ +λ (f )) La (S, H) by letting F act pointwise on elements d s + restricted to 0B(S) Cc (R , B(S)) is α(Aλ (f )). This implies that α is a homomorphism. a a By Corollary A.6 an element of L L (S, H) is already determined by its restriction to R ⊗ La (S). 0B(S) Cc (Rd , B(S))s contains R ⊗ La (S). Therefore, α is injective. a e Eq. (4.3) follows from (A.1) and from the fact that ( ⊗ id) ∈ L (S, H) corresponds d s to 1 ∈ 0B(S) Cc (R , B(S)) . 1 is not yet necessarily a cyclic vector for the range of α. However, if we denote by d V (B(S)) the module spanned by functions f : R × R → B(S) of the form f (τ, k) = d s ˘ ˘ χ[t,T ] (τ )f (τ, k) f ∈ Cc (R × R , B(S)) , then it is possible to extend the definitions of the collective operators and of ϕλ to V (B(S)). Also α extends to the bigger ∗-algebra generated by A+λ V (B(S)) and Theorem 4.1 remains true. We will see later that now 1 is at least toplogically cyclic. Notice that ϕλ is right linear automatically. We turn V (B(S)) into a semi-Hilbert B(S)-module, by defining the semi-inner product hf, giλ = hϕλ (f ), ϕλ (g)i . By defining the left multiplication [b.f ](t, k) = γλ−1 (t, k)bγλ (t, k)f (t, k)

(4.4)

V (B(S)) becomes a semi-Hilbert B(S)-B(S)-module and ϕλ a (B(S)-B(S)-linear) isometry. Notice that this left multiplication makes f 7→ A+λ (f ) a B(S)-B(S)-linear mapping, where La (H) is equipped with its natural B(S)-B(S)-module structure. (See the proof of Theorem (4.1) for the definition of H.) Proposition 4.2. ϕλ extends to an isomorphism between the Hilbert B(S)-B(S)-modules s s V (B(S)) and L2 (Rd , B(S))s . A fortiori, all V (B(S)) for different λ > 0 are isomorphic. d Proof. Observe that γλ−1 V (B(S)) ⊂ V (B(S)). Let b d∈ B(S) and f ∈ Cc (R ). We have −1 ϕλ (γλ χ[0,1] f b) = f b, so that ϕλ V (B(S)) ⊃ Cc (R ) ⊗ B(S). From Example 3.12 we s conclude that ϕλ V (B(S)) = L2 (Rd , B(S))s . s ϕλ is an isometry and extends as a surjective isometry from V (B(S)) to L2 (Rd , B(S))s . Clearly, this extension is an isomorphism. Remark 4.3. The preceding proof shows that the operators A+λ V (B(S)) applied successively to 1 generate a strongly dense subspace of 0B(S) Cc (Rd , B(S))s . Therefore, 1 is topologically cyclic. Notice that all results obtained so far remain valid, if we choose for γλ an arbitrary invertible element of Cb (R × Rd , B(S))s (the bounded strongly continuous functions). s

Remark 4.4. The two pictures L2 (Rd , B(S))s and V (B(S)) of the same Hilbert module are useful for two different purposes. L2 (Rd , B(S))s shows more explicitly the algebraic structure which appears simply as the pointwise operations on a two-sided module of functions with values in an algebra. The property that the module is generated by its centered elements can be seen clearly only in this picture. In Sect. 6 this yields to a systematic construction of the symmetric Fock module out of its one-particle sector L2 (Rd , B(S))s . For the limit λ → 0, however, we concentrate on the elements of the s generating subset V ⊂ V (B(S)) . (The image of f ∈ V in L2 (Rd , B(S))s under ϕλ does not converge to anything.)

582

M. Skeide

5. The Limit of the One-Particle Sector This section is the analytical heart of these notes. We compute the limit of the module V (B(S)). In Sect. 7 we point out how the results of this section can be generalized, using the algebraic results of Sect. 6, to the full system. Motivated by Remark 4.4 we give the definition of what we understand by a limit of Hilbert modules. Definition 5.1. Let V denote a vector space. A family of semi-Hilbert B-B-modules Eλ λ∈3 with linear embeddings iλ : V → Eλ is called V -related, if the B-Bsubmodule generated by iλ (V ) is Eλ . In this case iλ extends to a B-B-linear mapping from VBf onto Eλ . We turn VBf into a semi-Hilbert B-B-module Vλ , by defining the semi-inner product hf, giλ = hiλ (f ), iλ (g)i for f, g ∈ VBf . Let T1 and T2 be locally convex Hausdorff topologies on B. A semi-Hilbert B-Bmodule E is called sequentially T1 -T2 -continuous, if for all f, g ∈ E any of the four functions b 7→ hf, gbi, b 7→ hf, bgi, b 7→ hf b, gi and b 7→ hbf, gi on B is sequentially T1 -T2 -continuous. Let 3 be a net converging to λ0 ∈ 3 and B0 a ∗-subalgebra of B which is sequentially T1 -dense. We saya V -related family of sequentially T1 -T2 -continuous semiHilbert B-B-modules Eλ λ∈3 converges to Eλ0 , if limhf, giλ = hf, giλ0

(5.1)

λ

in the topology T2 for all f, g ∈ VBf 0 . We write lim Eλ = Eλ0 . λ

Remark 5.2. Some comments on this definition are in place. We are interested in the limits of the semi-inner products of elements of VBf . However, it turns out that the limit may be calculated only on the submodule VBf 0 , where B0 is a sufficiently small subalgebra of B, and in a sufficiently weak topology T2 . (If this limit took values also in B0 , we could stay with VBf 0 and forget about B. Unfortunately, this will not be the case.) By the requirement that B0 is a sequentially dense subalgebra of B in a sufficiently strong topology T1 and by the T1 -T2 -continuity conditions we assure that the semi-inner product on VBf 0 (with values in B) already determines the semi-inner product on VBf . Suppose that Eλ λ∈3\{λ0 } is V -related and sequentially T1 -T2 -continuous and that Eq. (5.1) holds. Furthermore, suppose that also the limit semi-inner product fulfills the continuity conditions and its extension to elements of VBf still takes values in B. Then VBf with extension of the semi-inner product (5.1) by T1 -T2 -continuity is a sequentially T1 -T2 -continuous semi-Hilbert B-B-module. Letting Eλ0 = VBf , the family Eλ λ∈3 is V -related, sequentially T1 -T2 -continuous and we have lim Eλ = Eλ0 . λ

Obviously, after dividing out all null-spaces, Definition 5.1 may be restricted to the case of pre-Hilbert modules. If B is a pre-C ∗ -algebra and left multiplication is norm continuous on all Eλ , we may perform a completion. Convergence of a family of Hilbert modules means that there is a familiy of dense submodules for which Definition 5.1 applies.

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

583

Remark 5.3. If the ∗ is continuous in both topologies, then it is sufficient to check the T1 T2 -continuity conditions only for either the left or the right argument of the semi-inner product. Furthermore, if the multiplication in B0 is separately T2 -continuous, then it is sufficient to compute (5.1) on elements of the left generate of V in VBf 0 . However, there is no way out of the necessity to compute the limit on any single element in the left generate. This had been avoided in [1], so that the convergence used therein is at most a convergence of right Hilbert modules. However, notice that, in particular, the left multiplication will cause later on a big growth of the limit module. The algebraic structures in Sects. 4 and 6 cannot even be formulated without the left multiplication. Now we start choosing the ingredients of Definition 5.1 for our problem. For B0 we choose the ∗-algebra W0 = span{eiκ·p eiρ·q : κ, ρ ∈ Rd } of Weyl operators. In order to proceed, we have to recall some basic facts about the Weyl algebra. For reference see e.g. the book [13] of Petz. The Weyl algebra W is the C ∗ -algebra generated by unitary groups of elements of a C ∗ -algebra subject to Relations (2.1). By Slawny’s theorem this C ∗ -algebra is unique, so that the definition makes sense. A representation of W on a Hilbert space induces a weak topology on W. However, this topology depends highly on the representation under consideration. For instance, we identify elements b ∈ W always as operators in B(S). In this representation the operators depend strongly continuously on the parameters κ and ρ. (Such a representation is called regular. An irreducible regular representation of W is determined up to unitary equivalence.) Denote by Wp and Wq the ∗-subalgebras of W0 spanned by all eiκ·p and spanned by all eiρ·q , respectively. The Weyl operators are linearly independent, i.e. as a vector space we may identify W0 with Wp ⊗ Wq via eiκ·p eiρ·q ≡ eiκ·p ⊗ eiρ·q . Since {eiρ·q }ρ∈Rd L is a basis for Wq , we may identify B0 with Wp . We identify Wp as a subalgebra ρ∈Rd

d of L∞ (Rd ) ⊂ B(S). By the momentum algebra P we mean the ∗-subalgebra L Cb (R ) of L∞ (Rd ). Notice that the C ∗ -algebra P contains Wp . For B we choose P. We ρ∈Rd L ∞ d L (R ) ⊂ B(S). have B ⊂ ρ∈Rd

In order to define the topology T1 , we need the weak topology arising from a different L 2 d L (R ) (consisting of families representation. We define a representation π of W on ρ∈Rd fρ ρ∈Rd , where fρ ∈ L2 (Rd )) by setting 0

π(eiκ·p eiρ ·q ) fρ

0

ρ∈Rd

= eiκ·p eiρ ·q fρ−ρ0

This representation extends to elements b ∈

L ρ∈Rd

ρ∈Rd

.

L∞ (Rd ). It is, roughly speaking,

regular with respect to κ, however, ‘discrete’ with respect ρ0 . L to L∞ (Rd ) with the restriction Let I denote a finite subset of Rd . We equip ρ∈I L ∞ d of the weak topology on L (R ) induced by the representation π. We equip ρ∈Rd L ∞ d L (R ) with a different topology by considering it as the strict inductive limit ρ∈Rd L L∞ (Rd ) ; see e.g. Yosida [23, Definition I.1.6]. Clearly, a sequence of d ρ∈I

I⊂R

584

P ρ∈Rd

M. Skeide

eiρ·q hnρ

n∈N

=

hnρ

in

ρ∈Rd n∈N

L ρ∈Rd

L∞ (Rd ), where hnρ ∈ L∞ (Rd ), con-

verges, if and only if the hnρ are different from zero only for a finite number of ρ ∈ Rd and if any of the sequences hnρ n∈N (ρ ∈ Rd ) converges in the weak topology of L ∞ d L (R ) is sequentially complete and that B0 is sequentially L∞ (Rd ). Notice that ρ∈Rd

dense in this topology. By restriction to B, we obtain the topology T1 . Notice that convergence of a sequence in the topology T1 also implies convergence in the weak topology of B(S). The topology T2 is the topology induced by matrix elements with respect to the Schwartz functions S(Rd ). Thus, hf, giλ converges to b ∈ B(S), if and only if hξ, hf, giλ ζi converges to hξ, bζi for all ξ, ζ ∈ S(Rd ). Since an element in B0 leaves invariant the domain of Schwartz functions, the multiplication with elements of B0 is a T2 -continuous operation. Also the ∗ is continuous in both topologies, i.e. Remark 5.3 applies. Of course, we choose 3 = [0, ∞), ordered decreasingly, and λ0 = 0. We return to V = A0 (R) ⊗ Cc (Rd ). Fix λ > 0 and consider V (B(S)) equipped with its semi-inner product h•, •iλ , the left multiplication (4.4) and the embeding iλ being the extension of the canonical embedding i : V → V (B(S)). Then our Eλ are iλ (VBf ). Proposition 5.4. The Eλ λ>0 form a V -related, sequentially T1 -T2 -continuous family of semi-Hilbert B-B-modules. Proof. First, we show T1 -T2 -continuity. Notice that for sequences convergence in T1 implies convergence in the weak topology and that convergence in the weak topology implies convergence in T2 . Therefore, it suffices to show that for all f, g ∈ VBf the mappings b 7→ hf, gbiλ and b 7→ hf, b.giλ are sequentially weakly continuous. However, by right B-linearity, continuity of the first mapping is a triviality. The second mapping, actually, is an inner product of element of Cc (Rd , B(S))s . By the concluding remarks in Example 3.12 we know that the mapping depends weakly continuous on b on bounded subsets. In particular, it is sequentially weakly continuous. It remains to show that the inner product maps into B. For f = χ[t,T ] fe, g = χ[s,S] ge ∈ V we have Z Z T Z S hf, giλ = dk dτ dσ fe(k)e g (k)γλ∗ (τ, k)γλ (σ, k) t

s

t

s−τ λ2

Z Z T Z S σ−τ 1 dk dτ dσ fe(k)e g (k)ei λ2 (p·k+ω(|k|)) = 2 λ t s Z Z T Z S−τ λ2 = dk dτ du fe(k)e g (k)eiu(p·k+ω(|k|)) .

(5.2)

This is the weak limit of elements in Wp and, therefore, an element of P ⊂ B. Automatically, we have hf, gbiλ ∈ B for b ∈ B. Now consider b = h(p)eiρ·q ∈ B (h ∈ P). By Eq. (2.1) and manipulations similar to (5.2) we find Z Z T Z S−τ τ λ2 dτ du fe(k)e g (k)e−i λ2 ρ·k eiu((p−ρ)·k+ω(|k|)) h(p + k)eiρ·q . hf, b.giλ = dk s−τ t (5.3) λ2

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

585

The integral without the factor eiρ·q is a continuous bounded function of p, i.e. an element of the momentum algebra P ⊂ B. It follows that also hf, b.giλ ∈ B. Next we evaluate the limit in (5.1). The following proposition is just a repetition of a result in [1]. However, notice that the integrations have to be performed precisely in the order indicated (i.e. the p-integration first). Proposition 5.5 ([1]). Let f, g ∈ V be given as for Eq. (5.2) and ξ, ζ ∈ S(Rd ). Then Z Z Z lim hξ, hf, giλ ζi = hχ[t,T ] , χ[s,S] i dk fe(k)e g (k) du dp ξ(p)ζ(p)eiu(p·k+ω(|k|)) . λ→0 (5.4) The factor hχ[t,T ] , χ[s,S] i is the inner product of elements of L2 (R). Proof. The matrix element of Eq. (5.2) is Z Z Z T dτ hξ, hf, giλ ζi = dk fe(k)e g (k) Z =

t

dk fe(k)e g (k)

Z

S−τ λ2 s−τ λ2

T

dτ t

Z du eiuω(|k|)

Z

S−τ λ2

dp ξ(p)ζ(p)eiup·k

du eiuω(|k|) d ξζ (uk).

s−τ λ2

For λ > 0 the order of integrations does not matter, so we may, indeed, decide to perform ξζ , the Fourier transform of ξζ, is a rapidly decreasing function. the p-integration first. d Therefore, the λ-limit in the bounds of the u-integral may by performed for almost all k (namely k 6= 0) and all τ . Depending on the sign of s − τ and S − τ , respectively, the bounds converge to ±∞. A careful analysis, involving the theorem of dominated convergence, yields the scalar product of the indicator functions in front of (5.4). The resulting function of k is bounded by a positive multiple of the function | f k(k) | which is integrable for d ≥ 2. By another application of the theorem of dominated convergence and a resubstitution of d ξζ the formula follows. Now we will show as one of our main results that the sesquilinear form on S(Rd ) given by (5.4) indeed defines an element of B. In [1] it was not clear, if (5.4) defines R any operator on S. Denote by ek the unit vector in the direction of k 6= 0 and by dek the angular part of an integration over k in polar coordinates. Lemma 5.6. Let f be an element of Cc (Rd ), ξ be an element of S(Rd ) and d ≥ 3. Furthermore, let ω be a C1 -function R+ → R+ of the form ω(r) = rω0 (r), where 0 ≤ ω0 (0) < ∞ and ω00 bounded below by a constant c > 0. Denote by ω0−1 the inverse function of ω0 extended by zero to arguments less than ω0 (0). Then Z Z Z dk f (k) du dp ξ(p)eiu(p·k+ω(|k|)) Z Z ω −1 (−p · ek )d−2 = 2π dp ξ(p) dek 00 −1 f (ω0−1 (−p · ek )ek ). ω0 (ω0 (−p · ek )) Moreover,

Z dek

ω0−1 (−p · ek )d−2 f (ω0−1 (−p · ek )ek ) ω00 (ω0−1 (−p · ek ))

as a function of p is an element of Cb (Rd ).

586

M. Skeide

Remark 5.7. Formally, we can perform the u-integration and obtain Z Z 2π dk f (k) dp ξ(p)δ p · k + ω(|k|) , where the δ-distribution is one-dimensional (not d-dimensional). The statement of the lemma arises by performing the integration over |k| first and use of the formal rules for δ-functions. However, f is in general not a test function and the domain of the |k|integration is R+ , not R. Therefore, some attention has to be paid. We will use this formal δ-notation whenever it is justified by Lemma 5.6. Proof of Lemma [5.6]. Let us write k in polar coordinates, i.e. k = rek . For fixed k 6= 0 we write the p-integral in cartesian coordinates with the first coordinate p0 being the component of p along ek . Then p has the form p = p0 ek + p⊥ with p⊥ the unique component of p perpendicular to ek . In this representation the exponent has the form iur(p0 + ω0 (r)) and we may apply the inversion formula of the theory of Fourier transThe result may be described forms to the p0 -integration followed by the u-integration. formally by the δ-function 2πδ r(p0 + ω0 (r)) for the p0 -integration. We obtain Z Z Z du dp ξ(p)eiu(p·k+ω(|k|)) = 2π dp ξ(p)δ r(p0 + ω0 (r)) Z 2π dp ξ(p)χ[0,] p · k + ω0 (|k|) . = lim →0 It is routine to check that the right-hand side is bounded uniformly in ∈ (0, 1] by a 1 . Therefore, again by the theorem of dominated convergence we positive multiple of |k| may postpone the -limit also for the k-integration and obtain Z Z Z dk f (k) du dp ξ(p)eiu(p·k+ω(|k|)) Z Z 2π dk f (k) dp ξ(p)χ[0,] p · k + ω0 (|k|) . = lim →0 Now the order of integrations no longer matters. We choose polar coordinates for the k-integration and perform first the integral over r = |k|. The above formula for finite becomes Z Z Z 2π dr rd−1 f (rek )χ[0,] r(p · ek + ω0 (r)) . dp ξ(p) dek Consider the function F (r) = r(p·ek +ω0 (r)). From the properties of ω0 it follows that ω0 (r) ≥ ω0 (0)+cr. Consequently, F (r) ≥ r(p·eRk +ω0 (0))+cr2 . If p·ek +ω0 (0) ≥ 0, then F (r) ≥ cr2 and, because d ≥ 3, the integral 1 dr rd−1 f (rek )χ[0,] r(p · ek + ω0 (r)) converges to 0 for → 0 uniformly in p · ek ≥ −ω0 (0). On the other hand, if p · ek + ω0 (0) < 0, then F (r) starts with 0 at r = 0, is negative until the second zero r0 = ω0−1 (−p · ek )) and increases monotonically faster than cr2 . We make the substitution µ = F (r) and obtain Z 1 dr rd−1 f (rek )χ[0,] r(p · ek + ω0 (r)) Z 1 r(µ)d−1 f (r(µ)ek ). = dµ 0 p · ek + ω0 (r(µ)) + r(µ)ω00 (r(µ))

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

The integrand is bounded by

r(µ)d−2 ω00 (r(µ))

587

sup |f (k)|. Therefore, the integral converges unik∈Rd

formly in p · ek < −ω0 (0) to the limit

r0d−2 f (r0 ek ). ω00 (r0 ) Substituting the concrete form of r0 and extending ω0−1 by ω0−1 (F ) = 0 for F ≤ ω0 (0), we obtain the claimed formula. The last statement of the lemma follows from the observation that ω0−1 is a continuous function and that if ω0−1 (−p · ek ) is big, then f (ω0−1 (−p · ek )) = 0. Corollary 5.8. The sesquilinear form on S(Rd ) given by (5.4) defines an element of B. Formally, we denote this element by hf, gi0 = 2πhχ[t,T ] , χ[s,S] i

Z

dk fe(k)e g (k)δ p · k + ω(|k|) .

Notice also the commutation relations Z g (k)δ (p − ρ) · k + ω(|k|) 2π dk fe(k)e Z g (k)δ p · k + ω(|k|) e−iρ·q . = eiρ·q 2π dk fe(k)e Again it is clear that the limit extends to the right B0 -generate of V and that the function b 7→ hf, gbi0 extends weakly continuous, i.e. a fortiori T1 -T2 -continuous, from B0 to B. It remains to show this also for the left B0 -generate. Proposition 5.9. Let again f, g ∈ V be given as for Equation (5.2) and ξ, ζ ∈ S(Rd ). Furthermore, let b = eiκ·p eiρ·q ∈ B0 . Then Z λ lim hξ, hf, b.gi ζi = δρ0 hχ[t,T ] , χ[s,S] i dk fe(k)e g (k)eiκ·k λ→0 Z Z × du dp ξ(p)eiu((p−ρ)·k+ω(|k|)) eiκ·p eiρ·q ζ (p). (5.5) Remark 5.10. δρ0 is, indeed, the Kronecker δ. So the left multiplication in the limit is no longer weakly continuous. This is the reason for our rather complicated choice of the topology T1 . Proof of Proposition 5.9. Our starting point is equation (5.3). If ρ = 0 the statement follows precisely as in the proof of Proposition 5.5. If ρ 6= 0, the expression is similar to the case ρ = 0 (where ζ is replaced by eiρ·q ζ). τ The only difference is the oscillating factor e−i λ2 ρ·k . Similarly, one argues that the λ-limit in the bounds of the u-integral may be performed first. By an application of the Riemann–Lebesgue lemma the resulting integral over the oscillating factor converges to 0. Remark 5.11. The proposition shows in a particularly simple example how the RiemannLebesgue lemma makes a lot of matrix elements disappear in the limit. This fundamental idea is due to [1]. However, in [1] the idea was not applied to the left multiplication.

588

M. Skeide

Corollary 5.12. The sesquilinear form on S(Rd ), defined by (5.5), determines the element hf, b.gi0 = δρ0 eiρ·q hf, (eiκ·k g)i0 e−iρ·q eiκ·p eiρ·q in Peiρ·q ⊂ B. Moreover, the mapping b 7→ hf, b.gi0 extends sequentially T1 -T2 continuously from B0 to B. Proof. Like for ρ = 0, it follows that (5.5), indeed, defines an element of Peiρ·q . Now we observe that a matrix element hf, h.gi0 , written in the form according to Lemma 5.6, may be extended from elements in Wp to all elements h ∈ P. It suffices to show that the mapping h 7→ hf, h.gi0 is sequentially weakly continuous on P. To see this we perform first the p-integral and obtain a bounded function on ek . Inserting a sequence hn n∈N , the resulting sequence of functions on ek is uniformly bounded. By the theorem of dominated convergence we may exchange limit and ek -integration. The following theorem is proved just by collecting all the results. Theorem 5.13. The Eλ λ≥0 form a V -related, sequentially T1 -T2 -continuous family of semi-Hilbert B-B-modules and lim Eλ = E0 .

λ→0

Now we Lare going to understand the structure of E0 better. Consider E0 = B ⊗ P ⊗ V ⊗ B. Any of the summands P ⊗ V ⊗ B inherits a semi-Hilbert V ⊗B= ρ∈Rd

P-B-module structure just by restriction of the operations of E0 . Notice that the inner products differ for different indices ρ. However, the left multiplications by elements 0 h ∈ P coincide. Of course, multiplication of an element in the ρth summand by eiρ ·q from the left, is not only pointwise multiplication, but shifts this element into the (ρ+ρ0 )th summand. Next we recall that V = A0 (R) ⊗ Cc (R). The factor hχ[t,T ] , χ[s,S] i tells us that E0 is the exterior tensor product of the pre-Hilbert C-C-module A0 (R) and B ⊗ Cc (R) ⊗ B with a suitable semi-Hilbert B-B-module structure. In order to combine both observations we make the following definition. Fix ρ ∈ Rd . We turn Cc (Rd , B)s into a P-B-module by pointwise multiplication by elements of B from the right and the left multiplication defined by setting [h.f ](k) = h(p + k)f (k). Denote by Vρr the P-B–submodule of Cc (Rd , B)s generated by Cc (Rd ). We turn Vρr into a semi-Hilbert P-B-module by setting Z (ρ) hf, gi = 2π dk f (k)∗ δ (p − ρ) · k + ω(|k|) g(k). 0 Vρr . For an element fρ ρ∈Rd ∈ E we define the left action of eiρ ·q by ρ∈Rd 0 0 eiρ ·q . fρ ρ∈Rd = eiρ ·q fρ−ρ0 ρ∈Rd . The following theorem may be checked simply by inspection.

Set E =

L

Theorem 5.14. The mapping X (hρ eiρ·q ) ⊗ χ[t,T ] fρ ⊗ bρ 7−→ χ[t,T ] ⊗ hρ .(eiρ·q fρ )bρ ρ∈Rd , ρ∈Rd

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

589

where χ[t,T ] ∈ A0 (R), fρ ∈ Cc (Rd ) ⊂ Vρr , hρ ∈ P, and b ∈ B, (all different from 0 only for finitely many ρ ∈ Rd ) defines a surjective B-B-linear isometry E0 = B ⊗ V ⊗ B −→ A0 (R) ⊗ E. The ⊗-sign on the right-hand side is that of the exterior tensor product. Remark 5.15. Cc (Rd , B)s may be considered as a completion of Cc (Rd ) ⊗ B. The left multiplication by elements of P leaves invariant Cc (Rd , B)s and the inner product of s E0 , first restricted to P ⊗ V ⊗ B = P ⊗ VBr and then extended to P ⊗ VBr , does not distinguish between elements h ⊗ f and 1 ⊗ (h.f ). Therefore, already the comparably small spaces Vρr are sufficient to obtain an isometry. We find the commutation relations i h 0 0 (eiκ·p eiρ ·q ). fρ ρ∈Rd (t, k) = eiκ·k fρ−ρ0 (t, k) ρ∈Rd (eiκ·p eiρ ·q )

(5.6)

for elements fρ ∈ V . This means for an arbitrary element in f ∈ A0 (R) ⊗ E and 0 0 ρ, κ ∈ Rd there exists f 0 ∈ A0 (R) ⊗ E such that (eiκ·p eiρ ·q ).f = f 0 eiκ·p eiρ ·q and conversely. (The same is true already for E.) Such a possibility is not unexpected, because it already occurs for finite λ. Remark 5.16. Of course, it is true that E is non-separable. This is not remarkable due to the non-separability of W0 . However, the separabilty condition usually imposed on Hilbert modules is that of being countably generated. Clearly, E fulfills this condition, because V is separable. A much more remarkable feature is that the left multiplication is no longer weakly continuous. However, also this behaviour is not completely unexpected. It often happens in certain limits of representations of algebras that certain elements in the representation space, fixed for the limit, become orthogonal. Consider, for instance, the limit } → 0 for the canonical commutation relations or the limits q → ±1 for the quantum group SUq (2), see [17]. In both examples the limits of suitably normalized coherent vectors become orthogonal in the limit. Remark 5.17. Since B ⊂ B(S) is a pre-C ∗ -algebra, E has a semi-norm and right multiplication fulfills kf bk ≤ kf k kbk. We show that left multiplication by an element of B 2 sup hξ, hf, f iξi. Any eleacts at least boundedly on E. Indeed, we have kf k = ξ∈S(Rd ),kξk=1 ment in b = hρ ρ∈Rd ∈ B may be T1 -approximated by a sequence bn n∈N of elements in B0 , where bn = hnρ ρ∈Rd . By the Kaplansky density theorem and weak separabilty

of the unit-ball of L∞ (Rd ), we may assume that hnρ = khρ k (ρ ∈ Rd , n ∈ N). We have 2

hξ, hbn ϕλ (f ), bn ϕλ (f )iξi ≤ kbn k hξ, hϕλ (f ), ϕλ (f )iξi X 2 ≤ hξ, hϕλ (f ), ϕλ (f )iξi khρ k . ρ∈Rd

The number of ρ’s for which hnρ 6= 0 for at least one n ∈ N is finite. Our claim follows, performing the limits first λ → 0 and then n → ∞. Therefore, if necessary, we may change to the Hilbert B-B-module E where, however, B is only a pre-C ∗ -algebra. This is all that is needed for the construction of the interior tensor products in Sect. 6.

590

M. Skeide

Remark 5.18. It is not difficult to see that the left and right multiplication, actually, are sequentially T1 -continuous. Therefore, all our results in this section and in Sect. 7 may be extended to the sequentially T1 -complete algebra L∞ (Rd ) ⊗ Wq .

6. Full and Symmetric Fock Modules This section together with the appendix is the algebraic heart of these notes. We recall the construction of the full Fock module which is the Hilbert module carrying Pimsner’s generalized Cuntz-Krieger algebras, see [14]. Then we show that for a suitable subcategory of the two-sided Hilbert modules, the so-called centered Hilbert modules, a construction paralleling the construction of the symmetric Fock space can be performed. The module from Sect. 4 which describes the physical system will be identified as the symmetric Fock module over Eλ . In Sect. 7 we will see that the full Fock module over A0 (R)⊗E will be the representation space on which the limits of the collective operators may be represented as elements of the corresponding generalized Cuntz-Krieger algebra. We explain the connection with B-algebras. A B-algebra with a conditional expectation is for operator-valued free probability what an algebra with a state is for usual quantum probability; see Voiculescu [22] and also [18]. The subcategory of centered B-algebras is the basis for operator-valued Bose probability; see [18]. Definition 6.1. Let E denote a semi-Hilbert B-B-module. By the full Fock module

zB (E) over E we mean the semi-Hilbert B-B-module zB (E) =

M

E n

n∈N0

(E 0 = B). By our remark in Definition 3.3 zB (E) is a pre-Hilbert module, if E is a Hilbert module and B is a C ∗ -algebra. On zB (E) we define the creators `+ (x) (x ∈ E) by setting `+ (x)xn . . . x1 = x xn . . . x1

`+ (x)1 = x

and the annihilators `(x) (x ∈ E) by setting `(x)xn . . . x1 = hx, xn ixn−1 . . . x1

`(x)1 = 0.

The creator and annihilator to the same x ∈ E are adjoint elements of La (zB (E)). Moreover, we have the relations `(x)`+ (y) = hx, yi,

(6.1)

where the algebra element hx, yi again acts as multiplication from the left. Notice that, unlike Relations (4.2), the above relations hold for all x, y ∈ E. The above definition has been introduced by Pimsner [14]. However, notice that Pimsner only considers complete modules. The first use in quantum probability occurred in Speicher [20].

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

591

Definition 6.2. A B-algebra is a B-B-module A with a B-B-linear multiplication M : A A → A and a B-B-linear unit mapping m : B → A, such that the associativity condition M ◦ (M id) = M ◦ (id M ) and the unit property M ◦ (m id) = id = M ◦ (id m) are fulfilled. We use the notation M (a b) = ab and m(1) = 1 (i.e. 1B = 1A = 1). A ∗-B-algebra is a B-algebra with a B-B-anti-linear involution fulfilling the usual properties. Remark 6.3. For B = C we recover the usual abstract definition of a unital complex algebra. In [22] a B-algebra is introduced as an algebra A which contains B as a subalgebra and 1B = 1A . One easily checks that this case is included in our definition as the case when m is injective, because m(B) always is a subalgebra of the algebra A. Example 6.4. The full Fock module zB (E) is turned into a B-algebra by setting (xn · · · x1 )(ym · · · y1 ) = xn · · · x1 ym · · · y1 and m(b) = b ∈ E 0 . Forgetting about the Hilbert module structure, we denote this B-algebra by TB (E) and call it the tensor B-algebra over E. For B = C we recover the usual tensor algebra over a vector space. As a B-algebra TB (E) with the canonical embedding i : E → E 1 is determined up to B-algebra isomorphism by the following universal property: An arbitrary B-B-linear mapping j : E → A into an arbitrary B-algebra A extends to a unique B-algebra homomorphism e j ◦ i. j : TB (E) → A, such that j = e Example 6.5. The ∗-algebra A` generated by all creators and annihilators on zB (E) is not necessarily unital. However, the elements of B act as double centralizers on A` . Denote by m the embedding of B into the multiplier algebra M (A` ). The ∗-subalgebra of M (A` ) generated by A` and m(B) is an example for a ∗-B-algebra where m is not necessarily injective. The generalized Cuntz-Krieger algebra is the ∗-B-subalgebra of La (zB (E)) generated by A` and B. In [14] it is shown that this algebra may be considered as the ∗-algebra generated by the module E, the ∗-algebra B, Relations (6.1) and `+ (axa0 + byb0 ) = a`+ (x)a0 + b`+ (y)b0 for all x, y ∈ E; a, a0 , b, b0 ∈ B. Example 6.6. Let V be a vector space. Then zB (VB ) = (z(V ))B . If, for instance, V = Cc (Rd ), then VB is the subspace of those functions in Cc (Rd , B) which take values only in a finite-dimensional subspace of B. Notice that, in general, Cc (Rd , B) Cc (Rd , B) 6⊂ Cc (R2d , B). However, as semi-Hilbert modules Cc (Rd , B) Cc (Rd , B) and the submodule of Cc (R2d , B) generated by elements f (k2 )g(k1 ) are isometric. The (unital) free product of (unital) algebras A, B is defined as A ? B = T (A ⊕ B)/(a ⊗ a0 − aa0 , b ⊗ b0 − bb0 , 1T − 1A , 1T − 1B ), see [20, 21]. One easily checks that TB (VBf )/(1T − 1B ) = T (V ) ? B. This shows a close connection between free two-sided modules and free products of algebras. By the

592

M. Skeide

universal property one checks the functorial property TB (E ⊕ F ) = TB (E) ?B TB (F ), where ?B denotes the B-free product of B-algebras; see [20]. (Roughly speaking, in the definition of ? one has to replace T by TB and ⊗ by . The B-free product is the coproduct in the category of B-algebras.) Thinking again about Hilbert modules, the functorial property remains true, if the tensor products and direct sums are understood as those of Hilbert modules. Now we proceed from the full to the symmetric Fock module. The symmetric Fock space may be obtained from the full Fock space via symmetrization of any n-particle sector. We want to generalize this to modules. However, in general the tensor product of modules depends on the order of the factors. Example 6.7. Denote by 4 a vector space with a basis en n∈N0 . By 4+ and 4− we denote the modules which are obtained by letting act the algebra of polynomials Chxi in one indeterminate x as creators (i.e. xen = en x = en+1 ) and as annihilators (i.e. xen = en x = en−1 , e−1 := 0), respectively. Obviously, 4+ may be identified with Chxi via en = xn . Hence, 4+ Chxi = 4+ . On the other hand, 4− Chxi = {0}. We obtain (4− ⊗ 4+ ) (4+ ⊗ 4− ) = 4− ⊗ 4+ ⊗ 4− , but (4+ ⊗ 4− ) (4− ⊗ 4+ ) = {0}. This means that, for instance, the B-tensor product of two B-algebras does not necessarily have a canonical B-algebra structure. In order to overcome this obstacle we introduce a new subcategory of the category of B-B-modules. Definition 6.8. The B-center of a B-B-module E is the set CB (E) = {x ∈ E|xb = bx (b ∈ B)}. A B-B-module E is called centered, if it is generated by its B-center. This means that for any x ∈ E there exist n ∈ N, xk ∈ CB (E), bk ∈ B (k = 1, . . . , n), such that x=

n X

x k bk =

k=1

n X

bk x k .

k=1

The following properties are checked immediately. Proposition 6.9. 1. A B-B-linear mapping maps the B-center into the B-center. 2. Any element of a centered B-B-module commutes with any element of the center of B. 3. Consequently, a B-B-module over a commutative algebra B is centered, if and only if the left and right action coincide. 4. For two centered B-B-modules E, F we have CB (E) CB (F ) ⊂ CB (E F ). Therefore, also E F is a centered B-B-module. Theorem 6.10. Let E, F be two centered B-B-modules. There is a unique B-Bmodule isomorphism F : E F → F E, called flip isomorphism, fulfilling F (x y)

for all x ∈ CB (E) and y ∈ CB (F ).

=y x

(6.2)

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

593

Proof. Let (x, y); x ∈ E, y ∈P F denote an arbitrary element of E × F . Since E and P ai xi and y = yj bj for suitable xi ∈ CB (E); yj ∈ F are centered, we have x = i

j

CB (F ); ai , bj ∈ B. Let x0i ∈ CB (E); yj0 ∈ CB (F ); a0i , b0j ∈ B denote another suitable choice. We find X X X a i yj x i bj = y j a i x i bj = yj a0i x0i bj ij

ij

=

X

ij

a0i yj

x0i bj

a0i yj0 b0j

x0i

=

ij

=

X

X ij

=

X

ij

Therefore, F

×

a0i yj bj x0i a0i yj0 x0i b0j .

ij

: (x, y) 7−→

X

a i yj x i bj

ij

is a well-defined mapping E × F → F E. Obviously, F × is B-B-bilinear. We show that it is balanced. Indeed, for an arbitrary a ∈ B we find X X × F (xa, y) = ai ayj xi bj = ai yj xi abj = F × (x, ay). ij

ij

Thus, by the universal property of the B-tensor product there exists a unique B-B-linear mapping F : E F → F E fulfilling F (x y)

= F × (x, y).

Of course, F fulfills (6.2). By applying F a second time (now to F E), we find F ◦ F = id. Combining this with surjectivity, we conclude that F is an isomorphism. Our proceeding in the foregoing proof is typical for all statements concerning centered B-B-modules and B-B-module homomorphisms between them. It shows that for many statements it is sufficient to check them only on elements of the center. The Btensor products of elements of the center behave formally like tensor products of vectors in usual tensor products. This means that all statements concerning centered B-algebras (i.e. B-algebras, whose module structure is centered) are clear, if the corresponding statements for algebras are understood. In particular, centered B-algebras have a natural B-tensor product. The B-tensor product of centered B-algebras is for operator-valued Bose probability what the B-free product is for operator-valued free probability, see [18]. Proposition 6.11. If E is a centered semi-Hilbert B-B-module, then hCB (E), CB (E)i is contained in the center of B. Consequently, the flip of two centered semi-Hilbert B-B-modules E and F is an isometry. Moreover, it extends as an isometry to E F . Thus, we have E F ∼ = F E. If B = B(S), the same is true for strong completions. A topological B-B-module is called topologically centered, if it contains a dense centered B-B-submodule.

594

M. Skeide

Example 6.12. VB is a centered B-B-module. It is easy to see that a B-B-module is of the form VB , if and only if it admits a module basis consisting of elements of its center. If the center of B is trivial, then the identification is canonical. If V is a pre-Hilbert space, then the exterior tensor product VB = V ⊗ B is a centered pre-Hilbert B-B-module. Assume that V is infinite-dimensional and separable. In view of Kasparov’s absorption theorem any countably generated Hilbert B-module may be considered as a complemented B-B-submodule of VB , see [8]. With the help of the flip isomorphism it is possible to define permutations on the n-fold tensor product E n of a centered B-B-module E in an obvious way. Any permutation is an element of B a (E n ) with the inverse permutation being the adjoint. If E is only topologically centered, then some attention has to be paid: We restrict to tensor products of pre-Hilbert modules in order to have a Hausdorff topology. Since the flip is an isometry, it may be extended (strongly) continuously from (CB (E))B (CB (E))B to E E. However, in general we have E E 6= F (E E). Example 6.13. Consider E = (Cc (Rd ))B with its strong Hilbert module completion L2 (Rd , B)s . On elements f, g ∈ E we have [F (f g)](k2 , k1 ) = f (k1 )g(k2 ). However, s for f, g ∈ Cc (Rd , B) ⊂ E there is in general no possibility to write f (k1 )g(k2 ) in the s form f 0 (k2 )g 0 (k1 ) for suitable f 0 , g 0 ∈ E . Definition 6.14. Let E be a centered semi-Hilbert B-B-module. We define the number operator N on zB (E) by setting N (E n ) = n. The symmetrization operator P is defined by P (E n ) being the mean over all permutations on E n . We define the symmetric Fock module 0B (E) √over+ E by setting 0B (E) = P zB (E). + (E) we define the creators a (x) = P N ` (x) and the annihilators a(x) = On 0 B √ `(x) N P for all x ∈ E. Sometimes we call `+ and ` the free creators and annihilators and a+ and a the symmetric creators and annihilators, respectively. Remark 6.15. P is a bounded self-adjoint projection. N is self-adjoint. Therefore, a+ (x) and a(x) are adjoints. One easily checks P N = N P , a+ (x)P = a+ (x) and N `+ (x) = `+ (x)(N + 1). By these relations the creators and annihilators fulfill the analogue of Relation (4.2) a(f )a+ (g) − a+ (g)a(f ) = hf, gi, if at least one of the arguments is in the B-center of E. Remark 6.16. The modules we are interested in are only topologically centered. Therefore, we have to generalize the preceding definition slightly. Suppose that E is a topologically centered pre-Hilbert B-B-module. Then the definitions of N , P and a+ (x), a(x) e of L E n (s) which is closed under P and big enough to extend to any submodule H n∈N0

contain the image of E n (n ∈ N0 ) under the canonical embedding. Also in this case e The concrete form of H e has we will speak of a symmetric Fock module 0B (E) = P H. to be clear from the context. With this agreement the module 0B(S) Cc (Rd , B(S))s from Sect. 4 is, indeed, the symmetric Fock module over the topologically centered pre-Hilbert B(S)-B(S)-module Cc (Rd , B(S))s . However, notice that the Eλ themselves are not centered, because the s s B(S)-center of Eλ is not in the image of Eλ in Eλ .

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

595

We conclude this section with some algebraic remarks. A B-algebra A is called B-commutative, if the B-center is a commutative subalgebra of A. Notice that such an algebra is, in general, far from being commutative. √ √ √ Clearly, 0B (E) with the multiplication (P N F )(P N G) = P N (F G) (F, G ∈ zB (E)) is a centered B-commutative B-algebra. Forgetting about the Hilbert module structure, we denote this B-algebra by SB (E) and call it the symmetric tensor B-algebra over E. The symmetric tensor B-algebra is characterized up to B-algebra isomorphism by the following universal property: An arbitrary B-B-linear mapping j : E → A into an arbitrary centered B-commutative B-algebra A extends to a unique B-algebra j ◦ i. homomorphism e j : SB (E) → A, such that j = e One easily checks the functorial property SB (E ⊕ F ) = SB (E) SB (F ) by looking at elements of the center. Thinking again about centered Hilbert modules, the functorial property remains true, if the tensor product is understood as that of Hilbert modules. The functorial property of the usual Fock space is crucial for the quantum stochastic calculus of Hudson and Parthasarathy [5]. We remark that it is very well possible to introduce also the symmetric Fock module via exponential vectors. However, with the domain of exponential vectors we, definitely, leave the algebraic framework. Finally, we mention that the B-tensor product is the coproduct in the category of centered B-commutative B-algebras. In [18] we investigate a new type of Bose independence of centered B-valued quantum random variables based on centered B-algebras. We obtain the Bose analogue of Voiculescu’s operator-valued free independence, see [22]. Example 6.17. We show an example of our notion of centered B-algebras to the paper [6] by K¨ummerer and Maassen. The authors study the so-called “essentially commutative Markov dilations”, which live on the algebra WB = W ⊗ B where W is a commutative von Neumann algebra and B = Mn . Clearly, WB is a centered B-commutative Balgebra. W as a commutative algebra of ‘classical white noises’ may be represented on a symmetric Fock space. Thus, WB is a centered B-commutative B-subalgebra of the centered B-algebra of adjointable operators on a suitable symmetric Fock module over B. It should be possible to generalize the results of [6] to the full B-algebra of adjointable operators on this Fock module by allowing non-commutative “white noises”. 7. The Central Limit Theorem In this section we prove in a central limit theorem that the moments of the collective creators and annihilators in the vacuum conditional expectation, represented in Sect. 4 by symmetric creators and annihilators on the symmetric Fock module 0B(S) (Cc (Rd , B(S)s ), converge to the moments of the corresponding free creators and annihilators on the full Fock module zB (A0 (R) ⊗ E) over the limit of the one-particle sector computed in Sect. 5. In a first step we show that in a pyramidally ordered product (i.e., so to speak, an anti-normally ordered product) the moments of the free operators for finite λ converge to the moments of the free operators for λ = 0. In the next step we show that nothing

596

M. Skeide

changes, if we replace for finite λ the free operators by symmetric operators. For this step an explicit knowledge of the embedding of the symmetric Fock module into the full Fock module as described in Sect. 6 is indispensable. The final step consists in showing that the limits for arbitrary monomials respect the free commutation relations given by (6.1). In the course of this section we compute a couple of T2 -limits of elements of B(S). For the sesquilinear forms on S(Rd ), defined by these algebra elements, all the limits already have been calculated by Accardi and Lu in [1]. Since the combinatorical problems of, for instance, how to write down an arbitrary monomial in creators and annihilators and so on, have been treated in [1] very carefully, we keep the proofs brief. Sometimes, we give only the main idea of a proof in a typical example. New is that the limit sequilinear forms define operators. This means that the limit conditional expectation, indeed, takes values in B(S). Also new is the interpretation of the limit of the moments of the collective operators as moments of free operators on a full Fock module in the vacuum expectation. The idea to see this, roughly speaking, by checking Relations (6.1) (see the proof of Theorem 7.3), has its drawback also in the computation of the limit of the sesquilinear forms. The structure of the proof is simplified considerably. Theorem 7.1. Let fi = χ[ti ,Ti ] fei , gi = χ[si ,Si ] gei be in V (i = 1, . . . , n; n ∈ N). Then lim h1, `(f1 ) · · · `(fn )`+ (gn ) · · · `+ (g1 )1iλ = h1, `(f1 ) · · · `(fn )`+ (gn ) · · · `+ (g1 )1i0 .

λ→0

Proof. First, we show that

Z Z gn (kn ) · · · fe1 (k1 )e g1 (k1 ) hfn . . . f1 , gn . . . g1 iλ = dkn . . . dk1 fen (kn )e Z Sn Z T 1 Z S1 Z Tn dτn dσn . . . dτ1 dσ1 γλ∗ (τ1 , k1 ) · · · γλ∗ (τn , kn )γλ (σn , kn ) · · · γλ (σ1 , k1 ) tn sn t1 s1 (7.1)

converges to the inner product on z0 in T2 . We proceed precisely as in the proof of Proposition 5.5. Here we are not very explicit, because we have been explicit there. We consider matrix elements of (7.1) with Schwartz functions ξ, ζ. The q’s in the γλ ’s disappear by extensive use of Relations (2.1), however, cause some shifts to the p’s. i and after performing the p-integral we obtain the We make the substitutions ui = σiλ−τ 2 d function ξζ (u k +. . .+u k ). Its modulo is for almost all k , . . . , k and all τ , . . . , τ n n

1 1

n

1

n

1

a rapidly decreasing upper bound for the ui -integrations. Similarly, as in the proof of Proposition 5.5 one checks that the λ-limits for the ui -integrals may be performed first. We obtain the result lim hξ, hfn . . . f1 , gn . . . g1 iλ ζi = hχ[tn ,Tn ] , χ[sn ,Sn ] i · · · hχ[t1 ,T1 ] , χ[s1 ,S1 ] i Z Z Z Z e e dk1 f1 (k1 )e g1 (k1 ) du1 · · · dkn fn (kn )e gn (kn ) dun Z dp ξ(p)ζ(p)eiun ((p+kn−1 +...+k1 )·kn +ω(|kn |)) · · · eiu1 (p·k1 +ω(|k1 |))

λ→0

of [1]. Now we proceed as in Lemma 5.6 and bring the p-integration step by step to the outer position. (Take into account that after performing the integrals over p and over ui , ki

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

597

(i = m + 1, . . . , n) the result is still a rapidly decreasing function on ui (i = 1, . . . , m) for almost all ki (i = 1, . . . , m). Therefore, Fubini’s theorem applies.) We obtain (by the same notational use of the δ-functions) that (7.1) converges to hfn . . . f1 , gn . . . g1 i0 = (2π)n hχ[tn ,Tn ] , χ[sn ,Sn ] i · · · hχ[t1 ,T1 ] , χ[s1 ,S1 ] i Z Z gn (kn ) · · · fe1 (k1 )e g1 (k1 ) dkn . . . dk1 fen (kn )e δ (p + kn−1 + . . . + k1 ) · kn + ω(|kn |) · · · δ p · k1 + ω(|k1 |) . Theorem 7.2. Theorem 7.1 remains true, if we replace on the left-hand side the free creators and annihilators by the symmetric creators and annihilators, i.e. lim (Aλ (f1 )· · ·Aλ (fn )A+λ (gn ) · · · A+λ (g1 ))

λ→0

= lim h1, a(ϕλ (f1 )) · · · a(ϕλ (fn ))a+ (ϕλ (gn )) · · · a+ (ϕλ (g1 ))1i λ→0

= h1, `(f1 ) · · · `(fn )`+ (gn ) · · · `+ (g1 )1i0 . Proof. Notice that a+ (ϕλ (gn )) · · · a+ (ϕλ (g1 ))1 =

√ n!P `+ (ϕλ (gn )) · · · `+ (ϕλ (g1 ))1.

Therefore, we are ready, if we show that in the sum over the permutations only the identity permutation contributes to the limit of the inner product. Applying the flip to two neighbouring elements ϕλ (gi+1 ) ϕλ (gi ) means exchanging the arguments ki+1 ↔ ki . (The σi are dummies and may be labeled arbitrarily.) We find

F (ϕλ (gi+1 ) ϕλ (gi ))

Z

Z

Si

Si+1

dσi+1 si

(ki+1 , ki ) = dσi ei

σi −σi+1 λ2

ki+1 ·ki

γλ (σi+1 , ki+1 )γλ (σi , ki )e gi (ki+1 )e gi+1 (ki ).

si+1

This differs only by the oscillating factor ei

σi −σi+1 λ2

ki+1 ·ki

from the expression

ϕλ (χ[si+1 ,Si+1 ] gei ) ϕλ (χ[si ,Si ] gei+1 ) whose inner products are known to have finite limits. This oscillating factor cannot be neutralized by any other flip operation on a different pair of neighbours. Assume, for instance, for a certain permutation π that i is the first position, counting from the right, which is changed by π. Then π may be written in the form π 0 F (i,i+1) π 00 , where F (i,i+1) is the flip of positions i and i + 1 and π 0 , π 00 are permutations involving only the positions i + 1, . . . , n. A look at the concrete form of the exponents in the oscillating factors tells us that the oscillating factor arising from F (i,i+1) will be neutralized at most on a null-set for the kj -σj -integrations (j = i + 1, . . . , n). Therefore, any non-identical permutation does not contribute to the sum over all permutations. (Notice that also here for a proper argument the theorem of dominated convergence is involved.)

598

M. Skeide

Central Limit Theorem 7.3. Theorem 7.2 remains true, if we replace on the left-hand side Aλ (f1 ) · · · Aλ (fn )A+λ (gn ) · · · A+λ (g1 ) by an arbitrary monomial in collective creators and annihilators and on the right-hand side `(f1 ) · · · `(fn )`+ (gn ) · · · `+ (g1 ) by the corresponding monomial in the free creators and annihilators. In other words, we expressed the limit of arbitrary moments of collective operators in the vacuum conditional expectation as the moments of the corresponding free operators in the vacuum expectation on the limit full Fock module. Proof. We will show that, in a certain sense, Aλ (f )A+λ (g) → hf, gi0 for λ → 0; cf. Relations (6.1). Indeed, one easily checks that [a(f )a+ (g)F ](kn , . . . , k1 ) = hf, giF (kn , . . . , k1 ) Z n X f ∗ (k)g(ki )F (k, kn , . . . , kbi , . . . , k1 ) + dk i=1

for f, g ∈ Cc (R , B) . Replacing f, g by ϕλ (f ), ϕλ (g) (f, g ∈ V ), the first summand converges precisely to what we want, namely, hf, gi0 F . In the remaing sum we may, like in the proof of Theorem 7.2, exchange the position of f ∗ and g. This produces an oscillating factor which makes the k-integral disappear in the limit. It has to be shown that in a concrete expression, e.g. like Aλ (f1 ) · · · Aλ (fn )A+λ (gn ) · · · A+λ (gm+1 )Aλ (f )A+λ (g)A+λ (gm ) · · · A+λ (g1 ) = d

s

h1, a(ϕλ (f1 )) · · · a(ϕλ (fn ))a+ (ϕλ (gn )) · · · a+ (ϕλ (gm+1 )) a(ϕλ (f ))a+ (ϕλ (g))a+ (ϕλ (gm )) · · · a+ (ϕλ (g1 ))1i, the limit a(ϕλ (f ))a+ (ϕλ (g)) → hf, gi0 for the inner pairing may be computed first. But this follows in the usual way using arguments involving the theorem of dominated convergence and the Riemann-Lebesgue lemma. Remark 7.4. It is possible to extend the preceding results in an obvious manner to elements f in the B0 -generate of V . This means that the moments of both A+ (f ) and `+ (f ) for finite λ converge to the moments of `+ (f ) on zB(S) (A0 (R)⊗E). By a slight weakening of Definition 5.1 in the sense that the generating set needs only to be topologically generating, one can show that lim zB (Eλ ) = zB (A0 (R) ⊗ E) and more or less also λ→0

lim 0B (Cc (Rd , B)s ) = zB (A0 (R) ⊗ E). However, since the notational effort and a

λ→0

precise reasoning would take a lot of time, we content ourselves with the central limit theorem. Since the moments of all creators and, henceforth, the inner products on the full Fock module are already determined by Relations (6.1) (see [14, 16]), we do not really lose information on the limit module. 8. R´esum´e Why do we consider the preceding theorem as a type of central limit theorem? By Eq. (2.4) the collective operators are related to a time integral over the interaction Hamiltonian. They collect, so to speak, the interaction over a certain time interval. In this sense they also may be considered as a “continuous sum” over certain random variables. By the rescaling t ,→ λt2 for the stochastic limit the “number of elements” in this “sum”

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

599

increases like λ12 for λ → 0. Therefore, λ12 plays the role of the number n in a usual central limit theorem. This interpretation is reconfirmed by the fact that HI (t) contains λ as a factor which, therefore, plays the role of the usual normalization √1n .

Let us recall the functorial property 0B (E ⊕ F ) = 0B (E) 0B (F ) of the symmetric Fock module. It is easy to check that creators and annihilators to orthogonal elements e ∈ E and f ∈ F factorize i.e. according to this decomposition, a# (e)a# (f ) = a# (e) a# (f ) := a# (e) id F ◦ (a# (f ) id) ◦ F , where # means ∗ or not ∗. (Actually, this is true for arbitrary elements of La (0B (E)) and La (0B (F )). However, notice that in general a+ (e)a+ (f ) 6= a+ (f )a+ (e). This follows, for instance, from a+ (be)a+ (b0 f ) = bb0 a+ (e)a+ (f ), but, a+ (b0 f )a+ (be) = b0 ba+ (f )a+ (e) = b0 ba+ (e)a+ (f ) for centered elements e and f .) It seems to be a natural idea to consider the creators and annihilators to orthogonal elements as (Bose) B-independent random variables, see [18]. However, the collective operators to different time intervalls are not independent random variables, because elements of the one-particle sector to different time intervals are not necessarilly orthogonal. Thus, the analogy to the usual central limit theorems is not complete. Nevertheless, the limits `# (χ[0,t] fe) of the collective operators A#λ (χ[0,t] fe) form stochastic processes with freely B-independent (additive) increments in the sense of Speicher [20]. For the symmetric and the full Fock space there exist the quantum stochastic calculi of Hudson-Parthasarathy [5] and K¨ummerer-Speicher [7], respectively. In both cases it is allowed to tensorize the Fock space R with an initial space S. We made the initial space disappear, or better, the information about the structure of the initial space had been changed into the algebra of operators on S. Of course, our Fock modules allow for a faithful representation of all processes which are represented as quantum stochastic integrals on R⊗S. (The methods in the appendix indicate how also algebras of unbounded operators can be included in our description. The only serious obstacle consists in the fact that we are restricted to adjointable mappings.) However, in the language of Fock modules a much bigger variety of operators may be identified as creator or annihilator. For instance, we have seen that the collective operators on R ⊗ S appear naturally as creators and annihilators on the symmetric Fock module. Thus, we did not only make the initial space superfluous, but also increased the number of quantum stochastic processes which may be thought of as white noises considerably. It is noteworthy that the sets of adapted processes coincide (up to algebra isomorphism). It is natural to ask, if the more general classes of creators and annihilators (and also gauge operators) on a symmetric or full Fock module allow for a quantum stochastic calculus (being more general than the calculi in [5] and [7]). We will follow this question in the future and expect from our first rudimentary investigations that it should be possible to find rich quantum stochastic calculi. Notice that our symmetric Fock module over an arbitrary centered pre-Hilbert B-B-module is different from the symmetric Fock module introduced by Lu on which he develops in [9, 10] his quantum stochastic calculus. Lu’s symmetric Fock module is only over centered modules over a commutative algebra. In this case, we can define an isometric (but not injective) embedding from his module to ours. Our module, however, is nothing but a direct integral of a family of symmetric Fock spaces over the spectrum of the commutative algebra. We will have to check, if his apparently more general notion of “essential adaptedness”, actually allows for more quantum stochastic processes. A second question is based on our notion of centered modules. We pointed out that in order to define a tensor product of B-algebras it is necessary for them to be centered. For centered modules, however, we are able to define a natural generalization

600

M. Skeide

of bialgebras to centered B-bialgebras. We consider it as an interesting question to ask, if Sch¨urmann’s theory of “White Noise on Bialgebras” [15] can be generalized to centered B-bialgebras, using a potential calculus on the symmetric Fock module over a centered Hilbert module. We remark that presently in [3] an attempt is made to describe the limit module again in terms of creator and annihilator densities. The algebraic relations, e.g. like (5.6), are transferred into the densities and then the “algebra” generated by the densities, is described as an “algebra” generated by relations. We insist, however, to call our limit module just a “full Fock module” and not an “interacting Fock module” as is done in [3, 1, 2]. In our opinion, the name “interacting” is given to Fock-like objects, if the inner product on the n-particle sector cannot be understood as the canonical inner product of the n-fold tensor product of one and the same (including all structures) one-particle sector by itself. One of our main results says, however, that precisely this is possible. Last but not least, we make a remark on the algebra over which our modules are. We chose a suitably dense subalgebra B of the algebra B(S) of all bounded operators on the initial space. We observed that for any λ > 0 the algebra Aλ may be represented on the same symmetric Fock module over the one-particle sector L2 (Rd , B(S))s which is a centered Hilbert module. Also in view of the appendix this module appears as the equivalent description of the physical system in terms of modules. The one-particle sector is generated by the subspace V . The inner product restricted to this subspace takes values only in the momentum algebra P. Therefore, as indicated by GNS-construction or by Remark 3.9 it is possible to restrict at any time to the P-Psubmodules generated by V . However, if we do so we lose a lot. Firstly, the one-particle sector is no longer centered and depends on λ. Thus, we do not have the possibility to interpret the GNS-module as a symmetric Fock module. Moreover, the GNS-module will no longer appear as an adequate description of the physical system. Secondly, the left multiplication appears no longer as the pointwise multiplicaton in a module of functions. We have to introduce it as an a priori operation. Finally, the apparent technical advantage of a module over a commutative algebra, actually, does not exist. The limit one-particle sector is not centered. The non-commutativity can be seen easily from the commutation relations which are fulfilled by algebra elements and elements of V . There are two possible advantages of the restriction to P. Firstly, the over-countable direct sum disappears and the limit module is now separable. Secondly, P is a C ∗ -algebra. Presently, we are not in a position to see whether our results extend to the C ∗ -completion of B or not. Thus, if one insists on considering two-sided modules over C ∗ -algebras, restriction to P is a possible way out. In this case T1 is just the weak topology of L∞ (Rd ). Appendix: Connections Between Vector Spaces and Modules In this appendix we work out the algebraic basis for our purpose to represent a quite general class of operators between Hilbert spaces on Hilbert modules. In particular, we will see that this possibility does not depend essentially on the existence of an inner product. Lemma A.1. Let G, H denote vector spaces, B a subalgebra of L(G). Furthermore, assume that g0 ∈ G is cyclic for B and that B contains a rank-one projection P0 such that P0 g0 = g0 . Then the mapping L g 7−→ Lg

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

601

establishes an isomorphism L(G, H) G −→ H, where denotes the tensor product over B of the L(H)-B-module L(G, H) and the B-C-module G. Proof. The mapping L g 7→ Lg is surjective (and, of course, well-defined). We show P that it is also injective. Indeed, denote by Li gi an arbitrary element of L(G, H) G. i P Li γi . Then Then there exist γi ∈ B such that gi = γi g0 . Denote L = i

X i

Li g i =

X

Li γi g 0 =

X

i

Li γ i g 0 = L g0 .

i

Suppose that L g0 6= 0. Then L P0 g0 = LP0 g0 6= 0, i.e. LP0 6= 0, i.e. Lg0 6= 0. Therefore, the mapping is injective. Theorem A.2. Let H1 , H2 be vector spaces and G and B as before. For a ∈ L(H1 , H2 ) define the mapping e a ∈ Lr (L(G, H1 ), L(G, H2 )), by setting e aL = aL. Then the mapping a 7→ e a establishes an isomorphism L(H1 , H2 ) −→ Lr (L(G, H1 ), L(G, H2 )). Moreover, if H1 = H2 = H, then the mapping is also an algebra isomorphism L(H) → Lr (L(G, H)). Proof. By the preceding lemma we may identify L(G, Hi ) G with Hi . Therefore, the mapping e a id : L(G, H2 ) G → L(G, H2 ) G may be identified with a unique element in L(H1 , H2 ). Obviously, this element is a. On the other hand, the mapping A 7→ (A id) is injective on the whole of Lr (L(G, H1 ), L(G, H2 )). (Suppose A 6= 0. Then there exists L ∈ L(G, H1 ) such that AL 6= 0, i.e. ALg = AL g = (A id)(L g) 6= 0 for a suitable g ∈ G. Henceforth, A id 6= 0.) It follows that a 7→ e a is an isomorphism and A 7→ (A id) its inverse. The last statement of the theorem is obvious. Now we consider H ⊗ G rather than H itself. Notice that the module L(G, H ⊗ G) contains H ⊗ B as a submodule, if we identify an element h ⊗ b in the latter with the mapping g 7→ h ⊗ bg in the former. Theorem A.3. Any mapping A ∈ Lr (H1 ⊗B, L(G, H2 )) extends uniquely to an element of Lr (L(G, H1 ⊗ G), L(G, H2 )). Proof. Clearly, also (H1 ⊗ B) G = H1 ⊗ G. Therefore, (A id) determines a unique a is an extension of A to L(G, H1 ⊗ G). Obviously, element a of L(H1 ⊗ G, H2 ). Then e this extension is unique. Remark A.4. The result can be generalized easily to the case when G is a right C-module. In this case B is a subalgebra of Lr (G).

602

M. Skeide

In the sequel, we assume G, H, H1 , H2 to be pre-Hilbert spaces and B = La (G). We may equip La (G, H) with an La (G)-valued inner product by setting hL, M i = L∗ M . If a ∈ La (H1 , H2 ), then e a maps La (G, H1 ) into La (G, H2 ). Moreover, ae∗ is the unique adjoint mapping in Lr (La (G, H2 ), La (G, H1 )), fulfilling hL, e aM i = hae∗ L, M i. On the other hand, if we denote by La (La (G, H1 ), La (G, H2 )) the set of all elements A ∈ Lr (La (G, H1 ), La (G, H2 )), which fulfill this adjointability condition, then (A id) is an element of La (H1 , H2 ) for all A ∈ La (La (G, H1 ), La (G, H2 )). We easily find the specializations of our foregoing theorems to adjointable mappings. Theorem A.5. The mapping a 7→ e a La (G, H2 ) is an isomorphism La (H1 , H2 ) → La (La (G, H1 ), La (G, H2 )). Moreover, for H1 = H2 it is a ∗-algebra isomorphism. An element A ∈ La (H1 ⊗ La (G), La (G, H2 )) extends uniquely to an element of La (La (G, H1 ⊗ G), La (G, H2 )). e 1 is a pre-Hilbert space which contains H1 ⊗ G as a dense Corollary A.6. Suppose H e 1 ), subspace. If an element A ∈ La (H1 ⊗ La (G), La (G, H2 )) extends to La (La (G, H La (G, H2 )), then this extension is unique. Proof. If A0 is such an extension, then (A0 id) is an extension of (A id) to an element e 1 , H2 ). Since (A id) is closable, this extension is unique. of La (H e ∈ H. We observe that η = |i e ⊗ id is an element of La (G, H ⊗ G) with Let e ⊗ id. Define the conditional expectation : La (H ⊗ G) → La (G) by adjoint η ∗ = h| setting (a) = η ∗ aη. Proposition A.7. The conditional expectation may be represented as the inner product (a) = hη, e aηi.

(A.1)

In the sense of Remark 3.4 La (G, H ⊗ G) is a pre-Hilbert module over the ∗-algebra L (G) with the separating set S0 consisting of all states, which may be written as matrix elements. Then La (G, H), η is the GNS-representation of the conditional expectation defined by Eq. (A.1). We mention an interesting application of the above methods to self-dual Hilbert modules. A Hilbert B-module E is self-dual, if the analogue of the Riesz-Fr´echet theorem holds, i.e. if any element in f ∈ B r (E, B) may be represented (uniquely) as an inner product hx, •i (x ∈ E). An application of Theorem A.5 in the case H1 = H and H2 = G yields La (La (G, H), a L (G)) ∼ = La (H, G). In other words, any element in La (La (G, H), La (G)) is of the form hL, •i for a unique element L ∈ La (G, H). Let us restrict Theorems A.2 and A.5 further to bounded mappings between Hilbert spaces. (The isomorphism is, of course, an isometry.) Recognizing that such mappings are adjointable, we obtain a

B r (B(G, H1 ), B(G, H2 )) ∼ = B(H1 , H2 ) ∼ = B a (B(G, H1 ), B(G, H2 )). It is easy to check that the pre-Hilbert module norm and the operator norm on E = B(G, H) coincide. Therefore E is a self-dual Hilbert module. We recover for our special

Hilbert Modules in Quantum Electro Dynamics and Quantum Probability

603

case Paschke’s result [12] that bounded right linear mappings between self-dual Hilbert modules are adjointable. Moreover, a right linear mapping A defined on a submodule F of E1 into E2 allows an extension to E1 , if and only if the sesquilinear form h•, A•i on E2 × F is bounded. Now we replace H by H⊗G. Then E contains HB(G) = H ⊗B(G) and its completion HB(G) . Clearly, E is the smallest self-dual Hilbert module which contains HB(G) . Now assume that H is infinite-dimensional and separable. By the absorption theorem (see e.g. [8]) any countably generated Hilbert B(G)-module F is contained in HB(G) as a complemented submodule. The projection onto this submodule is bounded, hence, extends to a projection on E. The image of E under this projection is the the smallest self-dual Hilbert module which contains F . Since B r (E) ∼ = B(H⊗G), we are in a position to translate a couple of results from theory of operators on Hilbert spaces, e.g. the spectral theorem, to bounded operators on countably generated Hilbert B(G)modules. We will come back to this possibility in more detail in the note [19] to be published later. Note added: In the meantime (see [19]) we know that an arbitrary centered Hilbert Bs B-module E is contained in H ⊗ B for a suitable Hilbert space H. If E is strongly s complete, then it is complemented in H ⊗ B . More generally, if a Hilbert B-module E is generated by elements x for which hx, xi is an element of the center of B, then E has a left multiplication which turns it into a centered Hilbert B-B-module. This left multiplication needs not to be unique. Also an arbitrary Hilbert B-module E may be s s embedded into H ⊗ B . However, the left multiplication induced from H ⊗ B needs not to leave invariant E. Finally, there are two-sided Hilbert modules which cannot be s embedded into H ⊗ B preserving also the left multiplication. We call such modules essentially non-centered. Acknowledgement. The author wants to express his gratitude to L. Accardi for the possibility to come to the “Centro Vito Volterra”. Also the financial support from the “Deutsche Forschungsgemeinschaft” is acknowledged gratefully. It would not have been possible to enter the deep paper [1] without learning its spirit in uncountable discussions directly from the authors. Several mistakes from the first version have been corrected. For a couple of remarks added to the revised version the author owes thanks to M. Sch¨urmann, R. Speicher and, in particular, to V. Liebscher. Many valuable remarks by two unknown referees were included in the second revision.

References 1. Accardi, L., Lu, Y.G.: The Wigner semi-circle law in quantum electro dynamics. Commun. Math. Phys. 180, 605–632 (1996) 2. Accardi, L., Lu, Y.G.: Wiener noise versus Wigner noise in quantum electrodynamics. Accardi, L. (ed.), Quantum Probability & Related Topics VIII, Singapore, New Jersey, London, Hong Kong, World Scientific, 1993 3. Accardi, L., Lu, Y.G., Volovich, I.V.: The QED interacting free Fock module. Preprint, Rome, in preparation 4. Gough, J.: On the emergence of a free noise limit from quantum field theory. Preprint, Rome, 1996 5. Hudson, R.L., Parthasarathy, K.R.: Quantum Ito’s formula and stochastic evolutions. Commun. Math. Phys. 93, 301–323 (1984) 6. K¨ummerer, B., Maassen, H.: The essentially commutative dilations of dynamical semigroups on Mn . Commun. Math. Phys. 109, 1–22 (1987) 7. K¨ummerer, B., Speicher, R.: Stochastic integration on the Cuntz algebra O∞ . J. Funct. Anal. 103, 372–408 (1992) 8. Lance, E.C.: Hilbert C ∗ -modules. Cambridge: Cambridge University Press, 1995

604

M. Skeide

9. 10. 11. 12. 13.

Lu, Y.G.: Quantum stochastic calculus on Hilbert modules. Submitted to Math. Z. Lu, Y.G.: Quantum Poisson processes on Hilbert modules. Submitted to Ann. I.H.P. Prob. Stat. Murphy, G.J.: C ∗ -Algebras and operator theory. Academic Press, 1990 Paschke, W.L.: Inner product modules over B ∗ -algebras. Trans. Amer. Math. Soc. 182, 443–468 (1973) Petz, D.: An invitation to the algebra of canonical commutation relations. Leuven: Leuven University Press, 1990 Pimsner, M.V.: A class of C ∗ -algebras generalizing both Cuntz-Krieger algebras and crossed products by Z. Preprint, Pennsylvania 1993, to appear in: Fields Institute Communications (D.V. Voiculescu, ed.), Memoires of the American Mathematical Society Sch¨urmann, M.: White noise on bialgebras. (Lect. Notes Math., vol. 1544), Berlin–Heidelberg–New York: Springer, 1993 Shlyakhtenko, D.: Random gaussian band matrices and freenes with amalgamation. International Mathematics Research Notices 20, 1013–1025 (1996) Skeide, M.: Infinitesimal generators on SUq (2) in the classical and anti-classical limit. Preprint, Cottbus, 1997 Skeide, M.: A note on Bose Z-independent random variables fulfilling q-commutation relations. Preprint, Rome, 1996 Skeide, M.: Generalized matrix C ∗ -algebras and representations of Hilbert modules. Preprint, Cottbus, 1997 Speicher, R.: Combinatorial theory of the free product with amalgamation and operator-valued free probability theory. Habilitation, Heidelberg, 1994, to appear in Memoirs of the American Mathematical Society Voiculescu, D.: Dual algebraic structures on operator algebras related to free products. J. Operator Theory 17, 85–98 (1987) Voiculescu, D.: Operations on certain non-commutative operator-valued random variables. Preprint, Berkely 1992 Yosida, K.: Functional analysis. Springer, Berlin Heidelberg New York 1980

14.

15. 16. 17. 18. 19. 20.

21. 22. 23.

Communicated by H. Araki

Commun. Math. Phys. 192, 605 – 629 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Drinfeld–Sokolov Reduction for Difference Operators and Deformations of W -Algebras I. The Case of Virasoro Algebra E. Frenkel1,? , N. Reshetikhin2 , M.A. Semenov-Tian-Shansky3,4 1 2 3 4

Department of Mathematics, Harvard University, Cambridge, MA 02138, USA Department of Mathematics, University of California, Berkeley, CA 94720, USA Universit´e de Bourgogne, Dijon, France Steklov Mathematical Institute, St. Petersburg, Russia

Received: 20 April 1997 / Accepted: 22 July 1997

Abstract: We propose a q-difference version of the Drinfeld-Sokolov reduction scheme, which gives us q-deformations of the classical W-algebras by reduction from PoissonLie loop groups. We consider in detail the case of SL2 . The nontrivial consistency conditions fix the choice of the classical r-matrix defining the Poisson-Lie structure on the loop group LSL2 , and this leads to a new elliptic classical r-matrix. The reduced Poisson algebra coincides with the deformation of the classical Virasoro algebra previously defined in [19]. We also consider a discrete analogue of this Poisson algebra. In the second part [31] the construction is generalized to the case of an arbitrary semisimple Lie algebra.

1. Introduction It is well-known that the space of ordinary differential operators of the form ∂ n +u1 ∂ n−2 + . . . + un−1 has a remarkable Poisson structure, often called the (second) Adler-GelfandDickey bracket [1, 12]. Drinfeld–Sokolov reduction [11] gives a natural realization of this Poisson structure via the hamiltonian reduction of the dual space to the affine Kacb n . Drinfeld and Sokolov [11] have applied an analogous reduction Moody algebra sl procedure to the dual space of the affinization b g of an arbitrary semisimple Lie algebra g. The Poisson algebra W(g) of functionals on the corresponding reduced space is called the classical W-algebra. Thus, one can associate a classical W-algebra to an arbitrary semisimple Lie algebra g. In particular, the classical W-algebra associated to sl2 is nothing but the classical Virasoro algebra, i.e., the Poisson algebra of functionals on the dual space to the Virasoro algebra (see, e.g., [19]). It is interesting that W(g) admits another description as the center of the universal enveloping algebra of an affine algebra. More precisely, let Z(b g)−h∨ be the center of a ?

Packard Fellow

606

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

completion of the universal enveloping algebra U (b g)−h∨ at the critical level k = −h∨ (minus the dual Coxeter number). This center has a canonical Poisson structure. It was conjectured by V. Drinfeld and proved by B. Feigin and E. Frenkel [14, 18] that as the Poisson algebra Z(b g)−h∨ is isomorphic to the classical W-algebra W(Lg) associated with the Langlands dual Lie algebra Lg of g. In [19] two of the authors used this second realization of W-algebras to obtain their q-deformations. For instance, the q-deformation Wq (sln ) of W(sln ) was defined b n ) of a completion of the quantized universal enveloping algebra as the center Zq (sl b n )−h∨ . The Poisson structure on Zq (sl b n ) was explicitly described in [19] using Uq (sl b n ) = Wq (sln ) results of [26]. It was shown that the underlying Poisson manifold of Zq (sl n n−1 is the space of q-difference operators of the form Dq + t1 Dq + . . . + tn−1 Dq + 1. Furthermore, in [19] a q-deformation of the Miura transformation, i.e., a homomorphism from Wq (sln ) to a Heisenberg-Poisson algebra, was defined. The construction [19] of Wq (sln ) was followed by further developments: it was quantized [32, 15, 4] and the quantum algebra was used in the study of lattice models [25, 3]; the Yangian analogue of Wq (sl2 ) was considered in [8]; q-deformations of the generalized KdV hierarchies were introduced [17]. In this paper we first formulate the results of [19] in terms of first order q-difference operators and q-gauge action. This naturally leads us to a generalization of the DrinfeldSokolov scheme to the setting of q-difference operators. The initial Poisson manifold is the loop group LSLn of SLn , or more generally, the loop group of a simply-connected simple Lie group G. Much of the needed Poisson formalism has already been developed by one of the authors in [29, 30]. Results of these works allow us to define a Poisson structure on the loop group, with respect to which the q-gauge action is Poisson. We then have to perform a reduction of this Poisson manifold with respect to the q-gauge action of the loop group LN of the unipotent subgroup N of G. At this point we encounter a new kind of anomaly in the Poisson bracket relations, unfamiliar from the linear, (i.e., undeformed), situation. To describe it in physical terms, recall that the reduction procedure consists of two steps: (1) imposing the constraints and (2) passing to the quotient by the gauge group. An important point in the ordinary Drinfeld–Sokolov reduction is that these constraints are of first class, according to Dirac, i.e., their Poisson bracket vanishes on the constraint surface. In the q-difference case we have to choose carefully the classical r-matrix defining the initial Poisson structure on the loop group so as to make all constraints first class. If we use the standard r-matrix, some of the constraints are of second class, and so we have to modify the r-matrix. In this paper we do that in the case of SL2 . We show that there is essentially a unique classical r-matrix compatible with the q-difference Drinfeld–Sokolov scheme. To the best of our knowledge, this classical r-matrix is new; it yields an elliptic deformation of the Lie bialgebra structure on the loop algebra of sl2 associated with the Drinfeld “new” realization of quantized affine algebras [10, 22]. The result of the correspindingDrinfeld– Sokolov reduction is the q-deformation of the classical Virasoro algebra defined in [19]. We also construct a finite difference version of the Drinfeld–Sokolov reduction in the case of SL2 . This construction gives us a discrete version of the (classical) Virasoro algebra. We explain in detail the connection between our discrete Virasoro algebra and the lattice Virasoro algebra of Faddeev–Takhtajan–Volkov [34–36, 13]. We hope that our results will help to clarify further the meaning of the discrete Virasoro algebra and its relation to various integrable models.

Drinfeld–Sokolov Reduction for Difference Operators I.

607

The construction presented here can be generalized to the case of an arbitrary simplyconnected simple Lie group. This is done in the second part of the paper [31] written by A. Sevostyanov and one of us. The paper is arranged as follows. In Sect. 2 we recall the relevant facts of [11] and [19]. In Sect. 3 we interpret the results of [19] from the point of view of q-gauge transformations. Section 4 reviews some background material on Poisson structures on Lie groups following [29, 30]. In Sect. 5 we apply the results of Sect. 4 to the q-deformation of the Drinfeld–Sokolov reduction in the case of SL2 . In Sect. 6 we discuss the finite difference analogue of this reduction and compare its results with the Faddeev–Takhtajan–Volkov algebra. 2. Preliminaries 2.1. The differential Drinfeld–Sokolov reduction in the case of sln . Let Mn be the manifold of differential operators of the form L = ∂ n + u1 (s)∂ n−2 + . . . + un−2 (s)∂ + un−1 (s),

(2.1)

where ui (s) ∈ C((s)). Adler [1] and Gelfand-Dickey [12] have defined a remarkable two-parameter family of Poisson structures on Mn , with respect to which the corresponding KdV hierarchy is hamiltonian. In this paper we will only consider one of them, the so-called second bracket. There is a simple realization of this structure in terms of the Drinfeld–Sokolov reduction [11], Sect. 6.5. Let us briefly recall this realization. b n associated to sln ; this is the central Consider the affine Kac-Moody algebra sl extension b n → Lsln → 0, 0 → CK → sl b n , which consists of linear see [20]. Let Mn be the hyperplane in the dual space to sl functionals taking value 1 on K. Using the differential dt and the bilinear form tr AB on sln , we identify Mn with the manifold of first order differential operators ∂s + A(s),

A(s) ∈ Lsln .

b ∗ factors through the loop group LSLn c n on sl The coadjoint action of the Lie group SL n and preserves the hyperplane Mn . The corresponding action of g(s) ∈ LSLn on Mn is given by g(s) · (∂s + A(s)) = g(s)(∂s + A(s))g(s)−1 , or

(2.2)

A(s) 7→ g(s)A(s)g(s)−1 − ∂s g(s) · g(s)−1 .

Consider now the submanifold MnJ of Mn which consists of operators ∂s + A(s), where A(s) is a traceless matrix of the form   ∗ ∗ ∗ ... ∗ ∗ −1 ∗ ∗ . . . ∗ ∗    0 −1 ∗ . . . ∗ ∗ (2.3) . . . . . . . . . . . . . . . . . .    0 0 0 . . . ∗ ∗ 0 0 0 . . . −1 ∗

608

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

To each element L of MnJ one can naturally attach an nth order scalar differential operator as follows. Consider the equation L · 9 = 0, where   9n  9 9 =  n−1  . ... 91 Due to the special form (2.3) of L, this equation is equivalent to an nth order differential equation L·91 = 0, where L is of the form (2.1). Thus, we obtain a map π : MnJ → Mn sending L to L. Let N be the subgroup of SLn consisting of the upper triangular matrices, and LN be its loop group. If g ∈ LN and 9 is a solution of L · 9 = 0, then 90 = g9 is a solution of L0 · 90 = 0, where L0 = gLg −1 . But 91 does not change under the action of LN . Therefore π(L0 ) = π(L), and we see that π factors through the quotient of MnJ by the action of LN . The following proposition describes this quotient. Proposition 1 ([11], Proposition 3.1). The action of LN on MnJ is free, and each orbit contains a unique operator of the form   0 u1 u2 . . . un−2 un−1 −1 0 0 . . . 0 0    0   0 −1 0 . . . 0 (2.4) ∂s +  . . . . . . . . . . . . . . . . . . . . . . . .    0 0 0 ... 0 0 0 0 0 . . . −1 0 But for L of the form (2.4), π(L) is equal to the operator L given by formula (2.1). Thus, we have identified the map π with the quotient of MnJ by LN and identified Mn with MnJ /LN . The quotient MnJ /LN can actually be interpreted as the result of hamiltonian reduction. Denote by n+ (resp., n− ) the upper (resp., lower) nilpotent subalgebra of sln ; thus, n+ is the Lie algebra of N . The manifold Mn has a canonical Poisson structure, which is the restriction of the b ∗ (such a structure exists on the dual space to any Lie algebra). Lie-Poisson structure on sl n The coadjoint action of LN on Mn is hamiltonian with respect to this structure. The corresponding moment map µ : Mn → Ln− ' Ln∗+ sends ∂s + A(s) to the lowertriangular part of A(s). Consider the one-point orbit of LN ,   0 0 0 ... 0 0 −1 0 0 . . . 0 0    0 −1 0 . . . 0 0 J = . . . . . . . . . . . . . . . . . .   0 0 0 . . . 0 0 0 0 0 . . . −1 0 Then MnJ = µ−1 (J). Hence Mn is the result of hamiltonian reduction of Mn by LN with respect to the one-point orbit J. The Lie-Poisson structure on Mn gives rise to a canonical Poisson structure on Mn , which coincides with the second Adler-Gelfand-Dickey bracket, see [11], Sect. 6.5. The Poisson algebra of local functionals on Mn is called the classical W-algebra associated to sln , and is denoted by W(sln ).

Drinfeld–Sokolov Reduction for Difference Operators I.

609

b n , which Remark 1. For α ∈ C, let Mα,n be the hyperplane in the dual space to sl b consists of linear functionals on sln taking the value α on K. In the same way as above (for α = 1) we identify Mα,n with the space of first order differential operators α∂s + A(s),

A(s) ∈ Lsln .

The coadjoint action is given by the formula A(s) 7→ g(s)A(s)g(s)−1 − α∂s g(s) · g(s)−1 . The straightforward generalization of Proposition 1 is true for any α ∈ C. In particular, for α = 0 we obtain a description of the orbits in MnJ under the adjoint action of LN . This result is due to B. Kostant [23]. Drinfeld and Sokolov [11] gave a generalization of Proposition 1 when sln is replaced by an arbitrary semisimple Lie algebra g. The special case of their result, corresponding to α = 0, is also due to Kostant [23]. The Drinfeld–Sokolov reduction can be summarized by the following diagram: MnJ

? Mn =

MnJ /LN

-

? - Mn /LN

Mn

There are three essential properties of the Lie-Poisson structure on Mn that make the reduction work: (i) The coadjoint action of LSLn on Mn is hamiltonian with respect to this structure. (ii) The subgroup LN of LSLn is admissible in the sense that the space S of LN invariant functionals on Mn is a Poisson subalgebra of the space of all functionals on Mn . (iii) Denote by µij the function on Mn , whose value at ∂ + A ∈ Mn equals the (i, j) entry of A. The ideal in S generated by µij + δi−1,j , i > j, is a Poisson ideal. We will generalize this picture to the q-difference case. 2.2. The Miura transformation. Let Fn be the manifold of differential operators of the form   v1 0 0 . . . 0 0 −1 v2 0 . . . 0 0     0 −1 v3 . . . 0 0  (2.5) ∂s +  , . . . . . . . . . . . . . . . . . . . .   0 0 0 ... v  n−1 0 0 0 0 . . . −1 vn

610

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

Pn where i=1 vi = 0. We have a map m : Fn → Mn , which is the composition of the embedding Fn → MnJ and the projection π : MnJ → Mn . Using the definition of π above, m can be described explicitly as follows: the image of the operator (2.5) under m is the nth order differential operator ∂sn + u1 (s)∂sn−2 + . . . + un−1 (s) = (∂s + v1 (s)) . . . (∂s + vn (s)). The map m is called the Miura transformation. We want to describe the Poisson structure on Fn with respect to which the Miura transformation is Poisson. To this end, let us consider the restriction of the gauge action (2.2) to the opposite triangular subgroup LN− ; let µ : Mn → Ln+ ' Ln∗− be the corresponding moment map. The manifold Fn is the intersection of two level surfaces, Fn = µ−1 (J) ∩ µ−1 (0). It is easy to see that it gives a local cross-section for both actions (in other words, the orbits of LN and LN− are transversal to Fn ). Hence Fn simultaneously provides a local model for the reduced spaces Mn = µ−1 (J)/LN and µ−1 (0)/LN− . The Poisson bracket on Fn that we need to define is the so-called Dirac bracket (see, e.g., [16]), where we may regard the matrix coefficients of µ as subsidiary conditions, which fix the local gauge. The computation of the Dirac bracket for the diagonal matrix coefficients vi is very simple, since their Poisson brackets with the matrix Pn coefficients of µ all vanish on Fn . The only correction arises due to the constraint i=1 vi = 0. Denote by vi,m the linear functional on Fn , whose value on the operator (2.5) is the mth Fourier coefficient of vi (s). We obtain the following formula for the Dirac bracket on Fn : n−1 mδm,−k , n 1 {vi,m , vj,k } = − mδm,−k , i < j. n {vi,m , vi,k } =

Since Fn and Mn both are models of the same reduced space, we immediately obtain: Proposition 2 ([11], Proposition 3.26). With respect to this Poisson structure the map m : Fn → Mn is Poisson. 2.3. The q-deformations of W(sln ) and Miura transformation. In this section we summarize relevant results of [19]. Let q be a non-zero complex number, such that |q| < 1. Consider the space Mn,q of q-difference operators of the form L = Dn + t1 (s)Dn−1 + . . . + tn−1 (s)D + 1,

(2.6)

where ti (s) ∈ C((s)) for each i = 1, . . . , n, and [D · f ](s) = f (sq). Denote by ti,m the functional on Mn,q , whose value at L is the mth Fourier coefficient of ti (s). Let Rn,q be the completion of the ring of polynomials in ti,m , i = 1, . . . , N − 1; m ∈ Z, which consists of finite linear combinations of expressions of the form X c(m1 , . . . , mk ) · ti1 ,m1 . . . tik ,mk , (2.7) m1 +...+mk =M

Drinfeld–Sokolov Reduction for Difference Operators I.

611

where c(m1 , . . . , mk ) ∈ C. Given an operator of the form (2.6), we can substitute the coefficients ti,m into an expression like (2.7) and get a number. Therefore elements of Rn,q define functionals on the space Mn,q . In order to define the Poisson structure on Mn,q , it suffices to specify the Poisson brackets between the generators ti,m . Let Ti (z) be the generating series of the functionals ti,m : X Ti (z) = ti,m z −m . m∈Z

We define the Poisson brackets between ti,m ’s by the formulas [19] {Ti (z), Tj (w)} =

X w m (1 − q im )(1 − q m(N −j) ) Ti (z)Tj (w) z 1 − q mN

m∈Z

wq r Ti−r (w)Tj+r (z) z r=1 min(i,N X−j) w Ti−r (z)Tj+r (w), i ≤ j. δ − zq j−i+r r=1 P In these formulas δ(x) = m∈Z xm , and we use the convention that t0 (z) ≡ 1. +

min(i,N X−j)

δ

(2.8)

Remark 2. Note the difference between ti (s) and Ti (z). The former is a Laurent power series, whose coefficients are numbers. The latter is a power series infinite in both directions, whose coefficients are functionals on Mn,q . Thus, Ti (z) is just the generating function for the functionals ti,m . We use these generating functions merely to simplify our formulas for the Poisson brackets (so that we do not have to write a Poisson bracket between each individual pair ti,m and tj,k ). Remark 3. The parameter q in formula (2.8) corresponds to q −2 in the notation of [19]. Now consider the space Fn,q of n-tuples of q-difference operators (D + λ1 (s), . . . , D + λn (s)).

(2.9)

Denote by λi,m the functional on Mn,q , whose value is the mth Fourier coefficient of λi (s). We will denote by 3i (z) the generating series of the functionals λi,m : X 3i (z) = λi,m z −m . m∈Z

We define a Poisson structure on Fn,q by the formulas [19]: X w m (1 − q m )(1 − q m(N −1) ) 3i (z)3i (w), z 1 − q mN (2.10) m∈Z X wq N −1 m (1 − q m )2 3i (z)3j (w), i < j. {3i (z), 3j (w)} = − z 1 − q mN (2.11) m∈Z {3i (z), 3i (w)} =

Now we define the q-deformation of the Miura transformation as the map mq : Fn,q → Mn,q , which sends the n-tuple (2.9) to

612

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

L = (D + λ1 (s))(D + λ2 (sq −1 )) . . . (D + λn (sq −n+1 )),

(2.12)

i.e. X

ti (s) =

λj1 (s)λj2 (sq −1 ) . . . λji (sq −i+1 ).

(2.13)

j1 <...<ji

Proposition 3 ([19]). The map mq is Poisson. 2.4. q-deformation of the Virasoro algebra. Here we specialize the formulas of the previous subsection to the case of sl2 (we will omit the index 1 in these formulas). We have the following Poisson bracket on T (z): wq X w m 1 − q m w −δ . (2.14) T (z)T (w) + δ {T (z), T (w)} = m z 1+q z zq m∈Z

The q-deformed Miura transformation reads: 3(z) 7→ T (z) = 3(z) + 3(zq)−1 .

(2.15)

The Poisson bracket on 3(z): {3(z), 3(w)} =

X w m 1 − q m 3(z)3(w). z 1 + qm

(2.16)

m∈Z

3. Connection with q-Gauge Transformations In this section we present the results of [19] in the form of q-difference Drinfeld–Sokolov reduction. 3.1. Presentation via first order q-difference operators. By analogy with the differential case, it is natural to consider the manifold Mn,q of first order difference operators D + A(s), where A(s) is an element of the loop group LSLn of SLn . The group LSLn acts on this manifold by the q-gauge transformations g(s) · (D + A(s)) = g(sq)(D + A(s))g(s)−1 ,

(3.1)

i.e. g(s) · A(s) = g(sq)A(s)g(s)−1 . J ⊂ Mn,q which consists of operators D + Now we consider the submanifold Mn,q A(s), where A(s) is of the form (2.3). It is preserved under the q-gauge action of the group LN . J → Mn,q , In the same way as in the differential case, we define a map πq : Mn,q J th which sends each element of Mn,q to an n order q-difference operator L of the form J by LN . Now we (2.6). It is clear that the map πq factors through the quotient of Mn,q state the q-difference analogue of Proposition 1.

Drinfeld–Sokolov Reduction for Difference Operators I.

613

J Lemma 1. The action of LN on Mn,q is free and each orbit contains a unique operator of the form   t1 t2 t3 . . . tn−1 1 −1 0 0 . . . 0 0    0 −1 0 . . . 0 0 D+ (3.2) . . . . . . . . . . . . . . . . . . . .   0 0 0 . . . 0 0 0 0 0 . . . −1 0

Proof. The proof is an exercise in elementary matrix algebra. For α = 1, . . . , n, denote α J the subset of matrices from Mn,q satisfying the property that all entries in by Mn,q their rows i = α + 1, . . . , n are zero except for the (i, i − 1) entry that is equal to α , α > 1, there exists g(s) ∈ LN , such that −1. We will prove that given A(s) ∈ Mn,q −1 α−1 n J = Mn,q , g(sq)A(s)g(s) ∈ Mn,q . Since the condition is vacuous for α = n, i.e. Mn,q J this will imply that each LN -orbit in Mn,q contains an element of the form (3.2). To prove the statement for a given α, we will recursively eliminate all entries of the αth row of A(s) (except the (α, α − 1) entry), from right to left using elementary unipotent matrices. Denote by Ei,j (x) the upper unipotent matrix whose only non-zero entry above the diagonal is the (i, j) entry equal to x. At the first step, we eliminate the (α, n) entry Aα,n of A(s) by applying the q-gauge transformation (3.1) with g(s) = α , Eα−1,n (−Aα,n (s)). Then we obtain a new matrix A0 (s), which still belongs to Mn,q but whose (α, n) entry is equal to 0. Next, we apply the q-gauge transformation by Eα−1,n−1 (−A0α,n−1 (s)) to eliminate the (α, n − 1) entry of A0 (s), etc. It is clear that at each step we do not spoil the entries that have already been set to 0. The product of the elementary unipotent matrices constructed at each step gives us an element g(s) ∈ LN α−1 . with the desired property that g(sq)A(s)g(s)−1 ∈ Mn,q To complete the proof, it suffices to remark that if A(s) and A0 (s) are of the form (3.2), and g(sq)A(s)g(s)−1 = A0 (s) for some g(s) ∈ LN , then A(s) = A0 (s) and g(s) = 1. For L of the form (2.4), πq (L) equals the operator L given by formula (2.6). Thus, we J J have identified the map πq with the quotient of Mn,q by LN and Mn,q with Mn,q /LN . Remark 4. In the same way as above we can prove the following more general statement. Let R be a ring with an automorphism τ . It rives rise to an automorphism of SLn (R) J as the set of elements of SLn (R) of the denoted by the same character. Define Mτ,n J (R) by the formula g · A = (τ · g)Ag −1 . form (2.3). Let the group N (R) act on Mτ,n Then this action of N (R) is free, and the quotient is isomorphic to the set MJτ,n (R) of elements of SLn (R) of the form (3.2) (i.e. each orbit contains a unique element of the form (3.2)). Note that the proof is not sensible to whether τ = Id or not. When τ = Id, this result is well-known. It gives the classical normal form of a linear operator. Moreover, in that case R. Steinberg has proved that the subset MJId,n (K) of SLn (K), where K is an algebraically closed field, is a cross-section of the collection of regular conjugacy classes in SLn (K) [33], Theorem 1.4. Steinberg defined an analogous cross-section for any simply-connected semisimple algebraic group [33]. His results can be viewed as group analogues of Kostant’s results on semisimple Lie algebras [23] (cf. Remark 1). Steinberg’s cross-section is used in the definition of the discrete Drinfeld– Sokolov reduction in the general semisimple case (see [31]).1 1

We are indebted to B. Kostant for drawing our attention to [33]

614

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

3.2. Deformed Miura transformation via q-gauge action. Let us attach to each element of Fn,q the q-difference operator 

 λ1 (s) 0 ... 0 0 −1 0 0  −1 λ2 (sq ) . . .    3 = D + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ,  0  0 0 . . . λn−1 (sq −n+2 ) −n+1 ) 0 0 ... −1 λn (sq

(3.3)

Qn where i=1 λi (sq −i+1 ) = 1. J e q : Fn,q → Mn,q be the composition of the embedding Fn,q → Mn,q and Let m J J πq : Mn,q → Mn,q /LN ' Mn,q . Using the definition of πq above, one easily finds e q (3) is the operator (3.2), where ti (s) is given by formula that for 3 given by (3.3), m (2.13). Therefore we obtain e q coincides with the q-deformed Miura transformation mq . Lemma 2. The map m Remark 5. Let G be a simply-connected semisimple algebraic group over C. Let Vi be the ith fundamental representation of G (in the case G = SLn , Vi = 3i Cn ), and χi : G → C be the corresponding character, χi (g) = Tr(g, Vi ). Define a map p : G → Cn by the formula p(g) = (χ1 (g), . . . , χn (g)). By construction, p is constant on conjugacy classes. In the case G = SLn the map p has a cross-section r : Cn → SLn (C): 

 a1 a2 a3 . . . an−1 1 −1 0 0 . . . 0 0    0 −1 0 . . . 0 0 (a1 , . . . , an ) 7→  . . . . . . . . . . . . . . . . . . . .  0 0 0 . . . 0 0 0 0 0 . . . −1 0 J e 1 can coincides with the map π1 . Moreover, m The composition r ◦ p, restricted to Mn,1 be interpreted as the restriction of p to the subset of SLn consisting of matrices of the e 1 sends (λ1 , . . . , λn ) to the elementary symmetric polynomials form (3.3). Hence m

ti =

X

λj 1 λj 2 . . . λ j i ,

j1 <...<ji

which are the characters of the fundamental representations of SLn . As we mentioned above, Steinberg has defined an analogue of the cross-section r for an arbitrary simplyconnected semisimple algebraic group [33]. Formula (2.13) expressing ti (z) in terms of the λj (z)’s can be thought of as a qdeformation of the character formula of the ith fundamental representations of SLn . It is interesting that the same interpretation is also suggested by the definition of Wq (sln ) b n )−h∨ as the center of a completion of the quantized universal enveloping algebra Uq (sl [19]. Namely, ti (z) is then defined as the (q-deformed) trace of the so-called L-operator b n ), see [26, 19] (note also that acting on 3i Cn considered as a representation of Uq (sl ti (z) is closely connected with a transfer-matrix of the corresponding integrable spin model).

Drinfeld–Sokolov Reduction for Difference Operators I.

615

J Thus, we have now represented Mn,q as the quotient of the submanifold Mn,q of the manifold Mn,q of first order q-difference operators by the action of the group LN (acting by q-gauge transformations). We have also interpreted the q-deformed Miura transformation in these terms. In the next sections we discuss the Poisson structure on Mn,q , which gives rise to the Poisson structure on Mn,q given by explicit formula (2.8).

4. Poisson Structures 4.1. Overview. In view of the previous section, the following diagram is the q-difference analogue of the diagram presented at the end of Sect. 2. J Mn,q

? Mn,q =

J Mn,q /LN

- Mn,q = (LG, η∗q )

? - Mn,q /LN

As in the differential case, in order to define a q-deformation of the Drinfeld–Sokolov reduction we need to find a Poisson structure η∗q on Mn,q and a Poisson-Lie structure η on LSLn satisfying the following properties: (i) The action LSLn × Mn,q → Mn,q by q-gauge transformations is Poisson. (ii) The subgroup LN of LSLn is admissible in the sense that the algebra Sq of LN -invariant functionals on Mn,q is a Poisson subalgebra of the algebra of all functionals on Mn,q . (iii) Denote by µij the function on Mn,q , whose value at D + A ∈ Mn,q equals the (i, j) entry of A. The ideal in Sq generated by µij + δi−1,j , i > j, is a Poisson ideal. Geometrically, the last condition means that Mn,q is a Poisson submanifold of the quotient Mn,q /LN . For the sake of completeness, we recall the notions mentioned above. Let M be a Poisson manifold, and H be a Lie group, which is itself a Poisson manifold. An action of H on M is called Poisson if H × M → M is a Poisson map (here we equip H × M with the product Poisson structure). In particular, if the multiplication map H × H → H is Poisson, then H is called a Poisson-Lie group. In this section we describe the general formalism concerning problems (i)–(iii) above. Then in the next section we specialize to M2,q and give an explicit solution of these problems. 4.2. Lie bialgebras. Let g be a Lie algebra. Recall [9] that g is called a Lie bialgebra, if g∗ also has a Lie algebra structure, such that the dual map δ : g → 32 g is a one-cocycle. We will consider factorizable Lie bialgebras (g, δ) satisfying the following conditions:

616

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

(1) There exists a linear map r+ : g∗ → g, such that both r+ and r− = −r+∗ are Lie algebra homomorphisms. (2) The endomorphism t = r+ − r− is g-equivariant and induces a linear isomorphism g∗ → g. Instead of the linear operator r+ ∈ Hom(g∗ , g) one often considers the corresponding element r of g⊗2 (or a completion of g⊗2 if g is infinite-dimensional). The element r (or its image in the tensor square of a particular representation of g) is called a classical r-matrix. It satisfies the classical Yang-Baxter equation: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0.

(4.1)

In terms or r, δ(x) = [r, x], ∀x ∈ g (here [a ⊗ b, x] = [a, x] ⊗ b + a ⊗ [b, x]). The maps r± : g∗ → g are given by the formulas: r+ (y) = (y ⊗ id)(r), r− (y) = −(id ⊗y)(r). Property (2) above means that r + σ(r), where σ(a ⊗ b) = b ⊗ a is a non-degenerate g-invariant symmetric bilinear form on g∗ . Set g± = Im(r± ). Property (1) above implies that g± ⊂ g is a Lie subalgebra. The following statement is essentially contained in [5] (cf. also [28]). Lemma 3. Let (g, g∗ ) be a factorizable Lie bialgebra. Then (1) The subspace n± = r± (Ker r∓ ) is a Lie ideal in g± . (2) The map θ : g+ /n+ → g− /n− which sends the residue class of r+ (X), X ∈ g∗ , modulo n+ to that of r− (X) modulo n− is a well-defined isomorphism of Lie algebras. (3) Let d = g ⊕ g be the direct sum of two copies of g. The map i : g∗ → d,

X 7→ (r+ (X), r− (X))

is a Lie algebra embedding; its image g∗ ⊂ d is g∗ = {(X+ , X− ) ∈ g+ ⊕ g− ⊂ d|X − = θ(X + )}, where Y ± = Y mod n± . Remark 6. The connection between our notation and that of [29] is as follows: the operator r ∈ End g of [29] coincides with the composition of r+ + r− up to the isomorphism t = r+ − r− : g∗ → g; the bilinear form used in [29] is induced by t. 4.3. Poisson-Lie groups and gauge transformations. Let (G, η) (resp., (G∗ , η ∗ )) be a Poisson-Lie group with factorizable tangent Lie bialgebra (g, δ) (resp., (g∗ , δ ∗ )). Let G± and N± be the Lie subgroups of G corresponding to the Lie subalgebras g± and n± . We denote by the same symbol θ the isomorphism G+ /N+ → G− /N− induced by θ : g+ /n+ → g− /n− . Then the group G∗ is isomorphic to {(g+ , g− ) ∈ G+ × G− |θ(g + ) = g − }, and we have a map i : G∗ → G given by i((g+ , g− )) = g+ (g− )−1 . Explicitly, the Poisson bracket on (G, η) can be written as follows: {ϕ, ψ} = hr, ∇ϕ ∧ ∇ψ − ∇0 ϕ ∧ ∇0 ψi, where for x ∈ G, ∇ϕ(x), ∇0 ϕ(x) ∈ g∗ are defined by the formulas:

(4.2)

Drinfeld–Sokolov Reduction for Difference Operators I.

617

d ϕ etξ x |t=0 , dt d 0 h∇ ϕ(x), ξi = ϕ xetξ |t=0 , dt h∇ϕ(x), ξi =

(4.3) (4.4)

for all ξ ∈ g. An analogous formula can be written for the Poisson bracket on (G∗ , η ∗ ). In formula (4.2) we use the standard notation a ∧ b = (a ⊗ b − b ⊗ a)/2. By definition, the action of G on itself by left translations is a Poisson group action. There is another Poisson structure η∗ on G which is covariant with respect to the adjoint action of G on itself and such that the map i : (G∗ , η ∗ ) → (G, η∗ ) is Poisson. It is given by the formula {ϕ, ψ} = hr, ∇ϕ ∧ ∇ψ + ∇0 ϕ ∧ ∇0 ψi − hr, ∇0 ϕ ⊗ ∇ψ − ∇0 ψ ⊗ ∇ϕi.

(4.5)

Proposition 4. (1) The map i : G∗ → G is a Poisson map between the Poisson manifolds (G∗ , η ∗ ) and (G, η∗ ); (2) The Poisson structure η∗ on G is covariant with respect to the adjoint action, i.e. the map (G, η) × (G, η∗ ) → (G, η∗ ) : (g, h) 7→ ghg −1 is a Poisson map. These results are proved in [29], § 3 (see also [30], § 2), using the notion of the Heisenberg double of G. Formula (4.5) can also be obtained directly from the explicit formulas for the Poisson structure η ∗ and for the embedding i. More generally, let τ be an automorphism of G, such that the corresponding automorphism of g satisfies (τ ⊗ τ )(r) = r. Define a twisted Poisson structure η∗τ on G by the formula {ϕ, ψ} = hr, ∇ϕ ∧ ∇ψ + ∇0 ϕ ∧ ∇0 ψi 0

(4.6) 0

− h(τ ⊗ id)(r), ∇ ϕ ⊗ ∇ψ − ∇ ψ ⊗ ∇ϕi, and the twisted adjoint action of G on itself by the formula g · h = τ (g)hg −1 . Theorem 1. The Poisson structure η∗τ on G is covariant with respect to the twisted adjoint action, i.e. the map (G, η) × (G, η∗τ ) → (G, η∗τ ) : (g, h) 7→ τ (g)hg −1 is a Poisson map. This result was proved in [29], § 3 (see also [30], § 2), using the notion of the twisted Heisenberg double of G. We will use Theorem 1 in two cases. In the first, G is the loop group of a finite-dimensional simple Lie group G, and τ is the automorphism g(s) → Z/N Z g(sq), q ∈ C× . In the second, G = G , and τ is the automorphism (τ (g))i → gi+1 . In the first case twisted conjugations coincide with q-gauge transformations, and in the second case they coincide with lattice gauge transformations. 4.4. Admissibility and constraints. Let M be a Poisson manifold, G a Poisson Lie group and G × M → M be a Poisson action. A subgroup H ⊂ G is called admissible if the space C ∞ (M )H of H-invariant functions on M is a Poisson subalgebra in the space C ∞ (M ) of all functions on M .

618

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

Proposition 5 ([29], Theorem 6). Let (g, g∗ ) be the tangent Lie bialgebra of G. A connected Lie subgroup H ⊂ G with Lie algebra h ⊂ g is admissible if h⊥ ⊂ g∗ is a Lie subalgebra. In particular, G itself is admissible. Note that H ⊂ G is a Poisson subgroup if and only if h⊥ ⊂ g∗ is an ideal; in that case the tangent Lie bialgebra of H is h, g∗ /h⊥ . Let H ⊂ G be an admissible subgroup, and I be a Poisson ideal in C ∞ (M )H , i.e. I is an ideal in the ring C ∞ (M )H , and {f, g} ∈ C ∞ (M )H for all f ∈ I, g ∈ C ∞ (M )H . Then C ∞ (M )H /I is a Poisson algebra. More geometrically, the Poisson structure on C ∞ (M )H /I can be described as follows. Assume that the quotient M/H exists as a smooth manifold. Then there exists a Poisson structure on M/H such that the canonical projection π : M → M/H is a Poisson map. Hamiltonian vector fields ξϕ , ϕ ∈ π ∗ C ∞ (M/H), generate an integrable distribution Hπ in T M . The following result is straightforward. Lemma 4. Let V ⊂ M be a submanifold preserved by H. Then V /H is a Poisson submanifold of M/H if and only if V is an integral manifold of Hπ . The integrality condition means precisely that the ideal I of all H-invariant functions on M vanishing on V is a Poisson ideal in C ∞ (M )H , and that C ∞ (V /H) = C ∞ (V )H = C ∞ (M )H /I. If this property holds, we will say that the Poisson structure on M/H can be restricted to V /H. V

-

? V /H

? - M/H

M

The Poisson structure on V /H can be described as follows. Let NV ⊂ T ∗ M |V be the conormal bundle of V . Clearly, T ∗ V ' T ∗ M |V /NV . Let ϕ, ψ ∈ C(V )H and V2 dϕ, dψ ∈ T ∗ M |V be any representatives of dϕ, dψ ∈ T ∗ V. Let PM ∈ T M be the Poisson tensor on M . Lemma 5. We have

{ϕ, ψ} = PM , dϕ ⊗ dψ ;

(4.7)

in particular, the right hand side does not depend on the choice of dϕ, dψ. Remark 7. In the case of Hamiltonian action (i.e. when the Poisson structure on H is trivial), one can construct submanifolds V satisfying the condition of Lemma 4 using the moment map. Although a similar notion of the nonabelian moment map in the context of Poisson group theory is also available [24], it is less convenient. The reason is that the nonabelian moment map is “less functorial” than the ordinary moment map. Namely, if G × M → M is a Hamiltonian action with moment map µG : M → g∗ , its restriction to a subgroup H ⊂ G is also Hamiltonian with moment µH = p ◦ µG (here p : g∗ → h∗ is the canonical projection). If G is a Poisson-Lie group, G∗ its dual, G × M → M a Poisson group action with moment µG : M → G∗ , and H ⊂ G a Poisson subgroup, the action of H still admits a moment map. But if H ⊂ G is only admissible, then the restricted action does not usually have a moment map. This is precisely the case which is encountered in the study of the q-deformed Drinfeld–Sokolov reduction.

Drinfeld–Sokolov Reduction for Difference Operators I.

619

5. The q-Deformed Drinfeld–Sokolov Reduction in the Case of SL2 In this section we apply the general results of the previous section to formulate a qanalogue of the Drinfeld–Sokolov reduction when G = SL2 . 5.1. Choice of r-matrix. Let g = Lsl2 . We would like to define a factorizable Lie bialgebra structure on g in such a way that the resulting Poisson-Lie structure η on LSL2 and the Poisson structure η∗q on M2,q satisfy the conditions (ii)–(iii) of Sect. 4. Let {E, H, F } be the standard basis in sl2 and {En , Hn , Fn } be the corresponding (topological) basis of Lsl2 = sl2 ⊗ C((s)) (here for each A ∈ sl2 we set An = A ⊗ sn ∈ Lsl2 ). Let τ be the automorphism of Lsl2 defined by the formula τ (A(s)) = A(sq) (we assume that q is generic). We have: τ · An = q n An . To be able to use Theorem 1, the rmatrix r ∈ Lsl⊗2 2 defining the Lie bialgebra structure on Lsl2 has to satisfy the condition (τ ⊗ τ )(r) = r. Hence the invariant bilinear form on Lsl2 defined by the symmetric part of r should also be τ -invariant. The Lie algebra Lsl2 has a unique (up to a non-zero constant multiple) invariant non-degenerate bilinear form, which is invariant under τ . It is defined by the formulas (En , Fm ) = δn,−m ,

(Hn , Hm ) = 2δn,−m ,

with all other pairings between the basis elements are 0. This fixes the symmetric part of the element r. Another condition on r is that the subgroup LN is admissible. According ∗ to Proposition 5, this means that Ln⊥ + should be a Lie subalgebra of Lsl2 . A natural example of r satisfying these two conditions is given by the formula: X 1 1X En ⊗ F−n + H0 ⊗ H0 + Hn ⊗ H−n . (5.1) r0 = 4 2 n>0

n∈Z

It is easy to verify that this element defines a factorizable Lie bialgebra structure on g. We remark that this Lie bialgebra structure gives rise to Drinfeld’s “new” realization of the quantized enveloping algebra associated to Lsl2 [10, 22, 21]. As we will see in the next subsection, r0 can not be used for the q-deformed Drinfeld–Sokolov reduction. However, the following crucial fact will enable us to perform the reduction. Let Lh be the loop algebra of the Cartan subalgebra h of sl2 . Lemma 6. For any ρ ∈ ∧2 Lh, r0 + ρ defines a factorizable Lie bialgebra structure on ∗ Lsl2 , such that Ln⊥ + is a Lie subalgebra of Lsl2 . The fact that r0 + ρ still satisfies the classical Yang-Baxter equation is a general property of factorizable r-matrices discovered in [5]. Lemma 6 allows us to consider the class of elements r given by the formula X 1 X En ⊗ F−n + φn,m · Hn ⊗ Hm , (5.2) r= 2 n∈Z

m,n∈Z

where φn,m + φm,n = δn,−m . The condition (τ ⊗ τ )(r) = r imposes the restriction φn,m = φn δn,−m , so that (5.2) takes the form r=

X n∈Z

where φn + φ−n = 1.

En ⊗ F−n +

1X φn · Hn ⊗ H−n , 2 n∈Z

(5.3)

620

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

5.2. The reduction. Recall that M2,q = LSL2 = SL2 ((s)) consists of the 2 × 2 matrices a(s) b(s) M (s) = , ad − bc = 1. (5.4) c(s) d(s) J and We want to impose the constraint c(s) = −1, i.e. consider the submanifold M2,q take its quotient by the (free) action of the group 1 x(s) LN = . 0 1

Let η be the Poisson-Lie structure on LSL2 induced by r given by formula (5.3). Let η∗q be the Poisson structure on M2,q defined by formula (4.6), corresponding to the automorphism τ : g(s) → g(sq). The following is an immediate corollary of Theorem 1, Proposition 5 and Lemma 6. Proposition 6. (1) The q-gauge action of (LSL2 , η) on (M2,q , η∗q ) given by formula g(s) · M (s) = g(sq)M (s)g(s)−1 is Poisson; (2) The subgroup LN ⊂ LSL2 is admissible. Thus, we have satisfied properties (i) and (ii) of Sect. 4. Now we have to choose the remaining free parameters φn so as to satisfy property (iii). The Fourier coefficients of the matrix elements of the matrix M (s) given by (5.4) define functions on M2,q . We will use the notation am for the mth Fourier coefficient of a(s). Let R2,q be the completion of the ring of polynomials in am , bm , cm , dm , m ∈ Z, defined in the same way as the ring Rn,q of Sect. 2.3. Let S2,q ⊂ R2,q be the subalgebra of LN -invariant functions. Denote by I be the ideal of S2,q generated by {cn +δn,0 , n ∈ Z} J (the defining ideal of M2,q ). Property (iii) means that I is a Poisson ideal of S2,q , which is equivalent to the J condition that {cn , cm } ∈ I, i.e. that if {cn , cm } vanishes on M2,q . This condition means that the Poisson bracket of the constraint functions vanishes on the constraint surface, i.e. the constraints are of first class according to Dirac. Let us compute the Poisson bracket between cn ’s. First, we list the left and right gradients for the functions an , bn , cn , dn (for this computation we only need the gradients of cn ’s, but we will soon need other gradients as well). It will be convenient for us to identify Lsl2 with its dual using the bilinear form introduced in the previous section. Note that with respect to this bilinear form the dual basis elements to En , Hn , and Fn are F−n , H−n /2, and E−n , respectively. Explicit computation gives (for shorthand, we write a for a(s), etc.): 1 1 a 0 b 0 −m −m 2 2 , ∇bm = s , ∇am = s c − 21 a d − 21 b 1 1 −m − 2 c a −m − 2 d b , ∇dm = s , ∇cm = s 0 21 c 0 21 d 1 1 a b 0 −m 0 −m − 2 b 0 2 , ∇ bm = s , ∇ am = s 0 − 21 a a 21 b 1 1 c d 0 −m − 2 d 0 ∇0 cm = s−m 2 d = s , ∇ . m 0 − 21 c c 21 d

Drinfeld–Sokolov Reduction for Difference Operators I.

621

Now we can compute the Poisson bracket between cn ’s using formula (4.6): {cm , ck } =

1X φn − φ−n + φn q n − φ−n q −n c−n+m cn+k . 2

(5.5)

n∈Z

J Restricting to M2,q , i.e. setting cn = −δn,0 , we obtain: J = {cm , ck }|M2,q

1X φm − φ−m + φm q m − φ−m q −m δm,−k . 2 n∈Z

This gives us the following equation on φm ’s: φm − φ−m + φm q m − φ−m q −m = 0. Together with the previous condition φm + φ−m = 1, this determines φm ’s uniquely: Theorem 2. The Poisson structure η∗q satisfies property (iii) of the q-deformed Drinfeld-Sokolov reduction if and only if φn =

1 . 1 + qn

Consider the r-matrix (5.2) with φn = (1 + q n )−1 . For this r-matrix, the Lie algebras defined in section Sect. 4 are as follows: g± = Lb∓ , n± = Ln∓ , where n+ = CE, n− = CF, b± = h ⊕ n± . We have: g± /n± ' Lh. The transformation θ on Lh induced by this r-matrix is equal to −τ . Explicitly, on the tensor product of the two 2-dimensional representations of sl2 ((t)), the r-matrix looks as follows:   t 0 0 0 φ s  0 −φ t δ t 0  s s ,  (5.6) t  0 0  0 −φ s 0 0 0 φ st where φ(x) =

1X 1 xn . 2 1 + qn n∈Z

Note that 2πφ(xq 1/2 ) coincides with the power series expansion of the Jacobi elliptic function dn (delta of amplitude). Now we have satisfied all the necessary properties on the Poisson structures and hence can perform the q-Drinfeld–Sokolov reduction of Sect. 3 at the level of Poisson algebras. In the next subsection we check that it indeed gives us the Poisson bracket J /LN . (2.14) on the reduced space M2,q = M2,q Remark 8. It is straightforward to identify the q → 1 limit of the reduced Poisson algebra with the classical Virasoro algebra.

622

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

5.3. Explicit computation of the Poisson brackets. Introduce the generating series X A(z) = an z −n , n∈Z

and the same for other matrix elements of M (s) given by formula (5.4). We fix the element r by setting φn = (1 + q n )−1 in formula (5.3) in accordance with Theorem 2. Denote X 1 − qn X (φn − φ−n )z n = zn. (5.7) ϕ(z) = 1 + qn n∈Z

n∈Z

Using the formulas for the gradients of the matrix elements given in the previous section and formula (4.6) for the Poisson bracket, we obtain the following explicit formulas for the Poisson brackets: {A(z), A(w)} = ϕ

w

A(z)A(w), zw {A(z), B(w)} = −δ A(z)B(w), z w {A(z), C(w)} = δ A(z)C(w), z w {A(z), D(w)} = −ϕ A(z)D(w), z {B(z), B(w)} = 0, wq w A(z)D(w) − δ A(z)A(w), {B(z), C(w)} = δ z z wq {B(z), D(w)} = −δ A(z)B(w), z {C(z), C(w)} = 0, w A(z)C(w), {C(z), D(w)} = δ zq w wq w {D(z), D(w)} = ϕ D(z)D(w) − δ C(z)B(w) + δ B(z)C(w). z z zq Remark 9. The relations above can be presented in matrix form as follows. Let A(z) B(z) L(z) = , C(z) D(z) and consider the operators L1 = L ⊗ id, L2 = id ⊗L acting on C2 ⊗ C2 . The r-matrix (5.6) also acts on C2 ⊗ C2 . Formula (4.6) can be written as follows: w 1 w 1 L1 (z)L2 (w) + L1 (z)L2 (w)r− {L1 (z), L2 (w)} = r− 2 z 2 z zq wq − L1 (z)r L2 (w) + L2 (w)σ(r) L1 (z), z w where

Drinfeld–Sokolov Reduction for Difference Operators I.

1 r−

w z

=r

w z

− σ(r)

z w

 = 

2ϕ

0 0 0

623 w z

 0 0 0 − 21 ϕ wz δ wz 0  . −δ wz − 21 ϕ wz 0  1 w 0 0 2ϕ z

J 5.4. Reduced Poisson structure. We know that M2,q = M2,q /LN is isomorphic to

t(s) 1 −1 0

(see Sect. 3). The ring R2,q of functionals on M2,q is generated by the Fourier coefficients of t(s). In order to compute the reduced Poisson bracket between them, we have to extend them to LN -invariant functions on the whole M2,q . Set e t(s) = a(s)c(sq) + d(sq)c(s).

(5.8)

t(s) are LN -invariant, and their It is easy to check that the Fourier coefficients e tm of e J restrictions to M2,q coincide with the corresponding Fourier coefficients of t(s). Let us compute the Poisson bracket between e tm ’s. Set X e tm z −m . Te(z) = m∈Z

Using the explicit formulas above, we find w {Te(z), Te(w)} = ϕ Te(z)Te(w) z wq w 2 +δ 1(z)c(w)c(wq ) − δ 1(w)c(z)c(zq 2 ), z zq

(5.9)

J where 1(z) = A(z)D(z)−B(z)C(z) = 1. Hence, restricting to M2,q (i.e. setting c(z) = 1 in formula (5.9)), we obtain: wq w w T (z)T (w) + δ −δ . {T (z), T (w)} = ϕ z z zq

This indeed coincides with formula (2.14). Remark 10. Consider the subring Se2,q of the ring R2,q , generated by cm , e tm , m ∈ Z. The ring Se2,q consists of LN -invariant functionals on M2,q , and hence it can serve as a substitute for the ring of functions on M2,q /LN . Let us compute the Poisson brackets in Se2,q . The Poisson brackets of e tm ’s are given by formula (5.9), and by construction {cm , ck } = 0. It is also easy to find that {cm , e tk } = 0. Hence Se2,q is a Poisson subalgebra of R2,q . Thus, the q-deformed Drinfeld–Sokolov reduction can be interpreted as follows. The initial Poisson algebra is R2,q . We consider its Poisson subalgebra Se2,q generated by cm ’s and e tm ’s. The ideal I of Se2,q generated by {cm + δm,0 , m ∈ Z} is a Poisson ideal. The quotient Se2,q /I is isomorphic to the q-Virasoro algebra R2,q defined in Sect. 3.

624

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

5.5. q-deformation of Miura transformation. As was explained in Sect. 3.2, the q-Miura transformation of [19] is the map between two (local) cross-sections of the projection J J → Mn,q /LN . In the case of LSL2 , the first cross-section πq : Mn,q λ(s) 0 −1 λ(s)−1 is defined by the subsidiary constraint b(s) = 0, and the second t(s) 1 −1 0 is defined by the subsidiary constraint d(s) = 0. The map between them is given by the formula mq : λ(s) 7→ t(s) = λ(s) + λ(sq)−1 . Now we want to recover formula (2.16) for the Poisson brackets between the Fourier coefficients λn of λ(s), which makes the map mq Poisson. We have already computed the Poisson bracket on the second (canonical) crosssection from the point of view of Poisson reduction. Now we need to compute the Poisson bracket between the functions an ’s on the first cross-section, with respect to which the map mq is Poisson. This computation is essentially similar to the one outlined in Sect. 3.2. The Poisson structure on the local cross-section is given by the Dirac bracket, which is determined by the choice of the subsidiary conditions, which fix the cross-section. The Dirac bracket has the following property (see [16]). Suppose we are given constraints ξn , n ∈ I, and subsidiary conditions ηn , n ∈ I, on a Poisson manifold M , such that {ξk , ξl } = {ηk , ηl } = 0, ∀k, l ∈ I. Let f, g be two functions on M , such that {f, ξk } and {g, ξk } vanish on the common level surface of all ξk , ηk . Then the Dirac bracket of f and g coincides with their ordinary Poisson bracket. In our case, the constraint functions are cm + δm,0 , m ∈ Z, and the subsidiary conditions are bm , m ∈ Z, which fix the local model of the reduced space. We have: {bm , bk } = 0, {cm , ck } = 0, and {am , bk } = 0, if we set bm = 0, ∀m ∈ Z. Therefore we are in the situation described above, and the Dirac bracket between am and P ak coincides with their ordinary bracket. In terms of the generating function A(z) = m∈Z am z −m it is given by the formula w A(z)A(w), {A(z), A(w)} = ϕ z which coincides with formula (2.16). Thus, we have proved the Poisson property of the q-deformation of the Miura transformation from the point of view of the deformed Drinfeld–Sokolov reduction.

6. Lattice Virasoro Algebra In this section we consider the lattice counterpart of the Drinfeld–Sokolov reduction. Our group is thus G = (SL2 )Z/N Z , where N is an integer, and τ is the automorphism of G, which maps (gi ) to (gi+1 ). Poisson structures on G which are covariant with respect to lattice gauge transformations xn 7→ gn+1 xn gn−1 have been studied already in [29] (cf.

Drinfeld–Sokolov Reduction for Difference Operators I.

625

also [2]). In order to make the reduction over the nilpotent subgroup N ⊂ G feasible, we have to be careful in our choice of the r-matrix. 6.1. Discrete Drinfeld–Sokolov reduction. By analogy with the continuous case, we ⊕Z/N Z as follows: choose the element r defining the Lie bialgebra structure on g = sl2 r=

X

En ⊗ Fn +

n∈Z/N Z

1 4

X

φn,m Hn ⊗ Hm ,

m,n∈Z/N Z

where φn,m + φm,n = 2δm,n . It is easy to see that r defines a factorizable Lie bialgebra structure on g. For Theorem 1 to be applicable, r has to satisfy the condition (τ ⊗τ )(r) = r, which implies that φn,m = φn−m . An element of G is an N -tuple (gi ) of elements of SL2 : a k bk gk = . c k dk We consider ak , bk , ck , dk , k ∈ Z/N Z, as the generators of the ring of functions on G. The discrete analogue of the Drinfeld–Sokolov reduction consists of taking the quotient M = GJ /N, where GJ = (GJ )Z/N Z , a b J G = , −1 d and N = N Z/N Z , acting on GJ by the formula (hi ) · (gi ) = (hi+1 gi h−1 i ). It is easy to see that

( M'

ti 1 −1 0

(6.1)

)

. i∈Z/N Z

The element r with φn,m = φn−m , φk +φ−k = 2δk,0 , defines a Lie bialgebra structure on g and Poisson structures η, η∗τ on G. According to Theorem 1, the action of (G, η) on (G, η∗τ ) given by formula (6.1) is Poisson. As in the continuous case, for the Poisson structure η∗τ to be compatible with the discrete Drinfeld–Sokolov reduction, we must have: {cn , cm }|GJ = 0.

(6.2)

Explicit calculation analogous to the one made in the previous subsection shows that (6.2) holds if and only if φn−1 + 2φn + φn+1 = 2δn,0 + 2δn+1,0 . give us a unique solution: for The initial condition φ0 = 1 and periodicity condition 2k . In what follows we restrict odd N , φk = (−1)k ; for even N , φk = (−1)k 1 − N ourselves to the case of odd N (note that in this case the linear operator id +τ is invertible). Continuing as in the previous subsection, we define e tn = an cn+1 + dn+1 cn ,

n ∈ Z/N Z.

626

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

These are N-invariant functions on G. We find in the same way as in the continuous case: tm } = ϕn−m e tn e tm + δn,m+1 cm cm+2 − δn+1,m cn cn+2 , {e tn , e {e tn , cm } = 0,

(6.3)

{cn , cm } = 0,

where 1 ϕk = (φk − φ−k ) = 2

0, k = 0, 6 0. (−1)k , k =

The discrete Virasoro algebra C[ti ]i∈Z/N Z is by definition the quotient of the Poisson algebra C[e ti , ci ]i∈Z/N Z by its Poisson ideal generated by ci +1, i ∈ Z/N Z. From formula (6.3) we obtain the following Poisson bracket between the generators ti : {tn , tm } = ϕn−m tn tm + δn,m+1 − δn+1,m .

(6.4)

The discrete Miura transformation is the map from the local cross-section λn 0 −1 λ−1 n to M, λn 7→ tn = λn + λ−1 n+1 .

(6.5)

It defines a Poisson map C[λ± i ]i∈Z/N Z → C[ti ]i∈Z/N Z , where the Poisson structure on the latter is given by the formula {λn , λm } = ϕn−m λn λm .

(6.6)

Remark 11. The Poisson algebra C[ti ]i∈Z/N Z can be considered as a regularized version of the q-deformed Virasoro algebra when q = , where is a primitive N th root of unity. Indeed, we can then consider t(i ), i ∈ Z/N Z, as generators and truncate in all power series appearing in the relations, summations over Z to summations over Z/N Z divided by N . This means that we replace ϕ(n ) given by formula (5.7) by ϕ( e n) =

1 N

X 1 − i ni , 1 + i

i∈Z/N Z

and δ(n ) by δn,0 . The formula for the Poisson bracket then becomes: e m−n )t(n )t(m ) + δn,m+1 − δn+1,m . {t(n ), t(m )} = φ( If we set t(i ) = ti , we recover the Poisson bracket (6.4), since it is easy to check that ϕ( e m−n ) = ϕn−m . One can apply the same procedure to the q-deformed W-algebras associated to sln and obtain lattice Poisson algebras. It would be interesting to see whether they are related to the lattice W-algebras studied in the literature, e.g., in [6, 7]. In the case of sl2 , this connection is described in the next subsection.

Drinfeld–Sokolov Reduction for Difference Operators I.

627

6.2. Connection with Faddeev–Takhtajan–Volkov algebra. The Poisson structures (6.4) and (6.6) are nonlocal, i.e. the Poisson brackets between distant neighbors on the lattice are nonzero. However, one can define closely connected Poisson algebras possessing local Poisson brackets; these Poisson algebras can actually be identified with those studied by L. Faddeev, L. Takhtajan, and A. Volkov. Let us first recall some results of [19] concerning the continuous case. As was explained in [19], one can associate a generating series of elements of the q-Virasoro algebra to an arbitrary finite-dimensional representation of sl2 . The series T (z) considered in this paper corresponds to the two-dimensional representation. Let T (2) (z) be the series corresponding to the three-dimensional irreducible representation of sl2 . We have the following identity [19]: T (z)T (zq) = T (2) (z) + 1, which can be taken as the definition of T (2) (z). From formula (2.15) we obtain: T (2) (z) = 3(z)3(zq) + 3(z)3(zq 2 )−1 + 3(zq)−1 3(zq 2 )−1 = A(z) + A(z)A(zq)−1 + A(zq)−1 , where A(z) = 3(z)3(zq)

(6.7)

(note that the series A(z) was introduced in Sect. 7 of [19]). From formula (2.16) we find: wq w −δ A(z)A(w). {A(z), A(w)} = δ zq z It is also easy to find wq w −δ T (2) (z)T (2) (w) − 1 δ zq z 2 wq w +δ T (w)T (wq 3 ) − δ T (z)T (zq 3 ). z zq 2

{T (2) (z), T (2) (w)} =

We can use the same idea in the lattice case. Let νn = λn λn+1 ; this is the analogue of A(z). We have: {νn , νm } = (δn+1,m − δn,m+1 )νn νm ,

(6.8)

and hence C[νi± ] is a Poisson subalgebra of C[λ± n ] with local Poisson brackets. We can = t t − 1. The Poisson bracket of t(2) also define t(2) n n+1 n n ’s is local: (2) (2) {t(2) (6.9) t(2) n , tm } = δn+1,m − δn,m+1 n tm − 1 + δn,m+2 tm tm+3 − δn+2,m tn tn+3 . (2) Unfortunately, it does not close on t(2) n ’s, so that C[ti ] is not a Poisson subalgebra of C[ti ]. But let us define formally

sn =

1 1 + t(2) n

−1 = t−1 n tn+1 =

1 . −1 (1 + νn )(1 + νn+1 )

(6.10)

628

E. Frenkel, N. Reshetikhin, M.A. Semenov-Tian-Shansky

Then from formulas (6.10) and (6.8) we find: {sn , sm } =sn sm (δn+1,m − δn,m+1 )(1 − sn − sm ) − − sn+1 δn+2,m + sm+1 δn,m+2 .

(6.11)

Thus, the Poisson bracket closes among sn ’s and defines a Poisson structure on C[si ]i∈Z/N Z . The Poisson algebra C[si ]i∈Z/N Z with Poisson bracket (6.11) was first introduced by Faddeev and Takhtajan in [34] (see formula (54)). We see that it is connected with our version of the discrete Virasoro algebra, C[ti ], by a change of variables (6.10). The Poisson algebra C[νi± ] and the Poisson map C[νi± ] → C[sn ] given by formula (6.10) were introduced by Volkov in [35] (see formulas (2) and (23)) following [34]; see also related papers [36, 13]. This map is connected with our version (6.5) of the discrete Miura transformation by a change of variables. Acknowledgement. E. Frenkel thanks P. Schapira for his hospitality at Universit´e Paris VI in June of 1996, when this collaboration began. Some of the results of this paper have been reported in E. Frenkel’s lecture course on Soliton Theory given at Harvard University in the Spring of 1996. The research of E. Frenkel was supported by grants from the Packard and Sloan Foundations, and by the NSF grants DMS 9501414 and DMS 9304580. The research of N. Reshetikhin was supported by the NSF grant DMS 9296120.

References 1. Adler, M.: On a trace functional for formal pseudodifferential operators and the symplectic structure of the Korteweg–de Vries type equations. Invent. Math. 50, 219–248 (1979) 2. Alekseev, A., Faddeev, L., Semenov-Tian-Shansky, M.: Hidden quantum group inside Kac–Moody algebras. Commun. Math. Phys. 149, 335–345 (1992) 3. Asai, Y., Jimbo, M., Miwa, T., Pugai, Ya.: Bosonization of vertex operators for the A(1) n−1 vertex models. J. Phys. A29, 6595–6616 (1996) 4. Awata, H., Kubo, H., Odake, S., Shiraishi, J.: Quantum Wn algebras and Macdonald polynomials Comm. Math. Phys. 179, 401–416 (1996) 5. Belavin, A.A., Drinfeld V.G.: Solutions of the classical Yang-Baxter equation for simple Lie algebras. Funct. Anal. Appl. 16, 159–80 (1981) 6. Belov, A.D., Chaltikian, K.D.: Lattice analogues of W -algebras and classical integrable equations Phys. Lett., 268–274 (1993) 7. Bonora, L., Colatto, L.P., Constantinidis, C.P.: Toda lattice field theories, discrete W-algebras, Toda lattice hierarchies and quantum groups. Phys. Lett. B387, 759 (1996) 8. Ding, X.-M., Hou, B.-Y., Zhao, L.: ~-(Yangian) deformation of Miura map and Virasoro algebra. Preprint q-alg/9701014 9. Drinfeld, V.G.: Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang–Baxter equation. Sov. Math. Dokl. 27, 68–71 (1983) 10. Drinfeld, V.G.: A new realization of Yangians and quantized affine algebras. Sov. Math. Dokl. 36, 212–216 (1988) 11. Drinfeld, V.G., Sokolov, V.V.: Lie algebras and equations of Korteweg-de Vries type. Sov. Math. Dokl. 23, 457–462 (1981); J. Sov. Math. 30, 1975–2035 (1985) 12. Gelfand, I.M., Dickey, L.A.: Family of Hamiltonian structures connected with integrable nonlinear equations. Collected papers of I.M. Gelfand, vol. 1, Berlin–Heidelberg–New York: Springer-Verlag, 1987, pp. 625–646 13. Faddeev, L.D., Volkov, A.Yu.: Abelian current algebras and the Virasoro algebra on the lattice. Phys. Lett. B315, 311–8 (1993) 14. Feigin, B., Frenkel, E.: Affine Lie algebras at the critical level and Gelfand-Dikii algebras. Int. J. Math. Phys. A7, suppl. A1, 197–215 (1992)

Drinfeld–Sokolov Reduction for Difference Operators I.

629

15. Feigin, B., Frenkel, E.: Quantum W-algebras and elliptic algebras. Comm. Math. Phys. 178, 653–678 (1996); q-alg/9508009 16. Flato, M., Lichnerowicz, A., Sternheimer, D.: Deformation of Poisson brackets, Dirac brackets and applications. J. Math. Phys. 17, 1754 (1976) 17. Frenkel, E.: Deformations of the KdV hierarchy and related soliton equations, Int. Math. Res. Notices 2, 55–76 (1996); q-alg/9511003 18. Frenkel, E.: Affine Kac-Moody algebras at the critical level and quantum Drinfeld–Sokolov reduction. PhD Thesis, Harvard University, 1991 19. Frenkel, E., Reshetikhin, N.: Quantum affine algebras and deformations of the Virasoro algebra and W-algebras Comm. Math. Phys. 178, 237–264 (1996); q-alg/9505025 20. Kac, V.: Infinite-dimensional Lie algebras. Third edition, Cambridge: Cambridge University Press, 1990 21. Kedem, R.: Singular R-matrices and Drinfeld’s comultiplication. Preprint q-alg/9611001 22. Khoroshkin, S.M., Tolstoy, V.N.: On Drinfeld’s realization of quantum affine algebras. J. Geom. Phys. 11, 445–452 (1993) 23. Kostant, B.: On Whittaker vectors and representation theory. Invent. Math. 48, 101–184 (1978) 24. Lu, J.H.: Momentum mapping and reduction of Poisson actions. In: Symplectic geometry, groupoids and integrable systems, Berkeley, 1989. P.Dazord and A.Weinstein (eds.), Berlin–Heidelberg–New York: Springer-Verlag, pp. 209–226 25. Lukyanov, S., Pugai, Ya.: Multi-point local height probabilities in the integrable RSOS models. Nucl. Phys. B473, 631 (1996) 26. Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Central extensions of quantum current groups. Lett. Math. Phys. 19, 133–142 (1990) 27. Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Quantum R-matrices and factorization problems. In: Geometry and Physics, essays in honor of I.M.Gelfand. S.Gindikin and I.M.Singer (eds.), pp. 533-550. Amsterdam–London–New York: North Holland, 1991 28. Semenov-Tian-Shansky, M.A.: What is a classical r-matrix. Funct. Anal. Appl. 17, 17–33 (1983) 29. Semenov-Tian-Shansky M.A.: Dressing action transformations and Poisson–Lie group actions. Publ. RIMS. 21, 1237–1260 (1985) 30. Semenov-Tian-Shansky, M.A.: Poisson Lie groups, quantum duality principle and the quantum double. Contemporary Math. 175, 219–248 31. Semenov-Tian-Shansky M.A., Sevostyanov A.V.: Drinfeld–Sokolov reduction for difference operators and deformations of W-algebras II. General semisimple case. Preprint q-alg/9702016 32. Shiraishi, J., Kubo, H., Awata, H., Odake, S.: A quantum deformation of the Virasoro algebra and the Macdonald symmetric functions Lett. Math. Phys. 38, 33–51 (1996) 33. Steinberg, R.: Regular elements of semisimple algebraic Lie groups, Publ. Math. I.H.E.S., 25, 49–80 (1965) 34. Takhtajan, L.A., Faddeev, L.D.. Liouville model on the lattice, Lect. Notes in Phys. 246, 166–179 (1986) 35. Volkov, A.Yu.: Miura transformation on a lattice. Theor. Math. Phys. 74, 96–99 (1988) 36. Volkov, A.Yu.: Quantum Volterra model. Preprint HU-TFT-92-6 (1992) 37. Weinstein, A.: Local structure of Poisson manifolds, J. Different. Geom. 18, 523–558 (1983) Communicated by T. Miwa

Commun. Math. Phys. 192, 631 – 647 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Drinfeld–Sokolov Reduction for Difference Operators and Deformations of W -Algebras II. The General Semisimple Case M. A. Semenov-Tian-Shansky1,3 , A. V. Sevostyanov2,3 1 2 3

Universit´e de Bourgogne, Dijon, France Institut of Theoretical Physics, Uppsala University, Uppsala, Sweden Steklov Mathematical Institute, St.Petersburg, Russia

Received: 27 April 1997 / Accepted: 22 August 1997

Abstract: The paper is the sequel to [9]. We extend the Drinfeld–Sokolov reduction procedure to q-difference operators associated with arbitrary semisimple Lie algebras. This leads to a new elliptic deformation of the Lie bialgebra structure on the associated loop algebra. The related classical r-matrix is explicitly described in terms of the Coxeter transformation. We also present a cross-section theorem for q-gauge transformations which generalizes a theorem due to R. Steinberg. Introduction The present paper is the sequel to [9]; we refer the reader to this paper for a general introduction. Our goal is to extend the results of [9] to arbitrary semisimple Lie algebras. As an intermediate step we develop a group-theoretic framework for abstract difference equations associated with arbitrary semisimple Lie groups. A similar problem for differential equations which was solved by Drinfeld and Sokolov is linear, since it involves only the structure of the corresponding semisimple Lie algebra. Difference equations lead to the study of specific submanifolds in Lie groups which are closely related to some of its Bruhat cells. Our main technical result is a cross-section theorem for the q-gauge transformations which generalizes a theorem due to R. Steinberg. The reduction scheme outlined in [9] extends to the abstract case as well and the general conclusion remains the same: the consistency condition for the reduction imposes very rigid conditions on the underlying classical r-matrix which fix it completely. The resulting classical r-matrix is new; it yields an elliptic deformation of the Lie bialgebra structure on the loop algebras associated with so-called Drinfeld’s new realizations of quantized affine algebras [3, 10]. The explicit characterization of the set of abstract q-difference operators leads to a very simple formula for this r-matrix in terms of the Coxeter element of the corresponding Weyl group. (The role of Coxeter transformations in the theory of q-difference operators should be compared with the role of the dual Coxeter transformations in the representation theory of affine Lie algebras at the critical level which is implicit in [7, 8]). The

632

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

Coxeter element also plays a key role in the definition and the study of the generalized Miura transformation. The structure of the paper is as follows. Section 1 gives a description of the set of abstract q-difference operators associated to an arbitrary complex semisimple Lie group G. This description is based on the cross-section theorem referred to above; its proof is postponed until Sect. 3. In Sect. 2 we define a class of Poisson covariant Poisson structures on the set of q-difference G-valued connections; this definition is preceded by the description of a class of Lie bialgebra structures on loop algebras. As compared to [9], we need more details on the Poisson theory of q-gauge transformations, including the theory of the double and the twisted factorization. We then formulate our main theorem which gives an explicit description of the (unique) classical r-matrix which is compatible with the Drinfeld–Sokolov reduction for q-difference operators associated with G. At the end of Sect. 2 we also briefly discuss the reduction for finite difference operators on the lattice which yields a definition of the classical lattice W (g)-algebra extending the definition of the lattice Virasoro algebra discussed in [9]. Section 4 contains a description of the generalized Miura transformations. Finally, in Sect. 5 we compare our formulae with the formulae of Frenkel and Reshetikhin [8] for the sl(n) case. 1. Abstract q-Difference Operators The following notation will be used throughout the paper. Let G be a connected simply connected finite-dimensional complex semisimple Lie group, g its Lie algebra. Fix a Cartan subalgebra h ⊂ g and let 1 be the set of roots of (g, h). Choose an ordering in the root system; let 1+ be the system of positive roots and {α1 , ..., αl }, l = rank g, the Pl corresponding set of simple roots. For α ∈ 1, α = i=1 ni αi , we define its height by height α =

l X

ni .

i=1

Let eα ∈ g be a root vector which corresponds to α ∈ 1. Let X Ceα b=h+ α∈1+

be the corresponding Borel subalgebra and X b=h+ Ce−α α∈1+

the opposite Borel subalgebra; let n = [b, b] and n = [b, b] be their respective nil-radicals. We shall fix a nondegenerate invariant bilinear form (, ) on g. Let H = exp h, N = exp n, N = exp n, B = HN, B = HN be the Cartan subgroup, the maximal unipotent subgroups and the Borel subgroups of G which correspond to the Lie subalgebras h, n, n, b and b, respectively. Let W be the Weyl group of (g, h) ; we shall denote a representative of w ∈ W in G by the same letter. Let s1 , ..., sl be the reflections which correspond to simple roots; let s = s1 s2 · · · sl be a Coxeter element. Put N 0 = {v ∈ N ; svs−1 ∈ N }; it is easy to see that N 0 ⊂ N is an abelian subgroup, dim N s = l. Put M s = N s−1 N ; it is clear that N s−1 N = N 0 s−1 N and that N 0 ×N → M s : n0 , n 7−→ n0 s−1 n is a diffeomorphism.

Drinfeld–Sokolov Reduction for Difference Operators. II

633

Let G = LG be the loop group of G; 1 the group G will be identified with the subgroup of constant loops in G. Fix q ∈ C , |q| < 1, and let τ be the automorphism of G defined by g τ (z) = g(qz). We shall denote the corresponding automorphism of the loop algebra Lg by the same letter. Let C be another copy of G equipped with the q-gauge action of G, G × C → C; (v, L) 7−→ v τ Lv −1 .

(1.1)

The space C will be sometimes referred to as the space of q-difference connections (with values in G). Let Ms be the cell in C consisting of loops with values in M s , and S = N 0 s−1 . Theorem 1.1. The restriction of the gauge action (1.1) to N ⊂ G leaves the cell Ms ⊂ C invariant. The action of N on Ms is free and S is a cross-section of this action. The proof will be given in Sect. 3. (Its special case which corresponds to G = SL(n) is presented in [9].) Remark. A closely related theorem is due to Steinberg [17] who discussed the action of an algebraic semisimple Lie group on itself by conjugations. (In other words, the situation considered in [17] corresponds to the trivial automorphism τ = id.) Steinberg’s theorem asserts that if G is defined over an algebraically closed field, N 0 s−1 ⊂ G is a cross-section of the set of regular conjugacy classes in G. In Theorem 1.1 we replace the action of the entire group on itself by the action of its unipotent subgroup on its affine subvariety Ms . In the context of Lie algebras a similar problem was studied by B.Kostant [12], again in the case of trivial τ To motivate the above definitions let us discuss the case G = SL(n). Let us choose an order in the root system of sl(n) in such a way that positive root vectors correspond to lower triangular matrices. We may order the simple roots and choose the matrices sk , k = 1, ..., n − 1, representing the corresponding reflections in such a way that the Coxeter element s = s1 s2 ...sn−1 is represented by the matrix   0 0 ··· 0 1  −1 0 · · · 0 0     0 −1 · · · 0 0  s= . · · · · · · · · · · · · · · ·  0 0 ··· 0 0  0 0 · · · −1 0 Then the group N 0 consists of lower triangular unipotent matrices of the form   1 0 ··· 0 0  0 1 ··· 0 0    u = · · · · · · · · · · · · · · ·,  0 0 ··· 1 0  ∗ ∗ ··· ∗ 1 the set M s consists of all unimodular matrices of the form 1 We may define G, e.g., as the group of points of G over the local field C((z)), or over the ring of holomorphic functions regular in C\{0}. The concrete choice is in fact irrelevant for the validity of Theorem 1.1.

634

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov



∗ −1  ∗ ∗  A = · · · · · ·  ∗ ∗ ∗ ∗

0 ··· −1 · · · ··· ··· ∗ ··· ∗ ···

 0 0   · · ·, −1  ∗

and the set N 0 s−1 is the set of all companion matrices of the form 

0  0   0 L= · · ·  0 1

 −1 ··· 0 0 0 −1 · · · 0 0   0 0 ··· 0 0  . ··· ··· ··· ··· ···    0 0 ··· 0 −1 u1 u2 · · · un−2 un−1

Let us associate to L a first order difference equation τ · ψ + Lψ = 0, where ψ = (ψ1 , ψ2 , ..., ψn )t is a column vector, ψk ∈ C ((z)). It is easy to see that its first component φ := ψ1 satisfies an nth order difference equation, τ n φ + un−1 τ n−1 · φ + un−2 τ n−2 · φ + ... + u1 τ · φ + φ = 0, and, moreover, ψk = τ k−1 φ, k = 1, 2, ..., n. Hence the set N 0 s−1 may be identified with the set Mn,q of all nth order q-difference operators. 2 In the general case we set, by definition, Mq (G) = Ms /LN ; as we shall see, the manifold Mq (G) carries a natural Poisson structure and may be regarded as the spectrum of a classical q-W-algebra. Remark. In [4] Drinfeld and Sokolov use a similar approach to define the set of abstract higher order differential operators associated to a given semisimple Lie algebra; the key observation that motivates their definition is that for g = sl(n) the matrix 

0 −1 0 · · ·  0 0 −1 · · ·  f = · · · · · · · · · · · ·  0 0 0 ··· 0 0 0 ···

 0 0 0 0   · · · · · · ∈ g 0 −1  0 0

is a principal nilpotent element. Accordingly, in the context of Lie algebras the set Ms ⊂ G is replaced by the affine manifold Mf = f +Lb ⊂ Lg; the main cross-section theorem then asserts that the gauge action of the subgroup LN ⊂ LG leaves Mf invariant; its restriction to Mf is free and admits a cross-section which is an affine submanifold in Lg. This cross-section provides a model for the space of higher order differential operators. Notice that there exists a well known link between principal nilpotent elements in a semisimple Lie algebra and Coxeter elements of the corresponding Weyl group provided by a theorem due to Kostant [11]. 2 Note that in this paper we use slightly different conventions, as compared to [9]; in particular, we consider the q-gauge action by lower triangular matrices as opposed to upper triangular matrices.

Drinfeld–Sokolov Reduction for Difference Operators. II

635

2. Poisson Structures for q-Difference Equations In this section we shall construct a class of Poisson structures on G = LG and on the space C of q-difference connections. Our main theorem asserts that there is a unique Poisson structure in this class which is compatible with the Poisson reduction over LN . We start with the description of a family of Lie bialgebra structures on loop algebras. 2.1. Factorizable Lie bialgebras associated with Lg. Let g = Lg be the loop algebra; we equip it with the standard invariant bilinear form, hX, Y i = Resz=0 (X (z) , Y (z)) dz/z. Notice that the automorphism τ satisfies hτ X, τ Y i = hX, Y i. Let d = g ⊕ g be the direct sum of two copies of g with the bilinear form hh(X1 , X2 ) , (Y1 , Y2 )ii = hX1 , Y1 i − hX2 , Y2 i .

(2.1)

Put b = Lb, b = Lb, n = Ln, n = Ln, h = Lh; let π : b → b/n, π : b → b/n be the canonical homomorphisms; the quotient algebras b/n, b/n may be canonically identified with h. Fix a linear operator θ ∈ End h which satisfies the following conditions: 1. hθX, θY i = hX, Y i for any X, Y ∈ h. 2. I − θ is invertible. We shall assume, moreover, that θ extends to an automorphism of LH which we denote by the same letter. Put o n (2.2) g∗θ = X+ , X− ∈ b ⊕ b ⊂ d; π X− = θ ◦ π (X+ ) ; let δ g ⊂ d be the diagonal subalgebra. The following assertion is well known. Proposition 2.1. d,δ g, g∗θ is a Manin triple. Thus we get a family of Lie bialgebras g, g∗θ with common double d = g ⊕ g parametrized by θ ∈ End h ; all these bialgebras are factorizable. Let ˙ +n ˙ g = n+h be the ‘pointwise Bruhat decomposition’ of the loop algebra. Let P+ , P0 , P− be the corresponding projection operators which map g onto n, h, n, respectively. The classical r-matrix associated with g, g∗θ is the kernel of the linear operator θ r+ ∈ Hom g∗θ , g , θ

r+ = P+ + (I − θ)−1 .

(2.3)

Let us also set θ r− := −θ r+∗ ; the classical Yang-Baxter identity implies that both θ r+ and θ r− are Lie algebra homomorphisms from g∗θ into g. In the definition of the Poisson structures it is sometimes convenient to use their skew-symmetric combination, 1 1 θ 1+θ θ P+ − P− + P0 ; r= r+ +θ r− = (2.4) 2 2 1−θ the “perturbation term” in (2.4) is the Cayley transform of θ,

636

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

θ

r0 =

1+θ P0 . 1−θ

(2.5)

The mappings θ r± give rise to group homomorphisms θ r± : (LG)∗ → LG (which we denote by the same letters). The double d = g ⊕ g has a naturalstructure of a factorizable Lie bialgebra associated with the Manin triple d,δ g, g∗θ . Hence D = G × G is a Poisson Lie group which contains both G and its dual group as Poisson subgroups. More precisely, let π : LB → LB/LN, π : LB → LB/LN be the canonical projections; the quotient groups LB/LN, LB/LN may be canonically identified with LH . Let δ G ⊂ G × G be the diagonal subgroup and G ∗ ⊂ D the subgroup defined by G∗ =

x+ , x− ∈ LB × LB; θ ◦ π(x+ ) = π(x− ) .

(2.6)

δ ∗ Proposition 2.2. (i) G, G ⊂ D are Poisson Lie subgroups with tangent Lie bialgebras ∗ ∗ g, gθ and gθ , g , respectively. (ii) Almost all elements x ∈ G admit a factorization ∗ x = x+ x−1 − , where x+ , x− ∈ G .

We shall also need the related notion of twisted factorization. Proposition 2.3. Suppose that the automorphism τ commutes with θ. Then almost all elements x ∈ G admit a twisted factorization ∗ x = xτ+ x−1 − , where x+ , x− ∈ G .

(2.7)

Factorizations (2.6) and (2.7) are unique if we assume that both x and x+ , x− are sufficiently close to the unit element. 2.2. Gauge covariant Poisson structures and reduction. Assume that θ ∈ End h commutes with τ . In that case the space C of q-difference connections admits a natural Poisson structure which is covariant with respect to the q-gauge action G × C → C. We refer the reader to [15, 16] for its construction which is based on the notion of the twisted Heisenberg double. Let A be the affine ring of G = LG; by definition, A is generated by the coefficients of the (formal) Laurent expansion of the matrix coefficients of L ∈ G in some faithful finite dimensional representation of G. Speaking informally, we may regard the elements of A as smooth functions on G or on C; the difference between the two options is in the choice of the Poisson structure. We shall write C ∞ (C) instead of A to stress the choice of this underlying Poisson structure on A. For any ϕ ∈ C ∞ (C) we denote by ∇ϕ, ∇0 ϕ its left and right gradients. Theorem 2.4. (i) For any θ ∈ End h satisfying conditions (1), (2) the bracket

, ∇ψ + θ r ∇0 ϕ , ∇0 ψ {ϕ, ψ}τ = θ r (∇ϕ) − τ ◦θ r+ ∇0 ϕ , , ∇ψ − θ r− ◦ τ −1 (∇ϕ) , ∇0 ψ ,

(2.8)

satisfies the Jacobi identity. (ii) Equip the space C of q-difference connections with the Poisson structure (2.8); then the q-gauge action defines a Poisson mapping G × C → C. (iii) The subgroup N = LN ⊂ G is admissible and hence N -invariant functions form a Poisson subalgebra in C ∞ (C).

Drinfeld–Sokolov Reduction for Difference Operators. II

637

(We refer the reader to [9] for a general definition of admissible subgroups.) Below we shall need another formula for the Poisson bracket {, }τ which is related to the twisted factorization in G. Let ϕ ∈ C ∞ (C); we define Zϕ ∈ g by the following relation: r+ Zϕ − τ −1 · r− Zϕ = ∇0 ϕ. Let h ∈ C be an element admitting twisted factorization, h = hτ+ h−1 − ; then

{ϕ, ψ}τ (h) = Adh+ · τ −1 · Zϕ − Adh− Zϕ , ∇ψ − ∇ϕ, Adh+ · τ −1 · Zψ − Adh− Zψ .

(2.9)

We can now state our second main theorem. For w ∈ W let Rw ∈ End Lh be the linear operator acting in the loop algebra, (Rw H) (z) = Ad w · (H (z)) . Theorem 2.5. The quotient space Ms /N =S is a Poisson submanifold in C/N if and only if the endomorphism θ is given by θ = Rs · τ , where s ∈ W is the Coxeter element. Notice that the Coxeter transformation satisfies conditions (1), (2) imposed above; since it preserves the root and weight lattices in h, it gives rise to an automorphism of LH. In the case when g = sl(2), Theorem 2.5 was proved in [9] (Theorem 2); in this case Rs = −Id. For future reference let us write down explicitly the twisted factorization problem associated with the r-matrix θ r. Proposition 2.6. Assume that θ = Rs · τ . The twisted factorization problem in the loop group G associated with the r-matrix θ r amounts to the relation −1 (2.10) , y+ ∈ LB, , y− ∈ LB, π y− = s (π (y+ )) . x = y+ y− We shall denote by G 0 ⊂ G the open subset consisting of elements admitting twisted factorization described in (2.10). We shall now explicitly describe the kernel of the corresponding classical r-matrix. The relevant part of this kernel is the ‘perturbation term’ r0 which was defined in (2.5). In the present situation we have r0 =

I + Rs · τ P0 . I − Rs · τ

(2.11)

Let Hp ∈ h, p = 1, ..., l, be the eigenvectors of the Coxeter element, Ad s(Hp ) = 2πikp

e h Hp (here h is the Coxeter number and k1 , ... kl are the exponents of g. There exists a permutation σ of the set {1, ...l} such that the basis{Hσp } is biorthogonal to {Hp } ; we may assume that hHp , Hσp i = 1.Then (r0 X) (z) =

r ∞ X X n=−∞ p=1

zn

2πikp h 2πik q n exp h p

1 + q n exp 1−

hXn , Hσp i Hp , q ∈ C, |q| < 1. (2.12)

Note that the Lie bialgebra studied by Drinfeld in [3] (see also [10]) corresponds to the “crystalline” limit q → 0 in (2.12); in this case r0 amounts to the Hilbert transform in h. It is convenient to write

638

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

r0 (q, z) =

r X

ψp (q, z)Hp ⊗ Hσp ,

(2.13)

p=1

where ψp (q, z) =

∞ 2πik X 1 + q n exp h p n n=−∞ 1 − q exp

2πikp h

zn.

(2.14)

Functions ψp satisfy the q-difference equations, ψp (q, z) + exp

2πikp 2πikp · ψp (q, qz) = δ (z) − exp · δ (qz) , h h

(2.15)

where δ (z) =

∞ X

zn.

(2.16)

n=−∞

2.3. Proof of Theorem 2.5. We briefly recall the geometric criterion that allows to check that a submanifold of a quotient Poisson manifold is itself a Poisson manifold. Let M be a Poisson manifold, π : M → B a Poisson submersion. Hamiltonian vector fields ξϕ , ϕ ∈ π ∗ C ∞ (B), generate an integrable distribution Hπ in T M . Proposition 2.7. Let V ⊂ M be a submanifold; W = π(V ) ⊂ B is a Poisson submanifold if and only if V is an integral manifold of Hπ . Assume that this condition holds true; let NV ⊂ T ∗ M |V be the conormal bundle of V ; clearly, T ∗ V ' T ∗ M |V /NV . Let ϕ, ψ ∈ C ∞ (W ); put ϕ∗ = π ∗ ϕ |V , ψ ∗ = π ∗ ψ |V . Let dϕ, dψ ∈ T ∗ M |V be any representatives of dϕ∗ , dψ ∗ ∈ T ∗ V . Let V2 PM ∈ V ect M be the Poisson tensor. Proposition 2.8. We have

π ∗ {ϕ, ψ} |V = PM , dϕ ∧ dψ ;

(2.17)

in particular, the r.h.s. does not depend on the choice of dϕ, dψ. Let us now apply Proposition 2.7 in the setting of Theorem 2.5. It is sufficient to check that the Hamiltonian vector fields generated by N -invariant functions on C are tangent to Ms if and only if r0 is given by (2.11). Let ϕ ∈ C ∞ (C)N ; then ϕ (v τ L) = ϕ (Lv) for all v ∈ N , and hence Z := ∇ϕ − τ · ∇0 ϕ ∈ b. Since ∇0 ϕ(L) = Ad L−1 · ∇ϕ(L), we rewrite the Poisson bracket on C in the following form:

{ϕ, ψ}τ (L) = rZ + Z − Ad L · r · τ −1 · Z + Ad L · τ −1 · Z, ∇ψ(L) ; thus in the left trivialization of T C the Hamiltonian field generated by ϕ has the following form: ξϕ (L) = rZ + Z − Ad L · r · τ −1 · Z + Ad L · τ −1 · Z. Assume that L ∈ Ms , L = vs−1 u, v ∈ N 0 , u ∈ N . Put Z = Z0 + Z+ , Z0 ∈ h,Z+ ∈ n. Then ξϕ (L) = r0 Z0 + Z0 + s−1 τ −1 · Z0 − s−1 τ −1 · r0 Z0 + X + Ad v · s−1 · Y,

Drinfeld–Sokolov Reduction for Difference Operators. II

639

where X ∈ n0 , Y ∈ n. On the other hand, in the left trivialization of T C the tangent space TL Mf is identified with n0 + Ad v · s−1 · n . Hence ξϕ is tangent to Mf if and only its h-component vanishes, i.e., if r0 + Id + Ad s−1 τ −1 − Ad s−1 τ −1 r0 = 0, which is equivalent to (2.11). 2.4. Lattice W-algebras. Let 0 = Z/N Z be a finite periodic lattice. Set G = G0 , g = g0 . Let τ be the automorphism of G induced by the cyclic permutation on 0, xτi = xi+1modN . We define lattice gauge transformations by g · x = g τ xg −1 . The definition of gauge covariant Poisson brackets on the space of q-difference connections has its obvious lattice counterpart. In an equally obvious way one may construct a class of Lie bialgebra structures on g = g0 which is compatible with reduction over the unipotent subgroup N = N 0 . Namely, let us consider the “pointwise Bruhat decomposition” ˙ +n, ˙ n = n 0 , h = h0 , n = n 0 g = n+h and set θ

r = P+ − P− +

I +θ P0 , θ ∈ End h. I −θ

We shall omit the details and formulate only the lattice counterpart of the main theorems. Let as usual s ∈ G be a Coxeter element. Theorem 2.9. (i) The restriction of the gauge action to N = N 0 leaves the subset Ms = Ns−1 N invariant. (ii) The restricted action is free and S = N0 s−1 , N0 = N 0 0 , is its cross-section. We shall assume for simplicity that the lattice length N is relatively prime with the Coxeter number. Theorem 2.10. The quotient Ms /N is a Poisson submanifold in the reduced space if and only if θ =Rs · τ . Remark. The condition on 0 assures that I − θ is invertible; it is likely that reduction is possible even without this assumption, but this question needs further study. For G = SL(2) this reduction was studied in detail in [9], Sect. 6. It gives a discrete version of the Virasoro algebra, which is closely connected to the lattice Virasoro algebra of [5].

3. The Cross-Section Theorem We shall prove the following assertion. Theorem 3.1. For each L ∈ N 0 s−1 N there exists a unique element n ∈ N such that nτ · L · s−1 ∈ N 0 s−1 .

640

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

Let Ch ⊂ W be the cyclic subgroup generated by the Coxeter element. Ch has exactly l = rank g different orbits in the root system 1(g, h). The proof depends on the structure of these orbits; for this reason we have to distinguish several cases.3 Proposition 3.2. The theorem is true for g of type Al . An elementary proof which is based on the matrix algebra is given in [9]; below we present a different proof which uses only the properties of the corresponding root system. Lemma 3.3. (i) Each orbit of Ch in 1 (g, h) consists of exactly h elements; one can order these orbits in such a way that the k th orbit contains all positive roots of height k and all negative roots of height h − k. Put nk =

M

nα , Nk = exp nk , Nk = LNk .

{α∈1+ , ht α=k}

For each k we can choose αk ∈ 1+ in such a way that nk =

h−k−1 M

nsp (αk ) .

p=0

Put npk = nsp (αk ) , Nkp = exp npk , Nkp = LNkp . Let L = v · s−1 · u, v ∈ N 0 , u ∈ N ; we must find n ∈ N such that nτ · v · s−1 · u = v0 · s−1 · n.

(3.1)

For any n ∈ N there exists a factorization n = n1 n2 · · · nl , where nk ∈ Nk ; moreover, each nk may be factorized as , npk ∈ Nkp . nk = n0k n1k · · · nh−k−1 k For any n ∈ N the element nτ · v · s−1 · u admits a representation nτ · v · s−1 · u = vs ˜ −1 u, ˜ v˜ ∈ N 0 , u˜ ∈ N ; let

−→−−−→ l h−k−1 Y Y p p u˜ = u˜ k , u˜ k ∈ Nkp , k=1

p=0

be the corresponding factorization of u. ˜ For x ∈ G we write s (x) := s · x · s−1 (this notation will be frequently used in the sequel). Lemma 3.4. We have u˜ pk = τ · s np−1 Vkp , where the factors Vkp ∈ Nkp depend only k on u, v and on nqj with j < k. 3

The proofs given below do not apply when g is a simple Lie algebra of type E6 .

Drinfeld–Sokolov Reduction for Difference Operators. II

641

Assume now that n satisfies (3.1); then we have v˜ = v0 , u˜ = n. This leads to the following relations: Vkp = npk , (3.2) τ · s np−1 k where we set formally n−1 k = 1. Proposition 3.5. The system (3.2) may be solved recursively starting with k = 1, p = 0. Clearly, the solution is unique. This concludes the proof for g of type Al . Let now g be a simple Lie algebra of type other than Al and E6 , l its rank. Lemma 3.6. (i) The Coxeter number h (g) is even. (ii) Each orbit of Ch in 1 (g, h) consists of exactly h elements and contains an equal number of positive and negative roots. (iii) Put M / 1+ }, np = C · eα ; 1p+ = {α ∈ 1+ ; s−p · α ∈ α∈1p +

then np ⊂ n is an abelian subalgebra, dim np = l. When g is not of type D2k+1 this assertion follows from [2] (Chapter 6, No 1.11, Prop. 33 and Chapter 3, No 6.2, Corr. 3). For g of type D2k+1 it may be checked directly. Put N p = exp np ; let N p be the corresponding subgroup of the loop group G. Let L = v · s−1 · u, v ∈ N 0 , u ∈ N ; we must find n ∈ N such that v · s−1 · u = nτ · v0 · s−1 · n−1 . Put n = n 1 n2 · · · n h , n p ∈ N p ; 2

(3.3)

the elements np will be determined recursively. Put s−1 (w) = s−1 · w · s, w ∈ N . We have Y Y ←− −→ v · s−1 (u) = τ · n−1 . (3.4) np · v0 · s−1 p We shall say that an element x ∈ G is in the big cell in G if, for all values of the argument z, the value x (z) is in the big Bruhat cell B N¯ ⊂ G. Lemma 3.7. v · s−1 (u) is in the big cell in G and admits a factorization v · s−1 (u) = x1+ · x1− , x1+ ∈ N , x1− ∈ N . Indeed, let u = uh/2 uh/2−1 · · · u1 , up ∈ N p , be a similar decomposition of u; then we have simply x− = s−1 (u1 ) . (It is clear that x1+ ∈ B actually does not have an H-component and so belongs to N .) A comparison of the r.h.s in (3.4) with the Bruhat decomposition of the l.h.s. imme−1 diately yields that the first factor in (3.3) is given by n1 = s x− . Assume that n1 , n2 , ..., nk−1 are already computed. Put mk = n1 n2 · · · nk−1 and consider the element

· v · s−1 (u) · s−1 (mk ) . Lk := s−k+1 τ m−1 k

(3.5)

642

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

Lemma 3.8. Lk is in the big cell in G and admits a factorization Lk = xk+ xk− , xk+ ∈ N , xk− ∈ N .

(3.6)

The elements xk± are computed recursively from the known quantities. By applying a similar transform to the r.h.s. of (3.4) we get ! ! ! ← − −→ Y Y −1 k −k+1 −1 −1 −1 τ mk · τ · np · v0 · s · s (mk ) = (3.7) np L =s p

p



    − → ← −−− Y Y  · s−k n−1 . s−k+1 τ · np  v0  · s−k  n−1 p k p≥k

p≥k+1

; hence nk = sk xk− , which A comparison of (3.7) and (3.6) yields xk− = s−k n−1 k concludes the induction. 4. Generalized Miura Transform Our construction of the Miura transform for q-difference operators may be regarded as a nonlinear version of the corresponding construction for differential operators, due to Drinfeld and Sokolov [4]. Recall that the space of abstract differential operators associated with a given semisimple Lie algebra g is realized as the quotient space of the affine manifold Mf = f + Lb ⊂ Lg (the translate of Lb by a fixed principal nilpotent element f ∈ n) over the gauge action of LN . The cross-section theorem of Drinfeld and Sokolov provides a global model S for this quotient space. It is easy to see that the affine submanifold f + Lh ⊂ f + Lb is a local cross-section of the gauge action LN × Mf → Mf (i.e., the orbits of LN are transversal to f + Lh) and hence f + Lh provides a local model of the same quotient space. Thus we get a Poisson structure on f + Lh and a Poisson mapping f + Lh → S. The computation of the induced Poisson structure on f + Lh follows the general prescription of Dirac (see, e.g., [6]), but is in fact greatly simplified, since all correction terms in the Dirac formula identically vanish. One may notice that the affine manifold f + Lh ⊂ Mf is the intersection of the level surfaces of two moment maps associated with the gauge actions of the opposite triangular subgroups LN and LN ; it is this symmetry between LN and LN that accounts for cancellations in the Dirac formula. The situation in the nonlinear case is exactly similar. We pass to the formal description of our construction. Let B ⊂ G be the opposite Borel subgroup, N ⊂ B its nilradical, B = LB, N = LN . Let us consider the Poisson reduction of the space C of q-difference connections over the action of the opposite gauge group N . We equip C with the Poisson structure (2.8), where the choice of θ may be arbitrary. Proposition 4.1. (i) The q-gauge action N × C → C leaves B ⊂ C invariant. (ii) Hamiltonian vector fields on C generated by gauge invariant functions ϕ ∈ C ∞ (C)N are tangent to B ⊂ C. Corollary 4.2. B/N ⊂ C/N is a Poisson submanifold. Remark. Heuristically, the submanifold B ⊂ C corresponds to reduction at the “zero level” of the moment, hence the constraints are automatically of the first class.

Drinfeld–Sokolov Reduction for Difference Operators. II

643

We shall now define an embedding i : H → Ms ∩ B into the intersection of two “level surfaces”. Let w0 ∈ W be the longest element; let π ∈ Aut 1+ be the automorphism defined by π (α) = −w0 · α, α ∈ 1+ . Let Ni ⊂ N be the 1-parameter subgroup generated by the root vector eπ(αi ) , αi ∈ P . Choose an element ui ∈ Ni , ui 6= 1. Lemma 4.3. [17] w0 ui w0−1 ∈ Bsi B. We may choose ui in such a way that w0 ui w0−1 ∈ N si N . Put x = ul ul−1 ...u1 ; then f := w0 xw0−1 ∈ N s−1 N ∩ N¯ . Remark. The choice of ui is not unique, however this non-uniqueness does not affect the arguments below. Define the immersion i : H → G : x 7−→ x · f · s(x−1 ). ¯ Proposition 4.4. i(H) ⊂ N s−1 N ∩ B. ¯ For Remark. By dimension count it is easy to see that i(H) is open in N s−1 N ∩ B. ¯ it seems plausible that this is true in G = SL(n) we have simply i(H) = N s−1 N ∩ B; the general case as well. We define the corresponding embedding i : H → G for loop groups associated with H, G by the same formula. Clearly, i(H) ⊂ Ms ∩ B. Proposition 4.5. i(H) is a local cross-section of the gauge actions N × Ms → Ms , N × B → B. In other words, gauge orbits of N , N are transversal to i(H) ⊂ Ms ∩ B. Let us now assume that the Poisson structure on the space of q-difference connections is the one described in Theorem 2.5. We may consider i(H) as a (local) model of the reduced space Ms /N obtained by “fixing the gauge” by means of the “subsidiary condition” L ∈ B, or, alternatively, as a model of B/N obtained by choosing the subsidiary condition L ∈ Ms . The choice of r0 assures that both the “constraints” and the “subsidiary conditions” are of the first class. The reduced Poisson structure on i(H) may be expressed in terms of the Dirac bracket. As it appears, it is possible to avoid the actual computation of the “correction terms”. We shall prove the following assertion. Proposition 4.6. The quotient Poisson structure on i(H) is given by

(Id − τ ) (Id − Rs ) . {ϕ, ψ}i(H) = P˜H ∇ϕ, ∇ψ , P˜H = Id − Rs · τ

(4.1)

It will be convenient to introduce another parametrization of the Cartan subgroup which is related to the twisted factorization problem (2.10) in G. Lemma 4.7. B ⊂ G 0 . Proof. The twisted factorization problem in B (cf. (2.10) amounts to the relation −1 b¯ = x+ · x−1 − n− , where x+ ∈ H, x− ∈ H, n− ∈ N and x− = s (x+ ) ,

or, equivalently, b¯ = x · s(x)−1 n−1 − .

(4.2)

644

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov

The same assertion of course holds true for H ⊂ B; in that case we have n− = 1. Let π : B → H be the projection map which assigns to h ∈ B the element x ∈ H satisfying (4.2). Lemma 4.8. Let H ⊂ G be the Cartan subgroup. The mapping p : H → H : x 7−→ x · s(x)−1 is an immersion. Put PH =

Rs · (τ − Id) (Id − Rs ) (Id − Rs · τ )

(4.3)

and define the Poisson bracket on H by {ϕ, ψ}H = hPH Dϕ, Dψi .

(4.4)

Lemma 4.9. p : H, {, }PH → H, {, }i(H) is a Poisson mapping. Hence to prove Proposition 4.6 we may use the Poisson bracket (4.4) instead of (4.1). Let ϕ, ψ ∈ C ∞ (H). Let ϕ∗ = ϕ ◦ π, , ψ ∗ = ψ ◦ π ∈ C ∞ (B) be their lifts to B defined via the twisted factorization map. (In other words, ϕ∗ b¯ = ϕ (x) , where b¯ = x · s(x)−1 n−1 − , x ∈ H.) In the right trivialization of the cotangent bundle of B the differential dϕ∗ (h) ∈ b∗ of ϕ∗ is given by dϕ∗ (h) = −τ −1 r− ∇ϕ,

(4.5)

where ∇ϕ ∈ h is the right invariant differential of ϕ evaluated at x = π (h) , and similarly for ψ ∗ . The standard embedding b∗ ⊂ g allows to regard dϕ∗ (h) as an element of g. To compute the Poisson bracket {ϕ, ψ} we may apply Proposition 2.8. We have {ϕ, ψ} (π (h)) = hPC , dϕ∗ (h) ∧ dψ ∗ (h)i .

(4.6)

Using formula (2.9) and inserting the expression (4.5) for the differentials we get (4.4). Let c : Ms → S be the canonical mapping which assigns to each L ∈ Ms the unique element L0 ∈ S lying in the same N -orbit. The generalized Miura transform m is defined by m = c ◦ i : H → S. Theorem 4.10. The generalized Miura transform is a Poisson mapping. The Poisson structure in H is given by (4.4); the Poisson structure in the target space is the reduced Poisson structure described in Theorem 2.5. The proof immediately follows from the fact that i (H) and S are different models of the quotient space Ms /N . Note that for G = SL(2) our construction of the Miura transform coincides with the one described in [9].

Drinfeld–Sokolov Reduction for Difference Operators. II

645

5. The SL(n) Case Our aim in this section is to compare the Poisson structures arising via the q-Drinfeld– Sokolov reduction with the results in [8]. (The case of n = 2 has been discussed in detail in [9]. Our analysis for general n is parallel to that of [9], Sect. 3, though our conventions are slightly different.) To begin with, let us list the standard facts concerning the structure of SL(n). We keep to the choice of order in the root system of sl(n) made in Sect. 1, that is, positive root vectors correspond to lower triangular matrices. We order simple roots in such a way that the Coxeter element s = s1 s2 · · · sn−1 is acting on the Cartan subalgebra h as a cyclic permutation, s−1 · diag (H1 , H2 , ..., Hn ) = diag Hn , H1 , ..., Hn−1 ; its representative in G = SL(n) is given by   0 −1 0 · · · 0  0 0 −1 · · · 0    s−1 =  · · · · · · · · · · · · · · ·  .  0 0 0 · · · −1  1 0 0 ··· 0 The automorphism π = −w0 of h is given by π · diag (H1 , H2 , ..., Hn ) = diag −Hn , −Hn−1 , ..., −H1 . We may choose the unipotent elements ui , i = 1, 2, ..., n − 1, in such a way that the principal nilpotent element f constructed in Lemma 4.3 is given by   1 −1 0 · · · 0  0 1 −1 · · · 0    f = · · · · · · · · · · · · · · ·;  0 0 0 · · · −1  0 0 0 ··· 1 the manifold Ms consists of matrices of the form  ∗ −1 0 · · ·  ∗ ∗ −1 · · ·  L = · · · · · · · · · · · ·  ∗ ∗ ∗ ··· ∗ ∗ ∗ ···

 0 0   · · ·. −1  ∗

Let x = diag(x1 , x2 , ..., xn ); the embedding i : H → Ms defined in Proposition 4.3 is given by   x1 x−1 −1 0 · · · 0 n  0 x2 x−1 −1 · · ·  0 1   . · · · · · · · · · · · · · · · i(x) := 3 =     0 0 0 · · · −1  0 0 0 · · · xn x−1 n−1 It is convenient to introduce affine coordinates on i (H) in the following way:

646

M. A. Semenov-Tian-Shansky, A. V. Sevostyanov



31 (z) −1  0 32 (qz)  ··· 3(z) =  · · ·  0 0 0 0

0 ··· 0 −1 · · · 0 ··· ··· ··· 0 ··· −1 0 · · · 3n q n−1 z

   . 

Let Lcan = m(3) ∈ S be the canonical form of 3,   0 −1 0 ··· 0  0  0 −1 · · · 0   Lcan =  · · · · · · · · · · · · · · ·  ;  0 0 0 · · · −1  1 u1 (z) u2 (z) · · · un−1 (z) we put up (z) := (−1)n−p−1 sn−p q n−p z . Proposition 5.1. ([9], Lemma 2) We have X 3j1 (z) 3j2 (qz) · · · 3jp q p−1 z . sp (z) =

(5.1)

1≤j1 <j2 <...<jp ≤n

Proposition 5.2. The Poisson bracket on H is given by ∞ X (1 − q m ) 1 − q m(n−1) z m {3p (z) , 3p (w)} = 3p (z) 3p (w) , (5.2) 1 − q nm w m=−∞ ∞ X (1 − q m ) 1 − q −m z m {3p (z) , 3s (w)} = 3p (z) 3s (w) , s > p. 1 − q nm w n=−∞ Proof. Let ω = exp 2πi n be the primitive root of unity. The eigenvectors of s in h are ek = diag(1, ω −k , ..., ω −(n−1)k ), s · ek = ω k ek , k = 1, ..., n − 1. )(Id−Rs ) is given by the formal Laurent The kernel of the Poisson operator P˜H = (Id−τ Id−τ ·Rs series ∞ n−1 X X 1 (1 − q m ) 1 − ω k z m ek ⊗ en−k . n 1 − qn ωk w m=−∞ k=1

We have {3p (z) , 3s (w)}

(5.3) 1 (1 − q m ) 1 − ω k z m m(s−p) (ek · 3 (z))pp · en−k · 3 (w) ss q = m k n 1−q ω w m=−∞ k=1 ∞ n−1 X X 1 (1 − q m ) 1 − ω k z m = q m(s−p) ω k(s−p) 3p (z) 3s (w) . m ωk n 1 − q w m=−∞ ∞ X

n−1 X

k=1

Observe that

Drinfeld–Sokolov Reduction for Difference Operators. II

( 1−q n(N −1) 1 X 1 − ω k k(s−p) 1−q mn , if s = p, ω = −m −m(s−p) 1−q n 1 − qm ωk q 1−q mn , if s > p.

647

(5.4)

k

Substituting (5.4) into (5.3), we get (5.2). Formula (5.1) coincides with the q–deformed Miura transformation defined in [8]. Formula (5.2) coincides with the Poisson bracket on 3i (z)’s derived in [8]. Therefore in the case of sl(n) the Poisson algebra obtained by the difference Drinfeld–Sokolov reduction coincides with the q–deformed W–algebra introduced in [8]. Acknowledgement. The present paper is part of a joint research project which was started by E. Frenkel and N. Reshetikhin together with the first author. We are indebted to B. Kostant who has pointed out to the paper [17]; one of the authors (M.A.S.T.S.) would like to thank G. Arutyunov for useful discussion. The work of the second author was partially supported by the Royal Swedish Academy of Sciences grant no.1240.

References 1. Belavin, A.A., Drinfeld, V.G.: Solutions of the classical Yang-Baxter equation for simple Lie algebras Funct. Anal. Appl., 16, 159–80 (1981) 2. Bourbaki, N.: Groupes et alg`ebres de Lie ch. 4 a` 6. Paris: Hermann, 1968 3. Drinfeld, V.G.: A new realization of Yangians and quantized affine algebras. Sov. Math. Dokl. 36 (1988) 4. Drinfeld, V.G., Sokolov, V.V.: Lie algebras and equations of Korteweg–de Vries type. Sov. Math. Dokl. 23, 457–62 (1981); J. Sov. Math. 30, 1975–2035 (1985) 5. Faddeev, L.D., Volkov, A.Yu.: Abelian current algebras and the Virasoro algebra on the lattice. Phys. Lett. B315, 311–8 (1993) 6. Flato, M., Lichnerowicz, A., Sternheimer, D.: Deformation of Poisson brackets, Dirac brackets and applications. J. Math. Phys. 17, 1754 (1976) 7. Frenkel, E.: Affine Kac-Moody algebras at the critical level and quantum Drinfeld–Sokolov reduction. PhD Thesis, Harvard University, 1991 8. Frenkel, E., Reshetikhin, N.: Quantum affine algebras and deformations of the Virasoro algebra and W -algebras, Commun. Math. Phys. 178, 237–264 (1996); q-alg/9505025 9. Frenkel, E., Reshetikhin, N., Semenov-Tian-Shansky M..A.: Drinfeld–Sokolov reduction for difference operators and deformations of W -algebras I. The case of Virasoro algebra. Commun. Math. Phys. 192, 605–629 (1998) 10. Khoroshkin, S.M., Tolstoy, V.N.: On Drinfeld’s realization of quantum affine algebras. J. Geom. Phys. 11, 445–452 (1993) 11. Kostant, B.: The principal three-dimensional subgroups and the Betti numbers of a complex simple Lie group. Am. J. Math. 81, 973–1032 (1959) 12. Kostant, B.: On Whittaker vectors and representation theory. Invent. Math. 48, 101–184 (1978) 13. Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 13–53 (1980) 14. Semenov-Tian-Shansky, M.A.: What is a classical r-matrix? Funct. Anal. Appl., 17, 17–33 (1983) 15. Semenov-Tian-Shansky, M.A.: Dressing action transformations and Poisson–Lie group actions. Publ. Math. RIMS, 21, 1237–1260 (1985) 16. Semenov-Tian-Shansky, M.A. Poisson Lie groups, quantum duality principle and the quantum double. Contemp. Math., 175, 219–248 17. Steinberg, R.: Regular elements of semisimple algebraic Lie groups. Publ. Math. I.H.E.S. 25, 49–80 (1965) Communicated by G. Felder

Commun. Math. Phys. 192, 649 – 669 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

On the Aubry–Mather Theory in Statistical Mechanics A. Candel1 , R. de la Llave2,? 1 2

Department of Mathematics, California Institute of Technology, Pasadena, CA 91125, USA Department of Mathematics, University of Texas at Austin, Austin, TX 78712, USA

Received: 15 May 1996 / Accepted: 22 July 1997

Abstract: We generalize Aubry-Mather theory for configurations on the line to general sets with a group action. Cocycles on the group play the role of rotation numbers. The notion of Birkhoff configuration can be generalized to this setting. Under mild conditions on the group, we show how to find Birkhoff ground states for many-body interactions which are ferromagnetic, invariant under the group action and having periodic phase space.

1. Introduction In the motivation from Solid State Physics, Aubry-Mather theory describes the structure of solutions to the following problem. Frenkel and Kontorova proposed a very simple model for a one-dimensional crystal to describe the structure of dislocations. They considered a one-dimensional chain of identical atoms, connected by springs, placed in a periodic substrate potential V with period p. The potential energy of such a system or configuration u (indexed by the integers Z) describes the interaction of a particle u(j) with its neighbors, and is given by the formal expression S(u) =

X1 j∈Z

2

(u(j) − u(j + 1) − a)2 + V (u(j)).

The parameter a is the length of the connecting springs. The problem is then to find configurations u which minimize this potential, and to describe their structure. The first part of the sum above is the energy of the internal interaction between particles, the second is the external energy. In the absence of substrate potential, that is, when V = 0, the configurations of minimal energy are given by u(j) = aj + k. The mean ?

R. Ll. partially supported by research grants from the NSF

650

A. Candel, R. de la Llave

spacing is the rest distance of the spring. (This situation corresponds to the integrable case in the dynamical systems interpretation.) When V is large, it is reasonable to expect – and indeed it can be proved – that the particles settle around the minima of the potential V , hence that the configurations that minimize the energy are spaced with period p. Thus ground states appear as a compromise between two different periods, and one is then interested in describing their structure. The various possible configurations of the atoms in the chain are characterized by trajectories in phase space, which turns out to be orbits of a twist map of an annulus. This is obtained by looking at the variational equations that define the map. In general one finds periodic orbits, to which one can assign a rational rotation number. Other orbits have irrational rotation number, corresponding to incommensurate structures. Several authors (see [2, 6, 9]) have considered extensions of the Aubry-Mather theory to higher dimensions, that is, configurations on lattices Zn . In [2], a study is made of minimal configurations of the variational problem with nearest neighbor potentials; while in [6] the authors study variational solutions which are well-ordered (Birkhoff configurations). In fact, it is shown in [2] that in higher dimensions, there may be minimal configurations that are not well ordered. In [9] the Aubry-Mather theory is discussed for nearest neighbor potential for lattices in the plane. One can find there some interesting physical applications of this theory. For a more detailed survey of the physical significance of the theory of incommensurate crystals in several dimensions, the reader may consult [8]. The main goal of this paper is to generalize the setting of Aubry-Mather theory to configurations on sets more general than Zd and for general many-body interactions. We consider a set 3, together with potentials HB associated to finite subsets B of 3. We will need to assume that 3 admits a group action by a group G satisfying some mild assumptions. The potentials HB describe the interaction of a particle lying in 3 with its neighbors. The potential energy of such a system can be modeled by a formal sum S(u) =

X

HB (u),

B⊂3

where u : 3 → R is a configuration on 3. We will require that HB satisfy a ferromagnetic or twist condition, that they are invariant under the action of G and also satisfy some other periodicity assumption. Physically, we could think of the points of 3 as atoms, whose state is characterized by one number. The configurations describe what is the situation of all the atoms. We look for configurations u on 3 which are stationary points or minimizers for S and which have a prescribed rotation number. In our situation, the rotation number is a cocycle for the group G. Solutions to the variational equations with given rotation number always exist. But in order to find ground state configurations we have to require that the group G satisfies a (very natural) property. Essentially, this property says that we can exhaust 3 by fundamental domains for subgroups of G of finite index. We will discuss the well-order properties of the solutions that we find. These properties are the exact analog of the non-crossing properties of codimension one minimal surfaces of Riemannian manifolds, which obviously is the natural generalization of the relation between solutions to the one-dimensional Frenkel–Kontorova model and geodesics on a torus.

Aubry–Mather Theory in Statistical Mechanics

651

We think that the generality emphasizes the relevant features of the theory and that it would be of interest even if we were interested only in the one dimensional case. Nevertheless, we had several concrete examples that we wanted to consider. One of them was the following. Consider a completely homogeneous discrete subset of the hyperbolic plane, something we may call a hyperbolic crystal. This set, say 3, is assumed to be left invariant by a discrete cocompact group G0 of hyperbolic isometries, that is G0 ⊂ PSl(2, R) is its isotropy group. There is a natural potential in this situation, namely the one whose solutions are discrete harmonic functions. Our theory applies to this case, and to perturbations of it. Although hyperbolic crystals may not seem very realistic, let us mention the work of Kleiman and Saduc [5]. They suggest that amorphous structures could be regarded as projections of ordered patterns in hyperbolic space. They could be seen as models for disclinations in ordered Euclidean structures. Imagine we add wedges to a crystal. This operation decreases the curvature, thus the new crystal resembles a periodic pattern in hyperbolic space. Finally let us mention that the theory presented here applies to the Bethe lattice. Besides its applications in statistical mechanics, this seems a reasonable model for hierarchical networks, for example computer networks. It is not hard to imagine situations in which there is a penalty for computers to be out of phase and also for not being in phase with the environment that they are occupying. In the next section, we will describe the properties of the geometry of the lattice that we will need; in the following one, we will describe the properties of interactions. Then, we will state and prove the main theorem that states that there are solutions for the variational equations satisfying extra properties. Finally, we will briefly discuss some other problems. 2. Generalities on Groups and Cocycles The situation we will consider is as follows. There is a countable space 3 on which a finitely generated group G acts. We impose some conditions on the action of G on 3. First, it is effective, that is only the identity element of G acts trivially (this is no restriction because we can always pass to a quotient of G satisfying this property). Second, there is a finite fundamental domain for the action, that is, a finite subset F of 3 which intersects each orbit of G in exactly one point. For instance, 3 could be the set of vertices of some graph (we would like to keep in mind the edges as well, as we may use them for the path metric). Perhaps to avoid unnecessary complications we should assume that the graph is locally finite or uniformly locally finite (it does not become more and more populated as one goes to infinity). Now this more general situation includes the Cayley graph of a finitely generated group, graphs which are lifts of the 1-skeleton of a compact manifold to its universal covering, graphs which are quasi-isometric to manifolds of bounded geometry, etc. For example, it includes the Bethe lattice as the universal cover of the figure 8. Consult [3] for the thermodynamic properties of the Bethe lattice. In this non-commutative situation that we will consider, the role of rotation numbers in Aubry-Mather theory would be played by cocycles on the group G, that is maps σ:G→R 0

0

such that σ(γγ ) = σ(γ) + σ(γ ). The space of cocycles on G forms a real vector space, which will be denoted by H 1 (G; R).

652

A. Candel, R. de la Llave

We will assume that G acts on 3 with small stabilizers, meaning that if an element γ of G fixes a point of 3, then σ(γ) = 0 for any cocycle σ on G. For this to hold it is sufficient that the stabilizer of each point of 3 is a torsion group, or, more generally, that it is a group with trivial first cohomology). One technical result that we will need is the possibility of approximating arbitrary cocycles by simpler ones, namely Proposition 1. Let σ : G → R be a cocycle. Then there is a sequence of cocycles σn which converges point-wise to σ and such that σn has integral values when restricted to some subgroup Gn of finite index in G. Proof. Since G is finitely generated, H 1 (G, R) is a finite dimensional real vector space. Take a basis for it formed by cocycles with integral values, say {τ1 , . . . , τe }. Then we can write σ = a1 τ1 + . . . + an τn , some real numbers aj . For each k = 1, 2, . . . , choose rational numbers bi,k , i = 1, . . . , e, such that |ai − bi,k | < 1/k 2 . Then: |σ(γ) − b1,k τ1 (γ) − . . . − be,k τe (γ)| < |γ|1 /k 2 , where |γ|1 is the l1P -norm of γ in H 1 (G; R) with respect to the basis {τi }. Furthermore, this cocycle σk = i bi,k τi takes on rational values only. Since G is finitely generated, its image by η, say Pk , is a finitely generated subgroup of Q. Thus there is an integer m such that mPk = Z, and Pk /mPk is a finite group. In conclusion, there is a finite index subgroup of G such that σk has integer values when restricted to it. Finally, if γ ∈ G has |γ|1 < k, then for n > k we get |σ(γ) − σn (γ)| < |γ|1 /k 2 < 1/k, so that σn converges point-wise to σ as required. Also, kσ − ηk = sup |σ(γ) − σn (γ)|/|γ| ≤ sup |σ(γ) − σn (γ)|/|γ|1 < 1/k 2 , γ∈G

γ∈G

because |γ|1 ≤ |γ| as one easily verifies.

There is one property of the group that we will need to consider when proving existence of ground states, and it is that of being residually finite. This means that for each element of G, other than the identity, there is a finite index subgroup of G which does not contain it. Most familiar groups are residually finite, for instance, if R is a field and n ≥ 1, then a finitely generated subgroup of GL(n, R) is residually finite (see [12]). On the other hand, the group with presentation hx, y ; xy 2 x−1 = y 3 i is not residually finite (Baumslag–Solitar). The reason for requiring this property is that it would allow us to do a kind of renormalization needed later in our discussion. More specifically we need the following: Proposition 2. Let G, 3 be as above with G residually finite. Then there is a sequence G1 ⊃ G2 ⊃ . . . of finite index subgroups of G whose fundamental domains exhaust 3. Proof. If F is a fundamental domain for G, we can write 3 = F ∪ g1 F ∪ . . . , where gi are the nontrivial elements of G. By induction, choose a subgroup Gn of finite index in Gn−1 which does not contain gn . (Recall that residually finite is a property inherited by finite index subgroups.) Since a fundamental domain for Gn may be constructed by adding the translates of a fundamental domain of Gn−1 by representatives of elements of Gn−1 /Gn , the proof is complete.

Aubry–Mather Theory in Statistical Mechanics

653

3. Configurations, the Birkhoff Property and Rotation Cocycles A configuration on 3 is a map u : 3 → R. The set of all configurations on 3 has the structure of a real vector space with the obviously defined operations, and we denote it by C(3). The space of configurations is partially ordered in the following sense. We say that two configurations u, v are such that u ≤ v if u(p) ≤ v(p) for all p in 3. This makes C(3) into a partially ordered vector space. One metric we want to consider on the space of configurations is the one induced by the family of semi-norms |u(p)|, p ∈ 3, which gives rise to the pointwise convergence topology. An explicit form for this metric is the following: if we enumerate the points of 3 as p1 , p2 , · · · , pn , · · · , an explicit form is d(u, v) =

X 1 |u(pn ) − v(pn )| . 2n 1 + |u(pn ) − v(pn )| p ∈3 n

We denote kuk = d(0, u) and note that ktuk ≤ (1 + |t|)kuk for any real number t. The action of G on 3 extends to one on C(3), defined in the following manner: if γ ∈ G and u ∈ C(3), then Tγ u(p) = u(γ(p)) for any element p of 3. It is evident that this action of G on C(3) preserves the partial order on configurations. We will denote by χA the characteristic configuration associated with a subset A of 3, and which is defined as χA (p) = 1 if p ∈ A, χA (p) = 0 otherwise. We define operators RA : C(3) −→ C(3) by RA (u) = u − χA . It is clear that they are invertible and furthermore that if A, B are subsets of 3, then the operators RA and RB commute. We denote by 3α , α = 1, . . . , l the equivalence classes of 3 modulo G, and denote by Rα the operators corresponding to the sets 3α . Given any s in Zl , s = (s1 , . . . , sl ), we denote by Rs the operator Rs11 . . . Rsl l . A cocycle σ defines a configuration uσ on 3 as follows. Let F be a fundamental domain for the action of G on 3. Any p of 3 can be written as p = γq for a unique q in F . Then define uσ (p) = σ(γ). It is elementary that uσ is well-defined, for if also p = γ1 q, then γ −1 γ1 stabilizes q, so σ(γ) = σ(γ1 ) by our hypothesis. Note that uσ (q) = 0 for all the elements of F . Then, with these conventions, we say that a configuration u is of type σ if sup |u(p) − uσ (p)| < ∞.

p∈3

The set of configurations of type σ will be denoted by Oσ . Put in another way, once a fundamental domain F ⊂ 3 is fixed, any cocycle σ on G defines a configuration and Oσ is the subspace of C(3) formed by those configurations u at bounded distance from uσ . Thus Oσ is an affine space modeled on `∞ (3, R). The induced metric in Oσ is |u(γ) − v(γ)|∞ = sup |u(γ) − v(γ)|, γ∈G

which makes it into a Banach space.

654

A. Candel, R. de la Llave

Proposition 3. The space Oσ is independent of the fundamental domain. This is an elementary consequence of the cocycle property. Also, the same property implies: Proposition 4. The operators Tγ , γ ∈ G, and R leave Oσ invariant. Next we introduce the notion of Birkhoff configuration. A configuration u of type σ is called a Birkhoff configuration for the group G if σ(γ) ≥ (≤)sα for γ ∈ G, s ∈ Nl only when Tγ (u) ≥ (≤)Rs (u). The set of Birkhoff configurations in Oσ is denoted by Bσ . Since Rs (u)(p) = u(p) − sα if p belongs to the equivalence class 3α of 3/G, we see that u is a Birkhoff configuration if either Tγ u + σ(γ) ≥ u or Tγ u + σ(γ) ≤ u for the partial order of configurations. It also follows immediately that: Proposition 5. For every γ in G there exists s+ , s− such that Rs− (u) ≤ Tγ (u) ≤ Rs+ (u) for any u in Bσ . Although we have defined the Birkhoff property of a configuration by referring to a given cocycle, we could have avoided it. We show next that the non-intersecting property of a configuration implies that there is a cocycle for which it is Birkhoff. That is, suppose that u : 3 → R is a configuration which satisfies the following property: for each γ ∈ G and any integer k ∈ Z, we have u(γp) + k ≤ u(p) for all p ∈ 3, or

u(γp) + k ≥ u(p)

for all p ∈ 3. We are going to define a cocycle σ : G → R such that u ∈ Oσ . First, let τ + (γ) = sup(u(γp) − u(p)) p

and

τ − (γ) = inf (u(γp) − u(p)). p

Note that they are finite numbers. Furthermore, τ + is a sub-cocycle, that is, τ + (γ1 γ2 ) ≤ τ + (γ1 ) + τ + (γ2 ), and τ − is a super-cocycle. This implies that we can define τ + (γ n ) τ − (γ n ) = lim , n→∞ n→∞ n n

σ(γ) = lim

because both limits exist and are equal. We also see that τ + ((γ1 γ2 )n ) τ + (γ1 (γ2 γ1 )n−1 γ2 ) = lim ≤ σ(γ2 γ1 ) n→∞ n→∞ n n

σ(γ1 γ2 ) = lim

by elementary use of the sub-cocycle and super-cocycle properties and of the definition of σ. Therefore σ(γ1 γ2 ) = σ(γ2 γ1 ).

Aubry–Mather Theory in Statistical Mechanics

655

It is also elementary to check that σ(γ −1 ) = −σ(γ), and that, if γ1 and γ2 commute, then σ(γ1 γ2 ) = σ(γ1 ) + σ(γ2 ). From all these properties it follows that σ is a cocycle. Finally we show that the configuration u has σ as rotation cocycle, that is, that sup |u(p) − σ(τp )| < ∞,

p∈3

which is the same as to show that sup

|u(γq) − σ(γ)| < ∞.

γ∈G,q∈F

Here F ⊂ 3 denotes a finite fundamental domain for the action of G. Fix q ∈ F and γ ∈ G, and consider the configuration n ∈ Z 7→ u(γ n q). Since τ − (γ n ) ≤ u(γ n q) − u(q) ≤ τ + (γ n ), it follows from the definitions that it has rotation number σ(γ). As is well-known, this implies that u(q) + nσ(γ) ≤ u(γ n q) < u(q) + nσ(γ) + 1 from which it follows that |u(γq) − σ(γ)| ≤ 1 + |u(q)|. Since the fundamental domain F is finite, this implies that the rotation cocycle of u is σ, as we had claimed. Proposition 6. If u is a configuration satisfying the well-ordering property above, then there is a cocycle σ on G for which u ∈ Bσ . In particular, Corollary 1. If the group has no nontrivial cocycles, then every Birkhoff configuration is bounded. A simple example of a group without cocycles is G = ha, b; ap = bq = (a−1 b)r = 1i with 1/p + 1/q + 1/r < 1. This group can be realized faithfully as a discrete group of isometries of the hyperbolic plane. However, it has finite index subgroups which do have nontrivial cocycles. We also mention the following elementary property of Birkhoff configurations. Proposition 7. The space of Birkhoff configurations is closed for the product topology on R3 . Furthermore, for any cocycle σ, the set Bσ is non-empty and convex. Proof. For the first part, simply note that by the previous proposition the space of Birkhoff configurations is characterized by the inequalities Tγ u + n ≥ u or Tγ u + n ≥ u, so it can be written as an intersection of closed sets. The configuration uσ associated to the cocycle is Birkhoff, so Bσ is non-empty. The convexity property is obvious.

656

A. Candel, R. de la Llave

Similarly, we also have Proposition 8. Let σ be a cocycle in G, and consider Oσ with the Banach space topology given by uniform convergence. Then the space Bσ of Birkhoff configurations in Oσ is closed. In analogy with rational rotation numbers, we say that a configuration u has rational rotation cocycle if there is a normal finite index subgroup G0 of G and an integer valued cocycle σ : G0 → Z such that u(γ(p)) = u(p) + σ(γ) for all p ∈ 3. This definition could have been formulated in a slightly different way by using the following two propositions. Proposition 9. Let σ : G0 → Z, where G0 is a finite index normal subgroup of G with G/G0 commutative. Then σ extends to a rational cocycle on G. Proof. For γ in G, let n be any nonzero integer such that γ n ∈ G0 . Then define σ(γ) = σ(γ n )/n. It is immediate to check that σ : G → Q is well defined and, because G/G0 is commutative, it is a homomorphism. Proposition 10. Let u ∈ C(3) be a configuration such that u(γ(p)) = u(p) + σ(γ) for all p in 3 and all elements γ of a finite index normal subgroup G0 of G, and for σ : G0 → Z. Suppose that σ extends to a cocycle on G. Then u ∈ Oσ , σ the extension of the cocycle to G. Proof. Let p0 , . . . , pr be a finite set such that the translates {γ(pi ); γ ∈ G0 } fill up 3. Choose an upper bound M for the finite sets of numbers {|u(p0 )|, . . . , |u(pr )|} and {|σ(τp0 )|, . . . , |σ(τpr )|}. Then, for any p ∈ 3 we have: |u(p) − σ(τp )| = |u(γ(pk )) − σ(τp )| = |u(pk ) − σ(τpk )| ≤ 2M. We will use another property of Birkhoff configurations that we now explain. Proposition 11. Suppose that u is a Birkhoff configuration that has integer valued rotation cocycle σ : G → Z. Suppose that u is periodic with respect to some finite index subgroup G0 of G, that is, u(γ(p)) = u(p) + σ(γ) for all p in 3 and all γ in G0 . Then u is also G-periodic. The proof of this fact goes as follows. Let γ be an element of G. Then γ k ∈ G0 for some integer k because G/G0 is finite. If u is not γ-periodic, then let p be such that u(γ(p)) > u(p) + σ(γ). Using the Birkhoff property of u this implies that u(γ k (p)) > u(p) + kσ(γ), which contradicts the G0 -periodicity of u.

Aubry–Mather Theory in Statistical Mechanics

657

4. Interaction Potentials and the Variational Problem Let S denote the collection of finite nonempty subsets of 3. An interaction potential is a collection of maps H = {HB ; B ∈ S}, where each HB : C(3) → R, such that HB (u) = HB (v) whenever u and v agree on B (so HB is interpreted as a function HB : RB → R), and for any finite subset X of 3, the series X HB (u) H(u, X) = B∩X6=∅

converges. We call H(u, X) the (total) energy of u in X. A potential is absolutely summable if the series X |HB |∞ B,p∈B

converges for all p ∈ 3. Denoting by |H|p be the value of the series above, we have a family of semi-norms in the space P of absolutely summable potentials, making it into a Frechet space. A potential H is said to be of finite range if for each p ∈ 3 there is some finite set Bp ⊂ 3 such that HB = 0 unless B ⊂ Bp for all p ∈ B. We say that H is of bounded range if there is a number M such that HB = 0 if diam(B) > M . Clearly, if H is finite range and all HB are bounded, then H ∈ P. Moreover, absolutely summable finite range potentials are dense in P. A configuration u is a ground state for the interaction potential H = {HB } if H(u, X) ≤ H(v, X) for any finite set X and any configuration v such that u = v on 3 \ X. Thus u is a ground state if the energy of any finite perturbation of u exceeds that of u. Our goal is to seek minimal configurations which belong to the spaces Oσ . Note that if u is a minimal configuration for the energy problem in Oσ , it is also a minimal configuration for the global problem in C(3), simply because if u ∈ Oσ and u = v outside some finite set X, then also v ∈ Oσ . On the other hand, note that a configuration that minimizes H(·, X) in Bσ may not necessarily be a ground state configuration. Some other properties of ground state configurations that we will use later are collected in the following proposition. Proposition 12. The set of all ground state configurations for a continuous potential H is closed in R3 with the product topology. If O denotes the space of configurations at bounded distance from a fixed one, the space of ground states in O is closed in the Banach space topology. Proof. Suppose that a sequence un of ground state configurations converges point-wise to the configuration u. If u is not a ground state configuration, then there is a finite set X and a configuration v which equals 0 outside X and such that H(u, X) − H(u + v, X) ≥ a > 0. On the finite set X, the convergence un → u is uniform, and the continuity of the potential implies that, for large n we have H(un , X) − H(un + v, X) > 0,

658

A. Candel, R. de la Llave

contradicting the minimality of the un ’s. The same argument also proves the second part, because if un ∈ O converges to u in the Banach space topology then it also converges uniformly on each finite subset X of 3. We introduce some more definitions in order to restrict the type of potentials to be considered. An interaction potential H = {HB } is G-invariant if HγB (u) = HB (Tγ u) for all γ ∈ G, all B and all configurations u on 3. If the interaction potential H satisfies HB (Rs u) = HB (u) for all configuration u, all B and all s ∈ Zl , then we say that H has periodic phase. This periodic phase property allows us to consider the HB ’s as maps C(3)/R → R. The Banach space structure of Oσ permits us to write down the variational equations if the potential is differentiable. Thus, with the assumption of differentiability of potentials, a necessary condition for u to be a minimal configuration is that it satisfies the variational equations X ∂ HB (u) = 0. ∇H(u, X) = ∂p X∩B6=∅,p∈B

This sum converges for finite range interaction. In general, we have to make some further restrictions. Let O be a convex subset of the set of configurations. We say that an interaction H is C r -bounded on O if X kHkr,O = sup |Dj HB (u)| < ∞. u∈O

|j|≤r

Note that the space O in the definition above may be a proper subset of one of the Oσ ’s. Examples show that interaction may not be bounded in Oσ but they are in subspaces of the form O = {u; |u − uσ |∞ ≤ K}. If H is a C 1 -bounded potential on O, we define a map u 7→ V (u) on configurations u ∈ O with values in `∞ (3) by X ∂ HB (u), V (u)(p) = − ∂p B3p

and therefore the variational equations can be written as V (u) = 0. In the cases that we will consider O will be an affine space over `∞ (3), in particular one of the Oσ , and the hypothesis of C r boundedness of H will ensure that the r − 1 derivative of V exists and is uniformly bounded in the sense of derivatives of Banach spaces. Since Oσ is a Banach space isomorphic to `∞ (3) the natural interpretation of V is as a vector field on Oσ . In any case, the above definitions show that the function V associated to a C r bounded potential H on O is globally Lipschitz, that is:

Aubry–Mather Theory in Statistical Mechanics

659

Corollary 2. If V is defined as above, then |V (u) − V (v)|∞ ≤ kHk2 |u − v|∞ . Before we state the main theorem on the existence of minimal configurations we still need a new definition concerning the potentials to be considered. This is the twist condition of Hamiltonian mechanics or the ferromagnetic property in statistical mechanics. More precisely, we say that an interaction potential H which is C 2 -bounded on O satisfies the twist condition if X ∂2 HB (u) ≤ 0 ∂p∂q

B3q

for all configurations u in O and all p, q in 3 with p 6= q. This property is obviously implied by the strong twist condition, that is ∂2 HB (u) ≤ 0 ∂p∂q for all u and all p 6= q.

5. Existence of Solutions to Variational Equations In this section we prove the existence of solution to the variational equation of a twist potential. The idea is inspired by [4] and [6]. There the strategy of the proof is to consider the flow φt : Oσ → Oσ defined by the differential equation d φt (u) = −V (φt (u)). dt Since the right-hand side is globally Lipschitz, solutions exist for all times. Then we will argue that this flow preserves the Birkhoff condition on configurations, so it is actually a flow on Bσ . Furthermore, it commutes with the operators R so that the end result is a flow on Bσ /R. Once here we will be able to use compactness and therefore fixed points for the flow. Unfortunately, for this outline to work we would need to add a further condition on the space 3, namely that it has polynomial growth function with respect to the action of G. In general this case is the exception rather than the rule. To overcome this difficulty we consider a sequence of gradient flows that converge to the one defined by the variational equations. Theorem 1. Let G be a finitely generated group acting on 3 with finitely many orbits, and with small stabilizers. Let σ be a cocycle on G and let H be an interaction potential which is C 2 -bounded on Oσ and which is G-invariant and has periodic phase. Assume that either H satisfies the strong twist condition or that H satisfies the twist condition and is of finite range. Then there is a solution of the variational equations which lies in Bσ .

660

A. Candel, R. de la Llave

Suppose that H is of finite range. For any finite subset F in 3 consider the function HF on configuration space defined by X HB (u). HF (u) = B∩F 6=∅

Let −∇HF be the gradient vector field defined by this function and let φF be the flow defined by the differential equation d φF (u, t) = −∇HF (φF (u, t)). dt Proposition 13. The flow φF is defined on Oσ for all times. Indeed, the gradient ∇HF is a Lipschitz vector field in Oσ with the Banach space topology. Proposition 14. The flow φF (t, u) defined by the differential equation d φF (t, u) = −∇HF (φF (t, u)) dt preserves the order structure on the space of configurations. The proof of this proposition is along the lines of the one in [6]. Briefly, the idea is that the twist condition implies that the corresponding linearized equation is given by a matrix with positive entries off the diagonal. This is shown to imply the monotonicity of the gradient flow. A consequence of this and the fact that the interaction potential has periodic phase is that φt induces a flow on Bσ /R. Proposition 15. If the interaction potential H is ferromagnetic, then HF is nonincreasing along the flow, that is d HF (u) ≤ 0. dt We compute the derivative along the flow φ : O × R → O defined by the differential equation. By the chain rule we have that: d d HF (φF (u, t)) = −∇HF (u) φF (u, t) dt dt ! ∂ X =− HB (u) · ∂p B p∈BF 2 X ∂ = HB (u) ≤ 0. ∂p

!

∂ X HB (u) ∂p B

p∈BF

p∈BF

By the order preserving property, we can consider the flow on Bσ /R. The last detail we need is: Proposition 16. The space Bσ /R is compact (for the pointwise convergence metric).

Aubry–Mather Theory in Statistical Mechanics

661

Proof. Let F be a fundamental domain for the action of G on 3. If u is a configuration, we can apply one of the operators R so that u(p) ∈ [0, 1] for p ∈ F . If furthermore u is a Birkhoff configuration, then σ(γ) ≤ u(γp) − u(p) ≤ σ(γ) + 1. Hence we can view Bσ /R as a subspace of an infinite product of circles. It is also closed, hence compact. Note that the potential HF induces a map on this quotient space because of the invariance of the interactions HB under the transformations R. Since HF is bounded below in this compact space, and it is non-increasing along the orbits of the flows, there must be a sequence tn → ∞ for which d φF (u, tn ) → 0. dt Therefore, if we start with a Birkhoff configuration u, we obtain a sequence of Birkhoff configurations φ(u, tn ) which is compact in the space Bσ /R. Thus there is a convergent subsequence whose limit is a critical point uF . Consider now an increasing sequence F0 ⊂ F1 ⊂ · · · of finite sets exhausting 3. For each of them we have a critical point un in Bσ for the gradient flow ∇Hn . After modification by the operators R, which preserves the property of being a critical point, we see that these sequence of points has a subsequence which converges point-wise to u, a configuration in Bσ . To finish the proof of the theorem we only need the following: Proposition 17. This configuration u satisfies the variational equations V (u) = 0. Proof. We have to check that V (u)(p) = 0 for any p in 3. Fix p and choose n large enough so that if B contains p then B ⊂ Fn . The convergence un → u is uniform on Fn , and ∇Hn (w)(p) = V (w)(p). It follows that u satisfies the variational equations. 6. Properties of Ground State Configurations For a configuration u and a point p in 3 define the energy of u due to p as Ep (u) =

X 1 HB (u). |B|

B3p

For X a finite subset of 3 let hX (u) =

1 X Ep (u). |X| p∈X

We would like to take the limit as X approaches 3, but there is no canonical way to do so. We will regard it as a function on finite parts of 3, X ∈ F(3) 7→ hX (u). One possibility would be to take a fundamental domain X for G, list the elements of G as {e, γ1 , γ2 , . . . } and set Xn = X ∪ . . . ∪ γn X. Then, if u is a configuration which has integer periods with respect to a finite index subgroup G0 of G, we see that the limit lim hXn (u)

n→∞

662

A. Candel, R. de la Llave

exists and is equal to the specific energy of u in a fundamental domain for G0 . We want to mention one property about the specific energy of configurations. In [2] this property is proved under the assumption that the configuration is minimal and satisfies the Birkhoff condition. It appears that considerably less is needed. We include a proof for completeness. Proposition 18. Let the interaction potential H be of finite range, G-invariant and satisfy the strong ferromagnetic condition. Let σ be a cocycle on G. If u is a configuration which belongs to Oσ then there is a constant K(u, σ) such that H(u, X) ≤ K(σ, u) Card(X), for any finite subset X of 3. Proof. Let u be a configuration in Oσ . Let uσ be a configuration defined by the cocycle σ. Applying the ferromagnetic hypothesis in the development provided by Taylors theorem we have that for any B with HB 6= 0 there are positive constants α(B), β(B) such that X HB (u) ≤ α(B) (u(p) − u(q))2 + β(B). p,q∈B

This follows from Taylor’s theorem. The constants α and β depend on σ, on the distance between u and uσ , and on B. There are only finitely many B’s up to translation, so the invariance of H allows us to choose α and β independently of B. Next note that for any p, q and for u in Oσ we have previously shown that |u(p) − u(q)| ≤ C + |σ(γpq )|, where C is a constant and γ is any transformation that takes p to q. Therefore, if the interaction potential is of bounded range, there is a constant C such that |u(p) − u(q)| ≤ C for any p, q belonging to the same set B with HB 6= 0. We now put all this information together. Let X be a finite set and u ∈ Oσ . We have H(u, X) =

X

HB (u)

B∩X6=∅

≤

X X

α(u(p) − u(q))2 + β

B p,q∈B

≤

X

n2B (C 2 + β)

B

≤ K(u, σ)Card(X). Aubry’s fundamental lemma for the Frenkel–Kontorova model shows the nonintersecting property of ground state configurations. The analysis of Aubry and Mather of configurations on the line which are minimal for the Lagrangian shows that they satisfy a non-intersecting property, namely, if u : Z → R is a ground state configuration, then it is Birkhoff. In higher dimensions this need not be true, and Blank [2] provides the following example:

Aubry–Mather Theory in Statistical Mechanics

663

H(u1 , u2 , u3 ) = (u1 − u2 + b1 )2 + (u1 − u3 + b2 )2 , which admits f (x1 , x2 ) = x1 x2 as a solution to the variational equations. Blank showed that a similar property holds for minimum energy configurations for the model he studies in [2]. The following proposition exhibits a similar property for the models we study. From a geometric point of view, it is essentially the maximum principle for minimal surfaces. The space 3 has no structure, but given an interaction potential H = {HB }, those sets B for which HB 6= 0 define a sort of topology on 3. So if X is a finite subset of 3, we will denote by N (X) the union of those sets B which intersect X and with HB 6= 0, and by X 0 the subset of those elements p of X for which there is some B which meets X in some other points different from p. Although these definitions may look strange, they are simply verified for models whose supporting set 3 corresponds to the vertices of a graph and the interaction potential is defined in terms of the connecting edges. First we need some new terminology. Say that a subset X of 3 is connected if for any pair of elements p, q of it, there is a sequence p0 , . . . , pk in X with p0 = p, pk = q, and sets Bi , each containing a pair pi , pi+1 , and for which HBi 6= 0. The meaning of this condition is obvious. Each connected component of 3 behaves in its own independent way, and there is no restriction in assuming our models have connected state space. Proposition 19. Let u and v be ground state configurations for the ferromagnetic potential H on the configuration space of 3. Suppose that there is a finite connected set X such that u ≥ v on N (X). Then, either u = v or u > v in all of X. Proof. We use the standard technique of the Hilbert integral in calculus of variations. Any ground state configuration must satisfy the variational equations. Thus, if p ∈ X we get X ∂ [HB (u) − HB (v)] = 0, ∂p B3p

and replacing the terms in the sum by their integrals, XZ 1 d ∂ HB (tu + (1 − t)v)dt = 0. 0 dt ∂p B3p

This expression may be written as: XX B3p q∈B

Z

1

[u(q) − v(q)] 0

∂2 HB (tu + (1 − t)v)dt = 0. ∂q∂p

Let p ∈ X 0 be such that u(p) = v(p). For q 6= p in X 0 , the term u(q) − v(q) in the sum above is ≥ 0 by hypothesis, while the integral term is strictly negative due to the ferromagnetic condition. This forces u = v through all of N (p). Now the fact that X is connected allows us to extend the equality u = v to all X. Corollary 3. Let u, v be two critical configurations on 3, which is assumed to be connected. If u ≤ v, then either u = v or u < v. Corollary 4. Let X be a connected finite subset of 3 and let u,v be two configurations which agree outside X and such that they both minimize H(·, X) with boundary conditions u3−X . Then u = v on X.

664

A. Candel, R. de la Llave

7. Existence of Ground State Configurations The result we just proved guarantees the existence of Birkhoff critical configurations with arbitrary rotation number. In this section we show that Birkhoff ground states exist with arbitrary rotation cocycle. Here we have to assume that the group is residually finite, and that the space 3 is connected with respect to the interaction potential to be considered. Theorem 2. Let G be a finitely generated residually finite group acting on the set 3 with finitely many orbits and with small stabilizers. Let H be a finite range interaction potential on 3, making 3 connected, which is G-invariant, has periodic phase, and satisfies the strong ferromagnetic condition. If σ is a cocycle on G, then there are Birkhoff configurations on Bσ which are ground states. We start with a rational cocycle σ : G → Q, integer valued on the finite index subgroup G0 . We let F be a connected fundamental domain for G0 . Consider the Hamiltonian X HB . HF = B∩F 6=∅

We also consider an exhaustion F = F0 ⊂ F1 · · · of 3 by fundamental domains with the property that the sets B which intersect Fn are contained in Fn+1 , and they correspond to a tower of finite index subgroups of G0 . We denote by Hn the Hamiltonian corresponding to Fn . The problem is to minimize Hn with periodic boundary conditions. These boundary conditions give us the constraints for a Lagrange multiplier problem. A configuration u which is Gn -periodic is completely determined by its values on Fn . If q is a point outside Fn then there is a unique p in F such that u(q) = u(p)+σ(γ) for some γ in Gn . (There could be several γ’s but the result is independent of that because of the small stabilizers condition.) Let N denote the union of all B’s with HB 6= 0 which intersect F . The problem is then to minimize H for configurations u on N subject to the constraints determined by the periodicity condition. We obtain one linear equation of the form gq (u) = u(q) − u(p) − σ(γ) for each point q in N \ F . For p in F denote by η(p) the number of points equivalent to it in N . We start by discussing the problem of minimizing Hn over Gn -periodic configurations. First, we consider Hn as a function on RFn . This can be done because any element in this space extends in a unique way to a configuration on 3, and then we apply Hn . Now the twist condition implies that this function attains its minimum over a compact set. It follows that we obtain a periodic configuration un which minimizes Hn among periodic configurations. Proposition 20. The function Hn has a minimum on the space of Gn periodic configurations. We want to show that critical configurations of this system are ordered. The method of Lagrange multipliers provides a system of equations which we now describe. First there are real numbers λq , one for each q in N \ F , and λ0 which is either 0 or 1 (and

Aubry–Mather Theory in Statistical Mechanics

665

so that not all λ’s are zero). The equations are as follows: if p ∈ F has no other point of its orbit in N ,then X ∂ ∂ H(u, N ) = HB (u) = 0, ∂p ∂p B3p

if the orbit of p ∈ F meets N in the points q1 , · · · , qk (other than p), then λ0

X ∂ HB (u) = λ1 + · · · + λk ∂p

B3p

for the scalars λ corresponding to the points in N \ F equivalent to p, and if q ∈ N \ F is in the orbit of p ∈ F we get λ0

X ∂ HA (u) = −λ∗ , ∂q

A3q

where the sum is over the sets A that meet F . We then see that we can take λ0 = 1 and the last two equations can be combined into one which is independent of the scalars λ and holds for any critical configuration: X ∂ HB (u) + ∂p

B3p

X

X ∂ HA (u) = 0 ∂q

q∈Op ∩N A3q

(here Op denotes the orbit of p in 3). This last equation can be rewritten in the form X B3p

nB,p

∂ HB (u) = 0 ∂p

because of the periodicity of the configuration and of the interaction potential. The index nB,p is a positive integer which comes out of the rearrangement of the sum over the A’s. Now if the interaction potential satisfies the strong ferromagnetic condition, the same technique of the Hilbert integral allows us to obtain a sort of Aubry’s fundamental lemma. Proposition 21. If the interaction potential satisfies the strong twist condition, then two Gn -periodic configurations u and v that minimize Hn and such that u ≥ v are either equal or u > v. Proof. The proof is essentially the same as the one in the previous section. First start with a point in Fn which is equivalent to no point of N (Fn ). If u = v at this point, the proof in the previous section shows the u = v in the whole neighborhood of the point. For points of Fn that have some equivalent in N (Fn ), the last equation we have written allows us to use again the Hilbert integral technique. In conclusion, if u(p) = v(p) at some point of Fn , we obtain that u = v through all of Fn , as we are assuming the fundamental domains to be connected.

666

A. Candel, R. de la Llave

Hence, if u, v are periodic and minimize Hn we can consider the configurations u ∨ v = max{u, v} and u ∧ v = min{u, v} which are also periodic, and the twist condition implies Hn (u) + Hn (v) ≥ Hn (u ∨ v) + Hn (u ∧ v). Hence these configurations also minimize Hn . By applying the argument of the previous paragraph to the pairs u, u ∨ v, etcetera, we conclude that either u > v or v > u or they are equal. Before proceeding we note the following fact. Lemma 1. If u is a Gn -periodic configuration, so is Tγ u for any γ in G. Proof. If γ1 ∈ Gn , then by normality of Gn in G, there is γ2 in Gn such that γγ1 = γ2 γ. Hence σ(γ1 ) = σ(γ2 ) and Tγ u(γ1 p) = u(γγ1 p) = u(γ2 γp) = u(γp) + σ(γ2 ). From this the Birkhoff property follows as we now show. Proposition 22. Suppose that u is Gn -periodic and minimizes Hn . Then it is a Birkhoff configuration. Proof. We show that for any γ in G we have |Tγ u − u − σ(γ)| ≤ 1. If [σ(γ)] denotes the integer part of σ(γ), it follows from the previous lemma that the configurations u = Tγ u − [σ(γ)] and u = u − 1 are also Gn -periodic and minimize Hn . Hence the differences u − u and u − u have constant sign. Clearly these differences are invariant by Gn , and so we can determine their sign by adding over a fundamental domain for Gn . To do this, observe that for a Gn -invariant configuration v we have X Tγ v − v = σ(γ)|Fn | Fn

and in fact the sum can be taken over any fundamental domain for Gn . Hence X u − u = (σ(γ) − [σ(γ)])|Fn | ≥ 0, Fn

and similarly

X

u − u = (σ(γ) − [σ(γ)] − 1)|Fn | ≤ 0.

Fn

It follows that 0 ≤ Tγ u − u − [σ(γ)] ≤ 1, which is equivalent to the Birkhoff property. This discussion has the following consequence. We start with a tower of connected fundamental domains F = F0 ⊂ · · · ⊂ Fn ⊂ · · · corresponding to normal finite index subgroups of G0 ⊃ G1 ⊃ . . . of G, then we obtain a sequence of configurations u0 , u1 , . . . , where un is periodic with respect to Gn and minimizes Hn . Furthermore, since these configurations satisfy the Birkhoff property and the cocycle is integer valued on G0 , they are all periodic with respect to G0 . By invoking the operators R, we may assume that this sequence {un } has a subsequence which converges pointwise to a configuration u. Hence u is a Birkhoff configuration. Furthermore,

Aubry–Mather Theory in Statistical Mechanics

667

Proposition 23. The configuration u is a ground state for H. Indeed, let v be a configuration that agrees with u outside a finite subset X of 3. Then X ⊂ Fn for some large n. We can find a larger m such that every set B which meets Fn is inside Fm . Now restrict v to Fm and extend to 3 by periodicity with respect to Gm . This does not change the value of H(v, Fn ). Then, for k ≥ n, H(v, Fn ) ≥ H(uk , Fn ). Since un converges uniformly to u on Fn , we take the limit with respect to k and the ground state property of u is verified. The case in which the cocycle σ is arbitrary is now immediate. We approximate σ by rational cocycles. For each one the argument just described produces a Birkhoff ground state. By using the operators R, we may assume this sequence converges. The limit is then a Birkhoff ground state for σ. 8. Examples and Problems The typical example is the following. Let G be a group with a finite generating set S = S −1 . With it we construct a graph whose vertices are the elements of G and whose edges are labeled by the elements of S, that is, there is an edge from g to h if h = gs, and we denote it by (g, s, gs). This graph is made into a metric space by taking each edge to be isometric to the unit interval in the real line. Furthermore, there is a natural left action of G on 0, namely, g(v, s, vs) = (gv, s, gvs), which is effective and transitive. There is a natural potential H defined by HB (u) = (u(p) − u(q))2 if p and q are connected by an edge, and trivial otherwise. The ground state configurations of this model are harmonic functions on the group G, where the discrete Laplacian is defined by 1 X u(q) − u(p), 1u(p) = n(p) q∼p where n(p) is the number of edges emanating from p, and q ∼ p means that q is connected to p by an edge. It is then elementary that any cocycle on G defines a harmonic function. This corresponds to the integrable case. The Frenkel–Kontorova model in the group G involves also the site potentials Hp (u) = V (u(p)), where V is a periodic function satisfying V 00 < 0. The variational equations are X u(gs) + V 0 (u(g)) = 0. n(g)u(g) − s∈S

Most of what was done for Z carries over to this general situation. In physics there are two types of lattices which are frequently used. One is the Euclidean lattice Zd , which is amenable, and the other goes by the name of Bethe

668

A. Candel, R. de la Llave

lattice, non-amenable. This one is typically used to exemplify strange phenomena in comparison with the Euclidean lattices (and non-amenability is responsible for this), and also because computations for models on it are simplified by the fact that Bethe lattices are trees. From our point of view, these lattices are the graphs of free products of groups of the form G1 ∗ · · · ∗ Gn , where each Gi is a copy of the infinite cyclic group Z or of the two element group Z/2Z. They are finitely generated groups and it is elementary how to compute their first cohomology group. Therefore our theory applies to them. To conclude, we mention some problems that appear to have some interest. Of course, the theorem of existence of quasi-periodic solutions is only the first step in the very rich Aubry-Mather theory and it would be interesting to see how much generalizes to the present setting. On the other hand, there are problems that seem to be suggested by the present formulation. The first concerns Gibbs states for the model 3, H, G. It would be interesting to know what type of ground states are contained in the support of a Gibbs state. For instance, under what conditions is it true that the Dirac mass supported on a Birkhoff ground state is a Gibbs state? Examples related to this question are the Shlosman staircases (see [3]). In a more geometrical vein, it would be interesting to develop a theory of minimal surfaces in Riemannian manifolds corresponding to the relation between the Frenkel– Kontorova model and geodesics on a torus. For manifolds of dimension three the theory of minimal surfaces admits a simplicial version (Jaco and Rubinstein). Given a compact three manifold M with a triangulation, a simplicial minimal surface is described by how it should intersect the simplices of the triangulation. Passing to the universal cover of M we obtain a space 3, namely, the set of simplices, as well as an action of the fundamental group of M on 3. An interaction potential can be written down in terms of the gluing data, so that solutions to the corresponding variational equations are simplicial minimal surfaces. We also note that, given the interpretation of S as energy and of the rotation numbers as density, it is interesting physically to study the function that associates the energy per particle of the minimal configuration to a rotation number. This has been studied when 3 = Zd numerically in [9] and rigorously in [11]. We also call attention to the models of quasi-crystals based on aperiodic tilings [10]. These have actions of semi-groups that correspond to dilations. It would be interesting to know whether some version of Aubry-Mather theory could be developed for them. References 1. Aubry, S. and Le Daeron, P.Y.: The discrete Frenkel–Kontorova model and its extensions; I. Exact results for the ground states. Physica D 8, 381–422 (1983) 2. Blank, M.L.: Metric properties of minimal solutions of discrete periodic variational problems. Nonlinearity 2, 1–22 (1989) 3. Georgii, H.-O.: Gibbs measures and phase transitions. de Gruyter Studies in Mathematics 9, Berlin–New York: Walter de Gruyter, 1988 4. Gol´e, C.: A new proof of the Aubry-Mather’s theorem. Math. Z. 210, 441–448 (1992) 5. Kl´eman, M. and Sadoc, J.F.: A tentative description of the crystallography of amorphous solids. J. Physique Lett. 40, 569–574 (1979) 6. Koch, H., de la Llave, R. and Radin, C.: Aubry-Mather theory for configurations on lattices. Preprint 7. Mather, J.: Existence of quasiperiodic orbits for twists homeomorphisms of the annulus. Topology 21, 457–467 (1982) 8. Pokrovsky, V. and Talapov, A.: Theory of incommensurate crystals. Sov. Sci. Reviews, Supplement series 1 (1984)

Aubry–Mather Theory in Statistical Mechanics

669

9. Vallet, F.: Thermodynamique unidimensionelle, et structures bidimensionelles de quelques modeles pour des systemes incommensurables. Thesis, Univ. Paris VI (1986) 10. Senechal, M.: Quasicrystals and Geometry Cambrigde: Cambrigde University Press, 1995 11. Sen, W.M.: Phase Locking in Multi-dimensional Frenkel–Kontorova models. Preprint (1994) 12. Wehrfritz, W.M.: Infinite linear groups. Ergebnisse der Mathematik, Band 76, New York: SpringerVerlag, 1973 Communicated by Ya. G. Sinai

Commun. Math. Phys. 192, 671 – 685 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

On Eigenfunction Decay for Two Dimensional Magnetic Schr¨odinger Operators H. D. Cornean1 , G. Nenciu1,2 1 Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, 70700 Bucharest, Romania. E-mail: [email protected] 2 Dept. of Theor. Phys., Univ. of Bucharest, P.O. Box MG 11, 76900 Bucharest, Romania. e-mail [email protected]

Received: 1 January 1997 / Accepted: 25 July 1997

Abstract: For two dimensional Schr¨odinger operators with a nonzero constant magnetic field perturbed by a magnetic field and a scalar potential, both vanishing arbitrarily slow at infinity, it is proved that eigenfunctions corresponding to the discrete spectrum decay faster than any exponential. Under more restrictive conditions on the perturbations, even quicker decay is obtained.

1. Introduction The aim of this paper is to study the decay properties of the eigenfunctions for the two dimensional magnetic Schr¨odinger operators H = (p − a(x))2 + V (x),

x ∈ R2 .

(1.1)

The two dimensional case is interesting both from physical (e.g. the theory of quantum Hall effect) and mathematical point of view; in particular, since for V = 0 and constant magnetic field B0 the spectrum of H is the well known “Landau spectrum” σL (B0 ) = {(2n + 1)B0 | n = 0, 1, . . .}

(1.2)

by adding a perturbation one obtains a very rich spectral structure: eigenvalues in the gaps of the essential spectrum, “bands” created by an infinite number of magnetic and / or scalar “wells”, etc. In this paper we shall consider a particle in a constant magnetic field perturbed by a magnetic field and a scalar potential vanishing at infinity, i.e. the “one well” problem (we stress the fact that we impose conditions on the magnetic field and not on the corresponding vector potential); in this case it is known [I, H] that the essential spectrum of H coincides with the Landau spectrum. More precisely, we shall obtain a strong decay at infinity of the eigenfunctions ψE (x), corresponding to discrete eigenvalues, E, irrespective of the fact that they lie below the essential spectrum or in between the

672

H. D. Cornean, G. Nenciu

Landau levels. In this context, a fairly straightforward application of Combes-ThomasAgmon theory gives an exponential decay, |ψE (x)| ≤ const. exp(−µ(E)|x|), where µ(E) depends upon the distance between E and the essential spectrum of H and moreover µ(E) → 0 as E approaches the essential spectrum (see however the remark at the end of the proof of Theorem 4.1). For a thorough review in this direction, including the many-body problem, see [C-F-K-S, A-H-S 1, A-H-S 2, H, Hu] and references therein. On the other hand, if the scalar potential and the perturbing magnetic field are compactly supported or have rotational symmetry, it is not hard to see that the eigenfunctions have a gaussian decay irrespective of the position of E. In this context, T. Hoffmann-Ostenhof posed the problem of the decay of eigenfunctions under less stringent conditions on the perturbation. Results concerning this question were obtained by Erd¨os [E] for energies below the essential spectrum. He proved that if V decays at infinity and is analytic in the angle variable then the gaussian decay of the eigenfunctions is preserved and more important, he gave an example of a potential decaying at infinity for which the decay of the ground state is slower than a gaussian. Our main result (see Theorem 4.1 for precise formulation) is that if the perturbations B 0 (x), V (x) → 0

when

|x| → ∞

(1.3)

then the eigenfunctions corresponding to E 6∈ σL (B0 ) decay faster than any exponential. We believe our result to be optimal; additional conditions have to be added in order to ensure a quicker decay of eigenfunctions and we give some results in this direction (see Theorem 4.2): if the perturbing magnetic field is compactly supported and the scalar potential has a decay of the form exp(−δ|x|β ) with 0 < β ≤ 2 then the eigenfunctions decay like exp(−µ|x|1+β/2 ). These results can be applied to the “magnetic multiple wells” problem (see [H-S, H-H, Na 1, Na 2]) in order to have a better control on the width of the mini-bands which appear in the tight-binding limit (see [B-C-D]); this will be done in a companion paper. The content of the paper is as follows. Section 2 has a preparatory character; we write down some simple properties of the family of “local” transversal gauges we shall use as well as the closed formula for the integral kernel K0 (x, x0 ; z) of the resolvent of H in the case of constant magnetic field B0 and z 6∈ σL (B0 ). Section 3 contains one of the main results of the paper: in the case when V = 0 and the magnetic field is of the form (b > 0): B(x) = B0 + b B 0 (x)

(1.4)

with B0 > 0 and B 0 uniformly bounded and smooth, we prove that for z 6∈ σL (B0 ) fixed and b → 0 then z ∈ ρ(H) and we obtain an upper bound on the integral kernel K(x, x0 ; z) of (H − z)−1 (see Theorem 4.1 for the precise statement), which essentially says that as |x − x0 | → ∞, we have |K(x, x0 ; z)| ≈ exp [−µ(b, z)|x − x0 |],

lim µ(b, z) = ∞.

b→0

(1.5)

Our method, which is interesting in itself, used for proving the above result is a “regularised” magnetic perturbation theory: the main idea is the fact that most of the “singularity” of the perturbation is given by a (nonvanishing at infinity) “gauge phase”; by an appropriate “factorisation” of this phase one obtains a much more regular perturbation theory (see [N] for another use of the same idea).

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

673

Section 4 contains the main results of the paper concerning the decay of eigenfunctions. Theorem 4.1 follows (with the technical complications due to the magnetic field) from Theorem 3.1 by substantiating the general idea that the decay at infinity of the eigenfunctions is governed by the “asymptotic” hamiltonian and its resolvent’s integral kernel behaviour. The proof of Theorem 4.2 is based on the fact that while the gaussian decay of the “free kernel” K0 may not “propagate” itself in the Neumann series when adding a perturbing scalar potential, one can use the decay properties of the potential in order to obtain a decay which is stronger than that of the potential. 2. Preliminaries In this paper we shall consider only the bidimensional case (i.e. the particle is confined in the plane x3 = 0 and the magnetic field is orthogonal to that plane). For b ∈ R and d = (d1 , d2 ) ∈ R2 , define b ∧ d ≡ (−b d2 , b d1 ) ∈ R2 .

(2.1)

Let B(x) ∈ C 1 (R2 ). We shall use the following family of vector potentials corresponding to B(x): Z 1 0 ds s B(x0 + s(x − x0 )) ∧ (x − x0 ). (2.2) a(x, x ) = 0 0

For x = 0, this is nothing but the usual transversal gauge (see e.g. [T]): Z 1 ds s B(s x) ∧ x. a(x, 0) ≡ a(x) =

(2.3)

0

If we define

f(x, x0 ) = a(x) − a(x, x0 ),

(2.4)

0

then there exists ϕ(x, x ) such that

The additional requirement

∇x ϕ(x, x0 ) = f(x, x0 ).

(2.5)

ϕ(x0 , x0 ) = 0

(2.6)

gives ϕ(x, x0 ) =

Z

x1 x01

dt f1 (t, x2 ; x0 ) +

Z

x2 x02

dt f2 (x01 , t; x0 ),

(2.7)

where xi , x0i , fi are the cartesian components of x, x0 andf respectively. From the explicit formulae (2.2), (2.4) and (2.7), it follows at once that Lemma 2.1. If |B(x)| ≤ M < ∞, then i) |ϕ(x, x0 )| ≤ 2 M |x − x0 |(|x| + |x − x0 |). ii)

a(x, x0 ) · (x − x0 ) = 0.

(2.8) (2.9)

674

H. D. Cornean, G. Nenciu

iii) 0

Z

1

ds s × ∇x a(x, x ) = 0 ∂ ∂ 0 0 0 0 × B(x + s(x − x )), − B(x + s(x − x )) · (x − x0 ). ∂x2 ∂x1

(2.10)

iv) If B(x) = B0 is constant, then 1 ϕ0 (x, x0 ) = − B0 (x1 x02 − x01 x2 ), 2 1 a0 (x, x0 ) = B0 ∧ (x − x0 ). 2

(2.11)

The hamiltonian of a particle in the presence of the magnetic field and a scalar potential V is (in the transversal gauge): H = (p − a(x))2 + V (x), ∂ ∂ p = −i , , −i ∂x1 ∂x2 Z 1 Z a(x) = −x2 ds s B(s x), x1 0

(2.12) !

1

ds s B(s x) . 0

In the case of constant magnetic field, one has the hamiltonian H0 = (p − a0 (x))2 , where p = −i∇x

and a0 (x) =

1 1 − B0 x2 , B0 x1 , 2 2

(2.13) (2.14)

which is essentially self-adjoint on C0∞ R2 and its spectrum is the well known Landau spectrum (2.15) σ(H0 ) = σess (H0 ) = {(2n + 1)B0 | n = 0, 1, 2, . . .} . 2 2 For z 6∈ σ(H0 ) and g ∈ L R , we write Z (H0 − z)−1 g (x) = dx0 K0 (x, x0 ; z)g(x0 ), (H0 − z)K0 (x, x0 ; z) = δ(x − x0 ). Then takes place (see e.g. [J-P]): Lemma 2.2. Let ϕ0 (x, x0 ) = −

B0 x1 x02 − x2 x01 , 2

B0 |x − x0 |2 , 4 1 z α=− − 1 6= −1, −2, . . . . 2 B0

ψ(x, x0 ) =

Then

(2.16)

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

675

0

K0 (x, x0 ; z) = ei ϕ0 (x,x ) G0 (x, x0 ; z) = 0(α) i ϕ0 (x,x0 ) −ψ(x,x0 ) = e e U (α, 1; 2 ψ(x, x0 )), 4π

(2.17)

where 0 is the Euler function and U (α, γ; ξ) is the confluent hypergeometric function [A-S]. For a proof using the eigenfunctions of H0 , see [J-P]. Alternatively, for <(z) < B0 one can obtain (2.17) by taking the Laplace transform of the formula for the “heat” kernel (see e.g. [F-H] ) and using an appropriate integral representation for the confluent hypergeometric function ( see formula (13.2.5) in [A-S]). The formula (2.17), for arbitrary z ∈ ρ(H0 ) follows by analytic continuation. The holomorphy and asymptotic behaviour of U (α, γ; ξ) (see e.g. [A-S]) imply the following inequality which we need later on: if β, m > 0, [c, d] ⊂ ρ(H0 ) and z ∈ [c, d], then: B0 t)| ≤ M (β, m, c, d) < ∞. (2.18) sup tβ e−m t |U (α, 1, 2 t∈R+ From Lemma 2.2 it follows that K0 (x, x0 ; z) has a gaussian decay as |x − x0 | → ∞. We shall use this in the following form: Corollary 2.1. Let µ > 0 and T (µ, x0 ) be the operator of multiplication with eµ|x−x0 | , x0 ∈ R2 . Then for all 0 < δ < B80 and z ∈ ρ(H0 ), one has that the operators T (δ, x0 ) (H0 − z)−1 T (−2δ, x0 ) : L2 R2 → L2 R2 L∞ R2 (2.19) 2

are bounded. Proof. Since for all x, x0 ∈ R2 , δ|x − x0 |2 ≤ 2δ|x − x0 |2 + 2δ|x0 − x0 |2 one has 0

|eδ|x−x0 | K0 (x, x0 ; z)e−2δ|x −x0 | f (x0 )| ≤ B |0(α)| −2 80 −δ |x−x0 |2 |f (x0 )|. |U (α, 1; 2ψ(x, x0 ))| ≤e 4π 2

2

Since (2.18) takes place, one has that: B B0 y 2 −2 80 −δ y 2 ) ∈ L1 R2 ∩ L2 R2 U (α, 1; e 2 and the use of Young inequality ([R-S 2]) finishes the proof.

(2.20)

(2.21)

Remark. Since under a gauge transformation Uχ f (x) = ei χ(x) f (x) and Z dx0 Kχ (x, x0 ; z) = Uχ∗ (H0 − z)−1 Uχ f (x) = 2 ZR 0 dx0 e−i χ(x) K0 (x, x0 ; z)ei χ(x ) f (x0 ) =

(2.22)

R2

one has

|Kχ (x, x0 ; z)| = |K0 (x, x0 ; z)|,

i.e. the gaussian decay is valid for an arbitrary gauge.

(2.23)

676

H. D. Cornean, G. Nenciu

3. Schr¨odinger Operators with Nonconstant Magnetic Field Suppose that we are given a magnetic field B(x) = (0, 0, B0 + B 0 (x)) , B0 > 0, ||B ||C 1 (R2 ) ≡ max {||Dα B 0 ||∞ } ≤ 1. 0

|α|≤1

(3.1)

A vector potential corresponding to B 0 is (see Sect. 2 ): a0 (x, x0 ) =

−(x2 − x02 )

Z

1

ds s B 0 (x0 + s(x − x0 )),

0

(x1 −

x01 )

Z

1

! 0

0

0

ds s B (x + s(x − x ))

(3.2)

0

Then a(x, x0 ) = a0 (x, x0 ) + a0 (x, x0 ) and |a0 (x, x0 )| < |x − x0 | , |∇x a0 (x, x0 )| < |x − x0 |.

(3.3)

We showed that there exist ϕ(x, x0 ) and ϕ0 (x, x0 ) such that: a(x) = a(x, x0 ) + ∇x ϕ(x, x0 ) = = a0 (x) + a0 (x, x0 ) + ∇x ϕ0 (x, x0 ),

(3.4)

ϕ0 (x, x0 ) = ϕ(x, x0 ) − ϕ0 (x, x0 ).

(3.5)

where We consider hamiltonians of the form: (3.6) Hb = (p − a0 − ba0 )2 , b > 0, which are essentially self-adjoint on C0∞ R2 (see e.g. [R-S] ). The main result of this section is the following perturbative expansion of the integral kernel Kb (x, x0 ; z) of the resolvent of Hb : Theorem 3.1. Let [c, d] ⊂ ρ(H0 ). Then there exists b0 > 0, depending on [c, d] such that for any 0 ≤ b ≤ b0 we have that [c, d] ⊂ ρ(Hb ) and for any z ∈ [c, d], (Hb − z)−1 has an integral kernel given by the following perturbative expansion: X 0 0 Fn (x, x0 ; z), (3.7) Kb (x, x0 ; z) = ei ϕb (x,x ) K0 (x, x0 ; z) + n≥0

where Fn (x, x0 ; z) are integral kernels corresponding to the operators Fn ∈ B L2 R2 , L2 R2 with ||Fn ||B(L2 (R2 ),L2 (R2 )) ≤ [const ([c, d]) b]n+1 .

(3.8)

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

677

Proof. Denote with Hb,x0 = (p − a0 (x) − ba0 (x, x0 ))2 . Then by gauge covariance: 0

0

0

0

Hb ei ϕb (x,x ) = ei ϕb (x,x ) Hb,x0 .

(3.9)

The equation satisfied by Kb takes the form: δ(x − x0 ) = (Hb − z)Kb (x, x0 ; z) = 0

0

0

0

= ei ϕb (x,x ) (Hb,x0 − z)e−i ϕb (x,x ) Kb (x, x0 ; z).

(3.10)

The main idea of the proof is to factorise the “gauge phases”. This is achieved by making the following ansatz: 0

0

Kb (x, x0 ; z) = ei ϕb (x,x ) K0 (x, x0 ; z) + F(x, x0 ; z)

(3.11)

with F to be found. One can rewrite (3.10) as: h 0 0 i δ(x − x0 ) = (Hb − z) ei ϕb (x,x ) K0 (x, x0 ; z) + F(x, x0 ; z) = 0

0

= (Hb − z)F(x, x0 ; z) + ei ϕb (x,x ) (Hb,x0 − z)K0 (x, x0 ; z).

(3.12)

But Hb,x0 = H0 − Rb (x, x0 ), where Rb (x, x0 ) = 2ba0 (x, x0 ) · (p − a0 (x)) − ib∇x a0 (x, x0 ) − b2 a0 (x, x0 ). 2

(3.13)

Since ϕ0b (x0 , x0 ) = 0, (3.12) becomes (using (2.16) ): 0

0

(Hb − z)F(x, x0 ; z) = ei ϕb (x,x ) Rb (x, x0 )K0 (x, x0 ; z).

(3.14)

Using the explicit form of K0 (x, x0 ; z) given in Lemma 2.2, the equality in (3.3) and the equality (2.9), one can derive that a0 (x, x0 ) · (p − a0 (x))K0 (x, x0 ; z) = −a0 (x, x0 ) · a0 (x, x0 )K0 (x, x0 ; z).

(3.15)

Moreover, using (2.18) with m = B0 /8, one can obtain the following estimate: B0 |x − x0 |2 0 0 (3.16) |Rb (x, x )K0 (x, x ; z)| ≤ b C(c, d) exp − 8 0

0

Write as Sb (z) the operator corresponding to ei ϕb (x,x ) K0 (x, x0 ; z) and as Tb (z) the 0 0 operator linked to ei ϕb (x,x ) Rb (x, x0 )K0 (x, x0 ; z). Using Young inequalities as in Corollary 2.1, one can see that (3.17) Tb (z), Sb (z) ∈ B L2 R2 , L2 R2 ∩ B L2 R2 , L∞ R2 and ||Sb (z)||B(L2 (R2 ),L2 (R2 )) ≤ ||K0 (y; z)||1 ≤ C1 (c, d) < ∞, ||Tb (z)||B(L2 (R2 ),L2 (R2 )) ≤ C2 (c, d) b. Define

Fn (z) = Sb (z) [Tb (z)]n+1

From (3.14) and (3.11) one has:

(3.18) ,

n ≥ 0.

(3.19)

678

H. D. Cornean, G. Nenciu

(Hb − z)−1 (1 − Tb (z)) = Sb (z) and if b<

1 , C2 (c, d)

then (3.20) becomes: (Hb − z)−1 = Sb (z) +

(3.20) (3.21)

X

Fn (z).

(3.22)

n≥0

Corollary 3.1. Let K be a compact in R, such that K ∩ σ(H0 ) = ∅. Define: Bn (x) = B0 + Bn0 (x) , B0 > 0 lim bn = 0. bn = ||Bn0 ||C 1 (R2 ) , n→∞

(3.23)

Let vˆ be a unit vector in R2 and fix y ∈ R2 . Denote with T(u,y) and D(u,y) the multiplication operators with eu|ˆv·(x−y)| and eu|x−y| . Fix u > 0. Define (3.24) Hn = (p − a0 (x) − an0 (x))2 . Then there exists N (u) ≥ 1 s.t. for any n ≥ N (u) we have K ∩ σ(Hn ) = ∅ and i) T (u, y)(Hn − z)−1 T (−u, y) ∈ B L2 R2 , L2 R2 , (3.25) ∞ 2 −1 2 2 , (3.26) ii) T (u, y)(Hn − z) T (−u, y) ∈ B L R , L R 2 2 −1 2 2 . (3.27) iii) D(u, y)(Hn − z) D(−u, y) ∈ B L R , L R Moreover, the norms are uniformly bounded in y, vˆ , in n ≥ N (u) and z ∈ K. Proof. We treat only the case i), the other ones being analogous. We can apply then Theorem 3.1 if n is sufficiently large. One obtains: X Tnm (z). (3.28) (Hn − z)−1 = Sn (z) m≥0

Because both Sn (z) and Tn (z) are integral operators, using (3.16) and the explicit form of K0 (x, x0 ; z), one can prove (as in Corollary 2.1) that for any ψ ∈ L2 R2 and m ≥ 0, ||T (u, y)Sn (z)Tnm (z)T (−u, y) ψ||2 (∞) ≤ B0 x2 |0(α)| B0 t2 ≤ sup exp u t − ||U (α, 1, (B0 x2 )/2)e− 8 ||1 (2) × 4 π t∈R+ 8 m B0 t2 ||ψ||2 . (3.29) × C2 (K) bn sup exp u t − 16 t∈R+ Take now n ≥ N (u) sufficiently large such that   2  e− 4Bu0  , bn <  C2 (K)  and the proof follows.

(3.30)

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

679

4. Decay of Eigenfunctions In this section we consider the decay of the eigenfunctions of H = (p − a)2 + V,

(4.1)

where the scalar potential V and the magnetic field which corresponds to a, ∇ ∧ a(x) = B0 + B0 (x) ,

B0 > 0

(4.2)

satisfy the following conditions:

B 0 ∈ C 1 R2 ; lim ||B 0 ||C 1 (R2 \{|x|≤n}) = 0, n→∞ V = V1 + V2 ; V1 ∈ L2 R2 , V2 ∈ L∞ R2 , lim sup |V2 (x)| = 0. n→∞ |x|≥n

(4.3)

Under these conditions, H is essentially self-adjoint on C0∞ R2 (see e.g. [C-FK-S] ). Moreover, V is relatively compact with respect to (p − a)2 [C-F-K-S] which together with the results in [I, H] it implies that σess (H) = σ(H0 ) = {(2n + 1)B0 | n = 0, 1, 2, . . .}.

(4.4)

Let E ∈ σdisc (H) (the discrete spectrum of H) and let ψ be a normalised eigenfunction corresponding to E. The main result of this section is that ψ decays faster than any exponential: 0 Theorem 4.1. Suppose B0 > 0 and B , V satisfy (4.3). Let K ⊂ R\σ(H0 ), K compact, E ∈ K, (H − E)ψ = 0, ψ ∈ L2 R2 . Then for any 0 ≤ u < ∞ there exists a constant C(u, K) < ∞ such that:

|ψ(x)| ≤ C(u, K)e−u|x| .

(4.5)

Remark. As proved by Erd¨os [E] (see also Theorem 4.2 below), under more restrictive conditions one can prove the existence of µ > 0 such as |ψ(x)| ≤ const e−µ|x|

r

,

1 < r ≤ 2.

(4.6)

The problem whether under the conditions (4.3) the decay given by Theorem 4.1 is optimal, remains open. In what follows, (4.7) g ∈ C ∞ R2 ; g = g ; ||g||C 2 (R2 ) = M < ∞. Under the conditions (4.3), one has D(H) = D((p − a)2 ) and (pj − aj )(H + i)−1 is bounded, j ∈ {1, 2}. Moreover, because [H, g] = −i{(p − a) · ∇g + ∇g · (p − a)}, it follows that ||[H, g](H + i)−1 || ≤ const(M ). Denote with 8 = [H, g]ψ = (E + i)[H, g](H + i)−1 ψ. Then ||8||2 ≤ const([c, d], M ). In what follows, we shall need a L∞ estimate on ψ.

(4.8)

680

H. D. Cornean, G. Nenciu

Lemma 4.1. |ψ(x)| ≤ C(E),

(4.9)

where C(E) is bounded on compacts. Proof. As is known, (4.9) follows from the general results (Harnack type inequalities); actually in our case, for sufficiently large λ, one has (H + λ)−1 ∈ B L2 , L∞ and then the result follows from ψ = (E + λ)(H + λ)−1 ψ. Proof of Theorem 4.1. Let now vˆ be a unit vector in R2 and 0 if x ≤ 21 g ∈ C ∞ (R) , 0 ≤ g(x) ≤ 1 , g(x) = , 1 if x ≥ 1 vˆ · x gn,v ≡ g ; n ≡ x ∈ R2 | vˆ · x ≥ n . n

(4.10)

From now on, we shall denote gn,v with gn . Let Bn (x) = B0 + gn B 0 (x) , Vn = gn V (x), Z 1 an (x) = ds s Bn (s x) ∧ x,

(4.11)

0

and Hn = (p − an (x))2 + Vn (x).

(4.12)

For x(n) ∈ n , we have for all x ∈ n (remember that gn (x) = 1 on n ): an (x) = an (x, x(n) ) + ∇x ϕn (x, x(n) ) = = a(x) + ∇x δϕn (x, x(n) ),

(4.13)

δϕn (x, x(n) ) = ϕn (x, x(n) ) − ϕ(x, x(n) ).

(4.14)

where As a consequence (see (4.12) and (4.13) ): Hg2n = (p − a)2 + V g2n = = ei δϕn (.,x

(n)

)

Hn e−i δϕn (.,x

(n)

)

g2n .

(4.15)

From (4.3) and (4.10) it follows that, uniformly in vˆ , lim ||Bn 0 ||C 1 (R2 ) = 0

n→∞

lim ||gn V2 ||∞ = 0,

n→∞

;

lim ||gn V1 ||2 = 0,

n→∞

(4.16)

and then from Corollary 3.1 it follows that if E ∈ K then E ∈ ρ (p − an )2 and in addition

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

681

−1 ||Vn (p − an )2 − E ||B(L2 ,L2 ) ≤ −1 ≤ ||gn V1 ||2 || (p − an )2 − E ||B(L2 ,L∞ ) + −1 + ||gn V2 ||∞ || (p − an )2 − E ||B(L2 ,L2 ) → 0 , when n → ∞

(4.17)

It follows that there exists n(K) such that for n ≥ n(K), (Hn − E)−1 = −1 X n −1 ok = (p − an )2 − E , Vn (p − an )2 − E

(4.18)

k≥0

where the series in the R.H.S. of (4.18) is B L2 , L2 and B L2 , L∞ convergent. Fix now u > 0. Corollary 3.1 implies the existence of N1 (u, K) such that −1 T (−u, 0)||B(L2 ,L2 ) ≤ C1 (u, K), ||T (u, 0) (p − an )2 − E −1 T (−u, 0)||B(L2 ,L∞ ) ≤ C2 (u, K), (4.19) ||T (u, 0) (p − an )2 − E as soon as n ≥ N1 (u, K) and E ∈ K. From (4.15) it follows that: (H − E)g2n ψ = [H, g2n ]ψ ≡ 82n = = ei δϕn (Hn − E)e−i δϕn g2n ψ.

(4.20)

Together with (4.18), one has (n large enough): g2n ψ =

−1 × = ei δϕn (p − an )2 − E ok Xn −1 × Vn (p − an )2 − E e−i δϕn 82n .

(4.21)

k≥0

Because T (u, 0) commutes with V , (4.21) becomes: ||T (u, 0)g2n ψ||∞ ≤ X −1 ≤ C2 (u, K) ||Vn T (u, 0) (p − an )2 − E T (−u, 0)||kB(L2 ,L2 ) × k≥0

× ||T (u, 0)82n ||2 .

(4.22)

Since supp 82n ⊂ n \ 2n , one has: ||T (u, 0)82n ||2 ≤ e2 n u ||82n ||2 .

(4.23)

Using (4.19), there exists N (u, K) > 0 such that: −1 ||Vn T (u, 0) (p − an )2 − E T (−u, 0)||B(L2 ,L2 ) ≤ ≤ ||V1 gn ||2 C2 (u, K) + ||V2 gn ||∞ C1 (u, K) < 1

if n ≥ N (u, K).

(4.24)

Because ||g2n ||C 2 (R2 ) ≤ and using (4.24), (4.23), (4.22) and (4.8) one obtains: eu vˆ ·x |ψ(x)|g2N (u,K) ≤ C(u, K)e2uN (u,K) , (4.25) and since vˆ is arbitrary, the proof of Theorem 4.1 is completed. const n1

682

H. D. Cornean, G. Nenciu

Remark. To obtain a proof of Theorem 4.1 in the Combes-Thomas scheme, in order to apply O’Connor’s Lemma, one needs to show that the essential spectrum of exp(α(1 + x2 )1/2 )H exp(−α(1 + x2 )1/2 ) ≡ H(α) is (at least) included in σL (B0 ) independently of α. One can try to prove this by showing that σess (H(α)) ⊂ σess (H0 (α)) ⊂ σL (B0 ). The second inclusion follows at once from the gaussian decay of the kernel of (H0 − z)−1 . If B 0 (x) = 0 the first inclusion (in fact equality) follows from a standard compactness argument. The hard case (at least for us) is B 0 (x) 6= 0 and actually Theorem 3.1 can be used to prove the inclusion in this case. In the rest of this section, B ∈ C∞

and B(x) = B0

if |x| ≥ 1.

(4.26)

The scalar potential V belongs to L2 + (L∞ ) and we suppose that there exist δ > 0 and 0 < β ≤ 2 such that: β

|V (x)| ≤ Ce−δ|x| ,

for |x| ≥ 1,

C > 0.

(4.27)

Then takes place Theorem 4.2. Let r = 1 + β2 , (1 < r ≤ 2) . Then there exists µ > 0 such that r

|ψ(x)| ≤ const(E) e−µ |x| .

(4.28)

One can write (see (4.11)): δ

β

|Vn (x)| ≤ Cn e− 2 |x| ,

lim Cn = 0.

(4.29)

n→∞

Take N1 sufficiently large such that if n ≥ N1 , then: Cn < dist(E, σ(H0 )),

(4.30)

and by a similar argument with that made in Theorem 4.1, one has: g2n ψ = = ei δϕn (H0 − E)−1 × X k Vn (H0 − E)−1 e−i δϕn 82n . ×

(4.31)

k≥0 r

Denote with xv = vˆ · x, with T (u, r) the multiplication operator with eu|xv | (where u > 0 and 1 ≤ r ≤ 2) and with χ = χ n . 2 Because suppVn ⊂ n2 , one has g2n ψ = = ei δϕn χ (H0 − E)−1 χ × X k × χVn (H0 − E)−1 χ e−i δϕn 82n . k≥0

(4.32)

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

Lemma 4.2. Let z ∈ ρ(H0 ) and 0 ≤ u <

B0 2r+1 .

683

Then

T (u, r)χ (H0 − z)−1 χT (−2r−1 u, r) ∈ B(L2 , L2 )

B(L2 , L∞ ) .

(4.33)

Proof. One has: B0 B0 B0 |x − x0 |2 ≥ |xv − x0v |r − . 4 4 4 If x, x0 ∈ n2 , then xv and x0v are positive, therefore

(4.34)

uxrv ≤ 2r−1 u|xv − x0v |r + 2r−1 ux0r v.

(4.35)

Using Young inequalities (as in Corollary 3.1), the result is straightforward.

The technical core of the proof of Theorem 4.2 is contained in the following Lemma 4.3. Let r = 1 + β2 , (1 < r ≤ 2). Then there exists u > 0 such that: T (u, r)χVn (H0 − E)−1 χT (−u, r) ∈ B(L2 , L2 )

(4.36)

and has a norm which goes to zero when n goes to infinity. Proof. We intend to use Young inequalities again, therefore we shall estimate for an arbitrary ψ ∈ L2 the term: |T (u, r)χVn (H0 − E)−1 χT (−u, r)ψ|(x) ≤ Z B0 |x−x0 |2 B0 |x−x0 |2 r β δ |0(α)| 8 8 dx0 e− 2 |x| e− e− × ≤ euxv Cn χ(x) 4π 0r

× |U (|x − x0 |2 )|χ(x0 )e−uxv |ψ(x0 )|.

(4.37)

We want to prove the existence of u > 0 such that for any x, x0 ∈ n2 to take place δ B0 |x − x0 |2 − uxrv + xβv + ux0r v ≥ 0. 8 2

(4.38)

First of all, |x − x0 |2 ≥ (xv − x0v )2 , so if xv ≤ x0v , the inequality is trivial. Therefore, the only interesting zone is 0 ≤ x0v ≤ xv . Define f (y) =

B0 (xv − y)2 + y r − xrv , 8u

0 ≤ y ≤ xv ,

n ≤ xv . 2

(4.39)

Then f 0 (y) = −

B0 (xv − y) + ry r−1 ; 4u

f 00 (y) =

B0 + r(r − 1)y r−2 . 4u

(4.40)

Because f 00 (y) is positive on the considered domain and f 0 (y) is changing sign in the extremities of the interval, there will be only one point z ∈ (0, xv ) in which f reaches its minimum. ∈ [0, 1]; then y = (1 − t)xv and Let t = xvx−y v B0 x2v 2 8uxr−2 v r t − 1 − (1 − t) . (4.41) f (t) = 8u B0

684

H. D. Cornean, G. Nenciu

The equation f 0 (tm ) = 0 can be easily rewritten as: tm = λ(1 − tm )r−1 ; If

8u B0

λ=

4ur . B0 x2−r v

(4.42)

< 1, an approximate solution for tm is tm = λ − (r − 1)λ2 + o(λ3 ).

(4.43)

Introducing tm in f (t) one obtains: f (tm ) = −

2ur2 β 8u xv 1 + o . B0 B0

(4.44)

Choosing u sufficiently small, one has uf (y) + δ2 xβv ≥ 0. This implies: ||T (u, r)χVn (H0 − E)−1 χT (−u, r)ψ||2 ≤ B0 x2 |0(α)| Cn ||e− 8 |U (x2 )|||1 ||ψ||2 , ≤ 4π and the proof of Lemma 4.3 is complete.

(4.45)

Proof of Theorem 4.2. With the results of Lemma 4.2 and Lemma 4.3, (4.32) becomes: u ||T , r g2n ψ||∞ ≤ r−1 2 u ≤ ||T , r χ(H0 − E)−1 χT (−u, r)||B(L2 ,L∞ ) × 2r−1 X × ||T (u, r)χVn (H0 − E)−1 χT (−u, r)||kB(L2 ,L2 ) × k≥0

× ||T (u, r)82n ||2 .

(4.46)

Choose now N0 large enough, such that the above series converges. Then: u |ψ(x)| ≤ const exp − r−1 xrv , xv ≥ 2N0 , 2 where the constant is independent of vˆ .

(4.47)

Acknowledgement. We thank the referee for valuable suggestions and for pointing out that (2.18) was known.

References [A-S]

Abramovitz, M., Stegun, I.A.: Handbook of mathematical functions. National Bureau of Standards, Applied Mathematics Series 55, 1965 [A-H-S 1] Avron, J., Herbst, I., Simon, B.: Schr¨odinger operators with magnetic fields, I. General interactions. Duke Math. J. 45, 847–883 (1978) [A-H-S 2] Avron, J., Herbst, I., Simon, B.: Schr¨odinger operators with magnetic fields, II. Separation of the center mass in homogeneous magnetic fields. Ann. Phys. 114, 431–451 (1978) [B-C-D] Briet, P., Combes, J. M., Duclos, P.: Spectral stability under tunneling. Commun. Math. Phys. 126, 133–156 (1989)

On Eigenfunction Decay for 2D Magnetic Schr¨odinger Operators

685

[C-F-K-S] Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger operators with application to quantum mechanics and global geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [E] Erd¨os, L.: Gaussian decay of the magnetic eigenfunctions. Vienna: Preprint E.S.I 184 (1994) [F-H] Feynman, R.P., Hibbs, A.: Quantum mechanics and path integrals. Hightstown, New Jersey: Mc Graw Hill, 1965 [H] Helffer, B.: On spectral theory for Schr¨odinger operators with magnetic potentials. Advanced Studies in Pure Mathematics 23, 113–141 (1994) [H-S] Helffer, B., Sj¨ostrand, J.: Multiple wells in the semi-classical limit I. Comm. in P.D.E. 9, 337–408 (1984) [H-H] Hempel, R., Herbst I.: Strong magnetic fields, Dirichlet boundaries and spectral gaps. Preprint E.S.I. 74 (1994) [Hu] Hunziker, W.: Schr¨odinger operators with electric or magnetic fields. Lecture Notes in Physics, Proc. Int. Conf. in Math. Phys., Lausanne 1980 [I] Iwatsuka, A.: The essential spectrum of two-dimensional Schr¨odinger operators with perturbed magnetic fields. J. Math. Kyoto Univ. 23, 475–480 (1983) [J-P] Joynt, R., Prange, R.: Conditions for the quantum Hall effect. Phys. Rev. B 29, 3303–3320 (1984) [K] Kato, T.: Perturbation theory for linear operators. Berlin–Heidelberg–New York: SpringerVerlag, 1976 [Na 1] Nakamura, S.: Band spectrum for Schr¨odinger operators with strong periodic magnetic fields. Operator Theory: Advances and Applications 78, 261–270 (1995) [Na 2] Nakamura, S., Bellisard, J.: Low energy bands do not contibute to Quantum Hall Effect. Commun. Math. Phys. 131, 283–305 (1990) [N] Nenciu, G.: Dynamics of band electrons in electric and magnetic fields: Rigorous justification of the effective hamiltonians. Rev. Mod. Phys. 63, 91–128 (1991) [R-S] Reed, M., Simon, B.: Methods of modern mathematical physics, II. New York: Academic Press, 1975 [T] Thaller, B.: The Dirac equation. Berlin–Heidelberg–New York: Springer-Verlag, 1992 Communicated by B. Simon

Commun. Math. Phys. 192, 687 – 706 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Finite Dimensional Unitary Representations of Quantum Anti–de Sitter Groups at Roots of Unity Harold Steinacker? Theoretical Physics Group, Ernest Orlando Lawrence Berkeley National Laboratory, University of California, Berkeley, California 94720, USA and Department of Physics, University of California, Berkeley, California 94720, USA. E-mail: [email protected] Received: 27 November 1996 / Accepted: 28 July 1997

Abstract: We study irreducible unitary representations of Uq (SO(2, 1)) and Uq (SO(2, 3)) for q a root of unity, which are finite dimensional. Among others, unitary representations corresponding to all classical one-particle representations with integral weights are found for q = eiπ/M , with M being large enough. In the “massless” case with spin bigger than or equal to 1 in 4 dimensions, they are unitarizable only after factoring out a subspace of “pure gauges” as classically. A truncated associative tensor product describing unitary many-particle representations is defined for q = eiπ/M . 1. Introduction In recent years, the development of Noncommutative Geometry has sparked much interest in formulating physics and in particular quantum field theory on quantized, i.e. noncommutative spacetime. The idea is that if there are no more “points” in spacetime, such a theory should be well behaved in the UV. Quantum groups [8, 13, 7], although discovered in a different context, can be understood as generalized “symmetries” of certain quantum spaces. Thinking of elementary particles as irreducible unitary representations of the Poincar´e group, it is natural to try to formulate a quantum field theory based on some quantum Poincar´e group, i.e. on some quantized spacetime. There have been many attempts (e.g. [25, 20]) in this direction. One of the difficulties with many versions of a quantum Poincar´e group comes from the fact that the classical Poincar´e group is not semisimple. This forbids using the well developed theory of (semi)simple quantum groups, which is e.g. reviewed in [2, 23, 11]. In this paper, we consider instead the quantum Anti–de Sitter group Uq (SO(2, 3)), resp. Uq (SO(2, 1)) in 2 dimensions, thus taking advantage of much well-known mathematical machinery. In the ? Present address: Sektion Physik, Ludwig-Maximilians Universit¨ at M¨unchen, Lehrstuhl Prof Wess, Theresienstr. 37, D-80333 M¨unchen, Germany. E-mail: [email protected]

688

H. Steinacker

classical case, these groups (as opposed to e.g. the de Sitter group SO(4, 1)) are known to have positive-energy representations for any spin [10], and e.g. allow supersymmetric extensions [29]. Furthermore, one could argue that the usual choice of flat spacetime is a singular choice, perhaps subject to some mathematical artefacts. With this motivation, we study unitary representations of Uq (SO(2, 3)). Classically, all unitary representations are infinite-dimensional since the group is noncompact. It is well known that at roots of unity, the irreducible representations (irreps) of quantum groups are finite dimensional. In this paper, we determine if they are unitarizable, and show in particular that for q = eiπ/M , all the irreps with positive energy and integral weights are unitarizable, as long as the rest energy E0 ≥ s + 1 where s is the spin, and E0 is below some (q-dependent, large) limit. There is an intrinsic high-energy cutoff, and only finitely many such “physical” representations exist for given q. At low energies and for q close enough to 1, the structure is the same as in the classical case. Furthermore, unitary representations exist only at roots of unity (if q is a phase). For generic roots of unity, their weights are non-integral. Analogous results are found for Uq (SO(2, 1)). In general, there is a cell-like structure of unitary representations in weight space. In the “massless” case, the naive representations with spin bigger than or equal to 1 are reducible and contain a null-subspace corresponding to “pure gauge” states. It is shown that they can be consistently factored out to obtain unitary representations with only the physical degrees of freedom (“helicities”), as in the classical case [10]. The existence of finite-dimensional unitary representations of noncompact quantum groups at roots of unity has already been pointed out in [6], where several representations of Uq (SU (2, 2)) and Uq (SO(2, 3)) (with multiplicity of weights equal to one) are shown to be unitarizable. In the latter case they correspond to the Dirac singletons [5], which are recovered here as well. We also show that the class of “physical” (unitarizable) representations is closed under a new kind of associative truncated tensor product for q = eiπ/M , i.e. there exists a natural way to obtain many-particle representations. Besides being very encouraging from the point of view of quantum field theory, this shows again the markedly different properties of quantum groups at roots of unity from the case of generic q and q = 1. The results are clearly not restricted to the groups considered here and should be of interest on purely mathematical grounds as well. We develop a method to investigate the structure of representations of quantum groups at roots of unity and determine the structure of a large class of representations of Uq (SO(2, 3)). Throughout this paper, Uq (SO(2, 3)) will be equipped with a non-standard Hopf algebra star structure. The idea to find a quantum Poincar´e group from Uq (SO(2, 3)) is not new: Already in [20], the so-called κ-Poincar´e group was constructed by a contraction of Uq (SO(2, 3)). This contraction however essentially takes q → 1 (in a nontrivial way) and destroys the properties of the representations which we emphasize, in particular the finite dimensionality. Although it is not considered here, we want to mention that there exists a (space of functions on) quantum Anti–de Sitter space on which Uq (SO(2, 1)) resp. Uq (SO(2, 3)) operates, with an intrinsic mass parameter m2 = i(q − q −1 )/R2 where R is the “radius” of Anti–de Sitter space (and the usual Minkowski signature for q = 1) [28]. This paper is organized as follows: In Sect. 2, we investigate the unitary representations of Uq (SO(2, 1)), and define a truncated tensor product. In Sect. 3, the most important facts about quantized universal enveloping algebras of higher rank are reviewed. In Sect. 4, we consider Uq (SO(5)) and Uq (SO(2, 3)), determine the structure of the relevant irreducible representations (which are finite dimensional) and investigate

Unitary Representations of Quantum Anti–de Sitter Groups

689

which ones are unitarizable. The truncated tensor product is generalized to the case of Uq (SO(2, 3)). Finally we conclude and look at possible further developments. 2. Unitary Representations of Uq (SO(2, 1)) We first consider the simplest case of Uq (SO(2, 1)), which is a real form of U ≡ | )), the Hopf algebra defined by [8, 13], Uq (Sl(2, C [H, X ± ] = ±2X ± , [X + , X − ] = [H], 1(H) = H ⊗ 1 + 1 ⊗ H, 1(X ± ) = X ± ⊗ q H/2 + q −H/2 ⊗ X ± , S(X + ) = −qX + , S(X − ) = −q −1 X − , ε(X ± ) = ε(H) = 0,

(1)

S(H) = −H,

−n

−q | where [n] ≡ [n]q = qq−q −1 . To talk about a real form of Uq (SL(2, C)), one has to impose a reality condition, i.e. a star structure, and there may be several possibilities. Since we want the algebra to be implemented by a unitary representation on a Hilbert space, the star operation should be an antilinear antihomomorphism of the algebra. Furthermore, we will see that to get finite dimensional unitary representations, q must be a root of unity, so |q| = 1. Only at roots of unity the representation theory of quantum groups differs essentially from the classical case, and new features such as finite dimensional unitary representations of noncompact groups can appear. This suggests the following star structure corresponding to Uq (SO(2, 1)): n

H ∗ = H, which is simply

(X + )∗ = −X − ,

(2)

x∗ = e−iπH/2 θ(xc.c. )eiπH/2 ,

(3)

where θ is the usual (linear) Cartan–Weyl involution and xc.c. is the complex conjugate of x ∈ U . Since q is a phase, q c.c. = q −1 , and

provided ∗

(1(x))∗ = 1(x∗ )

(4)

(a ⊗ b)∗ = b∗ ⊗ a∗ .

(5)

∗

Then (S(x)) = S(x ), which is a non-standard Hopf algebra star structure. In particular, (5) is chosen as e.g. in [24], which is different from the standard definition. Nevertheless, this is perfectly consistent with a many-particle interpretation in Quantum Mechanics or Quantum Field Theory as discussed in [28], where it is shown e.g. how to define an invariant inner product on the tensor product with the “correct” classical limit. The irreps of U at roots of unity are well known (see e.g. [15], whose notations we largely follow), and we list some facts. Let q = e2πin/m

(6)

for positive relatively prime integers m, n and define M = m if m is odd, and M = m/2 if m is even. Then it is consistent and appropriate in our context to set (X ± )M = 0

(7)

690

H. Steinacker

(if one uses q H instead of H, then (X ± )M is central). All finite dimensional irreps are highest weight (h.w.) representations with dimension d ≤ M . There are two types of irreps: m m z, h = j, j − 2, . . . , −(d − 1) + 2n z} with dimension • Vd,z = {ejh ; j = (d − 1) + 2n j j d, for any 1 ≤ d ≤ M and z ∈ ZZ, where Heh = heh , m | \{Z • Iz1 with dimension M and h.w. (M −1)+ 2n z, for z ∈ C Z + 2n m r, 1 ≤ r ≤ M −1}.

Note that in the second type, z ∈ ZZ is allowed, in which case we will write VM,z ≡ Iz1 for convenience. We will concentrate on the Vd,z -representations from now on. Furthermore, theLfusion rules at roots of unity state that Vd,z ⊗ Vd0 ,z0 decomposes into p p ⊕d00 Vd00 ,z+z0 p Iz+z 0 , where Iz are the well-known reducible, but indecomposable representations of dimension 2M , see Fig. 1 and [15]. If q is not a root of unity, then the universal R ∈ U ⊗ U given by R = q2H ⊗H 1

∞ X l=0

q − 2 l(l+1) 1

(q − q −1 )l lH/2 + l q (X ) ⊗ q −lH/2 (X − )l [l]!

(8)

defines the quasitriangular structure of U. It satisfies e.g. σ(1(u)) = R1(u)R−1 ,

u ∈ U,

(9)

where σ(a ⊗ b) = b ⊗ a. We will only consider representations with dimension ≤ M ; then R restricted to such representations is well defined for roots of unity as well, since the sum in (8) only goes up to (M − 1). Furthermore R∗ = (R)−1 .

(10)

To see this, (3) is useful. Let us consider a hermitian invariant inner product (u, v) for u, v ∈ Vd,z . A hermitian | , (u, v)c.c. = (v, u), and inner product satisfies (u, λv) = λ(u, v) = (λc.c. u, v) for λ ∈ C it is invariant if (11) (u, x · v) = (x∗ · u, v), i.e. x∗ is the adjoint of x. If ( , ) is also positive definite, we have a unitary representation. Proposition 2.1. The representations Vd,z are unitarizable w.r.t. Uq (SO(2, 1)) if and only if (12) (−1)z+1 sin(2πnk/m) sin(2πn(d − k)/m) > 0 for all k = 1, ..., (d − 1). m m , this holds precisely if z is odd. For d − 1 ≥ 2n , it holds for isolated For d − 1 < 2n values of d only, i.e. if it holds for d, then it (generally) does not hold for d ± 1, d ± 2, . . .. m . The representations Vd,z are unitarizable w.r.t. Uq (SU (2)) if z is even and d−1 < 2n Proof. Let ejh be a basis of Vd,z with highest weight j. After a straightforward calculation, invariance implies (X − )k · ejj , (X − )k · ejj = (−1)k [k]![j][j − 1]...[j − k + 1] ejj , ejj (13)

Unitary Representations of Quantum Anti–de Sitter Groups

691

for k = 1, ..., (d − 1), where [n]! = [1][2]...[n]. Therefore we can have a positive definite inner product (ejh , ejl ) = δh,l if and only if ak ≡ (−1)k [k]![j][j − 1]...[j − k + 1] is a positive number for all k = 1, ..., (d − 1), in which case ejj−2k = (ak )−1/2 (X − )k · ejj . Now ak = −[k][j − k + 1]ak−1 , and −[k][j − k + 1] = −[k][d − k +

m z] = −[k][d − k]eiπz 2n

(14)

= (−1)z+1 sin(2πnk/m) sin(2πn(d − k)/m)

1 , (15) sin(2πn/m)2

since z is an integer. Then the assertion follows. The compact case is known [15].

In particular, all of them are finite dimensional, and clearly if q is not a root of unity, none of the representations are unitarizable. We will be particularly interested in the case of (half)integer representations of type m = M Vd,z and n = 1, m even, for reasons to be discussed below. Then d − 1 < 2n always holds, and the Vd,z are unitarizable if and only if z is odd. These representations are centered around M z, with dimension less than or equal to M . Let us compare this with the classical case. For the Anti–de Sitter group SO(2, 1), H is nothing but the energy (cp. Sect. 3). At q = 1, the unitary irreps of SO(2, 1) are lowest weight representations with lowest weight j > 0, resp. highest weight representations with highest weight j < 0. For any given such lowest, resp. highest weight, we can now find a finite dimensional unitary representation with the same lowest, resp. highest weight, provided M is large enough (we only consider the (half)integer j here). These are unitary representations which for low energies look like the classical one-particle representations, but have an intrinsic high-energy cutoff if q 6= 1, which goes to infinity as q → 1. The same will be true in the 4 dimensional case. So far we only considered what could be called one-particle representations. To talk about many-particle representations, there should be a tensor product of 2 or more such irreps, which gives a unitary representation as well and agrees with the classical case for low energies. Since U is a Hopf algebra, there is a natural notion of a tensor product of two representations, given by the coproduct 1. However, it is not unitary a priori. As mentioned above, the tensor product of two irreps of type Vd,z is Vd,z ⊗ Vd0 ,z0 = ⊕d00 Vd00 ,z+z0

0 d+d −M M

p Iz+z 0,

(16)

p=r,r+2,...

where r = 1 if d + d0 − M is odd or else r = 2, and Izp is a indecomposable representation of dimenson 2M whose structure is shown in Fig. 1. The arrows indicate the rising and lowering operators. ... ...

Vp-1,z-1

...

...

W

V p-1,z+1

Fig. 1. Indecomposable representation Izp

692

H. Steinacker

In the case of Uq (SU (2)), one usually defines a truncated tensor product ⊗ by omitting all indecomposable Izp representations [24]. Then the remaining representations are unitary w.r.t. Uq (SU (2)); ⊗ is associative only from the representation theory point of view [24]. This is not the right thing to do for Uq (SO(2, 1)). Let n = 1 and m even, and consider e.g. VM −1,1 ⊗ VM −1,1 . Both factors have lowest energy H = 2, and the tensor product of the two corresponding classical representations is the sum of representations with lowest weights 4, 6, 8, . . . . In our case, these weights are in the Izp representations, while the Vd00 ,z00 have H ≥ M → ∞ and are not unitarizable. So we have to keep the Izp ’s and throw away the Vd00 ,z00 ’s in (16). A priori however, the Izp ’s are not unitarizable, either. To get a unitary tensor product, note that as a vector space,

(for p 6= 1) where

Izp = Vp−1,z−1 ⊕ W ⊕ Vp−1,z+1

(17)

W = VM −p+1,z ⊕ VM −p+1,z

(18)

as vector space. Now (X + )p−1 · vl is a lowest weight vector where vl is the vector with lowest weight of Ipz , and similarly (X − )p−1 · vh is a highest weight vector with vh being the vector of Ipz with highest weight (see Fig. 1). It is therefore consistent to consider the submodule of Ipz generated by vl , and factor out its submodule generated by (X + )p−1 ·vl ; the result is an irreducible representation equivalent to Vp−1,z−1 realized on the left summand in (17). Similarly, one could consider the submodule of Ipz generated by vh , factor out its submodule generated by (X − )p−1 · vh , and obtain an irreducible representation equivalent to Vp−1,z+1 . In short, one can just “omit” W in (17). The two V -type representations obtained this way are unitarizable provided n = 1 and m is even, and one can either keep both (notice the similarity with band structures in solidstate physics), or for simplicity keep the low-energy part only, in view of the physical application we have in mind. We therefore define a truncated tensor product as Definition 2.2. For n = 1 and even m, ˜ d0 ,z0 := Vd,z ⊗V

0 d+d −M M

Vd,z+z 0 −1 . ˜

(19)

˜ d=r,r+2,...

This can be stated as follows: Notice that any representation naturally decomposes ˜ simply means that only as a vector space into sums of Vd,z ’s, cp. (18); the definition of ⊗ the smallest value of z in this decomposition is kept, which is the submodule of irreps m (z + z 0 − 1). (Incidentally, z is the eigenvalue with lowest weights less than or equal to 2n of D3 in the classical su(2) algebra generated by {D± = ± M

(X ± )M [M ]!

, 2D3 = [D+ , D− ]},

where (X[M )]! is understood by some limes procedure.) With this in mind, it is obvious ˜ is associative: both in (V1 ⊗V ˜ 2 )⊗V ˜ 2 ⊗V ˜ 3 ), the result is simply the ˜ 3 and in V1 ⊗(V that ⊗ V ’s with minimal z, which is the same space, because the ordinary tensor product is associative and 1 is coassociative. This is in contrast with the “ordinary” truncated tensor product ⊗ [24]. Of course, one could give a similar definition for negative-energy representations. See also Definition 4.8 in the case of Uq (SO(2, 3)). ˜ d0 ,z0 is unitarizable if all the V ’s on the rhs of (19) are unitarizable. This is Vd,z ⊗V certainly true if n = 1 and m is even. In all other cases, there are no terms on the rhs of (19) if the factors on the lhs are unitarizable, since no Izp -type representations are

Unitary Representations of Quantum Anti–de Sitter Groups

693

generated (they are too large). This is the reason why we concentrate on this case, and ˜ furthermore on z = z 0 = 1 which corresponds to low-energy representations. Then ⊗ defines a two-particle Hilbert space with the correct classical limit. To summarize, we have the following: ˜ d0 ,1 is unitarizable. ˜ is associative, and Vd,1 ⊗V Proposition 2.3. ⊗ How an inner product is induced from the single-particle Hilbert spaces is explained in [28]. 3. The Quantum Group Uq (SO(2, 3)) In order to generalize the above results to the 4-dimensional case, one has to use the | general machinery of quantum groups, which is briefly reviewed (cp. e.g. [2]): Let q ∈ C (αi ,αj ) and Aij = 2 (αj ,αj ) be the Cartan matrix of a classical simple Lie algebra g of rank r, where (, ) is the Killing form and {αi , i = 1, . . . , r} are the simple roots. Then the quantized universal enveloping algebra Uq (g) is the Hopf algebra generated by the elements {Xi± , Hi ; i = 1, . . . , r} and relations [8, 13, 7] Hi , Hj = 0, Hi , Xj± = ±Aji Xj± ,

q di Hi − q −di Hi = δi,j [Hi ]qi , Xi+ , Xj− = δi,j q di − q −di 1−Aji X 1 − Aji (Xi± )k Xj± (Xi± )1−Aji −k = 0, k q k=0

i 6= j,

(20)

i

q i = q di ,

where di = (αi , αi )/2,

n m

[n]qi =

= qi

qin −qi−n qi −qi−1

and

[n]qi ! . [m]qi ![n − m]qi !

(21)

The comultiplication is given by 1(Hi ) = Hi ⊗ 1 + 1 ⊗ Hi , 1(Xi± ) = Xi± ⊗ q di Hi /2 + q −di Hi /2 ⊗ Xi± .

(22)

Antipode and counit are S(Hi ) = −Hi , S(Xi+ ) = −q di Xi+ ,

ε(Hi ) = ε(Xi± ) = 0

S(Xi− ) = −q −di Xi− ,

(we use the conventions of [16], which differ slightly from e.g. [2]). | )), r = 2 and For U ≡ Uq (SO(5, C 2 −2 2 −1 Aij = , (αi , αj ) = , −1 2 −1 1

(23)

(24)

694

H. Steinacker

so d1 = 1, d2 = 1/2, to have the standard physics normalization (a rescaling of ( , ) can be absorbed by a redefinition of q) The weight diagrams of the vector and P the spinor representations are given in Fig. 2 for illustration. The Weyl element is ρ = 21 α>0 α = 3 2 α1 + 2α2 .

H3

. . . . . . . . .

1

2

H2

. . . . . . . . .

Fig. 2. Vector and spinor representations of SO(2, 3)

The possible reality structures on U have been investigated in [19]. As in Sect. 2, in order to obtain finite dimensional unitary representations, q must be a root of unity. Furthermore, on physical grounds we insist upon having positive-energy representations; already in the classical case, that rules out e.g. SO(4, 1), cp. the discussion in [10]. It appears that then there is only one possibility, namely (Hi )∗ = Hi , (X1+ )∗ = −X1− , (X2+ )∗ = X2− , (a ⊗ b)∗ = b∗ ⊗ a∗ , (1(u))∗ = 1(u∗ ), (S(u))∗ = S(u∗ ),

(25) (26)

for |q| = 1, which corresponds to the Anti–de Sitter group Uq (SO(2, 3)). Again with E ≡ d1 H1 + d2 H2 , (−1)E x∗ (−1)E = θ(xc.c. ), where θ is the usual Cartan–Weyl involution corresponding to Uq (SO(5)). Although it will not be used it in the present paper, this algebra has the very important property of being quasitriangular, i.e. there exists a universal R ∈ U ⊗ U. It satisfies R∗ = (R)−1 , which can be seen e.g. from uniqueness theorems, cp. [17, 2]. In the mathematical literature, usually a rational version of the above algebra, i.e. using q di Hi instead of Hi is considered. Since we are only interested in specific representations, we prefer to work with Hi . We essentially work in the “unrestricted” specialization, i.e. (X ± )k

the divided powers (Xi± )(k) = [k]iq ! are not included if [k]qi = 0, although our results i will only concern representations which are small enough so that the distinction is not relevant. Often the following generators are more useful: p (27) hi = di Hi , e±i = [di ]Xi± , so that

hi , e±j = ±(αi , αj )e±j , ei , e−j = δi,j [hi ].

(28)

q In the present case, i.e. h1 = H1 , h2 = 21 H2 , e±1 = X1± and e±2 = [ 21 ]X2± . So far we only have the generators corresponding to the simple roots. A Cartan– Weyl basis corresponding to all roots can be obtained e.g. using the braid group action

Unitary Representations of Quantum Anti–de Sitter Groups

695

introduced by Lusztig [21], (see also [2, 11]) resp. the quantum Weyl group [16, 26, 18, 2]. If ω = τi1 ...τiN is a reduced expression for the longest element of the Weyl group, where τi is the reflection along αi , then {αi1 , τi1 αi2 , ..., τi1 ...τiN −1 αiN } is an ordered set of positive roots. We will use ω = τ1 τ2 τ1 τ2 and denote them β1 = α1 , β2 = α2 , β3 = α1 + α2 , β4 = α1 + 2α2 . A Cartan–Weyl basis of root vectors of U can then be defined as {e±1 , e±3 , e±4 , e±2 } = {e±1 , T1 e±2 , T1 T2 e±1 , T1 T2 T1 e±2 } and similarly for the hi ’s, where the Ti represent the braid group on U [21]: Ti (Hj ) = Hj − Aij Hi , −Aji

Ti (Xj+ ) =

X

Ti Xi+ = −Xi− qiHi ,

(−1)r−Aji qi−r (Xi+ )(−Aji −r) Xj+ (Xi+ )(r) ,

r=0

(29) where Ti (θ(xc.c. )) = θ(Ti (x))c.c. . We find e3 = q −1 e2 e1 − e1 e2 , e−3 = qe−1 e−2 − e−2 e−1 , h3 = h1 + h2 , e4 = e2 e3 − e3 e2 , e−4 = e−3 e−2 − e−2 e−3 , h4 = h1 + 2h2 .

(30)

Similarly one defines the root vectors Xβ±l . This can be used to obtain a Poincar´e– Birkhoff–Witt basis of U = U − U 0 U + where U ± is generated by the Xi± and U 0 by the Hi : for k := (k1 , . . . , kN ) where N is the number of positive roots, let Xk+ = Xβ+k1 1 . . . Xβ+kNN . Then the Xk± form a P.B.W. basis of U + , and similarly for U − [22] (assuming q 4 6= 1). Up to a trivial automorphism, (30) agrees with the basis used in [20]. The identification of the usual generators of the Poincar´e group has also been given there and will not be repeated here, except for pointing out that h3 is the energy and h2 is a component of | )) subalgebras with angular momentum, see also [10]. All of the above form Uq˜ (SL(2, C appropriate q˜ (but not as coalgebras), because the Ti ’s are algebra homomorphisms. The reality structure is e∗1 = −e−1 ,

e∗2 = e−2 ,

e∗3 = −e−3 ,

e∗4 = −e−4 .

(31)

So the set {e±2 , h2 } generates a Uq˜ (SU (2)) algebra, and the other three {e±α , hα } generate noncompact Uq˜ (SO(2, 1)) algebras, as discussed in Sect. 2. 4. Unitary Representations of Uq (SO(2, 3)) and Uq (SO(5)) In this section, we consider representations of Uq (SO(2, 3)) and show that for suitable roots of unity q, the irreducible positive, resp. negative, energy representations are again unitarizable, if the highest, resp. lowest, weight lies in some “bands” in weight space. Their structure for low energies is exactly as in the classical case including the appearance of “pure gauge” subspaces for spin bigger than or equal to 1 in the “massless” case, which have to be factored out to obtain the physical, unitary representations. At high energies, there is an intrinsic cutoff. From now on q = e2πin/m . Most facts about representations of quantum groups we will use can be found e.g. in [4]. It is useful to consider the Verma modules M (λ) for a highest weight λ, which is the (unique) U - module having a highest weight vector wλ such that

696

H. Steinacker

U + wλ = 0,

Hi w λ =

(λ, αi ) wλ , di

(32)

and the vectors Xk− wλ form a P.B.W. basis of M (λ). On a Verma module, one can define a unique invariant inner product ( , ), which is hermitian and satisfies (wλ , wλ ) = 1 and (u, x · v) = (θ(xc.c. ) · u, v) for x ∈ U , as in Sect. 2 [4]. θ is again the (linear) Cartan–Weyl involution corresponding to Uq (SO(5)). The irreducible highest weight representations can be obtained from the corresponding Verma module by factoring out all submodules in the Verma module. All submodules are null spaces w.r.t. the above inner product, i.e. they are orthogonal to any state in M (λ). Therefore one can consistently factor them out, and obtain a hermitian inner product on the quotient space L(λ), which is the unique irrep with highest weight λ. To see that they / U + wµ . Now for v ∈ U − wλ , it are null, let wµ ∈ M (λ) be in some submodule, so wλ ∈ + follows (wµ , v) ∈ (U wµ , wλ ) = 0. The following discussion until the paragraphPbefore Definition 4.4 is technical and may be skipped upon first reading. Let Q = ZZαi be the root lattice and Q+ = P ZZ+ αi , where ZZ+ = {0, 1, 2, . . .}. We will write λµ

if

λ − µ ∈ Q+ .

For η ∈ Q, denote (see [4]) Par(η) := {k ∈ ZZ+N ;

X

ki βi = η}.

(33)

(34)

Let M (λ)η be the weight space with weight λ − η in M (λ). Then its dimension is given by |Par(η)|. If M (λ) contains a highest weight vector with weight σ, then the multiplicity of the weight space M (λ)/M (σ) η is given by |Par(η)|−|Par(η +σ −λ)|, and so on. We will see how this allows to determine the structure, i.e. the characters of the irreducible highest weight representations. As usual, the character of a representation V (λ) with maximal weight λ is the function on weight space defined by X dim V (λ)η e−η , (35) ch(V (λ)) = eλ η∈Q+

where eλ−η (µ) := e(λ−η,µ) , and V (λ)η is the weight space of V (λ) at weight λ − η. The characters of inequivalent highest weight irreps (which are finite dimensional at roots of unity) are linearly independent. Furthermore, the characters of Verma modules are the same as in the classical case [12, 4], X |Par(η)|e−η . (36) ch(M (λ)) = eλ η∈Q+

In general, the structure of Verma modules is quite complicated, and the proper technical tool to describe it is its composition series. For a U -module M with a maximal weight, consider a sequence of submodules . . . ⊂ W2 ⊂ W1 ⊂ W0 = M such that Wk /Wk+1 is irreducible, and thus Wk /Wk+1 ∼ = L(µk ) for some µk . (If the series is finite, it is sometimes called a Jordan–H¨older series. For roots of unity it is infinite, but this is not a problem for our arguments. Wk+1 can be constructed inductively by fixing a maximal submodule of Wk ). While k may not be unique, it Pthe submodules WP ch(L(µk )). Since the is obvious that we always have ch(M ) = ch(Wk /Wk+1 ) =

Unitary Representations of Quantum Anti–de Sitter Groups

697

characters of irreps are linearly independent, this decomposition of ch(M ) is unique, and so are the subquotients L(µk ). We will study the composition series of the Verma module M (λ), in order to determine the structure of the corresponding irreducible highest weight representation. Our main tool to achieve this is a remarkable formula by De Concini and Kac for det(M (λ)η ), the determinant of the inner product matrix of M (λ)η . Before stating it, we point out its use for determining irreps: Lemma 4.1. Let wλ be the highest weight vector in an irreducible highest weight representation L(λ) with invariant inner product. If (wλ , wλ ) 6= 0, then ( , ) is non-degenerate, i.e. (37) det(L(λ)η ) 6= 0 for every weight space with weight λ − η in L(λ). Proof. Assume to the contrary that there is a vector vµ which is orthogonal to all vectors of the same weight, and therefore to all vectors of any weight. Because L(λ) is irreducible, there exists an u ∈ U such that wλ = u · vµ . But then (wλ , wλ ) = (wλ , u · vµ ) = (u† · wλ , vµ ) = 0, which is a contradiction. Now we state the result of De Concini and Kac [4]: det(M (λ)η ) =

Y

Y

[mβ ]dβ

β∈R+ mβ ∈IN

q (λ+ρ−mβ β/2,β) − q −(λ+ρ−mβ β/2,β) q dβ − q −dβ

|Par(η−mβ β)|

(38) in a P.B.W. basis for arbitrary highest weight λ, where R+ denotes the positive roots (cp. Sect. 3), and dβ = (β, β)/2. To get some insight, notice first of all that due to |Par(η − mβ β)| in the exponent, the product is finite. Now for some positive root β, let kβ be the smallest integer such that (λ+ρ−kβ β/2,β)

−(λ+ρ−kβ β/2,β)

−q = 0, and consider the weight space D(λ)kβ ,β := [kβ ]dβ q q dβ −q −dβ at weight λ − kβ β, i.e. ηβ = kβ β. Then |Par(ηβ − kβ β)| = 1 and det(M (λ)ηβ ) is zero, so there is a highest weight vector wβ with weight λ − ηβ (assuming for now that there is no other null state with weight larger than (λ − ηβ )). It generates a submodule which is again a Verma module (because U does not have zero divisors [4]), with dimension |Par(η−kβ β)| at weight λ−η. This is the origin of the exponent. However the submodules generated by the ωβi are in general not independent, i.e. they may contain common highest weight vectors, and other highest weight vectors besides these wβi might exist. Nevertheless, all the highest weights µk in the composition series of M (λ) are precisely obtained in this way. This “strong linkage principle” will be proven below, adapting the arguments in [12] for the classical case. While it is not a new insight for the quantum case either [6, 1], it seems that no explicit proof has been given at least in the case of even roots of unity, which is most interesting from our point of view, as we will see. To make the structure more transparent, let INβT be the set of positive integers k with m [k]β = 0, and INβR the positive integers k such that (λ + ρ − k2 β, β) ∈ 2n ZZ. Then

D(λ)k,β = 0 ⇔ k ∈ INβT The second condition is k = 2 (λ+ρ,β) (β,β) +

m 2 Z, 2n (β,β) Z

or

k ∈ INβR .

(39)

which means that

λ − kβ = σβ,l (λ),

(40)

698

H. Steinacker

m where σβ,l (λ) is the reflection of λ by a plane perpendicular to β through −ρ + 4nd lβ, β for some integer l. For general l, σβ,l (λ) ∈ / λ + Q; but k should be an integer, so it is natural to define the affine Weyl group Wλ of reflections in weight space to be generated by those reflections σβi ,li in weight space which map λ into λ + Q. For q = e2πin/m , two such allowed reflection planes perpendicular to βi will differ by multiples of 21 M(i) βi ; here M(i) = m for di = 21 , while for di = 1, M(i) = m or m/2 if m is odd or even, respectively. Thus Wλ is generated by all reflections by these planes. Alternatively, it is generated by the usual Weyl group with reflection center −ρ, and translations by M(i) βi . Now the strong linkage principle states the following:

Proposition 4.2. L(µ) is a composition factor of the Verma module M (λ) if and only if µ is strongly linked to λ, i.e. if there is a descendant sequence of weights related by the affine Weyl group as λ λi = σβi ,li (λ) . . . λkj...i = σβk ,lk (λj...i ) = µ.

(41)

Proof. The main tool to show this is the formula (38). Consider the inner product matrix Mk,k0 := (Xk− wλ , Xk−0 wλ ); it is hermitian, since q is a phase. One can define an analytic continuation of it as follows: for the same P.B.W. basis, let Bk,k0 (q, λ) := (Xk− wλ , Xk−0 wλ )b be the matrix of the invariant bilinear form defined as in [4], which is manifestly analytic in q and λ (one considers q as a formal variable and replaces | and arbitrary q → q −1 in the first argument of ( , )b ). Then (38) holds for all q ∈ C 0 0 complexified λ [4]. For |q| = 1 and real λ, Bk,k (q, λ) = Mk,k . Let λ0 = λ + hρ and 0 0 | ; then B q 0 = qeiπh for h ∈ C k,k0 (q , λ ) is analytic in h, and hermitian for h ∈ IR. Furthermore, one can identify the modules M (λ0 ) for different h via the P.B.W. basis. In this sense, the action of Xi± is analytic in h (it only depends on the commutation relations of the Xβ± ). Now it follows (see Theorem 1.10 in [14], chapter 2 on matrices which are analytic in h and normal for real h) that the eigenvalues ej of Bk,k0 (q 0 , λ0 ) are analytic in h, and there exist analytic projectors Pej on the eigenspaces Vej which span the entire vectorspace (except possibly at isolated points where some eigenvalues coincide; for h ∈ IR however, the generic eigenspaces are orthogonal and therefore remain independent even at such points). These projectors provide an analytic basis of eigenvectors of Bk,k0 (q 0 , λ0 ). Now let M Vk := V ej , (42) ej ∝hk

i.e. the sum of the eigenspaces whose eigenvalues ej have a zero of order k (precisely) at h = 0. Of course, (Vk , Vk0 )b = 0 for k 6= k 0 . The Vk span the entire space, they have an analytic basis as discussed, and have the following properties: Lemma 4.3. 1) (vk , v)b = o(hk ) for vk ∈ Vk and any (analytic) v ∈ M (λ0 ). k X X 2) Xi± vk = al vl + hl bl vk−l for vl ∈ Vl and al , bl analytic. In particular at h = 0, l≥k

l=1

M k := ⊕n≥k Vn is invariant.

(43)

Unitary Representations of Quantum Anti–de Sitter Groups

699

Proof. 1) Decomposing v according to ⊕l Vl , only the (analytic) component in Vk contributes in (vk , v)b , with a factor hk by the definition of Vk (o(hk ) means at least k factors of h). P 2) Decompose Xi± vk = ej aej vej with analytic coefficients aej corresponding to the eigenvalue ej . For any vej appearing on the rhs, consider (vej , Xi± vk )b = aej (vej , vej )b = c aej ej (with c 6= 0 at h = 0, since vej might not be normalized). But the lhs is (Xi∓ vej , vk )b = o(hk ) as shown above. Therefore aej ej = o(hk ), which implies 2). In particular, M (λ)/M 1 is irreducible and nothing but L(λ). (The sequence of submodules ... ⊂ M 2 ⊂ M 1 ⊂ M (λ) is similar to the Jantzen filtration [12].) By the definition of M k , we have X k ord(det(M (λ)η )) = dim Mλ−η , (44) k≥1 k is the weight space of M k at weight λ−η, and ord(det(M (λ)η )) is the order where Mλ−η of the zero of det(M (λ)η ) as a function of h, i.e. the maximal power of h it contains. Now from (38), it follows that X X X k ch(M k ) = eλ ( dim Mλ−η )e−η η∈Q+ k≥1

k≥1

=e

λ

X

ord(det(M (λ)η ))e−η

η∈Q+

=

X

(

β∈IR+

=

X

X

+

n∈INβT

(

X

β∈IR+ n∈IN T β

X n∈INβR

+

X

)eλ

X

|Par(η − nβ)|e−η

η∈Q+

)ch(M (λ − nβ)),

(45)

n∈INβR

where we used (36). Now we can show (4.2) inductively. Both the left and the right side of (45) can be decomposed into a sum of characters of highest weight irreps, according to their composition series. These characters are linearly independent. Suppose that L(λ − η) is a composition factor of M (λ). Then the corresponding character is contained in the lhs of (45), since M (λ)/M 1 is irreducible. Therefore it is also contained in one of the ch(M (λ − nβ)) on the rhs. Therefore L(λ − η) is a composition factor of one of these M (λ − nβ), and by the induction assumption we obtain that µ ≡ λ − η is strongly linked to λ as in (41). Conversely, assume that µ satisfies (41). By the induction assumption, there exists a n ∈ INβT ∪ INβR such that L(µ) is a subquotient of M (λ − nβ). Then (45) shows that L(µ) is a subquotient of M (λ). Obviously this applies to other quantum groups as well. In particular, we recover the well-known fact that for q = e2πin/m , all (Xi− )M(i) wλ are highest weight vectors, and zero in an irrep. reNow we can study the irreps of Uq (SO(5)) and Uq (SO(2, 3)). First, there Pexist m k markable nontrivial one dimensional representations wλ0 with weights λ0 = 2n i αi for integers ki . By tensoring any representation with wλ0 , one obtains another representation with the same structure, but all weights shifted by λ0 . We will see below that

700

H. Steinacker

by such a shift, representations which are unitarizable w.r.t. Uq (SO(2, 3)) are in one to one correspondence with representations which are unitarizable w.r.t. Uq (SO(5)). It is therefore enough to consider highest weights in the following domain: Definition 4.4. A weight λ = E0 β3 + sβ2 is called basic if 0 ≤ (λ, β3 ) = E0 <

m , 2n

0 ≤ (λ, β4 ) = (E0 + s) <

m . 2n

(46)

In particular, λ 0. It is said to be compact if in addition it is integral (i.e. (λ, βi ) ∈ ZZdi ), (47) s ≥ 0 and (λ, β1 ) ≥ 0. An irrep with compact highest weight will be called compact. The region of basic weights is drawn in Fig. 3, together with the lattice of wλ0 ’s. The compact representations are centered around 0, and the (quantum) Weyl group acts on them [16], as classically (it is easy to see that the action of the quantum Weyl group, resp. braid group, on the compact representations is well defined at roots of unity as well). w

.

h3 0

basic weights

.

.

.

.

.

.

.

.

m 2n

4

h2

m 2n

Fig. 3. Envelope of compact representations, basic weights and the lattice of wλ0

A representation with basic highest weight can be unitarizable w.r.t. Uq (SO(5)) (with conjugation θc.c. ) only if all the Uq (SU (2))’s are unitarizable. For compact λ, all the Uq (SU (2))’s are indeed unitarizable according to Sect. 2, where M(2) = M(3) = m and M(1) = M(4) = m or m/2 if m is odd or even, respectively. This alone however is not enough to show that they are unitarizable w.r.t. to the full group. Although it may not be expected, there exist unitary representations with non-integral basic highest weight, namely for λ=

m−1 β3 2

and

λ=(

m 1 − 1)β3 + β2 , 2 2

(48)

if n = 1 and m even. It follows from Proposition 4.2 that they contain a highest weight vector at λ − 2β3 and λ − β3 respectively, and all the multiplicities in the irreps turn

Unitary Representations of Quantum Anti–de Sitter Groups

701

out to be one. Furthermore all Uq (SU (2)) modules in the β1 , β4 direction have maximal length M(1) = m/2, which implies that they are unitarizable. The structure is that of shifted Dirac singletons which were already studied in [6], and we will return to this later. It appears that all other irreps must have integral highest weight in order to be unitarizable w.r.t. Uq (SO(5)). If the highest weight is not compact, some of the Uq (SU (2))’s will not be unitarizable. On the other hand, all irreps with compact highest weight are indeed unitarizable: Theorem 4.5. The structure of the irreps V (λ) with compact highest weight λ is the same as in the classical case except if m integer, where one additional highest a) λ = (m/2 − 1 − s)β3 + sβ2 for s ≥ 1 and 2n weight state at weight λ − β4 appears and no others, and m 1 b) λ = m−1 2 β3 and λ = ( 2 − 1)β3 + 2 β2 for n = 1 and m odd, where one additional highest weight state at weight λ − 2β3 resp. λ − β3 appears and no others,

which are factored out in the irrep. They are unitarizable w.r.t. Uq (SO(5)) (with conjugation θc.c. ). The irreps with nonintegral highest weights (48) discussed above are unitarizable as well. Proof. The statements on the structure follow easily from Proposition 4.2. To show that these irreps are unitarizable, consider the compact representation with highest weight λ before factoring out the additional highest weight state, so that the space is the same as classically. For q = 1, they are known to be unitarizable, so the inner product is positive definite. Consider the eigenvalues of the (hermitian) inner product matrix Mk,k0 as q goes from 1 to e2πin/m along the unit circle. The only way an eigenvalue could become negative is that it is zero for some q 0 6= q. This can only 0 0 happen if q 0 is a root of unity, q 0 = e2iπn /m with n0 /m0 < n/m. Then the “nonclassical” reflection planes of Wλ are further away from the origin and are relevant only in the case λ = m−1 2 β3 for n = 1 and m odd; but as pointed out above, no additional eigenvector appears in this case for q 0 6= q. Thus the eigenvalues might only become zero at q. This happens precisely if a new highest weight vector appears, i.e. in the cases listed. Since there is no null vector in the remaining irrep, all its eigenvalues are positive by continuity. So far, all results were stated for highest weight modules; of course the analogous statements for lowest weight modules are true as well. Now we want to find the “physical”, positive-energy representations which are unitarizable w.r.t. Uq (SO(2, 3)). They are most naturally considered as lowest weight representations, and can be obtained from the compact case by a shift, as indicated above: if V (λ) is a compact highest weight representation, then V (λ) · ω := V (λ) ⊗ ω

(49)

m β3 has lowest weight µ = −λ + λ0 ≡ E0 β3 − sβ2 (short: with ω ≡ wλ0 , λ0 = 2n µ = (E0 , s)). It is a positive-energy representation, i.e. the eigenvalues of h3 are positive. m integer, these representations will correspond precisely to classical positiveFor 2n energy representations with the same lowest weight [10]. The states with smallest energy h3 correspond to the particle at rest, so E0 is the rest energy and s the spin. For h3 ≤ m/4n, the structure is the same as classically, see Fig. 4. The irreps with nonintegral

702

H. Steinacker

highest weights (48) upon this shift correspond to the Dirac singleton representations “Rac” with lowest weight µ = (1/2, 0) and “Di” with µ = (1, 1/2), as discussed in [6]. m is not integer, the weights of shifted compact representations are not integral. For If 2n n = 1 and m odd, the irreps in b) of Theorem (4.5) now correspond to the singletons, again in agreement with [6]. We will see however that this case does not lead to an interesting tensor product. m integer will be called “massless” for two The case µ = (s + 1, s) for s ≥ 1 and 2n reasons. First, E0 is the smallest possible rest energy for a unitarizable representation with given s (see below). The main reason however is the fact that as in the classical case [10], an additional lowest weight state with E00 = E0 + 1 and s0 = s − 1 appears, which generates a null subspace of what should be called “pure gauge” states. This corresponds precisely to the classical phenomenon in gauge theories, which ensures that the massless photon, graviton, etc. have only their appropriate number of degrees of freedom (generally, the concept of mass in Anti–de Sitter space is not as clear as in flat space. Also notice that while “at rest” there are actually still 2s + 1 states, the representation is nevertheless reduced by one irrep of spin s − 1). In the present case, all these representations are finite-dimensional! Thus we are led to the following. Definition 4.6. An irreducible representation V(µ) with lowest weight µ = (E0 , s) ≡ E0 β3 − sβ2 (resp. µ itself) is called physical if it is unitarizable w.r.t. Uq (SO(2, 3)) (with conjugation as in (26)). m integer. It is called massless if E0 = s + 1 for s ≥ 1, s ∈ 21 ZZ and 2n For n = 1, V(µ) is called Di if µ = (1, 1/2) and Rac if µ = (1/2, 0). Theorem 4.7. The irreducible representation V(µ) with lowest weight µ is physical, i.e. unitarizable w.r.t. Uq (SO(2, 3)), if and only if the (shifted) irreducible represesentation m β3 is unitarizable w.r.t. Uq (SO(5)). with lowest weight µ − 2n m All V(µ) where −(µ − 2n β3 ) is compact are physical, as well as the singletons Di m and Rac. For h3 ≤ 4n , V(µ) is obtained by factoring out from a (lowest weight) Verma module a submodule with lowest weight (E0 , −(s + 1)), except for the massless case, where one additional lowest weight state with weight (E0 + 1, s − 1) appears, and for the Di and Rac, where one additional lowest weight state with weight (E0 + 1, s) and (E0 + 2, s) appears, respectively. This is the same as for q = 1, see Fig. 4. For the singletons, this was already shown in [6]. Proof. As mentioned before, we can write every vector in such a representation uniquely as a·ω, where a belongs to a unitarizable irrep of Uq (SO(5)). Consider the inner product ha · ω, b · ωi ≡ (a, b),

(50)

where (a, b) is the hermitian inner product on the compact (shifted) representation. Then ha · ω, e1 (b · ω)i = ha · ω, (e1 ⊗ q h1 /2 + q −h1 /2 ⊗ e1 )b ⊗ ωi = q h1 /2 |ω (a, e1 b) = i(a, e1 b), using h1 |ω =

m 2n .

(51)

Similarly,

he−1 (a · ω), b · ωi = h(e−1 ⊗ q h1 /2 + q −h1 /2 ⊗ e−1 )a ⊗ ω, b ⊗ ωi = q −h1 /2 |ω (e−1 a, b) = −i(e−1 a, b)

(52)

Unitary Representations of Quantum Anti–de Sitter Groups

703

because h, i is antilinar in the first argument and linear in the second. Therefore ha · ω, e1 (b · ω)i = −he−1 (a · ω), b · ωi.

(53) ∗

Similarly ha · ω, e2 (b · ω)i = he−2 (a · ω), b · ωi. This shows that x is indeed the adjoint of x w.r.t. h , i which is positive definite, because ( , ) is positive definite by definition. Theorem (4.5) now completes the proof. h3

.

.

.

.

.

.

- - - - - - - - -m- - - - - - - - - - 4n

pure gauges

h2

.

.

Fig. 4. Physical representation with subspace of pure gauges (for the structure is the same as for q = 1

.

lowest weight

m 2n

integer), schematically. For h3 ≤

m , 4n

As a consistency check, one can see again from Sect. 2 that all the Uq˜ (SO(2, 1)) resp. Uq˜ (SU (2)) subgroups are unitarizable in these representations, but this is not enough to show the unitarizability for the full group. Note that for n = 1, one obtains the classical one-particle representations for given s, E0 as m → ∞. We have therefore also proved the unitarizability at q = 1 for (half)integer spin, which appears to be non trivial in itself [10]. Furthermore, all representations obtained from the above by shifting E0 or s by a multiple of m n are unitarizable as well. One obtains in weight space a cell-like structure of representations which are unitarizable w.r.t. Uq (SO(2, 3)) resp. Uq (SO(5)). Finally we want to consider many-particle representations, i.e. find a tensor product such that the tensor product of unitary representations is unitarizable, as in Sect. 2. The idea is the same as there, the tensor product of 2 such representations will be a direct sum of representations, of which we only keep the appropriate physical lowest weight “submodules”. To make this more precise, consider two physical irreps V(µ) and V(µ0 ) as in Definition 4.6. For a basis {uλ0 } of lowest weight vectors in V(µ) ⊗ V(µ0 ) with physical λ0 , consider the linear span ⊕Uuλ0 of its lowest weight submodules, and let Qµ,µ0 be the quotient of this after factoring out all proper submodules of the Uuλ0 . Let {uλ00 } be a basis of lowest weight vectors of Qµ,µ0 . Then Qµ,µ0 = ⊕V(λ00 ) , where V(λ00 ) are the corresponding (physical) irreducible lowest weight modules, i.e. Qµ,µ0 is completely reducible. Now we define the following: Definition 4.8. In the above situation, let {uλ00 } be a basis of physical lowest weight states of Qµ,µ0 , and let V(λ00 ) be the corresponding physical lowest weight irreps. Then define

704

H. Steinacker

˜ (µ0 ) := V(µ) ⊗V

M

V(λ00 ) .

(54)

λ00 m Notice that if 2n is not an integer, then the physical states have non-integral weights, and the full tensor product of two physical irreps V(µ) ⊗ V(µ0 ) does not contain any physical ˜ (µ0 ) is zero in that case. lowest weights. Therefore V(µ) ⊗V Again as in Sect. 2, one might also include a second “band” of high-energy states.

˜ is associative, and Theorem 4.9. If all weights in the factors are integral, then ⊗ ˜ 0 V(µ) ⊗V(µ ) is unitarizable w.r.t. Uq (SO(2, 3)). Proof. First, notice that the λ00 are all integral and none of them gives rise to a massless representation or a singleton. Thus none of the U uλ0 contain a physical lowest weight vector according to Proposition 4.2. Also, lowest weight vectors for generic q cannot disappear at roots of unity. Therefore Qµ,µ0 contains all the physical lowest weight vectors of the full tensor product. Furthermore, no physical lowest weight vectors are contained in products of the form (discarded vectors)⊗ (any vectors). Associativity now follows from the associativity of the full tensor product and the coassociativity of the coproduct, m is the same as classically (observe that since and the structure for energies h3 ≤ 4n there are no massless representations, classically inequivalent physical representations cannot recombine into indecomposable ones). In particular, none of the low-energy states have been discarded. Therefore our definition is physically sensible, and the case of q = e2πi/m with m even appears to be most interesting physically. 5. Conclusion We have shown that in contrast to the classical case, there exist unitary representations of noncompact quantum groups at roots of unity. In particular, we have found finite dimensional unitary representations of Uq (SO(2, 3)) corresponding to all classical “physical” representations, with the same structure at low energies as in the classical case. Thus they could be used to describe elementary particles with arbitrary spin. This generalizes earlier results of [6] on the singletons. Representations for many non-identical particles are found. Apart from purely mathematical interest, this is very encouraging for applications in QFT. In particular the appearance of pure gauge states should be a good guideline to construct gauge theories on quantum Anti–de Sitter space. If this is possible, one should expect it to be finite in light of these results. However to achieve that goal, more ingredients are needed, such as implementing a symmetrization axiom (cp. [9]), a dynamical principle (which would presumably involve integration over such a quantum space, cp. [27]), and efficient methods to do calculations in such a context. These are areas of current research. Acknowledgement. It is a pleasure to thank Bruno Zumino for many useful discussions, encouragement and support. I would also like to thank Chong–Sun Chu, Pei–Ming Ho and Bogdan Morariu for useful comments, as well as Paolo Aschieri, Nicolai Reshetikhin, Michael Schlieker and Peter Schupp. In particular, I am grateful to V. K. Dobrev for pointing out the papers [6] to me. This work was supported in part by the Director, Office of Energy Research, Office of High Energy and Nuclear Physics, Division of High Energy Physics of the U.S. Department of Energy under Contract DE-AC03-76SF00098 and in part by the National Science Foundation under grant PHY-9514797.

Unitary Representations of Quantum Anti–de Sitter Groups

705

References 1. 2. 3. 4.

5. 6.

7. 8. 9.

10.

11. 12.

13. 14. 15. 16. 17. 18. 19. 20.

21. 22. 23. 24. 25.

Anderson, H.H., Polo, P., Kexin, W.: Invent. Math. 104, 1 (1991) V. Chari and A. Pressley, A guide to quantum groups. Cambridge: Cambridge University Press, 1994 Connes, A.: Publ. IHES 62, 257 (1986) De Concini, C., Kac, V.G.: Representations of Quantum Groups at Roots of 1. In: Operato Algebras, Unitary Representations, Enveloping Algebras and Invariant Theory, A. Connes et al. (eds), Boston: Birkhaeuser, p. 471 Dirac, P.: J. Math. Phys. 4, 901 (1963); Flato, M., Fronsdal, C.: Lett. Math. Phys. 2, 421 (1978) Dabrovski, L., Dobrev, V.K., Floreanini, R., Husain, V.: Phys. Lett 302B, 215 (1993); Dobrev, V.K., Moyan, P.J.: Phys. Lett. 315B, 292 (1993); Dobrev, V.K., Floreanini, R.: J. Phys. A: Math. Gen. 27, 4831 (1994); Dobrev, V.K.: Invited talk at the Workshop on Generalized Symmetries in Physics (Clausthal, July 1993), Proceedings, eds. H.-D. Doebner, V. K. Dobrev, A. G. Ushveridze Singapore: World Sci., 1994, p. 90 Drinfeld, V.: Quantum Groups. In: Proceedings of the International Congress of Mathematicians, Berkeley, 1986 A.M. Gleason (ed.), Providence, RI: AMS, p. 798 Faddeev, L.D., Reshetikhin, N.Yu., Takhtajan, L.A.: Quantization of Lie Groups and Lie Algebras. Algebra Anal. 1, 178 (1989) Fiore, G., Schupp, P.: Identical Particles and Quantum Symmetries. hep-th/9508047; Fiore, G.: Deforming Maps from Classical to Quantum Group Covariant Creation and Annihilation Operators. qalg/9610005 Fronsdal, C.: Rev. Mod. Phys. 37 Nr. 1, 221 (1965); Fronsdal, C.: Phys. Rev. D 10, Nr. 2, 589 (1974); Fronsdal, C., Haugen, R.: Phys. Rev. D 12 Nr. 12, 3810 (1975); Fronsdal, C., Haugen, R.: Phys. Rev. D 12 Nr. 12, 3819 (1975); Fronsdal, C.: Phys. Rev. D 20 848 (1979); Fang, J., Fronsdal, C.: Phys. Rev. D 22 1361 (1980) Jantzen, J.C.: Lectures on Quantum Groups. Graduate Studies in Mathematics Vol. 6,Providence, RI: AMS 1996 Jantzen, J.C.: Moduln mit h¨ochstem Gewicht. Lecture notes in Mathematics, Berlin–Heidelberg–New York: Springer, vol. 750; Kac, V., Kazhdan, D.: Structure of Representations with Highest Weight of infinite-dimensional Lie Algebras. Adv. Math. 34, 97 (1979) Jimbo, M.: A q-Difference Analogue of U (g) and the Yang–Baxter Equation. Lett. Math. Phys. 10, 63 (1985) Kato, T.: Perturbation Theory for linear Operators. 2nd edition, Berlin–Heidelberg– New York: Springer, 1980 | )), q m = 1. Lett. Math. Phys. 21, 273 (1991) Keller, G.: Fusion Rules of Uq (sl(2, C Kirillov, A.N., Reshetikhin, N.: q-Weyl group and a Multiplicative Formula for Universal R-Matrices. Commun. Math. Phys. 134, 421 (1990) Khoroshkin, S.M., Tolstoy, V.N.: Universal R-Matrix for Quantized (Super)Algebras. Commun. Math. Phys. 141, 599 (1991) Levendorskii, S.Z., Soibelman, Y.S.: Some applications of the quantum Weyl group. J. Geom. Phys. 7 (2), 241 (1990) Lukierski, J., Novicki, A., Ruegg, H.: Real forms of complex quantum Anti-de-Sitter algebra | )) and their contraction schemes. Phys. Lett. B 271, 321 (1991) Uq (Sp(4; C Lukierski, J., Ruegg, H., Nowicki, A., Tolstoy, V.: q-deformation of Poincare algebra. Phys. Lett. B 264, 331 (1991); Lukierski, J., Nowicki, A., Ruegg, H.: New quantum Poincare algebra and κ-deformed field theory. Phys. Lett. B 293, 344 (1992) Lusztig, G.: Quantum deformations of certain simple modules ..., Adv. in Math. 70, 237 (1988); Lusztig, G.: On quantum groups. J. Algebra 131, 466 (1990) Lusztig, G.: Quantum Groups at roots of 1. Geom. Ded. 35, 89 (1990) Lusztig, G.: Introduction to Quantum Groups. Progress in Mathematics Vol. 110, Basel–Boston: Birkhaeuser, 1993 Mack, G., Schomerus, V.: Quasi Hopf quantum symmetry in quantum theory. Nucl. Phys. B 370, 185 (1992) Ogievetsky, O., Schmidke, W., Wess, J., Zumino, B. Commun. Math. Phys. 150, 495 (1992); Aschieri, P., Castellani, L.: Lett. Math. Phys. 36, 197 (1996); Chaichian, M., Demichev, A.: Phys. Lett. B 304, 220 (1993); Podles P., Woronowicz, S.: hep-th/9412059, UCB preprint PAM-632 Dobrev, V.K.: J. Phys. A: Math. Gen. 26, 1317 (1993)

706

H. Steinacker

26. Soibelman, Y.S.: Quantum Weyl group and some of its applications. Rend. Circ. Mat. Palermo Suppl 26 (2), 233 (1991) 27. Steinacker, H.: Integration on quantum Euclidean space and sphere. J. Math. Phys. 37 (Nr.9), 7438 (1996) 28. Steinacker, H.: Quantum Groups, Roots of Unity and Particles on quantized Anti–de Sitter Space. Ph.D. Thesis, Berkeley, May 1997; hep-th/9705211 29. Zumino, B.: Nucl. Phys. B 127, 189 (1977); Deser, S., Zumino, B.: Phys. Rev. Lett. 38, 1433 (1977) Communicated by A. Connes

Commun. Math. Phys. 192, 707 – 736 (1998)

Communications in

Mathematical Physics c Springer-Verlag 1998

Quasi Linear Flows on Tori: Regularity of their Linearization F. Bonetto1 , G. Gallavotti2 , G. Gentile3 , V. Mastropietro4 1 Matematica, Universit` a di Roma, P.le Moro 2, 00185 Roma, Italy. E-mail: [email protected] 2 Fisica, Universit` a di Roma, P.le Moro 2, 00185 Roma Italy. E-mail: [email protected] 3 IHES, 35 Route de Chartres, 91440 Bures sur Yvette, France. E-mail: [email protected] 4 Matematica, Universit` a di Roma II, Via Ricerca Scientifica, 00133 Roma, Italy. E-mail: [email protected]

Received: 5 September 1996 / Accepted: 4 August 1997

Abstract: Under suitable conditions a flow on a torus C (p) -close, with p large enough, to a quasi periodic diophantine rotation is shown to be “linearizable”, i.e. conjugable to the quasi periodic rotation, by a map that is analytic in the perturbation size. This result 0 is parallel to Moser’s theorem stating conjugability in class C (p ) for some p0 < p. The extra conditions restrict the class of perturbations that are allowed. 1. Introduction 1.1. The perturbation of the Hamiltonian of a system of ` harmonic oscillators with 1 (ω01 , . . . , ω0` ) is described by the Hamiltonian frequencies 2π H0 (A, α) = ω0 · A + εA · f(α) + ε A · F (A, α) A ,

(1.1)

where (A, α) ∈ R` × T` are the “action–angle” variables of the oscillators, · denotes the scalar product, f is a vector and F a matrix that describe the perturbation structure and ε is the “intensity” of the perturbation. The Hamiltonian system (1.1) is not integrable in general (see for instance (4.10) in [G3]). Nevertheless, if the unperturbed rotation vector ω0 of the oscillators verifies a “diophantine condition” and if the perturbation is analytic, it is possible to add to the Hamiltonian a suitable “counterterm” A · N ε (A), divisible and analytic in the perturbative parameter ε and depending only on the action variables A, so that the modified Hamiltonian H0 + A · N ε (A) is integrable. This was conjectured in [G1] and proven first in [E2], then also in [GM2] (with techniques of [G4,GM1]), by exhibiting the details of the cancellation mechanisms operating, order by order, in the perturbative series for the counterterm and for the equations of motion solution for the modified Hamiltonian; a third method is in [EV]. Note that integrability of the problem (1.1) with F = 0 is equivalent to the problem of “linearizability” of the flow on the torus generated by the differential equations

708

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

dα = ω0 + εf(α) , dt

(1.2)

i.e. the problem of finding a change of coordinates, α = ψ + hε (ψ), on the torus T` = [0, 2π]` such that Eqs. (1.2) become the trivial quasi periodic linear flow dψ/dt = ω0 . The equations for h, H are derived below (see (1.6), (1.7) and (1.8)). A previous partial result about the existence of the counterterm Nε (A) is in [DS,R,PF], where the one-dimensional Schr¨odinger equation with a quasi periodic potential is studied: the latter problem can be shown to be equivalent to the problem of the existence of a counterterm which makes integrable the Hamiltonian of a system of interacting oscillators, provided F = 0 and f has a very special form, (see [G2] and Sect. 5 below): restricted to this case, the proof of existence of a counterterm making integrable the Hamiltonian follows also from the analysis of [M1] (see Appendix A4 below). But the “beginning” of the interest in the above questions goes back to a problem similar to (1.2) investigated by Moser in [M1]. In the latter paper more general nonlinear ordinary differential equations are perturbatively studied under the hypothesis that the “characteristic vector” (defining the linear part of the equations) verify a generalized diophantine condition (see Appendix A4 for details): under the assumption that the nonlinear part f is analytic it is proved that one can add to the equations counterterms depending (analytically) only on the perturbative parameter ε and so that the modified equations admit a solution analytic in ε for ε small. The conjecture in [G1] envisages the possibility of introducing a counterterm depending (analytically) also on the actions in the Hamiltonian in such a way that the equations of motion admit an analytic solution: the two problems (the one studied by Moser and the one studied in [G2]) deal in general with different equations. Nevertheless, if the perturbation is linear in the action variables in (1.1), i.e. F = 0, then the counterterm Nε (A) turns out to be A–independent and analytic in ε if f is analytic on T` : in that case the constant value of Nε (A) will be denoted Nε or N(ε) and, as pointed out above, the existence and analyticity of Nε is then implied by Moser’s theorem [M1], Theorem 1. In [M1] it is also pointed out that the latter result is the “core” of the proof of the KAM theorem in the analytic case. However Moser’s theorem gives no analyticity result when the perturbation is nonanalytic. So, in this paper we consider (1.1) with F = 0 or, equivalently, (1.2) and εf modified into εf +Nε , with the aim of proving existence and analyticity of the counterterm Nε if the perturbation f belongs to a special class, specified below (see Sect. 1.3), of functions depending non-analytically on the angles. Convergence of the perturbative series for the equations of motion solution and for the counterterm is obtained by taking into account cancellations which include, besides the ones necessary to treat the analytic case as in [E2] (the “infrared cancellations”), also new cancellations called “ultraviolet cancellations” (see [BGGM]). The analyticity (of the equations of motion solution and of the counterterm) in the perturbative parameter ε even though the interaction f(α) is non-analytic in α is a result that we believe would be difficult to obtain by other techniques which one could use to face this problem, like the Moser-Nash smoothing technique [M2]. The tools are inherited from [BGGM], where interaction potentials belonging to the same class of functions are considered and analyticity of the KAM invariant tori as functions of the perturbative parameter is proved (near zero). 1.2. The paper is organized as follows. In the remaining part of this section we introduce the notations and state the result (Theorem 1.4). In Sect. 2 we reintroduce our

Quasi Linear Flows on Tori

709

diagrammatic formalism, referring for details to Appendix A1 and simply outlining the differences with respect to [BGGM], Sect. 3. In Sect. 3 the so-called infrared cancellations, [E1], are discussed following [GM2]: such cancellations allow us to solve the so-called “small divisors problem”, and they are sufficient to prove convergence of the perturbative series in the analytic case. Since there are some differences with respect to [GM2], mostly the use of Siegel-Eliasson’s lemma (see Lemma 3.4 below) instead of Siegel-Bryuno’s lemma ([GM2], Lemma 5.3), we provide a selfconsistent discussion although the problem is the same as the one treated in [GM2], Sect. 8. In Sect. 4, we study the “ultraviolet cancellations” which, together with the infrared ones, make convergent the perturbative series for the class of interactions introduced in Sect. 1.3. In Sect. 5, we note briefly that Theorem 1.4 for non-analytic interactions does not give really more results than the corresponding theorem for the analytic case (see [E2], and [GM2], Theorem 1.4), if applied to the one-dimensional Schr¨odinger equation with a non-analytic quasi periodic potential; this is a little deceiving, but not quite unexpected (see comments in Sect. 5). As a byproduct of the proof, analyticity of eigenvalues and eigenfunctions in the perturbative parameter is proved under the assumption that the interaction potential is at least in the class C (p) , p > 3τ + `: this result improves some aspects of [Pa], see also [PF] and Sect. 5. Note that compared to [BGGM] the infared cancellations discussion appears remarkably less involved (and therefore more suitable for a first approach to the techniques employed). On the contrary the ultraviolet cancellations analysis is essentially unchanged compared to the one in [BGGM], notwithstanding the simplified expression of the Hamiltonian (1.3) below: it is repeated for selfconsistence purposes and because pointing out the (slight but many) variations would take the same amount of work. 1.3. The Hamiltonian is H = H0 + A · N(ε)

(1.3)

with H0 = ω0 · A + εA · f(α) given by (1.1) with F = 0, where (1) A ∈ R` and α ∈ T` are canonically conjugated variables (respectively action and angle variables), and · denotes the scalar product; (2) ω0 is a rotation vector satisfying the “diophantine condition” C0 |ω0 · ν| > |ν|−τ

∀ν ∈ Z` , ν 6= 0 ,

(1.4) √ for some positive constants C0 and τ (here and henceforth |ν| = ν · ν, while kνk = P` j=1 |νj |, if ν = (ν1 , . . . , ν` )); (3) f has the form f = (f1 , . . . , f` ), with of the class Cˆ (p) (T` ) introduced in P each fj iν·α [BGGM], for some p: namely fj (α) = ν∈Z fjν e , fν = f−ν , with fj0 = 0 and, for ν 6= 0, N (j) ||ν|| X c(j) n + dn (−1) fjν = , (1.5) n |ν| n≥p+`

(j) for some N ≥ p + ` and some constants c(j) n , dn ; and (4) N(ε), called a “counterterm”, has to be fixed in order to make the equations of motion for the model (1.3) linearizable. For instance we can choose fjν = aj |ν|−b , with b = p + ` and a = (a1 , . . . , a` ) ∈ R` . In the following we shall deal explicitly with such a function: the proof can be trivially extended to the class of functions (1.5).

710

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

Theorem 1.4. Given the Hamiltonian (1.1), with ω0 satisfying the diophantine condition (1.4), F ≡ 0 and f = (f1 , . . . , f` ), with each fj ∈ Cˆ (p) (T` ), there exist two positive constants ε0 and p0 = 2 + 3τ , and a function N(ε) analytic in ε for |ε| < ε0 , such that the equations of motion corresponding to the Hamiltonian (1.3) admit linearizable ˙ = ω) solutions in C (0) (T` ) analytic in ε (i.e. conjugable to the linear flow defined by ψ for |ε| < ε0 , provided p > p0 , i.e. solutions described by (1.7), (1.8) below. 1.5. The equations of motion for the Hamiltonian (1.3) are given by dαj = ω0j + εfj (α) + Nj (ε) , dt dAj = −εA · ∂αj f(α) . dt

(1.6)

We look for motions of the form α(t) = ω0 t + h(ω0 t) , A(t) = A0 + H(ω0 t) ,

h(ψ) =

∞ X X

H(ψ) =

hν(k) eiν·ψ εk ,

k=1 ν∈Z ∞ X X

(1.7) Hν(k) eiν·ψ

k

ε ,

k=1 ν∈Z

with h odd and H even in ψ so that the equations for h and H become (ω0 · ∂ψ ) hj (ψ) = εfj (ψ + h(ψ)) + Nj (ε) , (ω0 · ∂ψ ) Hj (ψ) = −ε[A0 + H(ψ)] · ∂αj f(ψ + h(ψ)) ,

(1.8)

where ∂α denotes derivative with respect to the argument, and N(ε) has to be so chosen that the right-hand side of the first equation in (1.8) has vanishing average (see Sect. 2.1 below). A “solution” to (1.8) can be given a meaning as soon as h, H are continuous by requiring equality of the Fourier transforms of both sides (regarded as distributions, see [BGGM]). We see from (1.8) that the equation for h is closed, so that, as long as we are interested only in the function h, i.e. in the analytic linearizability of (1.6), we can confine ourselves to studying only the first equation in (1.8). This is the equation that one has to solve to linearize the flow generated by dα/dt = ω0 + εf(α) + N(ε): hence it is not surprising that the equation for H can be easily solved once h is known: see Sect. 2.3 below. Note that since f is supposed even, then we expect that h is odd and H is even: hence while the equation for H does not seem to hit any obvious compatibility problems we see that the equation for h does, unless N(ε) is suitably chosen. In fact the function εf(ψ + h(ψ)), being even has no a priori reason to have a vanishing integral over ψ (as it should, being equal to (ω0 · ∂ ψ )h(ψ)). 2. Formal Solution and Graph Representation 2.1. We study now Eqs. (1.8) with fjν = aj |ν|−b replaced with fjν e−κ|ν| . The parameter κ is taken κ > 0, and, after computing the coefficients hν(k) in (1.7), which will depend on κ, one will perform the limit κ → 0 (“Abel’s summation”).

Quasi Linear Flows on Tori

711

The formal solubility of (1.8) with fjν replaced with fjν e−κ|ν| follows from [E2], where more general interaction potentials are considered (see also [GM2], Sect. 8.1, where the formalism is similar to the one used here). (k) One has h(k) j0 = Hj0 = 0 ∀k ≥ 1, while, when ν 6= 0, for k = 1, h(1) jν =

fjν , iω0 · ν

(1) Hjν =−

iνj A0 · fν , iω0 · ν

(2.1)

and, for k ≥ 2, h(k) jν =

1 X 1 iω0 · ν p! ν p>0

(k) Hjν

X

p>0

X

(iν0j ) ·

Hν(˜k) · fν 0 ˜

p Y

iν 0 · hν(kss )

p>0

X

A0 · fν 0

P∞

Nj(k) = −

k=1

p Y

iν 0 · hν(kss ) ,

Nj(k) εk , with Nj(k) defined by Nj(1) = −fj0 and, for k ≥ 2,

X 1 p! p>0

(2.2)

s=1

k1 +...+kp =k−1

provided Nj (ε) =

(iν0j ) ·

0 +ν 1 +...+ν p =ν

X

·

s=1

˜ 1 +...+kp =k−1 k+k

1 X 1 − iω0 · ν p! ν

iν 0 · hν(kss ) ,

0 +ν 1 +...+ν p =ν

X

·

p Y

k1 +...+kp =k−1 s=1

0 +ν 1 +...+ν p =ν

1 X 1 =− iω0 · ν p! ν+ν ˜

X

fjν 0

X ν 0 +ν 1 +...+ν p =0

fjν 0

X

p Y

iν 0 · hν(kss ) .

(2.3)

k1 +...+kp =k−1 s=1

Equality (2.3) assures the formal solubility of (1.8). The function f is even, hence h is odd and H is even. If f is analytic (κ > 0) the convergence of the series defining the functions h and H is a corollary of [E1,E2] (see also [GM2], Theorem 1.4), but the convergence radius is not uniform in κ (it shrinks to zero when κ → 0). The aim of the present paper is to show that, if f belongs to the class of functions Cˆ (p) (T` ), then there are cancellation mechanisms that imply convergence of the series and, therefore, analyticity in ε of the equations of motion solution. 2.2. We shall use a representation of (2.2) in terms of “Feynman graphs” following the rules in [BGGM], Sect. 3: the reader not familiar with [BGGM] can find in Appendix A1 below a brief but selfconsistent description of the graphs. See [GGM] for the terminology motivation. The only difference will be that that the “value” of a graph ϑ is now given by Y (iν v0 · fν ) v , (2.4) Val(ϑ) = iω · ν 0 λ v v
712

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

(1) v 0 is the node immediately following v in ϑ, and λv = v 0 v is the line (or branch) emerging from v and entering v 0 ; (2) r is the “root” of the graph, and iν r denotes the unit vector in the j th direction, iν r = ej , j = 1, . . . , `; P (3) ν v is the “external momentum” associated with the “node” v, ν λv = w≤v ν w is the “momentum” flowing through the line λv , and ν(ϑ) is the momentum flowing through the line entering the root (“root branch”). The momentum ν λ must be 6= 0 for all lines λ ∈ ϑ: this has to be regarded as a restriction on the possible values of the node momenta {ν v } that we allow attributing to the nodes. It can be convenient to introduce the notations gλ ≡

1 , ω0 · ν λ

D(ϑ) =

Y

gλ ,

(2.5)

;

(2.6)

λ∈ϑ

so that (2.4) becomes Val(ϑ) = D(ϑ)

Y

ν v0 · fν v

v
and gλ will be called the “propagator” of the line λ. Let us denote by T (k, ν) the set of non-equivalent labeled graphs of order k with ν(ϑ) = ν and iν r = ej ; then h(k) jν =

1 k!

X

Val(ϑ) =

ϑ∈T (k,ν)

1 k!

X

W (ϑ0 , ν) ,

(2.7)

ϑ0 ∈T 0 (k)

where ϑ = (ϑ0 , {ν v }), if ϑ is a labeled graph, while ϑ0 is a graph bearing no external momentum labels, T 0 (k) is the set of such graphs of order k, and X W (ϑ0 , ν) ≡ Val(ϑ) . (2.8) {ν v }: ν(ϑ)=ν (k) By comparing the expression of h(k) jν in (2.2) with (2.3), one realizes that Nj admits the same description as h(k) jν in terms of graph values, with the only difference that ν(ϑ) = 0 and no propagator is associated to the root branch. Then the bound we shall find for h(k) jν will hold also for Nj(k) (this will appear from the analysis of Sect. 3 and Sect. 4). (k) can be expressed in terms of the same graphs as in Sect. 2.2, but 2.3. The function Hjν the value associated with a graph ϑ is no longer given by (2.6). One defines, instead, X Y (ν v0 · fν v ) · Val∗ (ϑ) =D(ϑ) (−iνv1 j )

·

Y

v∈ϑ ˜ v ∈C(v / ˜ 1 ,v)

(−fν v · ν v0 0 ) (fν v˜ · A0 ) ,

(2.9)

v∈C(v1 ,v)\ ˜ v˜

where (1) v1 is the highest node in ϑ, i.e. v10 = r, (2) C(v1 , v) ˜ is the collection of vertices crossed by the connected path of branches in ϑ linking the node v1 with a node v˜ ≤ v1 , with C(v1 , v1 ) = v1 , ˜ immediately preceding v. (3) v 00 is the node on C(v1 , v)

Quasi Linear Flows on Tori

713

With the above definition (2.8), one has (k) Hjν =

1 k!

X

Val∗ (ϑ) ,

(2.10)

ϑ∈T (k,ν)

where T (k, ν) is defined as after (2.6). 3. Infrared Cancellations The cancellations discussed in this section are sufficient to treat the analytic case (see [E2,GM2]). Hence they are not really characteristic of the problems that we address in this paper. Nevertheless they must be taken into account and their compatibility with the cancellations that are typical of the differentiable problem will have to be, eventually, discussed. 3.1. Let us define χ(x) as the characteristic function of the set {x ∈ R : |x| ∈ [1/2, 1)}, and χ1 (x) as the characteristic function of the set {x ∈ R, |x| > 1}. Then each propagator in (2.6) can be decomposed as gλ =

0 1 X X χ1 (ω0 · ν λ ) χ(2−n ω0 · ν λ ) + = gλ(n) , ω0 · ν λ ω · ν 0 λ n=−∞ n=−∞

(3.1)

and, inserting the above decompositions in the definition of the value of a graph (2.3), we see that the value of each graph is naturally decomposed into various addends. We can identify the addends simply by attaching to each line λ an “infrared scale ” (or simply “scale”) label nλ ≤ 1, thus obtaining a new “more decorated” graph that we still call ϑ. It has to be noted that, given a graph ϑ, there is only one set of scales {nλ } for which all the propagators gλ(nλ ) are not identically zero: nevertheless if one uses (3.1) the scale labels {nλ } and the momentum labels {ν v } are considered as independent labels, which is useful for combinatorial purposes. Definition 3.2 (Cluster). Given a graph ϑ, a “cluster” of scale n ≤ 1 is a maximal set of nodes connected by lines of scale ≥ n with at least one line of scale exactly n. A line λ which connects nodes both located inside a cluster T is said to be “internal” to the cluster, and we write λ ∈ T ; the lines which connect a node inside with a node outside the cluster are called “external” to the cluster; if a line λ is internal or external to a cluster T , we say that λ intersects T , and we write λ ∩ T 6= ∅. A line is “outside” the cluster T if it is neither internal nor external. The nodes of a cluster V of scale nV may be linked to other nodes by lines of lower scale. Such lines are called “incoming” if they point at a node in the cluster or outgoing otherwise; there may be several incoming lines (or zero) but at most one outgoing line, because of the tree structure of the graphs. Definition 3.3 (Resonance). We call “resonance” a cluster V such that: (1) there is only one incoming line λV and one outgoing line λ0V and |ν λV | = |ν λ0V |; (2) if nV is the scale of the cluster and nλV is the scale of the line λV , one has nV ≥ nλV + 3. If ν λV = ν λ0V , the resonance is called a “real resonance”; if ν λV = −ν λ0V , it is called a “virtual resonance”.

714

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

Note that V can be a resonance only if nλV ≤ −2. Then the following result holds (see [S,E,BGGM]). Lemma 3.4 (Siegel-Eliasson’s bound). If we consider only graphs with no real resonances then Q η Y |ν v | 2 τ 1 ≤ C k Pv∈ϑ , (3.2) |ω0 · ν λ | ( v∈ϑ |ν v |)τ λ∈ϑ

for some positive constant C and η = 6. 3.5. Consider a graph ϑ and call ϑˆ the graph obtained by deleting the infrared scale labels {nλ } and ϑ0 the graph obtained by deleting the scale and external momentum ˆ {nλ }), or (ϑ0 , {nλ }, {ν v }). labels: we shall write ϑ = (ϑ, Suppose that the set of scales {nλ } for ϑ0 is consistent with the existence of a fixed family V 1 of maximal (real) resonances, i.e. of real resonances not contained in any larger real resonance.1 If V ∈ V 1 we call λV = vVb vV1 the line incoming into the real resonance and nλV its scale; likewise λ0V = vV0 vVa is the outgoing line, (vVa , vVb ∈ V while vV0 , vV1 are out of it). We consider the graph values at a fixed set of scales for the lines not in any V ∈ V 1 and arbitrary values assigned to the scales of the lines inside the resonances V ∈ V 1 , and we say, in general, that a set of scales is “compatible” with V 1 , denoting this property by {nλ }&V 1 .2 0 P We introduce the momentum flowing on λv ∈ V intrinsic to the cluster V as ν λv = v≥w∈V ν w , and define the “resonance path” QV as the totally ordered path of lines joining the line coming into the real resonance V with the outgoing line and not including the latter two lines. Then X X h Y X χ(2−nλ ω0 · ν λ ) i ˆ {nλ }) = Val(ϑ, (ν x · fν y ) · (3.3) ω0 · ν λ λ∩V =∅ V 1 {nλ }&V 1

{nλ }

·

Y nh V ∈V 1

1 λ=xy

(ν v0 · fν va ) (ν vb · fν v1 ) V

V

V

V

o χ(2−nλV ω0 · ν λV ) i V(ω0 · ν λV |V, {nλ }λ∈V ) , 2 (ω0 · ν λV )

where the sums are performed at fixed ϑ0 , fixed {ν v } and fixed values of the nλ if λ is not in any V ∈ V 1 and with the scale labels nλ compatible with the cluster structure given by the resonances V ∈ V 1 ; therefore they run over many terms, as the labels take all possible values: all vanishing but one, because ϑˆ is a graph with given node momenta and therefore with only one possible set {nλ } of lines scales for which the addend does not vanish; the “resonance value” V is defined by V(ζ|V, {nλ }λ∈V ) =

Y λ∈V λ=xy

(ν x · fν y )

χ(2−nλ (ω0 · ν 0λ + σλ ζ)) , ω0 · ν 0λ + σλ ζ

(3.4)

with σλ = 1 if λ is on the resonance path QV , (λ ∈ QV ), otherwise σλ = 0. 1 This means that the V ∈ V are clusters corresponding to the labeling {n } of the branches of ϑ0 , 1 λ furthermore they have one entering line and one exiting line and such lines have the same scale. 2 This means that there is at least one graph (ϑ0 , {n }, {ν }) such that among its clusters there are the v λ resonances V ∈ V 1 not contained in any other resonances and each V ∈ V 1 is a real resonance.

Quasi Linear Flows on Tori

715

Pn0 0 −j Let χ(n,n ) (x) ≡ x) be the characteristic function of the set |x| ∈ j=n χ(2 n−1 n0 , 2 ). We shall use the notation {nλ }λ∈V [2 / 1 to denote the collection of the {nλ } corresponding to the lines λ not internal to any resonance V ∈ V 1 . Then (3.3) implies X X X ˆ {nλ }) = ˆ {nλ }λ∈V Val(ϑ, X1 (ϑ, / 1) , V 1 {nλ }λ∈V / 1 &V 1

{nλ }

where

h Y

ˆ λ }λ∈V X1 (ϑ,{n / 1) = ·

(ν x · fν y )

λ∩V 1 =∅ λ=xy

Y nh

χ(2−nλ ω0 · ν λ ) i · ω0 · ν λ

(ν v0 · fν va ) (ν vb · fν v1 ) V

V ∈V 1

V

V

with V 0 (ζ|V ) =

Y

V

(ν x · fν y )

λ∈V λ=xy

(3.5)

o χ(2−nλV ω0 · ν λV ) i 0 V (ω · ν |V ) , λ V (ω0 · ν λV )2

χ(nλV +3,+∞) (ω0 · ν 0λ + σλ ζ) , ω0 · ν 0λ + σλ ζ

(3.6)

which can be rewritten, by using the interpolation formula V 0 (ζ|V ) = V 0 (0|V ) + V10 (ζ|V ) , where V10 (ζ|V ) =

Y

(ν x · fν y )

λ∈V /QV λ=xy

Z

1

·ζ 0

χ(nλV +3,+∞) (ω0 · ν 0λ ) · ω0 · ν 0λ

χ(nλV +3,+∞) (ω0 · ν 0λ + tV ζ) i ∂ h Y . dtV (ν x · fν y ) ∂tV λ∈Q ω0 · ν 0λ + tV ζ

(3.7)

V λ=xy

A key remark, [E1,G4], will be that for the purpose of computing the sums over all scale and momentum labels, i.e. for our purposes, we can consider the value X1 with the real resonance value (3.6) corresponding to V ∈ V 1 always replaced by the expression defined in (3.7); this follows from the following “cancellation”: Lemma 3.6. Fixed ϑ0 and V 1 , when all graph values in (3.3), with the real resonance value V 0 (ζ|V ) replaced with V 0 (0|V ), are summed together over {ν v } and {nλ } with {nλ } &V 1 , they give a vanishing contribution. The proof is in Appendix A2. 3.7. We can perform explicitly the derivative in (3.7): we obtain (see also the Remark after (5.6) in [BGGM]), neglecting the terms with ζ = 0, ˆ {nλ }λ∈V X1 (ϑ, / 1) =

·

λ∈ϑˆ λ=xy

1 Z X X λ0V ∈QV z=0

Y

1 0

Y ν x · fν y · ˆ λ∈ϑ/V 1

χ(2−nλ ω0 · ν λ ) · ω0 · ν λ

Y h Y χ(nλV +3,+∞) (ω0 · ν λ (tV )) i dtV p(λ0V , z, tV ) · , ω0 · ν λ (tV ) V ∈V 1

λ∈V

(3.8)

716

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

where, for λ ∈ V , we adopt the notation ν λ (tV ) = ν 0λ + tV ν λV , and ( ω ·ν V , if z = 1 , − ω0 ·ν0 0λ(t V) 0 p(λV , z, tV ) = P λV ∗ if z = 0 , t∗ δ(tV − tV ) ,

(3.9)

V

are the solutions to the equation |ω0 · ν λ0 (tV )| = 2nλV +2 , if any (there are where V at most 2 solutions). The two values of z correspond to the two terms obtained by differentiating the denominator of the term in square brackets in (3.8), (z = 1), or the numerator, (z = 0). We then redecompose in (3.8) the characteristic functions of the lines inside the real resonances into individual scales from nλV + 3 up, so that (3.5) with V 0 replaced by V10 , see (3.7), and with X1 replaced by Z 1 Y χ(2−nλ ω · ν (t)) Y 0 λ ˆ {nλ }) = X˜ 1 (ϑ, · dtV · ω · ν (t) 0 λ 0 ˆ λ=xy λ∈ϑ, t∗V

V ∈V 1

·

Y

0 } λ∈{λ / V V ∈V 1

Y ν x · fν y · V ∈V

ˆ λ=xy λ∈ϑ,

X 1 λ0 V

1 X

∈QV zV =0

dzλV0

V

ω0 · ν λ0 (t)

(3.10)

,

V

with t = {tV }V ∈V 1 and we set ν λ (t) = ν 0λ + tV ν λV if λ ∈ QV , and ν λ (t) = ν 0λ ≡ ν λ if λ ∈ / ∪V ∈V 1 QV , and d1λ0 = −1 , V

d0λ0 = V

ω0 ·ν λ0 (t) V

ω0 ·ν λV

P

t∗ V

δ(tV − t∗V ) ,

(3.11)

where t∗V is defined as in (3.9). ˆ {nλ }) and {λ0 , zV , tV }V ∈V 1 , is said to Each addend in (3.10), with fixed ϑ = (ϑ, V be “superficially renormalized ” on the real resonances V 1 , on the line λ0V and on the choices zV . Remark 3.8. Note that the case zV = 0 is special as it forces nλ0 = nλV + 3, so that the V ratio in the definition (3.11) of d0λ0 is bounded above by 24 . V

3.9. Having dealt with the maximal real resonances (“first generation” real resonances) ˆ V 1 , {λ0 , zV , tV }V ∈V 1 and the we perform again the same operations, i.e. fixed ϑ, V scales {nλ } for λ 6∈ ∪V ∈V 1 V , we identify the “second generation” real resonances as the maximal real resonances inside each V ∈ V 1 ; call V 2 the set of the real resonances of the first and second generations and proceed in a similar way to “renormalize" superficially the newly considered real resonances W ∈ V 2 /V 1 . This means that we fix the scale labels of the lines outside the second generation real resonances, and sum over the other scale labels {nλ }, λ ∈ V 2 \ V 1 , consistent with the cluster structure of V 2 . We obtain that the product in (3.10) of the terms coming from the lines λ ∈ W ∈ V 2 , W ⊂ V ∈ V 1 , can be written in a form very close to (3.6), with the difference that the momenta flowing through the lines λ ∈ QW ∩ QV are ν λ (t) = ν 0λ + tW (ν 0λW + tV ν λV ), and nλV + 3 is replaced by nλW + 3 in the characteristic functions. We proceed to perform a Taylor expansion as above, in the variables ζW = ω0 · (ν 0λW + tV ν λV ) if λW ∈ QV or ζW = ω0 · ν 0λW otherwise. However this time we

Quasi Linear Flows on Tori

717

modify the renormalization procedure: if W contains the line λ0V we do nothing, while / W we write the first order remainder as above. if λ0V ∈ We then perform the derivatives with respect to the new interpolation parameters tW , generated by the expression of the Taylor series remainders, hence redevelop the characteristic functions and rearrange, along the lines that generated (3.10), the various terms and get an expression very similar to (3.10), for a quantity that we could call ˆ {nλ }λ∈V X2 (ϑ, / 2 \V 1 ), such that X X X ˆ {nλ }) = ˆ {nλ }λ∈V Val(ϑ, X2 (ϑ, / 2 \V 1 ) . V 2 {nλ }λ∈V / 2 \V 1 &V 2

{nλ }

Note that the order zero term of each Taylor expansion can be neglected, because of Lemma 3.6, which still holds if V is not a maximal resonance. In the same way as the X1 was already used to start the second renormalizations, the latter X2 can be subsequently used, for the superficial renormalization of the third generation of real resonances. Then we iterate step by step the procedure until there are no more real resonances inside the maximal real resonances found at the last step performed and all the nλ have been fixed.3 To write down the final expression we need some notations. (1) Let us call V the collection of all real resonances selected along the iterative procedure. For each V ∈ V choose a line λ0V , on the resonance path of V , with the “compatibility condition” that if V ⊂ Z ∈ V , λ0Z ∈ V implies λ0V = λ0Z . (2) Then if λ0Z ∈ V (so that λ0Z = λ0V ) we say that λ0V and V are “old”, and define πV (dtV ) = δ(tV − 1) dtV ,

d1λ0 = 1 ,

d0λ0 = 0 .

V

V

(3.12)

/ V we say that the line λ0V is “new” and that the real resonance V is “new”, (3) If λ0Z ∈ and define πV (dtV ) = dtV ,

d1λ0 = −1 , V

d0λ0 = V

ω0 ·ν λ0 (t) V

P

ω0 ·ν λV (t)

t∗ V

δ(tV − t∗V ) ,

(3.13)

where t∗V are the solutions (at most 2) of the equation |ω0 · ν λ0 (t)| = 2nλV +2 for tV , V and the interpolated momenta ν λ (t) are defined as ν λ (t) = ν 0λ if λ is not contained in any resonance paths and, otherwise Y X ν 0λV tW , (3.14) ν λ (t) = ν 0λ + V : λ∈QV

W ⊆V : λ∈QW

where the sum and the product are over all resonances V, W verifying the conditions indicated and ν 0λ is the sum of all the momenta of nodes preceding λ and inside the smallest resonance containing it when λ is inside a resonance, (see (5.13) in [BGGM]). (4) Recalling that λ0V denotes the line exiting the resonance V , define P0 (ϑ) =

Y λ6∈∪V λ0V

χ(2−nλ ω0 · ν λ (t)) , ω0 · ν λ (t)

N (ϑ) =

Y

ν x · fν y ,

(3.15)

λ∈ϑˆ λ=xy

3 Note that we, in fact, proceeded by first fixing the scales of the lines outside the maximal resonances, and at the first step we fixed the scales of lines just inside such first generation resonances, at the second step also the scales of the lines inside the second generation of resonances were fixed, and so on.

718

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

and denote by 3 the function V → {λ0V , zV }. (5) Define R Val(ϑ) = N (ϑ) RD(ϑ) , where RD(ϑ) =

XZ 3

Y

1 0

Y π(dtV ) P0 (ϑ)

V ∈V

V ∈V

(3.16) dzλV0

V

(ω0 · ν λ0 (t))∗

,

(3.17)

V

and the ∗ means that (ω0 · ν λ0 (t))∗ = ω0 · ν λ0 (t) if the resonance is new, and (ω0 · V V ν λ0 (t))∗ = ω0 · ν λV (t) if the resonance is old. V

Remark 3.10. By Definition 3.3, we have |ω0 · ν λ (t)| ≥ 23 |ω0 · ν 0λ |, uniformly in t. 3.11. Then, from the iterative procedure and with the just introduced notations, we obtain X X Val(ϑ) = R Val(ϑ) , (3.18) ϑ

ϑ

as all terms discarded in each Taylor expansion add to zero when summed together (as a corollary of Lemma 3.6). The number of terms thus generated is, at fixed V , bounded by the product over V ∈ V of 2 times the Q number of pairs that are in V / ∪W ⊂V,W ∈V W and therefore it is bounded by 2k V k(V )2 if k(V ) is the number of nodes in V which are not in real resonances inside V . Hence this number is ≤ (24 )k . The number of families of real resonances in ϑˆ (hence at fixed {ν v }) is also bounded by 2k . 3.12. After applying the R operations, we see that the contribution to the new “renormalized value" from the divisors in (3.10) will be bounded by the same product appearing in the non-renormalized values of the graphs deprived of the divisors due to the lines λ0V exiting resonances times a factor iτ Y Y hX 1 1 k ≤ C ≤ C2k |ν | , v 1 minλ∈V0 |ω0 · ν λ (t)| minλ∈V0 |ω0 · ν 0λ | V ⊂ϑ V ⊂ϑ V ⊂ϑ v∈V0 (3.19) where the factor 24k arises from Remark 3.8, C2 = 24 · (3/2) from Remark 3.10, C1 = C0 C2 and V0 is the set of nodes inside the real resonance V not contained in the real resonances internal to V . In order to bound the factor P0 (ϑ) given by (3.15), we can identify the real resonances V ∈ V of different generations; the set V j of real resonances of the j th generation, j ≥ 1, just consists of the real resonances which are contained in (j − 1)th generation real resonances (of lower scale) but not in any (j + 1)th generation real resonances. If V is a real resonance in V j with entering line λV = vVb vV1 and outgoing line 0 λV = vV0 vVa with momentum ν λV we can construct a “V -contracted graph" by replacing the cluster V together with the incoming and outgoing lines by the single line vV0 vV1 : i.e. by deleting the resonance V and replacing it by a line. We can also construct the “V -cut graphs” by deleting everything but the lines of the resonance V and its entering and outgoing lines and, furthermore, by deleting the outgoing line λ0V as well as the node vaV and attributing to the node vV1 an external momentum equal to the momentum flowing into the entering line in the original graph ϑ: thus we get pvVa disconnected graphs. 24k

Y

Quasi Linear Flows on Tori

719

We repeat the above two operations until we are left Q only with graphs ϑi , i = 1, 2, . . . without real resonances: by construction the product λ∈ϑ |ω0 · ν λ (t)|−1 is the same as Q Q the i λ∈ϑi |ω0 · ν λ (t)|−1 . Then we imagine to delete as well the lines of the various ϑj which were generated by the old entering lines (not all ϑi contain such lines, but some do) and we call ϑ0i the graphs so obtained. By doing so we change the momenta flowing into the lines of the graphs ϑi by an amount which is either 0 or the old momentum ν λV (t) entering a real resonance V , and from Remark 3.10, we have Y Y (3.20) |ω0 · ν 0λ |−1 , |P0 (ϑ)| ≤ C1k i λ∈ϑ0 i

and using Lemma 3.4 it follows Y Y

|ω0 ·

ν 0λ |−1

≤C

i λ∈ϑ0 i

k

Y i

Q

η

v∈ϑ0i

(

P

|ν v | 2 τ

v∈ϑ0i

|ν v |)τ

,

(3.21)

with η = 6 (see also [BGGM]. Then (3.17)÷(3.21), bounding the last product in (3.17) by (3.19), imply Y η |ν v | 2 τ , η=6. (3.22) RD(ϑ) ≤ C k C1k v∈ϑ

Note that (3.22) and our discussion leading to it is a version of the proofs in [E1,E2] and in the case of analytic f the Fourier coefficients f ν are exponentially bounded as |ν| → ∞ so that (3.22) yields the convergence of the perturbation series for h. 4. Ultraviolet Cancellations The ultraviolet cancellations are characteristic of the linearization problems relative to (1.1), (1.2). They are quite different from the infrared ones discussed in Sect. 3 and the main technical problem, besides their identification, is their compatibility with the infrared cancellations. Exhibiting the two cancellations may not be possible simultaneously in the sense that the first cancellations may require grouping graphs in classes that are completely different from the groupings that are necessary to exhibit the second cancellations. If this happens one says that the cancellations are not independent, or “overlap”, and it is clear that one runs into serious problems. Hence the following analysis will be mostly devoted to showing that, besides an obvious incompatibility that can be explicitly resolved, the two cancellations are in fact independent. 4.1. Given a graph ϑ, we can define the (ultraviolet) “scale” hv of the node v to be the integer hv ≥ 1 such that 2hv −1 ≤ |ν v | < 2hv . We say that the labels {ν x } and {hx } are “compatible” if |ν v | ∈ [2hv −1 , 2hv ) for all v ∈ ϑ. The compatibility relationship between {ν x } and {hx } will be denoted {ν x } comp {hx }. Then we can write X X XX XX RVal(ϑ) = RVal(ϑ0 , {ν x }) = RVal(ϑ0 , {ν x }) . ϑ

ϑ0 {ν x }

ϑ0 {hx } {ν x } comp {hx }

(4.1)

720

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

Here ϑ0 denotes a graph with infrared scale labels only, ϑ = (ϑ0 , {ν x }). The infrared scale labels need not be explicitly declared. None of the following operations will modify the infrared scale of the momentum flowing in a line, thus it is possible using a notation in which the infrared scale labels (needed in the infrared cancellations discussions) do not appear explicitly. Set Q = ∪V ∈V QV , where V is the set of all resonances of ϑ and QV is the resonance path of the resonance V (see Sect. 3.5): however from now on we shall consider not only real resonances but also virtual resonances referring to them occasionally just as “resonances”. The notion of resonance paths makes sense also for virtual resonances: see below (end of Sect. 4.6, or [BGGM]) for a discussion of why the consideration of virtual as well as real resonances (which seem purely infrared problem objects) is necessary in the ultraviolet problem. Define Bv the subset of the nodes w among the pv nodes immediately preceding v such that the branch vw is not on the resonance paths Q of real or virtual resonances. The generally larger set of all nodes immediately preceding v can be denoted B¯v : Bv ⊆ B¯v . Given a set of momenta and a fixed node v¯ ∈ ϑ0 , we define the change of variables σw σw : Z` ↔ Z` , where w ∈ Bv¯ , by fixing a sign σw = ±1 and defining Uvw Uvw ¯ ¯ ({ν x }) = 0 {ν x } as: z≤w, ν 0z =σw ν z , ν 0z =ν z ,

ν 0v¯

for all other z 6= v¯ , X =ν v¯ + (1 − σw ) ν z 6= ν v¯ + (1 − σw )ν λw ,

(4.2)

z≤w

¯ there so that, for any choice of the subset B1v¯ ⊆ Bv¯ of nodes immediately preceding v, are cancellations, see Remark 4.2 below, which allow us to write X {σw }w∈B1v¯

X

·

||mv¯ ||=pv¯

RVal(ϑ0 , Y w∈B1v¯

Y

σw Uvw ¯ {ν x }) ≡

Z

w∈B1v¯

0

1

Y

dtw ·

w∈B1v¯

n o ∂ 1 fjν v¯ (tv¯ ) (ν v¯ (tv¯ ))mv¯ · R Val0 (ϑ0 ) , ∂tw ω0 · ν λv¯

(4.3)

where Val0 (ϑ0 ) is a tensor containing all the other factors of the graph value relative to nodes v’s different from v. ¯ (ii) P The free indices of the pv¯ -order tensor ν v¯ (tv¯ )mv¯ are contracted (by performing the 0 0 mv¯ ) with the ones that appear in the tensor Val (ϑ ), and mv¯ is a ` dimensional positive integer components vector (with ||mv¯ || denoting the sum of the compov1 ¯ v` ¯ . . . bm ; nents; see comments after (1.4)) and, given a vector b, we put bmv¯ = bm 1 ` furthermore tv¯ = (tw1 , . . . , tw|B1v¯ | ) and ν v¯ (tw1 , . . . , tw|B1v¯ | ) ≡ ν v¯ (tv¯ ) is defined as

(i)

ν v¯ (tv¯ ) = ν v¯ (tw1 , . . . , tw|B1v¯ | ) = ν v¯ +

X w∈B1v¯

2tw ν λw = ν v¯ +

X w∈B1v¯

tw

X

2ν z ,

z≤w

(4.4) where ν v¯ (tv¯ ) = ν v¯ if B1v¯ = ∅. (iii) The assumed form (1.5) of the fν allows us to think that fν is defined on R` rather than on Z` and hence to give a meaning to the derivatives of fν v (tv ) .

Quasi Linear Flows on Tori

721

(iv) We use here and henceforth that R acts only on the product of propagators D(ϑ) (see (2.5) and (3.16)), and the fact that the definition of Bv after (4.1) yields that all real or virtual resonances remain such under the action of the change of variables (4.2). Remark 4.2. The cancellations are due to the fact that the change of variables (4.2) ¯ leaves unchanged each factor (ν v0 · fν v )[ω0 · ν λv ]−1 , except for the nodes w and v, whose factors are modified in the following way: ν v¯ · fν w ω 0 · ν λw ν v¯ 0 · fν v¯ ω0 · ν λv¯

→ →

(ν v¯ + ζ w ) · fν w , node w ω0 · ν λw (ν v¯ 0 ) · fν v¯ +ζ w , node v¯ ω0 · ν λv¯

−

with ζ w = 2ν λw : then, if we set ζ w = 0, the sum of the two graph values cancel exactly: hence their sum can be written via an interpolation formula like (4.3). P 4.3. We can study the sum Sk (ϑ0 ) = ν |ν|s |RW (ϑ0 , ν)|, where W (ϑ0 , ν) is defined in (2.8) and RW (ϑ0 , ν) is defined as in (2.8) with Val(ϑ) replaced by RVal(ϑ).4 From the tree structure of the graphs defining the “value” it follows that X |ν|s RW (ϑ0 , ν) = Sk (ϑ0 ) = ν

=

X ν

n |ν|s fjν v1 (ν v1 )mv1 R

io Y hX 1 W (ϑ0v1 w , ν λw ) , ω0 · ν ν w∈Bv1

(4.5)

λw

P

P where v1 is the highest node and ν v1 = ν˜ − w∈Bv ν λw with ν˜ = ν − w∈B¯ v \Bv ν λw . 1 1 1 Fixed ν˜ and {ν λw }w∈Bv1 let hv1 ≡ hv1 (ν, ˜ {ν λw }w∈Bv1 ) be the scale of ν v1 : i.e. ν v1 is such that 2hv1 −1 ≤ |ν v1 | < 2hv1 . We shall say, see Sect. 3.5 and Sect. 4.1, that hv1 is “compatible” with ν v1 if ν v1 has scale hv1 . Given w with w0 = v1 we say that w is “out of order” with respect to v1 if, for a suitably large o which below we fix o = 5 as this turns out to be sufficient, it is 2hv1 > 2o pv1 |ν λw | ,

(4.6)

˜ {ν λw }w∈Bv1 ) where pv is the number of branches entering v. We denote B1v1 ≡ B1v1 (ν, ⊆ Bv1 the nodes w ∈ Bv1 which are out of order with respect to v1 . The number of elements in B1v1 will be denoted qv = |B1v1 |. Note that the notion of w being out of order with respect to v1 depends on {ν λw }w∈Bv1 and ν. ˜ Given a set {ν λw }w∈Bv1 for all choices of σw = ±1 we define the transformation U ({ν λw }w∈Bv1 ) ≡ {σw ν λw }w∈Bv1 ,

(4.7)

and given a set C ⊆ Bv1 we call U (C) the set of all transformations U such that σw = 1 for w 6∈ C. If [2h−1 , 2h ) is a scale interval Ih , h = 1, 2, . . . we call • the “first quarter” of Ih the “lower part” Ih− = [2h−1 , 45 2h−1 ) of Ih , 4 The reader familiar with [BGGM] can skip the following discussion, which is essentially identical to the one in [BGGM], Sect. 4, and leap directly to the final expression (4.27) in Sect. 4.7.

722

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

• the “fourth quarter” of Ih the “upper part” Ih+ = [ 78 2h , 2h ) of Ih , and • the remaining part the “central part” Ihc of Ih . We group the set of branch momenta {ν λw }w∈Bv1 into collections by proceeding iteratively in the way described below. The collections will be built so that in each collection the cancellation discussed in Remark 4.2 above can be exhibited. Fixed ν˜ and h choose {ν 1λw }w∈Bv1 such that |ν 1v1 | ∈ Ihc : such {ν 1λw }w∈Bv1 is called a “representative”. Given the representative we define • the “branch momenta collection” to be the set of the {ν λw }w∈Bv1 of the form U ({ν 1λw }w∈Bv1 ),

U ∈ U (B1v1 (ν, ˜ {ν 1λw }w∈Bv1 )) ;

• the “external momenta collection” to be the set of momenta X ν 1U σw ν 1λw , for U ∈ U (B1v1 (ν, ˜ {ν 1λw }w∈Bv1 ) , v1 = ν −

(4.8)

(4.9)

w∈B¯ v1

where, here and below, we set σw ≡ 1 if w ∈ B¯v1 /Bv1 to unify the notation. The elements of the above constructed external momenta collection need not be necessarily contained in Ihc . We consider then another representative {ν 2λw }w∈Bv1 such that |ν 2v1 | ∈ Ihc and not belonging to the branch momenta collection associated with {ν 1λw }w∈Bv1 , if there are any left; and we consider the corresponding branch momenta and external momenta collections as above. We proceed in this way until all the representatives such that ν v1 is in Ihc , for the given h, have been put into some collection of branch momenta. We then repeat the above construction with the interval Ih− replacing the Ihc , always being careful not to consider representatives {ν λw }w∈Bv1 that appeared as members of previously constructed collections. It is worth pointing out that not all the external ˜ {ν λw }w∈Bv1 )), are in Ih− , but they are all in the corridor momenta ν U v1 , U ∈ U(B1v1 (ν, + ∪ Ih− , by (4.6). Ih−1 + , (if h = 1 we simply skip this step). The Finally we consider the interval Ih−1 construction is repeated for such intervals. Proceeding iteratively in this way starting from h = 1 and, after exhausting all the h = 1 cases, continuing with the h = 2, 3 . . . cases, we shall have grouped the sets of branch momenta into collections obtainable from a representative {ν λw }w∈Bv1 by ˜ {ν λw }w∈Bv1 )) to it. Note that, in this way, applying the operations U ∈ U(B1v1 (ν, + is considered, all the remaining representatives are such that when the interval Ih−1 + ˜ {ν λw }w∈Bv1 )). |ν U v1 | ∈ Ih−1 for all U ∈ U (B1v1 (ν, The graphs with momenta in each collection are just the graphs involved in the parity cancellation described in the previous section. In fact if U is generated by the signs {σw }w∈Bv , we have Y σw νU = ( U ){ν } , x v1 v1 w w∈B1v1

(U ({ν λw˜ }w∈B ))w = ˜ v1

v1

X

(

z≤w

Y w∈B ˜ 1v1

Uvσ1ww˜˜ ){ν x }

(4.10) z

,

where, given the sets {ν x } and {ν λw˜ }, ({ν x })v denotes the external momentum in {ν x } corresponding to the node v and ({ν λw˜ })w denotes the branch momentum in {ν λw˜ } corresponding to the branch λw .

Quasi Linear Flows on Tori

723

Remark 4.4. The complexity of the above construction is due to the necessity of avoiding overcountings, called “overlapping divergences” in the usual language of field theory. ˜ {ν λw }w∈Bv1 ), one has In fact it is possible that, for some U ∈ U (B1v1 (ν, ˜ U ({ν λw }w∈Bv1 )) 6= B1v1 (ν, ˜ {ν λw }w∈Bv1 ) , B1v1 (ν,

(4.11)

because the scale of ν U v1 may be h − 1, while that of ν v1 may be h; so that if one + considered, for instance, Ih−1 before Ih− overcountings would be possible, and in fact they would occur. 4.5. A convenient way to rewrite (4.5) is the following: Sk (ϑ0 ) =

X

X |ν|s

ν

∗ X

X

hv1 {ν λw }w∈B¯ v U ∈U (B1v1 )

mv1 fjν Uv (ν U · v1 ) 1

1

(4.12) n

·R where

P∗

{ν λw }w∈Bv

1

o Y W (ϑ0v1 w , σw ν λw ) ,

1 ω0 · ν

w∈B¯ v1

means sum over the above defined representatives such that ν v1 is

compatible with hv1 ; and we abbreviate B1v1 (ν, ˜ {ν λw }w∈Bv1 ) by B1v1 in conformity with the notations introduced after (4.6). The explicit sum over the scales hv1 is introduced to simplify the bounds analysis that we perform later, see Sect. 4.8. Note that ν U v1 is, in general, not compatible with hv1 , i.e. we are grouping together also terms with a different scale label (but the difference in scale is at most one, see (4.16) below). The parity properties of f, W (ϑ0v1 w , σw ν λw ) = σw W (ϑ0v1 w , ν λw ),

(4.13)

and (4.12) imply Sk (ϑ0 ) =

X ν

X |ν|s

∗ X

X

hv1 {ν λw }w∈B¯ v U ∈U (B1v1 )

n Y

·R

σw

w∈B1v1

mv1 fjν Uv (ν U · v1 ) 1

1

o Y 1 W (ϑ0v1 w , ν λw ) . ω0 · ν ¯

(4.14)

w∈Bv1

We can apply the interpolation in (4.3) to the node v and rewrite (4.14) as 0

Sk (ϑ ) =

X ν

·

Y w∈B1v1

X |ν| s

h

∗ X

hv1 {ν λw }w∈B¯ v

1

X ||mv1 ||=pv1

Z

0 1

Y

dtw ·

(4.15)

w∈B1v1

n 1 o Y mv1 i ∂ fjν v1 (tv1 ) ν v1 (tv1 ) ·R W (ϑ0v1 w , ν λw ) , ∂tw ω0 · ν ¯ w∈Bv1

where if B1v1 = ∅ no interpolation is made; and we note that by (4.3), by the definition of nodes out of order and by the iterative grouping of the representatives,

724

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

2hv1 −2 ≤ |ν v1 (tv1 )| < 2hv1 ,

(4.16)

so that the interpolation formulae discussed in Sect. 4.1 can be used because no singularity arises in performing the tv1 -integrations. By the definition of W (ϑ0 , ν), (see (2.8)), we can write (4.15) as ∗ X h X Z 0 X X Y Sk (ϑ0 ) = |ν|s dtw · ν

·

Y w∈B1v1

n ·R

hv1 {ν λw }w∈B¯ v

1

1

||mv1 ||=pv1

mv1 i ∂ fjν v1 (tv1 ) ν v1 (tv1 ) · ∂tw

Y 1 ω0 · ν ¯

w∈B1v1

(4.17)

o Val(ϑv1 w ) .

X

w∈Bv1 {ν x }x≤w : ν(ϑv1 w )=ν λw

If we use (see (4.3)) X ∂ ∂ ∂ ≡ 2ν λw · ≡ 2ν z · , ∂tw ∂ν ν=ν v (tv ) ∂ν ν=ν v (tv )

(4.18)

z≤w

to compute differentiations with respect to tw , we can write (4.15) as ∗ X n X Z 0 X X Y Sk (ϑ0 ) = |ν|s dtw · ν

hv1 {ν λw }w∈B¯ v

1

||mv1 ||=pv1

1

w∈B1v1

o

mv1 ∂ · |ν| ¯ −(pv1 −qv1 ) qv fj ν¯ ν¯ · 1 ∂ ν¯ ν=ν ¯ v1 (tv1 ) h Y X ih i Y · 2ν z |ν v1 (tv1 )| · qv1

w∈B1v1

n

·R

1 ω0 · ν

z≤w

w∈B¯ v1 \B1v1

Y h

w∈B¯ v1

(4.19)

X

io ,

Val(ϑv1 w )

{ν x }x≤w : ν(ϑv1 w )=ν λw

¯ −(pv1 −qv1 ) (which, computed for where we recall that qv1 = |B1v1 |; here the Q factor |ν| ν¯ = ν v1 (tv1 ), is identical to the inverse of [ w∈B¯ v \B1v |ν v1 (tv1 )|]) has been introduced 1 1 so that a “dimensional” estimate (i.e. an estimate based on the homogeneity of the functions involved) of the factor in the second line of (4.19) can be taken proportional to 2−hv1 b (see the homogeneous form of the function f , Sect. 1.3, and (4.16)). If w ∈ B¯v1 \ B1v1 we have Y Y |ν v1 (tv1 )| = (2o pv1 ) x˜ v1 w (tv1 ) · ν λw = w∈B¯ v1 \B1v1

w∈B¯ v1 \B1v1

=

Y

w∈B¯ v1 \B1v1

(2o−1 pv1 ) x˜ v1 w (tv1 ) ·

X

(4.20) (2ν z ) ,

z≤w

where x ˜ v1 w (tv1 ) is a suitable vector depending on ν λw but not on the individual terms ν z , and such that |˜xvw (tv1 )| < 1. We obtain, with the above notations:

Quasi Linear Flows on Tori

Sk (ϑ0 ) =

X

X |ν|s

ν

·

725

n

∗ X

hv1 {ν λw }w∈B¯ v

1

Z

X

0

Y

dtw ·

1 w∈B 1v1

||mv1 ||=pv1

Y (t ) ∂ |B1v1 | oh Y X i mv1 v1 v1 · (4.21) f 2ν ν ¯ j ν ¯ z pv1 −qv1 |ν| ¯ ν=ν ¯ v1 (tv1 ) ∂ ν¯ |B1v1 | ¯ z≤w n

Y h 1 R ω0 · ν w∈Bv1

w∈Bv1

io ,

X

Val(ϑv1 w )

{ν x }x≤w : ν(ϑv1 w )=ν λw

Y

where the tensor Yv1 (tv1 ) =

24 pv1 x˜ v1 w (tv1 )

(4.22)

w∈B¯ v1 \B1v1

depends also on ν˜ and {ν λw }w∈Bv1 (although this dependence is not shown, to simplify the notation), and has to be contracted with the external momenta ν z , z ≤ w ∈ B¯v1 \Bv1 . P 4.6. Developing the sum z≤w 2ν z in (4.21), the quantity Sk (ϑ0 ) is given by a sum of terms corresponding to a collection of nodes lying on the paths P (v1 , z(v1 , w)) leading from v1 to a node Pz: the collection is defined by the “choices" of one particular addend 2ν z in the sum z≤w 2ν z , with z = z(v1 , w), w ∈ B¯v1 . Therefore, in general, we can think that (4.21) corresponds to a sum over a collection of paths P (v1 , z(v1 , w)) for the w ∈ B¯v1 . The paths are regarded as totally ordered (and gapless) sequences of nodes on ϑ0 . We can call P1Pthe family of the possible collections of paths that arise when expanding the sums z≤w in (4.21): each element P 1 of P1 can be identified with one contribution to (4.21). And, by using the notation tv = {tw }w∈B1v as in (4.3), the result is the following more explicit interpolation formula reexpressing the r.h.s. of (4.21), ∗ X X n X Z 0 Y X X 0 s |ν| dtw · Sk (ϑ ) = ν

hv1 P 1 ∈P1 {ν λw }w∈Bv

1

||mv1 ||=pv1

1

w∈B1v1

Y (t ) ∂ qv1 o Y mv1 v 1 v1 · · 2ν z · ¯ qv1 fj ν¯ ν pv1 −qv1 ∂ ν¯ |ν| ¯ ν=ν ¯ v1 (tv1 ) z:P (v1 ,z)∈P 1 io n 1 Y h X Val(ϑv1 w ) , ·R ω0 · ν ¯ w∈Bv1

(4.23)

{ν x }x≤w : ν(ϑv1 w )=ν λw

where the interpolation is considered when B1v1 6= ∅ (i.e. when it makes sense), and the indices have to be contracted suitably. The above formula can be rewritten as ∗ X X X Y X X 0 s |ν| Sk (ϑ ) = ν

R

n

X ||mv1 ||=pv1

·

Y

hv1 {ν λw }w∈B¯ v P 1 ∈P1

Z

0 1

X

v∈[P 1 ] ||mv ||=pv

w∈B1v1

(2ν v )ηv

v∈[P 1 ] {ν λy }y∈B¯ v

mv1 Y (t ) ∂ qv1 f ¯ j ν¯ ν v1 v1 dtw · (4.24) ω0 · ν |ν| ¯ pv1 −qv1 ∂ ν¯ qv1 ν=ν ¯ v1 (tv1 ) 1

Y

fν1 v (ν v )mv +1 ω 0 · ν λv

Y y∈Bv /[P 1 ]

o Val(ϑvy ) ,

726

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

where S • [P 1 ] = w∈B¯ v P (v1 , z(v1 , w))/{v1 }, 1

1` 11 . . . f`ν , with k1k = 1, is contracted with a factor in (ν v0 )mv0 , and • fν1 v = f1ν v v • ηv is equal to 1 if v = z(v1 , w) for some w ∈ Bv1 and 0 otherwise. We are now in position to iterate the resummation done in the previous section leading from (4.5) to (4.21) and “concerning" the highest node v1 . For each P v˜ ∈ P 1 , v˜ < v1 , let hv˜ = hv˜ (ν˜ λv˜ , {ν λw }w∈Bv˜ ) be the scale of ν v˜ , i.e. ν v˜ = ν λv˜ − w∈B¯ v˜ ν λw is such that 2hv˜ −1 ≤ |ν v˜ | < 2hv˜ . Here we again denote B¯v˜ the set of all the nodes immediately preceding v. ˜ Given an immediate predecessor w of v˜ we say that w is “out of order” with respect to v˜ if (4.25) 2hv˜ > 25 pv˜ |ν λw | , ¯ ˜ Let Bv˜ ⊆ Bv˜ be the subset of those that where pv˜ is the number of branches entering v. are not on a resonance path. Following the definitions in Sect. 4.1 we also call B1v˜ ≡ B1v˜ (ν˜ λv˜ , {ν λw }w∈Bv˜ ) ⊆ Bv˜ the nodes w ∈ Bv˜ which are “out of order” with respect to v. ˜ Given a set {ν λw }w∈Bv˜ for all choices of σw = ±1 we define

U ({ν λw }w∈Bv˜ ) ≡ {σw ν λw }w∈Bv˜ ,

(4.26)

and given a set C ⊆ Bv˜ we call U(C) the set of all transformations such that σw = 1 for w 6∈ C. Again we set, for uniformity of notations, σw ≡ 1 for w ∈ B¯v˜ /Bv˜ . We group the set of branch momenta {ν λw }w∈Bv˜ and the external momenta into collections by proceeding, very closely following the preceding construction, with ν λv˜ playing the role of ν, in the way described below. Fixed ν λv˜ and h we choose a {ν 1λw }w∈Bv˜ such that |ν 1v˜ | ∈ Ihc where ν 1v˜ = ν λv˜ − P 1 w∈B¯ v˜ ν λw . Then {ν 1λw }w∈Bv˜ is called a “representative”. For such a representative we define the “branch momenta collection”, associated with it to be the set of the {ν 1λw }w∈Bv˜ having the form U ({ν 1λw }w∈Bv˜ ) and the “external momenta collection” to be the set P 1 of momenta ν 1U ˜ λv˜ , {ν 1λw }w∈Bv˜ )/[P 1 ]). v˜ = ν λv˜ − w∈Bv˜ σw ν λw , for U ∈ U(B1v˜ (ν Note again that the above constructed external momenta collection is not necessarily contained in Ihc . We consider then another representative {ν 2λw }w∈Bv˜ such that |ν 2v˜ | ∈ Ihc and does not belong to the just constructed branch momenta collection associated with {ν 1λw }w∈Bv˜ , if there is any; and then we consider the branch momenta collections and external momenta collections obtained from {ν 2λw }w∈Bv˜ by the corresponding U transformations. And, as previously done, we proceed in this way until all the representatives such that ν v˜ is in Ihc are in some external momenta collections. The construction is repeated for the interval Ih− , always being careful not to consider + , see {ν λw }w∈Bv˜ that have been already considered, and finally for the interval Ih−1 Sect. 4.3. Proceeding iteratively in this way and considering the same sequence of h’s as in the previous case (i.e. the natural h = 1, 2, . . .), at the end we shall have grouped the set of branch momenta into collections obtainable from a representative {ν λw }w∈Bv˜ by applying the operations U ∈ U (B1v˜ (ν˜ λv˜ , {ν λw }w∈Bv˜ ) \ [P1 ]) to it. In other words the definition of the representatives {ν λw }w∈Bv˜ is identical to the one for v1 except that the collections are defined only by transformations changing the branch momentum of the lines emerging from the nodes in B1v˜ but not in P 1 .

Quasi Linear Flows on Tori

727

We repeat the above construction for all v˜ ∈ P 1 until all the v˜ ∈ P 1 are considered starting from the v˜ with v˜ 0 = v and, after exhausting them, continuing with vˆ with vˆ 0 = v˜ and so on. We call B¯v˜ (P 1 ) the nodes w immediately preceding v˜ but which are not on the union of the paths P ∈ P 1 , Bv˜ (P 1 ) the nodes w in B¯v˜ immediately preceding v˜ which are not in any resonance paths, and B1v˜ (P 1 ) the nodes in Bv˜ (P 1 ) which are out of order with respect to v; ˜ the set of just described transformations will be denoted by U (B1v˜ (P 1 )). Proceeding as we did for the highest node v1 and by performing the analogues of the transformations leading from (4.15) to (4.24), we construct for each v˜ ∈ P 1 new paths P 2 which, by construction, will not have common branches with those in P 1 ; call P2 the collection of the pairs P 1 , P 2 . The crucial point is that the factors x˜ vw (tv ) are the same for all the terms generated by the action of U ∈ U (B1v (P 1 )), by (4.20). We iterate then this procedure. Eventually we end up by constructing a “pavement” P of the graph with nonoverlapping paths (and the union of the paths does cover the graph); note that the paths are “ordered”, in the sense that they are formed only by comparable lines. We call P the collection of all such pavements; Bv (P ), P ∈ P, will be the set of nodes w immediately preceding v and such that (1) vw is not in any resonance path, and (2) a path P (v, z(v, w)) ∈ P starting from v passes through w, and B1v (P ) is the collection of nodes in Bv (P ) out of order with respect to v. Note that in general Bv (P ) ⊆ Bv (unless v is the highest node v1 , when Bv1 (P ) = Bv1 ), and B¯v (P ) ⊆ B¯v . Note also that for all P ∈ P the change of variables U ∈ U (B1v (P )) changes Q σw )ν x ) with the same resonant a graph (ϑ0 , ν x ) into a new graph (ϑ0 , ( w∈B1v (P ) Uvw clusters (virtual or real). The set of “path head" nodes v, i.e. the upper end nodes of paths in P , will be denoted Mh (P ): hence if v 6∈ Mh (P ) (i.e. if no path in P has v as path head) then Bv (P ) = ∅; likewise Me (P ) will denote the set of “path end” nodes, i.e. the nodes z such that P (v, z) is a path in P . The necessity of excluding real and virtual resonance paths from the renormalization procedures should now be clear, see [BGGM]: it may happen that a pair of successive nodes vw, v > w, has vw on the path of a real or virtual resonance V . Then the Q σw ν x) change of variables U ∈ U(B1v (P )) constructs a graph (ϑ0 , {ν λ }, w∈B1v (P ) Uvw in which the line incoming into the resonance carries some momentum −ν while the outgoing line carries a momentum ν: hence in the new graph the cluster V is no longer a resonance; or, viceversa, it can happen that a virtual resonance becomes real. To avoid this “interference between ultraviolet and infrared cancellations" we must exclude the resonances (virtual or real) from the interpolations. 4.7. Then we see that (4.22) leads to the following “path expansion” for Sk (ϑ0 ) summarizing our analysis X X X∗ Y X X |ν|s |W (ϑ0 , ν)| = |ν|s (4.27) ν

ν

nZ

0 1

Y w∈B1v (P )

{hx } P ∈P {ν λ } v∈Mh (P )

dtw

X

o Ov fν1 v (tv ) (ν v (tv ))mv Yv (tv ) RD(ϑ0 ) ,

||mv ||=pv

where (1) P is a partial pavement of the graph with non overlapping “paths” such that: (1.1) a path P (v, z) is a connected set of comparable lines (“ordered paths”) connecting the

728

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

node v to the node z < v; (1.2) the resonance paths are not contained in any path; (1.3) for any line λ which is not contained in any resonance path there is one path in P covering it: λ ∈ P (v, z) for some P (v, z) ∈ P ; (2) Mh (P ) is the collection of upper end nodes of the paths in P , and Me (P ) of lower end nodes; (3) Bv (P ) is the set of nodes w immediately preceding v such that (3.1) vw is not in any resonance path, and (3.2) a path P (v, z(v, w)) ∈ P starting from v passes through w, and B1v (P ) is the set of nodes w ∈ Bv (P ) which are out of order with respect to v, i.e. such that (4.28) 2hv > 25 pv |ν λw | ; (4) Yv (tv ) is defined as Q 4 ˜ vw (tv )) , w∈B¯ v (P )\B1v (P ) (2 pv x Yv (tv ) = 1,

if v ∈ Mh (P ) , otherwise ,

(4.29)

if x ˜ vw (tv ) is the vector defined via the implicit relation ˜ vw (tv ) · ν λw , |ν v (tv )| = 25 pv x

(4.30)

˜ vw (tv )) depends on ν λw but not on the individual external so that |x ˜ vw (tv ))| ≤ 1 and x momenta which add to ν λw and B˜v (P ) is the set of nodes verifying (3.2) in item (3); (5) the operator Ov is defined as Ov (ν v (tv ))mv Yv (tv ) fν v (tv ) = (4.31) ∂ |B1v (P )| 1 Yv (tv ) mv ηv f ( ν) ¯ (2 ν) ¯ , = ν ¯ ν=ν ¯ v (tv ) |ν| ¯ |Bv (P )|−|B1v (P )| ∂ ν¯ |B1v (P )| with ηv = 1 if v ∈ Me (P ), and ηv = 0 otherwise, and fν1 defined after (4.24); (6) the sum over {ν λ } has the restriction that the external momentum configuration {ν x } is compatible with the scales {hx }; (7) RD(ϑ0 ) is the same for all graphs involved in the cancellations mechanisms, as the moduli of the momenta do not change under the action of the change of variables (4.2), and the signs are taken into account by the interpolation formula (4.3) (see Remark 4.2). 4.8. We can bound Y

|ν|s

Ov fν1 v (tv ) (ν v (tv ))mv Yv (tv ) ≤ v∈ϑ0

≤

Y

D1 D2pv qv ! ppvv −qv 2hv (1−b+s+ηv ) ≤

v∈ϑ0

n

ϑ0

≤ max C3k P ∈P

D3 D4pv pv ! 2hv (1−b+s+ηv ) ,

v∈ϑ0

for suitable constants Dj , and use X X |ν|s RW (ϑ0 , ν) ≤ {ν}

Y

h Y v≤v0

Q

η

2τ ≤ v∈ϑ0 |ν v |

iX Y h

η

Q

2hv (pv +s+`+ 2 τ −b)

pv !

{hx } v≤v0

(4.32)

η

hv 2 τ , so that v∈ϑ0 2

Y P (v,z)∈P

2(hz −hv )

io

(4.33) ,

Quasi Linear Flows on Tori

729

where the number of pavements P is estimated by 2k , see Appendix A2, in [BGGM]. Then, setting b = 2 + s + ` + η2 τ + µ, with µ > 0, and exploiting the identity X

hv pv =

v
X

hv 0 ,

v10 = r ,

(4.34)

v
one obtains for (4.5) the bound X

X |ν|s RW (ϑ0 , ν) ≤

{ν}

≤ Ck

Y v

ϑ0

pv !

Xh

2−(2+µ)hv0

{hx }

Y v
2−(1+µ)hv

Y

i 2hv −hw ,

(4.35)

vw∈Q

for a suitable constant C. We see that there is at most one factor 2hv −hw per node v, because the resonance paths are totally ordered, so that the factors 2−hv in 2−(1+µ)hv compensate (when necessary) the factors 2hv in 2hv −hv0 (and 2−hv0 ≤ 1): then the sum over the scales can be performed. Q There are k!/ v pv ! graphs with given pv ’s and fixed shape (“Cayley’s formula”, see [HP]), so that the sum over the graph orders weighed by εk can be performed if ε is η small enough; in particular we obtain that h ∈ C (s) (T` ), if f ∈ Cˆ (2+s+ 2 τ +µ) (T` ), with µ > 0. 4.9. Then we can pass to Eq. (2.10) for H, with Val∗ (ϑ) defined in (2.9). In such a case we give the extra prescription not to apply the ultraviolet interpolation procedure to the ˜ equivalently, modify slightly the definition of the set Bv after (4.1): Bv is path C(v1 , v); the the subset of the nodes w among the pv nodes immediately preceding v such that ˜ the branch vw is neither on the resonance paths Q nor on C(v1 , v). Then we obtain again a formula like (4.17), with respect to which there are the following differences. (1) P is the partial pavement such that, besides the conditions (1.1)÷(1.3) after (4.27), ˜ and any verifies the further condition: (1.4) there is no overlapping between C(v1 , v) P (v, z) ∈ P . ˜ fν1 v has to be contracted with a factor in (ν v0 0 )mv0 0 , where v 00 ∈ (2) If v ∈ C(v1 , v), C(v1 , v) ˜ is the node on C(v1 , v) ˜ immediately preceding v. (3) If v ∈ C(v1 , v), ˜ the factors (ν v )mv arise from the pv − 1 branches not contained in C(v1 , v) ˜ and entering v and from the branch on C(v1 , v) ˜ exiting from v and pointing to v0 . Since the bound in Sect. 4.8 is independent on the exact structure of the contractions, η the bound (4.35) can be still obtained, so that also H ∈ C (s) (T` ), if f ∈ Cˆ (2+s+ 2 τ +µ) (T` ), with µ > 0. Of course in deriving the above formulae one should take into account also the cut off factors e−κ|ν| appearing in the Fourier coefficients fν v , which may be stricken by differentiations: but their contribution is not worse than the terms that we have treated, as briefly commented in [BGGM], comment following (4.39). Thus the proof of Theorem 1.4 is complete.

730

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

5. Comparison with the One-Dimensional Schr¨odinger Equation in a Quasi Periodic Potential 5.1. From Theorem 1.4, one could deduce the existence of Bloch waves for the onedimensional Schr¨odinger equation with a potential belonging to a certain class of nonanalytic quasi periodic functions, and one could be tempted to compare the result with [Pa], see also [PF], where the existence of Bloch waves is proven with the Moser-Nash techniques for quasi periodic potentials having p > 2(` + 1) continuous derivatives (if ` is the dimension of the frequency vector of the quasi periodic potential and τ is supposed to be τ > `−1), with no other restriction on the potential regularity. However, in order to perform a meaningful comparison between the two results, one has to consider carefully the exact form of the interaction potential. 5.2. The problem studied in [DS,R,Pa] is the Schr¨odinger equation h

−

i d2 + εV (x) ψ(x) = Eψ(x) , 2 dx

where V (x) is a quasi periodic function of the form X V (x) = eiω·νx Vν ,

(5.1)

(5.2)

ν∈Z`−1

with ω ∈ R`−1 satisfying a diophantine condition. The problem to find eigenvalues and eigenfunctions of (5.1) can be easily seen, see for instance [G2], to be equivalent to solving the equations of motion of the classical mechanics system described by the Hamiltonian H=

i q2 h p2 +ω·B+ E − εV (β) , 2 2

(5.3)

with (p, q) ∈ R2 and (B, β) ∈ R`−1 × T`−1 . In fact the evolution equation for the coordinate q is the eigenvalue equation (5.1). Then it is possible to introduce a canonical transformation C : (p, q) → (A1 , α1 ), [G2], such that the Hamiltonian (5.3) becomes H=

√

EA1 + ω · B + εf (α1 , β) ,

A1 sin2 α1 V (β) , f (α1 , β) = − √ E

(5.4)

which can be reduced to the form (1.3), with A = (A1 , B) ∈ R` , α = (α1 , β) ∈ T` , and f(α) = (f (α), 0, . . . , 0). For the proof of such an assertion, we refer to [G2]. And the equations of motions for β give β(t) = β 0 + ωt, and the derivatives whose number can grow up indefinitely, in the expansion described in Sect. 2, are those acting on the α1 variable: however the perturbation is always analytic in α1 . Thus the assumptions on the interaction potential V can be weakened, compared to the ones following from the general result in Theorem 1.4, simply because the onedimensional Schr¨odinger equation can be reduced to a classical mechanics problem with Hamiltonian of the form (1.1), but the interaction term depends analytically on α1 , independently on the regularity of the quasi periodic potential. In this case the existence of the counterterm can be proved without exploiting ultraviolet cancellations, and the infrared cancellations are sufficient to give convergence of the perturbative series, provided the quasi periodic potential is so regular to guarantee

Quasi Linear Flows on Tori

731

the summability on the Fourier components in the perturbative series: the analysis in Sects. 3, 4 gives p > ` + 3τ , see Appendix A3 for details. Then, if τ > ` − 1, one has p > 4` − 3. With respect to [Pa], the result is weaker for ` ≥ 3 but, in some respects, better for ` = 2. The result in [Pa] has been obtained by using the Moser-Nash techniques for KAM theory, and it is known that the class of differentiability of the perturbations of integrable systems can be raised in the KAM theory above Moser’s result, [P]: then one can conjecture that also for the Schr¨odinger equation the ideas in [P] could lead to p > 2`. Our result p > 4` − 3 can be considered, for ` = 2, a partial improvement of [Pa] in this direction. We stress that with the techniques described in the present paper, the ultraviolet cancellations do not enter into the analysis to obtain analyticity in the perturbative parameter of the eigenvalue E and of the corresponding eigenfunction ψ(x) in (5.1). It follows that the techniques of [E2] imply , in this case, our results, although the question was not relevant for that paper. 5.3. The situation is essentially identical if one consider the Schr¨odinger equation h

−

i d2 + U (x) + εV (x) ψ(x) = Eψ(x), dx2

(5.5)

where U (x) is a periodic potential with frequency ω2 and V (x) a quasi periodic function of the form X eiω·νx Vν , (5.6) V (x) = ν∈Z`−2

with ω ∈ R`−2 , such that ω2 and ω satisfy a diophantine condition. The Hamiltonian of the corresponding classical mechanics problem is H=

i q2 h p2 + ω2 B2 + ω · B + E − U (β2 ) − εV (β) , 2 2

(5.7)

with (p, q) ∈ R2 , (B2 , β2 ) ∈ R1 × T1 , and (B, β) ∈ R`−2 × T`−2 . If ε = 0, the Hamiltonian is integrable, [G2,C], so that (5.7) becomes H = ω01 A1 + ω02 A2 + ω · B + ε A1 f (α1 , α2 , β) , f (α1 , α2 , β) = −G(α1 , α2 ) V (β) ,

(5.8)

where G(α1 , α2 ) is a function which depends analytically on α1 , [C], Sect. V,VI, independently on the regularity of U and V in (5.5). The fact that the interaction is proportional only to A1 (i.e. independent on the other action variables) implies that the equations of motion for α2 and β can be trivially integrated and give α2 = α20 + ω02 t and βj = βj0 + ω0j t, 2 ≤ j ≤ `. Then we can reason as in Sect. 5.2, and the same conclusions hold.

Appendix A1. Graphs and Graph Rules We lay down one after the other, on a plane, k pairwise distinct unit segments oriented from one extreme to the other: respectively the “initial point” and the “endpoint” of the oriented segment. The oriented segment will also be called “arrow”, “branch” or “line”. The segments are supposed to be numbered from 1 to k.

732

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

The rule is that after laying down the first segment, the “root branch”, with the endpoint at the origin and otherwise arbitrarily, the others are laid down one after the other by attaching an endpoint of a new branch to an initial point of an old one and by leaving free the new branch initial point. The set of initial points of the object thus constructed will be called the set of the graph “nodes” or “vertices”. A graph of “order” k is therefore a partially ordered set of k nodes with top point the endpoint of the root branch, also called the “root” (which is not a node); in general there will be several “bottom nodes” (at most k − 1). We denote by ≤ the ordering relation, and say that two nodes v, w are “comparable” if v < w or w < v. With each graph node v we associate an “external momentum” or “mode” which is simply an integer component vector ν v 6= 0; with the root of the graph (which is not regarded as a node) we associate a label j = 1, . . . , `. For each node v, we denote by v 0 the node immediately following v and by λv ≡ v 0 v the branch connecting v to v 0 (v will be the initial point and v 0 the endpoint of λv ). If v1 is the node immediately preceding the root r (“highest node”) then we shall write v10 = r, for uniformity of notation (recall that r is not a node). We consider “comparable” two lines λv , λw , if v, w are such. If pv is the number of branches entering the node v, then each of the pv branches can be thought of as the root branch of a “subgraph” having root at v: the subgraph is uniquely determined by v and one of the pv nodes w immediately preceding v. Hence if w0 = v it will be denoted ϑvw . We shall call “equivalent” graphs which can be overlapped by (1) changing the angles between branches emerging from the same node, or (2) permuting the subgraphs entering into a node v in such a way that all the labels match. The number of (non-equivalent numbered) graphs with k branches is bounded by 4k k!, [HP].

Appendix A2. Proof of Lemma 3.6 We consider all the graphs we obtain by detaching from each resonance the subgraph with root vVb , if vVb is the node in which the resonance line λV enters, then reattaching it to all the nodes w ∈ V . We call this set of contributions a “resonance family”. If one sets ζ ≡ ω0 · ν λV = 0, no propagator changes inside the resonance, and the only effect of the above operation is that in the factor h χ(2−nλ ω0 · ν λV ) i (ν v0 · fν va ) (ν vb · fν v1 ) V V V (ω0 · ν λV )2 V

(A2.1)

appearing in (3.3) the external momentum ν vb assumes successively the values ν w , V w ∈ V . In this way, by summing over all P the trees belonging to a given resonance family, we build a quantity proportional to w∈V ν w = 0, by Definition 3.3 of real resonance. It is important to note that, by the definition of real resonance, the scale of the lines internal to V cannot change too much, certainly not enough to break the cluster V (i.e. no scale of a line internal to V can become smaller than nλV .)

Quasi Linear Flows on Tori

733

Appendix A3. Regularity of the Potential for the Schr¨odinger Equation in a Quasi Periodic Potential The equations of motion of the Hamiltonian system (5.4) for the angle variables are 1 dα1 = k − ε √ (sin2 α1 )V (ω0 t) + N1 , dt E dβ = ω0 , dt

(A3.1)

√ where E = k + N1 , which can be discussed as in Sect. 2. We look for a “Bloch wave” with momentum k and energy E assuming that the vector (k, ω0 ) is diophantine (i.e. a quasi periodic solution with rotation vector (k, ω0 )). Following [G2] we regard the √1E in (A3.1) as a parameter to be fixed later: we can deduce a function of ε and in fact that the solution to (A3.1), and in particular N1 = N1 (ε, E) as√ ε √ E, is analytic in E and therefore the “dispersion relation” equation E = k +N1 (ε, E) can be easily solved (see [G2]). A formula in terms of graphs (2.6) can be still obtained, where Val(ϑ), defined in (2.4), becomes Y 1 ν v 0 s ν v Vν v −√ , (A3.2) Val(ϑ) = E kνλv + ω0 · ν λv v
ν=0,±2

Then one sees that no problem arises from the numerators, (as the only appearing external momenta are of the form νv , and sin2 α1 is a trigonometric polynomial in α1 ), while the small divisors can be dealt with through Lemma 3.4. This gives a factor |ν v |3τ per node, so that summability on the Fourier labels requires at least V ∈ C (p) (T`−1 ), p > 3τ + ` − 1. The equations of motion for the action variables give A1 ∂ sin2 α1 dA1 = ε√ V (ω0 t) , dt E ∂α1 A1 ∂V (β) dB = ε √ sin2 α1 , dt ∂β β=ω0 t E

(A3.4)

so that we can reason as above, with the only difference that the highest node of the graph v1 has a factor ν v1 which requires, to guarantee the summability on ν v1 , V ∈ C (p) (T`−1 ), with p > 3τ + `. Then one has to require at least V ∈ C (p) (T`−1 ), p > 3τ + `, in order to have h1 ∈ C (1) (T1 ), as it has to be for the Schr¨odinger equation (5.1) to be meaningful, if one recalls that (1) the wave function ψ(x) solving (5.1) has to be of class C (1) for V ∈ C (0) , and (2) ψ(x) = q(x), where q is the variable related with α1 by the canonical transformation C defined before (5.4).5 5 Note that if we confine ourselves to the classes of functions introduced in Sect. 1.3, item (3), then we have to require V ∈ Cˆ (p) (T`−1 ), p > 3τ + 1, in order to have ψ ∈ C (1) .

734

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

Appendix A4. Comparison between Moser’s Counterterms Theorem and the Counterterms Conjecture in [G1] A 4.1. In [M1] a perturbation theory for quasi-periodic solutions of a nonlinear system of ordinary differential equations is developed. Up to a (trivial) coordinate transformation, the system can be written in the form dx = ω + εf(x, y; ε) , dt dy = y + εg(x, y; ε) , dt

(A4.1)

where x ≡ (x1 , . . . , xn ) ∈ Rn , y ≡ (y1 , . . . , ym ) ∈ Rm , ω ∈ Rn , is a constant m × m matrix with eigenvalues 1 , . . . , m , and f and g are functions with period 2π in x1 , . . . , xn and analytic in x, y and ε (in suitable domains). If the characteristic numbers ω1 , . . . , ωn , 1 , . . . , n verify the “generalized diophantine condition” n m X X −1 C0 i ν j ωj + µi i ≥ |ν|τ + 1 , j=1

(A4.2)

i=1

Pn with ν ≡ (ν1 , . . . , νm ) ∈ Zn , |ν| = j=1 νj , and (µ1 , . . . , µm ) ∈ Zm , then there exists unique analytic vector valued functions λ(ε) and m(ε) and a unique analytic matrix valued function M (ε) such that the modified system dx = ω + εf(x, y; ε) + λ(ε) , dt dy = y + εg(x, y; ε) + m(ε) + M (ε)y , dt

(A4.3)

admits a quasi periodic solution with the same characteristic number as the unperturbed one, [M1], Theorem 1. A 4.2. Let us consider the case in which m = n = `, = 0, and there exists a function H0 = ω · y + εf (x, y; ε) such that f = ∂y f and g = −∂x f . Then the system (A4.1) becomes the system studied in [GM2], Sect. 8. Under the same hypotheses, if moreover f (x, y; ε) ≡ εy · f(x) for some function f, (A4.1) and (A4.3) become the equations of motion of systems described by the Hamiltonians, respectively, (1.1) and (1.3). In fact the linearity in the action variables of the term added to the Hamiltonian H0 in (1.3) leads to a term independent of the action variables in the equations of motion, i.e. N(ε) ≡ λ(ε), while the counterterms m(ε) and M (ε) are identically vanishing as a consequence of the symplectic structure of the equations of motion (as one can argue a posteriori from Theorem 1.4 in Sect. 1). In the general case in which the function f (x, y; ε) appearing in the Hamiltonian H0 depends arbitrarily (but always analytically) on y, the systems studied in [M1] (under the same hypotheses as above) and in [GM2] are no longer equal to each other, i.e. the modified system (A4.3) is not the system with Hamiltonian considered in Eq. (1.10) of [GM2], so that Theorem 1.4 in [GM2] cannot be reduced to the results of [M1]: in fact not only there will be no more a trivial relation between the counterterms N(ε) and λ(ε), but also the equations of motion solutions will be different from each other.

Quasi Linear Flows on Tori

735

Note however that the result following from Moser’s theorem applied to such a system (i.e. a Hamiltonian system with = 0) can be (trivially) reproduced with our techniques. Also an extension of our techniques to Hamiltonian systems (verifying the anisochrony condition) such that 6= 0 could been envisaged:6 an example in this direction is in [Ge], where has eigenvalues 1 = . . . = `−1 = 0, ` = g 2 , and the existence of a counterterm M (ε) analytic in ε is proven (while λ(ε) ≡ m(ε) ≡ 0 again for the symplectic structure of the equations of motion). Acknowledgement. Partially supported by the European Network on: “Stability and Universality in Classical Mechanics", # ERBCHRXCT940460. G.Ge. acknowledges support from the EC program TMR and IHES. We are grateful to L. H. Eliasson for kindly signaling an error in our concluding remarks in Sect. 5.

References [BGGM] Bonetto, F., Gallavotti, G., Gentile, G., Mastropietro, V.: Lindstedt series, ultraviolet divergences and Moser’s theorem. Preprint IHES/P/95/102 (1995) [C] Chierchia, L.: Absolutely continuous spectra of quasi periodic Schr¨odinger operators. J. Math. Phys. 28, 2891–2898 (1987) [DS] Dinaburg, E.I., Sinai, Ya.G.: The one dimensional Schr¨odinger equation with a quasi periodic potential. Funct. Anal. and Appl. 9, 279–289 (1975) [E1] Eliasson, L.H.: Absolutely convergent series expansions for quasi-periodic motions. Mathematical Physics Electronic Journal 2, (4): 1–33 (1996), http://mpej @math. utexas.edu. [E2] Eliasson, L.H.: Hamiltonian systems with linear normal form near an invariant torus. In: Nonlinear Dynamics, Ed. G. Turchetti, Bologna Conference, 30/5 to 3/6 1988, Singapore: World Scientific, 1989 [EV] Ecalle, J., Vallet, B.: Prenormalization, correction, and linearization of resonant vector fields or diffeomorphisms, Preprint 95-32, Universit´e de Paris Sud - Mathematiques, Paris (1995) [G1] Gallavotti, G.: A criterion of integrability for perturbed harmonic oscillators. “Wick Ordering” of the perturbations in classical mechanics and invariance of the frequency spectrum. Commun. Math. Phys. 87, 365–383 (1982) [G2] Gallavotti, G.: Classical Mechanics and Renormalization Group. Lectures notes, 1983 Erice summer school, Ed. A. Wightman, G. Velo, N.Y.: Plenum, 1985 [G3] Gallavotti, G.: Quasi-integrable mechanical systems. In: Critical Phenomena, Random Systems, Gauge Theories. Les Houches, Session XLIII (1984), Vol. II, Ed. K. Osterwalder & R. Stora, Amsterdam: North Holland, (1986) pp. 539–624 [G4] Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164, 145–156 (1994) [GGM] Gallavotti, G., Gentile, G., Mastropietro, V.: Field theory and KAM tori. Mathematical Physics Electronic Journal 1, (5): 1–13 (1995), http://mpej@ math. utexas. edu [Ge] Gentile, G.: Whiskered tori with prefixed frequencies and Lyapunov spectrum. Dynam. Stability of Systems 10, 269–308, (1995) [GM1] Gentile, G., Mastropietro, V.: KAM theorem revisited. Physica D 90, 225–234 (1996) [GM2] Gentile, G., Mastropietro, V.: Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications. Rev. Math. Phys. 8, 393–444 (1996) [HP] Harary, F., Palmer, E.: Graphical enumeration. New York: Academic Press, 1973 [M1] Moser, J.: Convergent series expansions for quasi periodic motions. Math. Ann. 169, 136-176 (1967) [M2] Moser, J.: On the construction of almost periodic solutions for ordinary differential equations. Proceedings of the International Conference of functional analysis and related topics. Tokyo, 60– 67, (1969) [Pa] Parasyuk, I.S.: On instability zones of the Schr¨odinger equation with a quasi periodic potential. Ukr. Mat. Zh. 30, 70–78 (1985) 6

The anisochronus case with = 0 is simply the KAM theorem.

736

F. Bonetto, G. Gallavotti, G. Gentile, V. Mastropietro

[PF]

Pastur, L.A., Figotin, A.: Spectra of random and almost-periodic operators. Grundlehren der Mathematischen Wissenschaften 297, Berlin: Springer, 1992 ¨ P¨oschel, J.: Uber invarianten Tori on differenzierbaren Hamiltonschen Systemen, Bonn. Math. Schr. 120, 1–120 (1980) R¨ussmann, H.: One dimensional Schr¨odinger equation. Ann. New York Acad. Sci. 357, 90–107 (1980) Siegel, C.L.: Iterations of analytic functions. Ann. Math. 43, No.4, 607–612 (1943)

[P] [R] [S]

Communicated by G. Felder