Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Author: A. Jaffe (Chief Editor)

47 downloads 1112 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 208, 1 – 23 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Characters of Cycles, Equivariant Characteristic Classes and Fredholm Modules Alexander Gorokhovsky Department of Mathematics, Ohio State University, Columbus, OH 43210, USA. E-mail: [email protected] Received: 13 April 1998 / Accepted: 29 April 1999

Abstract: We derive simple explicit formula for the character of a cycle in the Connes’ (b, B)-bicomplex of cyclic cohomology and apply it to write formulas for the equivariant Chern character and characters of finitely-summable bounded Fredholm modules.

1. Introduction The notion of a cycle, introduced by Connes in [4], plays an important role in his development of the cyclic cohomology and its applications. Many questions of the differential geometry and noncommutative geometry can be reformulated as questions about geometrically defined cycles. Associated with a cycle is its character, which is a characteristic class in cyclic cohomology, described by an explicit formula (see [4]). Some natural constructs, like the transverse fundamental cycle of a foliation [6] or the superconnection in [15] require however consideration of more general objects, which we call “generalized cycles” (we recall the definition in Sect. 2). The simplest geometric example of generalized cycle is provided by the algebra of forms with values in the endomorphisms of some vector bundle, together with a connection. More interesting examples arise from vector bundles equivariant with respect to the action of the discrete group, or, more generally, holonomy equivariant vector bundles on foliated manifolds. The original definition of the character of a cycle does not apply directly to generalized cycles. To overcome this, Connes ([4], cf. also [6]) has devised a canonical procedure allowing to associate a cycle with a generalized cycle. This allows to extend the definition of the character to the generalized cycles. In this paper we show that the character of a generalized cycle can be defined by the explicit formula in the (b, B)-bicomplex, resembling the JLO formula for the Chern character [10]. In the geometric examples above this leads to formulas for Bott’s Chern character [2] in cyclic cohomology. As another example we derive the formula for the character of the Fredholm module.

2

A. Gorokhovsky

The paper is organized as follows. In Sect. 2 we define the character of a generalized cycle and, more generally, generalized chain. Closely related formulas also appear and play an important role in Nest and Tsygan’s work on the algebraic index theorems [12, 13]. We then establish some basic properties of this character and prove that our definition of the character coincides with the original one given by Connes in [4]. In Sect. 3 we construct the cyclic cocycle, representing the equivariant Chern character in the cyclic cohomology, and discuss relation of this construction with the multidimensional version of the Connes construction of the Godbillon-Vey cocycle [5], and the transverse fundamental class of the foliation. In Sect. 4 we write explicit formulas for the character of a bounded finitely-summable Fredholm module, where F 2 − 1 is not necessarily 0 (such objects are called pre-Fredholm modules in [4]). The idea is to associate with such a Fredholm module a generalized cycle, by the construction similar to [4]. We thus obtain finitely summable analogues of the formulas from [10] and [9]. 2. Characters of Cycles In this section we start by stating definitions of generalized chains and cycles, and writing the JLO-type formula for the character. We then show that this definition of character coincides with the original one from [4]. In what follows we require the algebra A to be unital. This condition will be later removed by adjoining the unit to A. One defines a generalized chain over an algebra A by specifying the following data: 1. Graded unital algebras and ∂ and a surjective homomorphism r : → ∂ of degree 0 , and a homomorphism ρ : A → 0 . We require that ρ and r be unital. 2. Graded derivations of degree 1 ∇ on and ∇ 0 on ∂ such that r ◦ ∇ = ∇ 0 ◦ r and θ ∈ 2 such that ∇ 2 (ξ ) = θ ξ − ξ θ ∀ξ ∈ . We require that ∇(θ) = 0 . Z

3. A graded trace − on n for some n (called the degree of the chain) such that Z −∇(ξ ) = 0 ∀ξ ∈ n−1 such that r(ξ ) = 0. If one requires ∂ = 0 one obtains the definition of the generalized cycle. The generalized cycle for which θ = 0 is called cycle. 0 One Z defines the boundary of the generalized chain to Z be a generalized cycle (∂, ∇ , 0

0

θ 0 , − ) of degree n − 1 over an algebra A, where the − is the graded trace defined by the identity

Z Z 0 − ξ 0 = −∇(ξ ),

(2.1)

where ξ 0 ∈ (∂)n−1 and ξ ∈ n such that r(ξ ) = ξ 0 . Homomorphism ρ 0 : A → ∂0 is given by ρ 0 = r ◦ ρ.

(2.2)

Cycles, Equivariant Characteristic Classes and Fredholm Modules

3

Notice that for ξ 0 ∈ ∂ (∇ 0 )2 (ξ 0 ) = θ 0 ξ 0 − ξ 0 θ 0 , where θ 0 is defined by θ 0 = r(θ ).

(2.3)

With every generalized chain C n of degree n one can associate by a JLO-type formula a canonical n-cochain Ch(C n ) in the (b, B)-bicomplex of the algebra A, which we call a character of the generalized chain, Chk (C n )(a0 , a1 , . . . ak ) = (−1)

n−k 2

( n+k 2 )! Cn

X

Z −ρ(a0 )θ i0 ∇(ρ(a1 ))θ i1 . . . ∇(ρ(ak ))θ ik .

(2.4)

i0 +i1 +···+ik = n−k 2

Note that if C n is a (non-generalized) cycle Ch(C n ) coincides with the character of as defined by Connes. For the generalized chain C let ∂C denote the boundary of C.

Theorem 2.1. Let C n be a chain, and ∂(C n ) be its boundary. Then (B + b) Ch(C n ) = S Ch(∂(C n )).

(2.5)

Here S is the usual periodicity shift in the cyclic bicomplex. Proof. By direct computation. u t Remark 2.1. A natural framework for such identities in cyclic cohomology is provided by the theory of operations on cyclic cohomology of Nest and Tsygan, cf. [12,13]. Corollary 2.2. If C n is a generalized cycle then Ch(C n ) is an n-cocycle in the cyclic bicomplex of an algebra A. Corollary 2.3. For two cobordant generalized cycles C1n and C2n , [S Ch(C1n )] = [S Ch(C2n )] in H C n+2 (A). Formula (2.4)Zcan also be written in a different form. We will use the Z following notations. First, − can be extended to the whole algebra by setting −ξ = 0 if P ξj k deg ξ 6 = n. For ξ ∈ eξ is defined as ∞ j =0 j ! . Then denote 1 the k-simplex {(t0 , t1 , . . . , tk )|t0 + t1 + · · · + tk = 1, tj ≥ 0} with the measure dt1 dt2 . . . dtk . Finally, α is an arbitrary nonzero real parameter. Then Chk (C n )(a0 , a1 , . . . ak ) = Z Z k−n −ρ(a0 )e−αt0 θ ∇(ρ(a1 ))e−αt1 θ . . . ∇(ρ(ak ))e−αtk θ dt1 dt2 . . . dtk , α 2 1k

(2.6)

4

A. Gorokhovsky

where k is of the same parity as n. Indeed, Z −ρ(a0 )e−αt0 θ ∇(ρ(a1 ))e−αt1 θ . . . ∇(ρ(ak ))e−αtk θ = (−α)

X

n−k 2

i0 +i1 +···+ik = n−k 2

Z t0i0 t1i1 . . . tkik −ρ(a0 )θ i0 ∇(ρ(a1 ))θ i1 . . . ∇(ρ(ak ))θ ik i0 !i1 ! . . . ik !

(2.7)

and our assertion follows from the equality Z 1n

t0i0 t1i1 . . . tkik dt1 dt2 . . . dtk =

i0 !i1 ! . . . ik ! . (i0 + i1 + · · · + ik + k)!

Remark 2.2. We worked above only in the context of unital algebras and maps. The case of general algebras and maps can be treated by adjoining a unit. We follow [15]. The definition of the generalized chain in the nonunital case differ from the definition in the unital case only in two aspects: first, we do not require algebras and morphisms to be unital; second, we do not require any more that the curvature θ is an element of 2 ; rather we require it to be a multiplier of the algebra which satisfies the Zfollowing: Z for k k+2 ω ∈ , θω and ωθ are in , ∇(θ ω) = θ ∇(ω), ∇(ωθ ) = ∇(ω)θ and −θ ω = −ωθ if ω ∈ n−2 . We also need to require existence of the θ 0 – multiplier of ∂ such that r(θω) = θ 0 r(ω), r(ωθ) = r(ω)θ 0 , and Z include it in the defining data of the chain.

With C n = (, ∂, r, ∇, ∇ 0 , θ, −) – nonunital generalized chain over a (possibly

nonunital) algebra A we associate canonically a unital chain Ze n 0e e e e e e r, ∇, ∇ θ , −) C = (, ∂ ,e e – A with unit adjoined. The construction is the following: the algebra over the algebra A e is obtained from the algebra by adjoining a unit 1 , (of degree 0 ) and an element e θ of degree 2 with the relations e θ ω = θ ω and ωθ = ωe θ for ω ∈ , and similarly for e e the algebra ∂ . The derivation ∇ coincides with ∇ on the elements of and satisfies Ze ee e e0 is defined similarly. The graded trace − on equalities ∇( θ ) = 0 and ∇(1) = 0, and ∇ Z e is defined to coincide with − on the elements of and, if n is even, is required to Ze n satisfy the relation −e θ 2 = 0. Now if C n is a (nonunital) generalized cycle over A, formula (2.4), applied to Cen e and hence a class in the reduced defines a (reduced) cyclic cocycle over an algebra A n n e = H C (A). Corollary 2.3 implies that this class is invaricyclic cohomology H C (A) ant under the (nonunital) cobordism. Note also that in the unital case the class defined after adjoining the unit agrees with the one defined before. Alternatively, one can work from the beginning with the Loday–Quillen–Tsygan bicomplex, see e.g. [11], where the corresponding formulas can be easily written.

Cycles, Equivariant Characteristic Classes and Fredholm Modules

5

We now will show equivalence of the previous construction with Connes’ original construction. Z With every generalized cycle C = (, ∇, θ, −) over an algebra A Connes shows how to associate canonically a cycle CX . One starts with a graded algebra θ , which as a vector space can be identified with the space of 2 by 2 matrices over an algebra , with the grading given by the following: ω11 ω12 ∈ kθ if ω11 ∈ k ω12 , ω21 ∈ k−1 and ω22 ∈ k−2 . ω21 ω22

by

0 0 ω11 ω12 ω11 ω12 0 and ω = 0 is given The product of the two elements in θ ω = 0 ω21 ω22 ω21 ω22

ω ω ω ∗ ω = 11 12 ω21 ω22 0

1 0 0 θ

0 0 ω11 ω12 . 0 ω0 ω21 22

The homomorphism ρθ : A → θ is given by ρ(a) 0 . ρθ (a) = 0 0

(2.8)

(2.9)

On this algebra one can define a graded derivation ∇θ of degree 1 by the formula (here ω ω ω = 11 12 ) ω21 ω22 ∇θ (ω) =

∇(ω11 ) ∇(ω12 ) . −∇(ω21 ) −∇(ω22 )

(2.10)

One checks that ∇θ2 (ω) =

θ 0 θ 0 ∗ω−ω∗ . 0 1 0 1

(2.11)

More generally, one can define on this algebra a family of connections ∇θt , 0 ≤ t ≤ 1 by the equation ∇θt (ω) = ∇θ (ω) + t (X ∗ ω − (−1)deg ω ω ∗ X ), where X is degree 1 element of θ given by the matrix 0 −1 X = . 1 0

Lemma 2.4. (∇θt )2 (ω) = (1 − t 2 )

θ 0 θ 0 ∗ω−ω∗ . 0 1 0 1

Proof. Follows from an easy computation. u t

(2.12)

(2.13)

6

A. Gorokhovsky

Hence for t = 1 we obtain aZgraded derivation ∇θ1 whose square is 0 . Finally, the graded trace − is defined by θ

Z Z Z − ω = −ω11 − (−1)deg ω −ω22 θ. θ

(2.14)

It is closed with respect to ∇θ , and hence, being a graded trace, it is closed with respect to ∇θt for any t. Z Corollary 2.5. CX = (θ , ∇θ1 , − ) is a (nongeneralized) cycle. θ

The cycle CX is Connes’ canonical cycle, associated with the generalized cycle C . With every (nongeneralized) cycle of degree n Connes associated a cyclic n-cocycle on the algebra A by the following procedure: let theZcycle consist of a graded algebra , degree 1 graded derivation d and a closed trace −. Then the character of the cycle is the cyclic cocycle τ in the cyclic complex given by the formula Z τ (a0 , a1 , . . . , an ) = −ρ(a0 )dρ(a1 ) . . . dρ(an ).

(2.15)

To it corresponds a cocycle in Z the (b, B)-bicomplex with only one nonzero component of degree n, which equals

1 n!

−ρ(a0 )dρ(a1 ) . . . dρ(an ).

n be Theorem 2.6. Let C n be a generalized cycle of degree n over an algebra A, and CX n )] the canonical cycle over A, associated with C n (see above). Then [Ch(C n )] = [τ (CX n in H C (A).

Note that equality here is in the cyclic cohomology, not only in the periodic cyclic cohomology. The theorem will follow easily from the above considerations and the following lemma. Z Lemma 2.7. Let C0 = (, ∇0 , θ0 , −) , be a generalized cycle of degree n over an algebra A, Zand let η be an element of 1 . Consider the generalized cycle C1 = (, ∇1 , θ1 , −), where ∇1 = ∇0 + ad η, θ1 = θ0 + ∇0 η + η2 . Then [Ch(C0 )] = [Ch(C1 )]. Proof of Lemma 2.7. First, we can suppose that the cycle is unital – in the other case one can perform a construction, explained in Remark 2.2. We start by constructing a cobordism between cycles C0 and C1 . This is analogous to a c c c c c construction Z cfrom [15]. The cobordism is provided by the chain C = ( , ∂ , r , ∇ , (∇ c )0 , θ c , − ) with ∂C c = −C0 t C1 defined as follows.

Cycles, Equivariant Characteristic Classes and Fredholm Modules

7

b, where ⊗ b denotes the graded The graded algebra c is defined as ∗ ([0, 1])⊗ tensor product, and ∗ ([0, 1]) is the algebra of the differential forms on the segment [0, 1]. The map ρ c : A → c is given by bρ(a). ρ c (a) = 1⊗

(2.16)

We denote by t the variable on the segment [0, 1]. The graded derivation ∇ c is defined by bω) = dα ⊗ bω + (−1)deg α α ⊗ b∇0 ω + (−1)deg α tα ⊗ b[η, ω]. ∇ c (α ⊗

(2.17)

Here d is the de Rham differential on [0, 1]. The curvature θ c is defined to be bη2 + dt ⊗ b∇η + t 2 ⊗ bη. bθ0 + t ⊗ 1⊗ As expected, the algebra ∂c

⊕ is defined by

( c

bω) = r (α ⊗

(2.18)

is defined to be ⊕ . The restriction map r c α(0)ω ⊕ α(1)ω 0

if deg α = 0 . otherwise

: c →

(2.19)

The connection ∇ 0 onZ ⊕ is given by ∇0 ⊕ ∇1 . c The graded trace − on (c )n+1 is given by the formula  R Z Z c  α −ω bω = [0,1] − α⊗  0

if deg ω = n and deg α = 1

.

(2.20)

otherwise

It is easy to see that  Z Z c (α(1) − α(0)) −ω if deg ω = n and deg α = 0 bω) = . − ∇ c (α ⊗ 0 otherwise Z Z Z c Hence the “boundary” trace ( − )0 induced on ⊕ equals − − ⊕ −.

(2.21)

Thus we constructed the generalized chain C c , providing the cobordism between C0 and C1 . Corollary 2.3 implies that [S Ch(C0 )] = [S Ch(C1 )]. To obtain the more precise statement of the lemma and finish the proof of the theorem, we need to examine the character Ch(C c ), since S Ch(C0 ) − S Ch(C1 ) = (b + B) Ch(C c ). Ch(C c ) has components Chk (C c ) for k = n + 1, n − 1, . . . . Its top component Chn+1 (C c ) is given by the formula, Z c 1 − ρ c (a0 )∇ c (ρ c (a1 )) . . . ∇ c (ρ c (an+1 )), Chn+1 (C c )(a0 , a1 , . . . , an+1 ) = (n + 1)! (2.22) Z c bω, with where ai ∈ A. But the expression under − is easily seen to be of the form α ⊗ α degree 0. Hence the expression (2.22) is identically 0, by the definition (2.20) of Z of c − . It follows that Ch(C c ) is in the image of the map S, and this implies that [Ch(C0 )] = t [Ch(C1 )]. u

8

A. Gorokhovsky

Remark 2.3. The above lemma remains true if we relaxZ its conditions to allow η to be a Z

multiplier of degree 1 , such that −ηω = (−1)(n−1)/2 −ωη and r(ηω) = r(ωη) = 0 if r(ω) = 0. Then ∇0 η is a multiplier, defined by (∇0 η)ω = ∇0 (ηω) + η∇0 ω. The same proof then goes through if we enlarge the algebra to the subalgebra of the multiplier Z algebra of obtained from by adjoining 1 , θ0 , η, ∇0 η, and extending − to this Z algebra by zero (i.e. we put −P = 0 for any P – monomial in θ0 and η). Proof of Theorem 2.6. The lemma above applies directly to the cycles Z θ 0 ,− C1 = θ , ∇θ , 01 θ n (with η = X ). This shows that Ch(C n ) = Ch(C n ) in H C n (A). Since C n and C1 = CX X X n ) = τ (C n ), is a (nongeneralized) cycle, comparison of the definitions shows that Ch(CX X even on the level of cocycles, and the theorem follows. u t

Corollary 2.8. For two generalized cycles Z = (1 , ∇1 , θ1 , − ) and Z1 m C2 = (2 , ∇2 , θ2 , − ) C1n

2

Z Z b2 , ∇1 ⊗ b1 + 1⊗ b∇2 , θ1 ⊗ b1 + 1⊗ bθ2 , − ⊗ b − ). define the product by C1 × C2 = (1 ⊗ Then Ch(C1 × C2 ) = Ch(C1 ) ∪ Ch(C2 ).

1

2

Proof. For the non-generalized cycles this follows from Connes’ definition of the cupproduct. In the general case, the statement follows from the existence of the natural map of cycles (i.e. homomorphism of the corresponding algebras, preserving all the structure) (C1 × C2 )X → (C1 )X × (C2 )X , which agrees with taking the character. The simplest way to describe this map is by using another Connes’ description of ω11 ω12 , ωij ∈ is identified with the his construction. In this description matrix ω21 ω22 element ω11 + ω12 X + Xω21 + Xω22 X, where X is a formal symbol of degree 1. The multiplication law is formally defined by ωXω0 = 0, X2 = θ . This should be understood as a short way of writing identities like ωX ∗ Xω0 = ωθ ω0 (note that X is not an element of the algebra). If we denote by X1 , X2 , X12 formal elements, corresponding to C1 , C2 , C1 × C2 respectively, the homomorphism mentioned above is the unital extension of the identity b2 → 1 ⊗ b2 defined (again formally) by X12 7→ (X1 ⊗ b 1 + 1⊗ bX2 ). u t map 1 ⊗ 3. Equivariant Characteristic Classes This section concerns vector bundles equivariant with respect to discrete group actions. We show that there is a generalized cycle associated naturally to such a bundle with (not

Cycles, Equivariant Characteristic Classes and Fredholm Modules

9

necessarily invariant) connection. The character of this generalized cycle turns out to be related (see Theorem 3.1) to the equivariant Chern character. Let V be an orientable smooth manifold of dimension n, E a complex vector bundle over V , and A = End(E) – algebra of endomorphisms with compact support. One can construct a generalized cycle over an algebra A in the following way. The algebra = ∗ (V , End(E)) – the algebra of endomorphism-valued differential forms. Any connection ∇ on the bundle E defines a connection for the generalized cycle, with curvature of Zthe connection. On the the curvature θ ∈ 2 (V , End(E)) – the usual Z R n (V , End(E)) one defines a graded trace − by the formula −ω = trω, where in V

the right-hand side we have a usual matrix trace and a usual integration over a manifold. Note that when V is noncompact, this cycle is nonunital. The formula (2.6), define a cyclic n-cocycle {Chk } on the algebra A, given by the formula Chk (a0 , a1 , . . . ak ) =   Z Z  tr a0 e−t0 θ ∇(a1 )e−t1 θ . . . ∇(ak )e−tk θ  dt1 dt2 . . . dtk . 1k

(3.1)

V

Hence we recover the formula of Quillen from [16]. (Recall that for noncompact V these expressions should be viewed as defining the reduced cocycle over the algebra A with unit adjoined, with Ch0 extended by Ch0 (1) = 0.) One can restrict this cocycle to the subalgebra of functions C ∞ (V ) ⊂ End(E). As a result one obtains an n-cocycle on the algebra C ∞ (V ), which we still denote by {Chk }, given by the formula Z 1 k a0 da1 . . . dak tr e−θ . (3.2) Ch (a0 , a1 , . . . ak ) = k! V

To this cocycle corresponds a current on V , defined by the form tr e−θ . Hence in this case we recover the Chern character of the bundle E. Note that we use normaliztion of the Chern character from [1]. Let now an orientable manifold V of dimension n be equipped with an action of the discrete group 0 of orientation preserving transformations, and E be a 0-invariant bundle. In this situation, one can again construct a cycle of degree n over the algebra A = End(E) o 0. Our notations are the following: the algebra A is generated by the elements of the form aUg , a ∈ End(E), g ∈ 0, and Ug is a formal symbol. The product 0 is (a 0 Ug 0 )(aUg ) = a 0 a g Ugg 0 . The superscript here denotes the action of the group. The graded algebra is defined as ∗ (V , End(E)) o 0. Elements of clearly act on the forms with values in E, and any connection ∇ in the bundle E defines a connection for the algebra , which we also denote by ∇, by the identity (here ω ∈ , and s ∈ ∗ (V , E)) ∇(ωs) = ∇(ω)s + (−1)deg ω ω∇(s).

(3.3)

One checks that the above formula indeed defines a degree 1 derivation, which can be described by the action on the elements of the form αUg , where α ∈ ∗ (V , End(E)), g ∈ 0, by the equation ∇(αUg ) = (∇(α) + α ∧ δ(g)) Ug ,

(3.4)

10

A. Gorokhovsky

where δ is 1 (V , End(E))-valued group cocycle, defined by δ(g) = ∇ − g ◦ ∇ ◦ g −1 .

(3.5)

1 is the unit of the group, and θ is One defines a curvature as an element θ U1 , where Z the (usual) curvature of ∇ . The graded trace − on n is given by R Z  α −αUg = V 0

if g = 1

.

(3.6)

otherwise

One can associate with this cycle a cyclic n-cocycle over an algebra A, by Eq. (2.6). By restricting it to the subalgebra C0∞ (V ) o 0 one obtains an n-cocycle {χ k } on this algebra. Its k th component is given by the formula χ k (a0 Ug0 , a1 Ug1 , . . . ak Ugk ) = Z X γi1 −1 γi1 γi1 +1 γ γ a0 da1 1 da2 2 . . . dai1 −1 ai1 dai1 +1 ... 1≤i1
2i1 ,i2 ,...,il (γ1 , . . . , γk ). (3.7) for g0 g1 . . . gk = 1 and 0 otherwise. Here the summation is over all the subsets of {1, 2, . . . k} and the following notations are used: γj are group elements defined by γj = g0 g1 . . . gj −1 . 2i1 ,i2 ,...,il (γ1 , . . . , γk ) is the form (depending on g0 , g1 . . . ) defined by the formula 2i1 ,i2 ,...,il (γ1 , . . . , γk ) = Z γi γ1 γ2 1 tre−t0 θ e−t1 θ . . . e−ti1 −1 θ δ(gi1 )γi1 1k

e−ti1 θ

γi +1 1

. . . e−ti2 −1 θ

γi

2

δ(gi2 )γi2 . . . e−tk θ dt1 . . . dtk .

(3.8)

The change of connection does not change the class in the cyclic cohomology, as can be seen by constructing a cobordism between corresponding cycles. This formula is a cyclic cohomological analogue of the formula of Bott [2]. More precisely, the following theorem holds: Theorem 3.1. Let Ch0 (E) ∈ H ∗ (V ×0 E0) be the equivariant Chern character. Let 8 : H ∗ (V ×0 E0) → HP∗ (C∞ 0 (V) o 0) be the canonical imbedding, constructed by Connes, cf. [6]. Then 8 (Ch0 (E)) = [χ]. Here E pulls back to an equivariant bundle on V ×E0, and then drops down to V ×0 E0, and the equivariant Chern character Ch0 (E) is the Chern character of the resulting bundle. We recall that we use normalization from [1]. To prove the theorem we need some preliminary constructions and facts. For a 0manifold Y by Y0 we denote the homotopy quotient Y ×0 E0.

Cycles, Equivariant Characteristic Classes and Fredholm Modules

11

Suppose we are given 0-manifolds V and X, X oriented. We construct then a map I : H C j (C0∞ (V ) o 0) → H C j +dim X (C0∞ (V × X) o 0). The construction is the following: in H C dim X (C0∞ (X) o 0) there is a class represented by the cocycle τ (f( 0 Ug0 , f1 Ug1 , . . . , fk Ugk ) R g g ...g g 1 f0 df1 0 . . . dfk 0 1 k−1 = k! X 0

if g0 g1 . . . gk = 1 , otherwise

k = dim X. One then constructs the map I from the following diagram: ∪τ

H C j (C0∞ (V ) o 0) → H C j +dim X

C0∞ (V ) o 0 ⊗ C0∞ (X) o 0 1∗

→ H C ∗+dim X (C0∞ (V × X) o 0).

(3.9)

Here the last arrow is induced by the natural map 1 : C0∞ (V × X) o 0 = C0∞ (V ) ⊗ C0∞ (X) o 0

→ C0∞ (V ) o 0 ⊗ C0∞ (X) o 0 defined by 1 (f ⊗ f 0 )Ug = f Ug ⊗ f 0 Ug . Suppose now that V is also oriented.

(3.10)

Proposition 3.2. The following diagram is commutative: 8

H P ∗ (C0∞ (V × X) o 0) ←−−−− H ∗ ((V × X)0 ) x x   ∗ . I π H P ∗ (C0∞ (V ) o 0)

8

←−−−−

(3.11)

H ∗ (V0 )

Here π : (V × X)0 → V0 is induced by the (0-equivariant) projection V × X × E0 → V × E0. Proof. We can consider V × X with action of 0 × 0. We start with showing that the following diagram is commutative: 8 H P ∗ (C0∞ (V × X) o (0 × 0)) ←−−−− H ∗ (V × X)(0×0) x x   ∗ . (3.12) ∪τ  π H P ∗ (C0∞ (V ) o 0)

8

←−−−−

H ∗ (V0 )

Here we identify C0∞ (V × X) o (0 × 0) with (C0∞ (V ) o 0) ⊗ (C0∞ (X) o 0) and (V × X)(0×0) with X0 × V0 . This is verified by the direct computation, using the Eilenberg-Silber theorem and shuffle map in cyclic cohomology, cf. [11]. Now we note that the commutativity of the following diagram is clear: 8 H P ∗ (C0∞ (V × X) o (0 × 0)) ←−−−− H ∗ (V × X)(0×0)     , (3.13) y y H P ∗ (C0∞ (V × X) o 0)

8

←−−−−

H ∗ ((V × X)0 )

12

A. Gorokhovsky

where both vertical arrows are induced by the diagonal maps 0 → 0 × 0 and E0 → E0 × E0. This ends the proof. u t Proposition 3.3. Let E be an equivariant vector bundle on V with connection ∇. Let χ ∈ H C n (C0∞ (V )o0) be the character of the associated cycle, and let χ 0 ∈ H C n+k (C0∞ (V × X) o 0) be the character of the cycle constructed with the bundle prV∗ E and connection prV∗ ∇, where prV : X × V → V . Then I (χ) = χ 0 . (Here n and k are dimensions of V and X respectively.) Proof. Let C denote the corresponding cycle over C0∞ (V ) o 0, and T -transverse fundamental cycle of X. Then C × T is a cycle over (C0∞ (V ) o 0) ⊗ (C0∞ (X) o 0), and Ch(C × T ) = Ch(C) ∪ τ by Corollary 2.8. If by pr ∗ C we denote the corresponding cycle over C0∞ (V × X) o 0, we have Ch(pr ∗ C) = 1∗ (Ch(C ⊗ T )) = 1∗ (Ch(C) ∪ τ ) = I (Ch(C)).

t u

Lemma 3.4. Suppose in addition to the conditions of Theorem 3.1 that 0 acts freely and properly on V . Then the statement of the theorem holds. Proof. Since the group acts freely and properly, one can find a connection on E which is 0-invariant. For the class of the cocycle χ written with the invariant connection the result follows easily from the definition of the map 8. u t Proof of Theorem 3.1. Comparison of the construction from [7], [14] with the definition of the map 8 implies that (class of) χ is in the image of 8, [χ ] = 8(ξ ) for some (necessarily unique) ξ ∈ H ∗ (V ×0 E0). We need to verify that ξ = Ch0 (E). We do this by showing that for any oriented manifold W and any map continuous f : W → V0 f ∗ ξ = f ∗ Ch0 (E). e be the principal 0-bundle obtained by pullback of the bundle V × E0 → V0 , Let W so that the following diagram is commutative, and fe is 0-equivariant: e

e −−−f−→ V × E0 W     . y y f

W −−−−→

(3.14)

V0

e→W e × V × E0, We can write fe as a composition of two 0-equivariant maps fe1 : W e as the graph of fe and pr : W e × V × E0 → V × E0, projection. which embeds W e × V )0 be the induced maps. We have e × V )0 → V0 and f1 : W → (W Let π : (W f = πf1 . e × V ) o 0) using the bundle pr ∗ E with Construct now the class χ 0 ∈ H P n (C0∞ (W ∗ 0 connection pr ∇. By Proposition 3.3 χ = I (χ), where I : H P ∗ (C0∞ (V ) o 0) → e × V ) o 0). By Proposition 3.2 χ 0 = I (χ) = I (8(ξ )) = 8(π ∗ ξ ). H P ∗+dim W (C0∞ (W e × V is acted by 0 freely and properly, χ 0 = 8(Ch(pr ∗ E)). By Lemma 3.4, since W ∗ But since Ch(pr (E)) = π ∗ Ch(E), and using injectivity of 8 we conclude that π ∗ Ch(E) = π ∗ ξ.

Cycles, Equivariant Characteristic Classes and Fredholm Modules

13

Hence f ∗ Ch(E) = f1∗ π ∗ Ch(E) = f1∗ π ∗ ξ = f ∗ ξ.

t u

Remark 3.1 (Relation with Connes’ Godbillon-Vey cocycle). In the paper [5] Connes considers (in particular) the case of the circle S 1 acted on by the group of its diffeomorphisms Diff(S 1 ) . Here we present the Connes construction in the multidimensional case and indicate some relations with our construction of cyclic cocycles representing equivariant classes. V In the situation of the previous example take the bundle E to be n T ∗ X. This is a 1 dimensional trivial bundle, naturally equipped with the action of the group 0 = Diff(X). Let φ be a nowhere 0 section of this bundle, i.e. a volume form. Define a flat connection ∇ on E by ∇(f φ) = df φ, φ ∈ C ∞ (X).

(3.15)

We can thus define the cycle C over the algebra C ∞ (X). Let now δ(g) be defined as above, and put µ(g) =

φg ∈ C ∞ (X). φ

(3.16)

Then µ is a cocycle, i.e. µ(gh) = µ(h)g µ(g).

(3.17)

δ(g) = d log µ(g).

(3.18)

We also have

Indeed, δ(g)φ g = ∇φ g − (∇(φ))g = ∇(µ(g)φ) = dµ(g)φ and δ(g) =

dµ(g) dµ(g)φ = dlog(µ(g)). = φg µ(g)

For every t we define a homomorphism ρt : C ∞ (X) o 0 → End(E) o 0 by ρt (aUg ) = a(µ(g))t Ug .

(3.19)

This is a homomorphism due to the cocycle property of µ, which according to [5] is the Tomita-Takesaki flow associated with the state given by the volume form φ. Consider now the transverse fundamental cycle 8 over the algebra A = C ∞ (X) defined by the following data: the differential graded algebra ∗ (X) o 0 with the differential d(ωUg ) = (dω)Ug ,

14

A. Gorokhovsky

Z the graded trace − on ∗ (X) o 0 defined by R Z  ω −ωUg = X 0

if g = 1

,

otherwise

the homomorphism ρ = ρ0 = id from A = C ∞ (X) o 0 to C ∞ (X) o 0. The flow (3.19) acts on the cycle 8, by replacing ρ0 by ρt . We call the cycle thus obtained 8t . Using the identities (3.20) d(ρt (aUg )) = da + ta dlogµ(g) µ(g)t Ug = da + ta δ(g) µ(g)t Ug and µ(g0 )µ(g1 )g0 µ(g2 )g0 g1 . . . µ(gk )g0 g1 ...gk−1 = µ(g0 g1 . . . gk ),

(3.21)

we can explicitly compute Ch(8t ). This is the cyclic n-cocycle with the only component of degree n. The result is: Ch(8t ) =

n X

t j pj ,

(3.22)

j =0

where pj is the cyclic cocycle given by pj (a0 Ug0 , a1 Ug1 , . . . , an Ugn ) = Z X 1 n!

1≤i1
γ

γ

γi

−1

γi

γi

+1

1 1 a0 da1 1 da2 2 . . . dai1 −1 ai1 1 dai1 +1 ...

2i1 ,i2 ,...,ij (γ1 , . . . , γk ) (3.23) for g0 g1 . . . gk = 1 and 0 otherwise, where we define as before γj = g0 g1 . . . gj −1 , and the j -form 2i1 ,i2 ,...,ij (γ1 , . . . , γk ) is given by 2i1 ,i2 ,...,ij (γ1 , . . . , γk ) = δ(gi1 )γi1 δ(gi2 )γi2 . . . δ(gij )

γij

.

(3.24)

In particular, p0 is the transverse fundamental class. Comparing these formulas from the formulas in the previous example we obtain Proposition 3.5. Let 81 be the image of the transverse fundamental cycle 8 under the ∞ action of the Tomita-Takesaki flow for Vthe time 1. Let C be the cycle over C (X) o 0 associated to the equivariant bundle n T ∗ X with the connection from (3.15). Then, on the level of cocycles Ch(81 ) = Ch(C). We now sketch a construction of a family of chains 9s providing the cobordism b∗ (X) o 0. The homobetween 80 and 8s , s ∈ R. The algebra ∗ = ∗ ([0, s])⊗ 0 ∗ t morphism from A to maps aUg ∈ (X) o 0 to aµ(g) Ug , where t is the variable b∇ + d ⊗ b1, where d is the de Rham differential, on [0, s]. The connection is given by 1⊗

Cycles, Equivariant Characteristic Classes and Fredholm Modules

15

and the curvature is 0 . The restriction map is given by the restriction to the endpoints of the interval and the graded trace is given by Z Z Z b(ωUg ) = (−1)deg ω α ω −α ⊗ [0,s]

X

if deg α = 1 and g = 1 and 0 otherwise. This chain provides a cobordism between 80 and 8s . Its character is given by the formula Ch(9s ) =

n+1 X

s j qj ,

(3.25)

j =1

where qj is the cyclic cochain given by qj (a0 Ug0 , a1 Ug1 , . . . , an Ugn ) = Z X 1 n!

1≤i1
γ

γ

γi

−1

γi

γi

+1

1 1 a0 da1 1 da2 2 . . . dai1 −1 ai1 1 dai1 +1 ...

4i1 ,i2 ,...,ij (γ1 , . . . , γk ) (3.26) for g0 g1 . . . gk = 1 and 0 otherwise, where we define as before γj = g0 g1 . . . gj −1 , and the j − 1-form 4i1 ,i2 ,...,ij (γ1 , . . . , γk ) is given by 4i1 ,i2 ,...,ij (γ1 , . . . , γk ) = j 1X γ (−1)l δ(gi1 )γi1 δ(gi2 )γi2 . . . log µ(gil )gil . . . δ(gij ) ij . j

(3.27)

l=1

Comparing this formula with (3.22) we obtain: Proposition 3.6. Let pj , j = 1,..., n be the chains, defined in (3.22), (3.23), and qj , j = 1,..., n + 1 be from (3.25), (3.26). Then for j = 1, . . . , n we have B qj = pj and b qj = 0.

(3.28)

B qn+1 = 0 and b qn+1 = 0.

(3.29)

Also

In particular all pj define trivial classes in periodic cyclic cohomology, and qn+1 is a cyclic cocycle. The cocycle qn+1 should represent (up to a constant) the Godbillon-Vey class in the j cyclic cohomology (i.e. class defined by h1 c1n , while pj and qj represent forms c1 and j h1 c1 , j = 1,...,n, see [2]).

16

A. Gorokhovsky

Remark 3.2 (Transverse fundamental class). The construction of the equivariant characteristic classes works equally well in the case of a foliation. The new ingredient required here is the Connes’ construction of the transverse fundamental (generalized) cycle. We now will write a simple formula for the character of this cycle. We start by briefly recalling Connes’ construction from [6]. Details can be found in [6]. Let (V , F ) be a transversely oriented foliated manifold, F being an integrable subbundle of T V . The graph of the foliation G is a groupoid, whose objects are points of V and morphisms are equivalence classes of paths in the leaves, with equivalence given by holonomy. Equipped with a suitable topology it becomes a smooth (possibly non-Hausdorff) manifold. By r and s we denote the range and source maps G → V . 1/2 the line bundle on V of the half-densities in the direction of F . By F we denote

Let A = C0∞ G, s ∗ (F ) ⊗ r ∗ (F ) be the convolution algebra of G. We define 1/2

1/2

of a (nonunital) generalized cycle over thealgebra A as follows. The k th component Vk ∗ 1/2 1/2 ∞ ∗ ∗ ∗ ∗ τ ) . Here the graded algebra is given by C0 G, s (F ) ⊗ r (F ) ⊗ r (

τ = T V /F is the normal bundle, and the product k ⊗ l → k+l is induced by the convolution and exterior product. The definition of the transverse differentiation (connection) requires a choice of a ∞ subbundle H ⊂ T V , complementary F . This choice allows one to identify V∗ V∗ ∗ Vto V∗ C ∗ ∗ ∗ ∗ ∞ ∞ (V , T V ) with C (V , F ⊗V τ ).V We say that form ω ∈ C (V , TV ) is of the type (r, s) if it is in C ∞ (V , r F ∗ ⊗ s τ ∗ ) under this identification. For such a form we have dω = dV ω + dH ω + σ ω,

(3.30)

where dV ω, dH ω, σ ω are defined to be components of dω of the types (r + 1, s), (r, s + 1), (r − 1, s + 2) respectively (our notations are slightly different from those 1/2 of [6]). Now, writing locally ρ ∈ C ∞ (V , F ) as ρ = f |ω|1/2 , f ∈ C ∞ (V ), ω ∈ V dim F F ∗ ) we define C ∞ (V , dH ρ = (dH f )|ω|1/2 + f |ω|1/2

dH ω . 2ω

(3.31)

as a graded derivation of the graded algebra Finally, dH can be extended uniquely V∗ ∗ 1/2 1/2 ∞ ∗ ∗ ∗ C0 G, s (F ) ⊗ r (F ) ⊗ r ( τ ) so that the following identities are satisfied: dH r ∗ (ρ1 )f s ∗ (ρ2 ) =

r ∗ (dH ρ1 )f s ∗ (ρ2 ) + r ∗ (ρ1 )dH f s ∗ (ρ2 ) + r ∗ (ρ1 )f s ∗ (dH ρ2 ) for ρ1 , ρ2 ∈ C ∞ (V , F ), f ∈ C0∞ (G) (3.32) 1/2

and dH (φr ∗ (ω)) = dH (φ)r ∗ (ω) + φr ∗ (dH ω) ∗ ^ 1/2 1/2 for φ ∈ C0∞ G, s ∗ (F ) ⊗ r ∗ (F ) , ω ∈ C ∞ (V , τ ∗ ). (3.33)

Cycles, Equivariant Characteristic Classes and Fredholm Modules

17

2 ω = −(d σ + σ d )ω. The operator θ = −(d σ + σ d ) Now, for the form ω dH V V V V contains only longitudinal Lie derivatives, and hence defines a multiplier (of degree 2) V 1/2 1/2 of the algebra C0∞ G, s ∗ (F ) ⊗ r ∗ (F ) ⊗ r ∗ ( ∗ τ ∗ ) . V 1/2 1/2 Finally, the graded trace on C0∞ G, s ∗ (F ) ⊗ r ∗ (F ) ⊗ r ∗ ( q τ ∗ ) , q = codim F Z R is given by −ω = ω. V

Z V 1/2 1/2 Lemma 3.7 ([6]). C0∞ G, s ∗ (F ) ⊗ r ∗ (F ) ⊗ r ∗ ( q τ ∗ ) , dH , θ, − is a gen

eralized cycle of degree q over the algebra A . We can now write an explicit formula for the character of this cycle. Proposition 3.8. The following formula defines a (reduced) cyclic cocycle χ in the (b, B)-bicomplex of the algebra A (with adjoined unit), q−k

(−1) 2 χ (φ0 , φ1 , . . . , φk ) = k

q+k 2

!

X i0 +···+ik = q−k 2

Z

φ0 θ i0 dH (φ1 ) . . . dH (φk )θ ik .

(3.34)

V

Here k = q, q − 2, ..., and φj , j ≥ 1 are elements of A, while φ0 is an element of A with unit adjoined. Z Recall, that for qZeven to define the cocycle over A with the unit adjoined we extend − by requiring that −θ q/2 = 0. The resulting class is independent of the choice of H . It follows from the fact that by varying the subbundle H smoothly we obtain the cobordism between the corresponding cycles, satisfying the conditions of Lemma 2.7. Note that the equality here is in cyclic cohomology, not only periodic cyclic cohomology. The results of Sect. 2 imply that the class of the cocycle χ is the transverse fundamental class of the foliation, as defined in [6]. 4. Fredholm Modules In this section we write formulas for the character of the generalized cycle associated with a finitely summable bounded Fredholm module (cf. [4]) . In other words we obtain a formula for the character of a Fredholm module. We show that this definition agrees with Connes’ definition [4]. Let (H, F, γ ) be an even finitely summable bounded Fredholm module over the algebra A. Here H is a Hilbert space, on which the algebra A acts, γ is a Z2 -grading on H, and F is an odd selfadjoint operator on H. We assume that A is represented by the even operators in H, and since we almost always consider only one representation of A, we drop this representation from our notations, and do not distinguish elements of the algebra and corresponding operators. We suppose that the algebra A is unital. Let p p be a number such that [F, a] ∈ Lp and (F 2 − 1) ∈ L 2 . We remark that for any p summable Fredholm module one can achieve these summability conditions by altering the operator F and keeping all the other data intact. We associate with the Fredholm module a generalized cycle similarlyL to [4] where it is done in the case when F 2 = 1. ∞ m Consider a Z-graded algebra = m=0 generated by the symbols a ∈ A of

18

A. Gorokhovsky

degree 0 , [F, a], a ∈ A of degree 1 and symbol (F 2 − 1) of degree 2 , with a relation [F, ab] = a[F, b] + [F, a]b. This algebra can be naturally represented on the Hilbert space H, and we will not distinguish in our notations between elements of the algebra and the corresponding operators. is equipped with a natural connection ∇, given by the formula ∇(ξ ) = [F, ξ ] (graded commutator) in terms of the representation of , or on generators by the formulas ∇(a) = [F, a], ∇([F, a]) = (F 2 − 1)a − a(F 2 − 1) = [(F 2 − 1), a], ∇ (F 2 − 1) = 0.

(4.1) (4.2) (4.3)

Notice that ∇ 2 (ξ ) = [(F 2 − 1), ξ ] for ξ ∈ . Hence we define the curvature θ to be (F 2 − 1). Clearly, ξ ∈ n is of trace class if n ≥ p. Here we need Z to choose n to be

even, n = 2m. Hence we can define the graded trace on n by −ξ = m!T r γ ξ . The equality T r γ ∇(ξ ) = 0 for ξ ∈ n−1 follows from the relation Tr γω =

1 T r γ F ∇(ω) − T r γ (F 2 − 1)ω 2

which holds for ω of trace class). Indeed, for ξ ∈ n−1 ∇(ξ ) is of trace class and T r γ ∇(ξ ) =

1 T r γ F ∇ 2 (ξ ) + T r γ ∇(ξ ) 2 1 = T r γ F [(F 2 − 1), ξ ] − T r γ (F 2 − 1)[F, ξ ] = 0. 2

(4.4)

Now we can apply the formula (2.4) to obtain a cyclic cocycle Ch2m (F ) in the cyclic bicomplex of the algebra A. Its components Chk2m (F ) k = 0, 2 , 4 , . . . ,2m are given by the formula Chk (F )(a0 , a1 , . . . ak ) = X m! T r γ a0 (1 − F 2 )i0 [F, a1 ](1 − F 2 )i1 . . . [F, ak ](1 − F 2 )ik . (m + 2k )! k i0 +i1 +···+ik =m− 2

(4.5) Note that for the case when F 2 = 1 we get the formula from [4], normalized as in [6]. We will now associate the generalized chain with homotopy between Fredholm modules. If the two Fredholm modules (H, F0 , γ ) and (H, F1 , γ ) are connected by a smooth operator homotopy ( meaning that there exists a C 1 family Ft of operators p with [Ft , a] ∈ Lp and (Ft2 − 1) ∈ L 2 , t ∈ [0, 1] with Ft |t=0 = F0 , Ft |t=1 = F1 ), this generalized chain will provide cobordism between cycles corresponding to the modules. We start by constructing, exactly as before, an algebra t generated by the elements a, [Ft , a], (Ft2 − 1), with the connection ∇t and the curvature θt = (Ft2 − 1). For each t ∈ [0, 1] one constructs a natural representation πt of this algebra on the Hilbert space H. Let ∗ ([0, 1]) be the DGA of the differential forms on the interval [0, 1] with the bt . Choose an usual differential d. We can form a graded tensor product ∗ ([0, 1])⊗

Cycles, Equivariant Characteristic Classes and Fredholm Modules

19

p t odd number n = 2m + 1 so that n ≥ p + 2; if in addition we suppose that dF dt ∈ L , we can choose n ≥ p + 1. In order to define the connection and the curvature we will t b dF have to adjoin to our algebra an element of degree 2 dt ⊗ dt and an element of degree dFt t b(Ft dF 3 dt ⊗ dt + dt Ft ). The algebra with the adjoined elements will be denoted c . The ba. We define the connection ∇c homomorphism ρc : A → c is given by ρc (a) = 1⊗ d as dt ∧ dt + ∇t , i.e. on the generators the definition is the following (β ∈ ∗ ([0, 1])):

ba) = dβ ⊗ ba + (−1)deg(β) β ⊗ b[Ft , a], ∇c (β ⊗

(4.6)

b[Ft , a]) = dβ ⊗ b[Ft , a] + (−1)deg(β) β ⊗ b[(Ft2 − 1), a] + β ∧ dt ⊗ b[ ∇c (β ⊗ b(Ft2 − 1)) = dβ ⊗ b(Ft2 − 1) + β ∧ dt ⊗ b(Ft ∇c (β ⊗

dFt , a], dt (4.7)

dFt dFt + Ft ), dt dt

(4.8)

dFt dFt dF b(Ft t + ) = −dt ⊗ Ft ), dt dt dt dF dF dFt b(Ft t + b[(Ft2 − 1), t ]. Ft )) = dt ⊗ ∇c (dt ⊗ dt dt dt

b ∇c (dt ⊗

(4.9) (4.10)

The curvature θc of this connection is defined as b(Ft2 − 1) + dt ⊗ b θc = 1⊗

dFt , dt

(4.11)

2 and the Z identity (∇c ) · = [θc , ·] is verified by computation. One then defines the graded

bt )n by the formula trace − on (∗ ([0, 1])⊗ c

 R Z (−1)deg(ξ ) m! β T r γ πt (ξ ) bξ = [0,1] − β⊗ 0 c

if β ∈ 1 ([0, 1]) if β ∈ 0 ([0, 1])

.

The restriction maps r0 : c → 0 and r1 : c → 1 are defined as follows. bξ )) is 0 if β is of degree 1 , and β(0)ξ0 where ξ0 is obtained from ξ by replacing r0 (β ⊗ Ft by F0 if β is of degree 0 , and similarly for r1 . One can check that the map r1 ⊕ r0 f0 and provides required cobordism. identifies ∂c with 1 ⊕ Now we can use Theorem 2.1 to study the properties of Ch(F ) with respect to the operator homotopy. Theorem 4.1. Suppose (H, F0 , γ ) and (H, F1 , γ ) are two finitely summable Fredholm modules over an algebra A which are connected by the smooth operator homotopy Ft p and p is a number such that [Ft ] ∈ Lp and (Ft2 −1) ∈ L 2 for 0 ≤ t ≤ 1. Choose m such p t that 2m ≥ p + 1. Then Ch2m (F0 ) = Ch2m (F1 ) in H C 2m (A). If moreover dF dt ∈ L one can choose m such that 2m ≥ p. Proof. Let T chk2m denote the k th component of the character of the constructed above chain, providing the cobordism between the cycles associated with (H, F0 , γ ) and (H, F1 , γ ), k = 1, 3 , . . . , 2m + 1. It can be defined under the conditions on m specified in the theorem. According to Theorem 2.1, Ch2m (F1 ) − Ch2m (F0 ) = (b + B) T ch2m .

20

A. Gorokhovsky

Now, T ch2m+1 2m (a0 , a1 , . . . a2m+1 )

Z = const − ρc (a0 )∇c (ρc (a1 )) . . . ∇c (ρc (a2m+1 )) = 0 c

Z (since the term under the − does not contain dt). Hence T ch2m can be considered as c

the 2m − 1 chain (is in the image of S), and the result follows. u t Remark 4.1. Suppose we have two Fredholm modules (H, F0 , γ ) and (H, F1 , γ ) such p that F0 − F1 ∈ Lp and Fi2 − 1 ∈ L 2 , i = 0, 1 . Then Ch2m (F0 ) = Ch2m (F1 ), 2m ≥ p. Indeed, we can apply Theorem 4.1 to the linear homotopy Ft = F0 + t (F1 − F0 ), and p need only to verify that Ft2 − 1 ∈ L 2 . But Ft2 − 1 = (F02 − 1) + t F0 (F1 − F0 ) + (F1 − F0 )F0 + t 2 (F1 − F0 )2 . p

The first and the last terms in the right-hand side are always in L 2 , and since the left-hand p p side is in L 2 for t = 1, F0 (F1 − F0 ) + (F1 − F0 )F0 ∈ L 2 . Corollary 4.2. Let e be an idempotent in MN (A), and (H, F, γ ) be an even Fredholm module over A. Construct the Fredholm operator Fe = e(F ⊗ 1)e : H+ ⊗ CN → H− ⊗ CN (where H+ and H− are determined by the grading). Then index(Fe ) =< Ch∗ (F ), Ch∗ (e) > . Here Ch∗ (e) is the usual Chern character in the cyclic homology. Proof. By replacing A by MN (A) we reduce the situation to the case when e ∈ A. Now we apply Connes’ construction, which uses the homotopy Ft = F + t (1 − 2e)[F, e] which connects F (obtained when t = 0) with the operator F1 = eF e +(1−e)F (1−e), p obtained when t = 1. Note that 1 − Ft2 ∈ L 2 . Indeed, 2 Ft2 − 1 = F 2 − 1 + t (1 − 2e)[F, e] + t [F, (1 − 2e)[F, e]] . p

The first two terms are clearly in L 2 . As for the third one, it can be rewritten as p −2[F, e][F, e] + (1 − 2e)[F, [F, e]] = −2[F, e]2 + (1 − 2e)[(F 2 − 1), e] ∈ L 2 . The operator F1 commutes with e, and homotopy does not change the pairing. Hence it is enough to prove the result in the case when F and e commute. In this case in the formula for the pairing all of the terms involving commutators are 0 , hence the only term with nonzero contribution is Ch0 (F )(e) = T r γ e(1 − F 2 )m = T r γ (e − (eF e)2 )m = t index(Fe ) by the well known formula. u In [4] Connes provides canonical construction, allowing one to associate with every p-summable Fredholm module such that F 2 − 1 6 = 0 another one for which F 2 − 1 = 0, and which defines the same K-homology class. This allows to reduce the definition of the character of a general Fredholm module to the case when F 2 = 1. The construction is the following. Given the Fredholm module (H, F, γ ) one first constructs the Hilbert e space γ = γ ⊕ (−γ ). An element a ∈ A acts H= H ⊕ H with the grading given by e a0 e2 = 1; here e, such that F e − F 0 ∈ Lp and F by . Then one constructs an operator F 00 F 0 . The character of the Fredholm module (H, F, γ ) is then by F 0 we denote 0 −F e F e, e defined to be the character of the (H, γ ).

Cycles, Equivariant Characteristic Classes and Fredholm Modules

21

Theorem 4.3. Let (H, F, γ ) be an even finitely summable Fredholm module over the p algebra A, and let p be a real number such that [F, a] ∈ Lp and (F 2 − 1) ∈ L 2 . Then the class of Ch∗ (F ) defined in (4.5) in the periodic cyclic cohomology coincides with the Chern character, as defined by Connes [4]. e - the e F 0, e γ ) over the algebra A Proof. First, let us consider the Fredholm module (H, algebra A with adjoined unit (acting by the identity operator). Then Ch2m (F 0 ) defines e where we choose 2m ≥ p. Since T r e a class in the cyclic cohomology of A, γ (1 − e and hence in the (F 0 )2 )m = 0, it defines a class in the reduced cyclic cohomology of A, cyclic cohomology of A. It coincides with the class defined by the Fredholm module (H, F, γ ). Theorem 4.1 and Remark 4.1 show that the classes defined by the Fredholm modules e) coincides with the e) and Ch(F 0 ) coincide. To finish the proof we note that Ch(F Ch(F Chern character as defined in [4]. u t The proof of Theorem 4.1 also provides an explicit transgression formula. We just need to compute explicitly formula for T chk2m (a0 , a1 , . . . ak ) = (−1)m− (m +

k−1 2

k+1 2 )!

Z − ρc (a0 )θci0 ∇c (ρc (a1 ))θci1 . . . ∇c (ρc (ak ))θcik .

X

(m)!

i0 +i1 +···+ik =m− k−1 2

c

(4.12) Since θcil = (−1)m− (m +

k−1 2

P r+q=il −1

(m)!

k+1 2 )!

2 q t b(Ft2 − 1)r dF dt ⊗ dt (Ft − 1) one can rewrite this formula as

Z 1 0

k X

X

X

l=0 r+q=il −1 i0 +i1 +···+ik =m− k−1 2

[Ft , a1 ](Ft2 − 1)i1 . . . [Ft , al ](Ft2 − 1)r

(−1)l T r γ a0 (Ft2 − 1)i0

dFt 2 (Ft − 1)q . . . [Ft , ak ](Ft2 − 1)ik dt. dt (4.13)

Finally we can write the answer as T chk2m (a0 , a1 , . . . ak ) = Z 1 (m)! − (m + k+1 2 )! 0

X

k X (−1)l T r γ a0 (Ft2 − 1)i0

l=0 i0 +···+ik +ik+1 =m− k+1 2

[Ft , a1 ](1 − Ft2 )i1 . . . [Ft , al ](1 − Ft2 )il

dFt (1 − Ft2 )il+1 . . . [Ft , ak ](1 − Ft2 )ik+1 dt, dt (4.14)

where k is an odd number between 1 and 2m − 1. All the considerations above can be repeated in the case of an odd finitely summable Fredholm module (H, F ) over an algebra A . Here as before we suppose that [F, a] ∈

22

A. Gorokhovsky p

Lp , (F 2 − 1) ∈ ZL 2 . We choose the number m such that n = 2m + 1 ≥ p. The trace √ now is given by −ξ = 2i0(n/2 + 1)T r ξ . The corresponding Chern character Ch2m+1 (F ) has components Chk2m+1 for k = 1, 3 , . . . , 2m + 1, given by the formula Chk2m+1 (a0 , a1 , . . . , ak ) X

=

√ 0(m + 23 ) 2i (m +

k+1 2 )!

T r a0 (1 − F 2 )i0 [F, a1 ](1 − F 2 )i1 . . . [F, ak ](1 − F 2 )ik .

(4.15)

i0 +i1 +···+ik =m− k−1 2

If the two Fredholm modules are connected via the operator homotopy Ft one has the transgression formula Ch2m+1 (F1 ) − Ch2m+1 (F0 ) = (b + B) T ch2m+1 ,

(4.16)

where T ch2m+1 is a 2m cyclic cochain having components T chk2m for k even between 0 and 2m, given by the formula: T chk2m+1 (a0 , a1 , . . . ak ) = √ Z 0(m + 23 ) 2i 1 − (m + 2k + 1)! 0

X

k X (−1)l T r a0 (Ft2 − 1)i0

i0 +···+ik +ik+1 =m− 2k l=0

[Ft , a1 ](1 − Ft2 )i1 . . . [Ft , al ](1 − Ft2 )il

dFt (1 − Ft2 )il+1 . . . [Ft , ak ](1 − Ft2 )ik+1 dt. dt (4.17)

The proof of Theorem 4.3 works in the odd situation as well and shows that Ch∗ (F ) coincides with the Chern character as defined by Connes. In particular, this allows to recover the spectral flow via the pairing with K th eory. More precisely, let u ∈ MN (A) be a unitary. Let sf(F ⊗ 1, (F ⊗ 1)u ) be the spectral flow of the operators F ⊗ 1 and (F ⊗ 1)u = u((F ⊗ 1)u∗ acting on the space H ⊗ CN . The Chern character of the class of u in K1 (A) is the periodic cyclic cycle defined by ∞ 1 X (−1)l (l − 1)! tr (u ⊗ u−1 )l − (u−1 ⊗ u)l . Ch∗ (u) = √ 2 2πi l=1

(4.18)

Then we have the following Corollary 4.4. Let u ∈ A be a unitary, and (H, F ) be an odd Fredholm module over the algebra A. Then < Ch∗ (F ), Ch∗ (u) >= sf(F ⊗ 1, (F ⊗ 1)u ). Remark 4.2. This is a finitely summable analogue of the result of Getzler [8]. In the finitely summable case analytic formula for the spectral flow was derived in [3]; use of Theorem 4.3 allows to give a proof of Corollary 4.4 without using this formula. Acknowledgement. I would like to thank my advisor H. Moscovici for introducing me to the area and constant support. I would like also to thank D. Burghelea and I. Zakharevich for helpful discussions.

Cycles, Equivariant Characteristic Classes and Fredholm Modules

23

References 1. Berline, N., Getzler, E. and Vergne, M.: Heat kernels and Dirac operators. Volume 298 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: SpringerVerlag, 1992 2. Bott, R.: On some formulas for the characteristic classes of group-actions. In: Differential topology, foliations and Gelfand–Fuks cohomology (Proc. Sympos., Pontifícia Univ. Católica, Rio de Janeiro, 1976). Volume 652 of Lecture Notes in Math., Berlin: Springer, 1978, pp. 25–61 3. Carey, A. and Phillips, J.: Unbounded Fredholm modules and spectral flow. Canad. J. Math. 50 (4), 673–718 (1998) 4. Connes, A.: Non-commutative differential geometry. Publ. Math. IHES 62, 257–360 (1985) 5. Connes, A.: Cyclic cohomology and the transverse fundamental class of a foliation. In: Geometric methods in operator algebras (Kyoto, 1983). Volume 123 of Pitman Res. Notes Math. Ser. Harlow: Longman Sci. Tech., 1986, pp. 52–144 6. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 7. Fe˘ıgin, B.L. and Tsygan, B.L.: Cyclic homology of algebras with quadratic relations, universal enveloping algebras and group algebras. In: K-theory, arithmetic and geometry (Moscow, 1984–1986), Berlin: Springer, 1987, pp. 210–239 8. Getzler, E.: The odd Chern character in cyclic homology and spectral flow. Topology 32 (3), 489–507 (1993) 9. Getzler, E. and Szenes, A.: On the Chern character of a theta-summable Fredholm module. J. Funct. Anal. 84 (2), 343–357 (1989) 10. Jaffe, A., Lesniewski, A. and Osterwalder, K.: Quantum K-theory. I. The Chern character. Commun. Math. Phys. 118 (1), 1–14 (1988) 11. Loday, J.-L.: Cyclic homology. Volume 301 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, 1992 12. Nest, R. and Tsygan, B.: Algebraic index theorem. Commun. Math. Phys. 172 (2), 223–262 (1995) 13. Nest, R. and Tsygan, B.: Algebraic index theorem for families. Adv. Math. 113 (2), 151–205 (1995) 14. Nistor, V.: Group cohomology and the cyclic cohomology of crossed products. Invent. Math. 99 (2), 411–424 (1990) 15. Nistor, V.: Super-connections and non-commutative geometry. In: Cyclic cohomology and noncommutative geometry (Waterloo, ON, 1995). Volume 17 of Fields Inst. Commun. Providence, RI: Amer. Math. Soc., 1997, pp. 115–136 16. Quillen, D.: Algebra cochains and cyclic cohomology. Inst. Hautes Études Sci. Publ. Math. 68, 139–174 (1989) Communicated by A. Connes

Commun. Math. Phys. 208, 25 – 54 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Exponential Decay of Correlations for Random Lasota–Yorke Maps Jérôme Buzzi C.N.R.S., Institut de Mathématiques de Luminy, 163, av. de Luminy, Case 930, 13288 Marseille, France. E-mail: [email protected] Received: 16 November 1998 / Accepted: 11 May 1999

Abstract: We consider random piecewise smooth, piecewise invertible maps mainly on the interval but also in higher dimensions. We assume that, on the average and possibly without any stochastic uniformity: (i) the maps expand distances, (ii) do not have too many pieces, (iii) do not have too large a distortion, and (iv) are strongly mixing. We assume no Markov property. We prove that as in the classical case of the iteration of a fixed piecewise expanding map of the interval, we have exponential decay of random correlations. Our proof builds on the one given by C. Liverani for deterministic, mixing and piecewise expanding interval maps. We demand very little of the stochastic process giving the maps. In particular, if the maps are β-transformations on [0, 1[d , i.e., xn+1 = Bn+1 xn mod Zd with Bn : Rd → Rd affine, then our results apply to all stationary and ergodic processes B1 , B2 , . . . which expand on the average and satisfy the mixing condition above. We remark that our setting does not imply fast decay of integrated correlations. 0. Introduction A random dynamical system f on a space X is a stationary stochastic process f1 , f2 , · · · : X → X (i.e., with values in a space of maps), see for instance [14]. We can see this process as given by fn+1 = f ◦ T n , where f is a measurable map defined on some abstract probability space (, A, P) and T an automorphism. We require (ω, x) 7 → fω (x) to be measurable on × X. As usual, this random system is realized by the skew-product: F : × X −→ × X, (ω, x) 7 −→ (T ω, fω (x)). n We write fω for f (ω) and fω for fT n−1 ω ◦ . . . fω for n ≥ 0. For convenience, we shall always tacitly assume (, P, T ) to be ergodic. We study mainly random compositions of piecewise smooth maps on the interval but we shall state our result on a sufficient level of generality so that we will see in

26

J. Buzzi

Appendix B that it applies to interesting classes of multi-dimensional maps, including multi-dimensional β-transformations, i.e.: fB : [0, 1[d −→ [0, 1[d , x 7−→ B.x

mod Zd ,

where d = 1, 2, . . . , B : Rd → Rd is an affine map and y mod Zd is the point in [0, 1[d the coordinates of which are the fractional parts of the coordinates of y. For now we restrict ourselves to the interval and more precisely to the following class of maps: A piecewise monotonic and non-singular (p.m.n.s.) map is f : [0, 1] → [0, 1] which is: 1. piecewise monotonic, i.e., there is a finite subdivision a0 = 0 < a1 < · · · < aN = 1 of the unit interval such that each restriction fi of f to any subinterval ]ai−1 , ai [, i = 1, . . . , N is a homeomorphism on its image; R 2. non-singular, i.e., there is |f 0 | : [0, 1] → [0, ∞[ such that: m(f (E)) = E |f 0 | dm for any E ⊂]ai−1 , ai [, m being Lebesgue measure. We shall use the following notion of variation (see [11]): var(g) =

h=g

inf

sup

n X

mod m 0=s0 <s1 <···<sn =1 k=1 n=1,2,...

|h(sk ) − h(sk−1 )|.

We call the intervals ]ai−1 , ai [ the intervals of f and define: N (f ) = N , δ(f ) = essinf |f 0 |. We write δ(ω) for δ(fω ), etc. If f is a random p.m.n.s. interval map we consider the following conditions: ω 7 → (δ, var(1/|fω0 |), N, a1 , . . . , aN , 0, . . . ) is measurable on . Z log min(δ(fω ), K)dP > 0 (possibly +∞). lim K→∞

N (ω) is P-integrable. δ(ω) log+ var(1/|fω0 |) is P-integrable. log+

(A0) (A1) (A2) (A3)

f is covering: for each non-trivial subinterval I ⊂ [0, 1], for P-almost all ω ∈ (for (A4) short, a.a. ω), there exists nc = nc (ω, I ) < ∞ such that, essinf x∈[0,1] (Lnω 1I ) > 0 ∀n ≥ nc , where the essential infimum is with respect to (for short, w.r.t.) m, 1I is the characteristic function of I and Lnω is the transfer operator associated to the map fωn : [0, 1] → [0, 1]: (Lnω h)(x) =

X y∈fω−n (x)

Remark 0.1. 1. We assume no uniformity w.r.t. ω.

h(y) . |fωn 0 (y)|

Decay of Correlations for Random Lasota–Yorke Maps

27

2. (A4) is the direct random generalization of the covering property considered by C. Liverani [19]. In particular, if supx |fω0 (x)| < ∞ almost surely then the condition on Lnω 1I in (A4) can be replaced by fωn (I ) = [0, 1] modulo a finite set. 3. All non-trivial, non-contracting random β-transformations on [0, 1[ satisfy (A0) – (A4): that is, if fω : x 7 → β(ω)x mod 1, then one can take any measurable function β : → [1, ∞[ not almost always equal to 1. 4. Small random perturbations of an expanding p.m.n.s. certainly satisfy (A0)–(A3) (and one can easily give sufficient conditions to ensure (A4)). But our results say nothing about the limit of vanishing perturbations. In a previous paper [9] we showed that, under conditions (A0) – (A3), the skewproduct admits finitely many ergodic a.c.i.m., i.e., invariant probability measures which are absolutely continuous w.r.t. P × m. These measures taken together describe the statistics of orbits starting at P × m-a.e. point. We now make the non-trivial (see below) additional assumption (A4), “random covering”. By the previous result, there is a unique a.c.i.m. µ for the skew-product and we would like to know its mixing properties. More precisely, in this paper – which is independent of the previous one – we prove that, under the covering assumption (A4), there is random exponential mixing, i.e., exponential decay of the random correlation functions. TheseR are defined after [2] as follows: Write µ = µω dP(ω) for the disintegration of µ over P. Given two bounded measurable functions ϕ, ψ : X → R, the random correlation functions are, for n ≥ 0: def

Cω (ϕ, ψ, n) =

Z X

ϕ · ψ ◦ fωn dµω −

Z X

Z ϕ dµω ·

X

ψ dµT n ω .

These are more precisely the forward correlation functions. We shall also consider backward correlation functions: CT −n ω (ϕ, ψ, n).

0.1. Main Result. We first check that the maps introduced above define random transfer operators, i.e., families L = (Lω )ω∈ of transfer operators on X, with good properties in the sense of Sect. 1 below: Proposition 0.2. Assume that f is: • either a random p.m.n.s. on the interval satisfying (A0)–(A4); • or a random multi-dimensional R β-transformation fω = fB(ω) which is (i) expanding on the average, i.e., such that log inf kvk=1 kB.v − B.0k dP > 0; and (ii) randomly covering, i.e., for each ball B, P-almost surely, there exists nc = nc (ω) < ∞ such that, for all n ≥ nc , fωn (B) = [0, 1[d . Then f defines a good random transfer operator in the sense of Sect. 1 below. Remark 0.3. For random d-dimensional β-transformations, the strong expansion condition: Z √ log inf kB.v − B.0k dP > log(1 + d)

kvk=1

with the notation above implies (i) and (ii) above.

28

J. Buzzi

Now the random transfer operator by itself determines the correlation functions for any a.c.i.m. of the skew-product, according to the formula: Z Z Z dµω n Lω ϕ ϕ dµω · ψ dµT n ω . · ψ dm − Cω (ϕ, ψ, n) = dm X X X This allows us to investigate the decay of these functions for abstract random transfer operators with “good properties”: Notation. k · kp stands below and throughout the paper for the norm of Lp (X, m). Main Theorem. Let (, P, T , L) be a good random transfer operator on a space X endowed with a probability m and a notion of variation var. Then: 1. There is a normalized density h on × X which is globally invariant in the sense that: Lω hω = hT ω for a.a. ω ∈ (hω : x 7 → h(ω, x)). Moreover, h is unique modulo m and we may assume that var(hω ) < ∞ for a.a. ω ∈ . There is a constant ρ < 1 with the following properties: 2. The density h can be approximated exponentially fast by iterating the transfer operator: for all h∗ : X → [0, ∞[ with bounded variation and kh∗ k1 = 1, for a.a. ω, for all large n (i.e., n ≥ n0 (ω), a.e. finite, depending on var(h∗ )):

n

L −n h∗ − hω ≤ ρ n . T ω ∞ def

3. Let µ = h · P × m. Then we have (backward and forward) exponential random mixing: there exists a P-a.e. finite measurable function K such that, for all ω ∈ , for all ϕ, ψ : X → R with ϕ of bounded variation and ψ bounded, for all n ≥ 0: |Cω (ϕ, ψ, n)| ≤ K(ω)kψk∞ kϕkvar ρ n , |CT −n ω (ϕ, ψ, n)| ≤ K(ω)kψk∞ kϕkvar ρ n , def

where kϕkvar = kϕk∞ + var(ϕ). Remark. ρ could be estimated, at least in principle, from the proof and a rather complete knowledge of the random map. This theorem is essentially a corollary of the following result (θ+ being the projective metric between positive functions, see Sect. 1). In the setting of the theorem: Proposition 4.1. Let > 0 be small enough and V : → R+ be measurable. Then there exist a constant ρ < 1 and a measurable function n0 : → N such that, for a.a. ω ∈ , all n ≥ n0 (ω), all ` ≥ 0 and all |p| ≤ n: g) ≤ ρ n , θ+ (LnT p ω h, Ln+` T p−` ω for all h, g : X → R+ with m(h) = m(g) = 1 and var(h) ≤ V (ω)en , var(g) ≤ V (ω)e(n+`) . Moreover ρ = ρ() does not depend on V .

Decay of Correlations for Random Lasota–Yorke Maps

29

0.2. Comments and counter-examples. For simplicity of language we restrict our discussion to the case where the random transfer operator is given by a random dynamical system f with skew-product F . 1. Similar results. Backward and/or forward exponential decay of correlations for random maps had been obtained till now only for globally smooth and expanding systems [1,6, 13,17] or for a piecewise expanding map of the interval but with strong restrictions [20, 21] or for small random perturbations of the deterministic case [3,2,15]. 2. Random covering. We proved in [9] that, for random p.m.n.s. interval maps, (A0)-(A3) imply the existence of finitely many ergodic F -invariant probability measures absolutely continuous w.r.t. P × m. Our new results supersede these, but only under the additional assumption of random covering (A4), which seems to be a significant restriction even in dimension 1, unlike in the deterministic case: Indeed, we give an example of a random p.m.n.s. interval map f satisfying (A0)(A3) such that µ = P × m is invariant for the skew-product F and (F, µ) is mixing but, nevertheless, f does not satisfy the covering condition (A4). Thus, the ergodic decomposition of a random p.m.n.s. map satisfying (A0)-(A3) (and thus having a finite number of ergodic a.c.i.m.’s by [9]) does not define subsystems satisfying the covering hypothesis (A4), even up to some period and rescaling. That something like this might happen was pointed out to us by Viviane Baladi. Question: Can one define (and build) a useful decomposition of an arbitrary random p.m.n.s. map satisfying (A0)–(A3) into “pathwise irreducible” (and, according to the previous example, non-measurable w.r.t. ω!) components? 3. No mixing for the skew-product. Our result clearly implies random mixing (or fibern→∞ mixing [7]), i.e.: Cω (ϕ, ψ, n) −→ 0 a.e. Recall that we do not assume (measuretheoretic) mixing of the basis but only its ergodicity. Thus we certainly cannot have mixing for the skew-product in the general case. This a fortiori excludes looking for a spectral gap for the global transfer operator associated to the skew-product. This is in contrast to [1] where more is assumed on the process giving the maps (it is i.i.d.) and much stronger properties are shown. 4. No exponential decay for integrated correlations. An intermediate notion between random mixing and global mixing is obtained by looking at the integrated correlation functions: Z def Cω (ϕ, ψ, n) dP Cint (ϕ, ψ, n) =

for bounded, measurable ϕ, ψ : X → R. Let ϕ(ω) ¯ = µω (ϕ) and define ψ¯ similarly. Then writing CF , resp. CT , for the correlation function associated to the skew-product ¯ n). Observe ¯ ψ, F , resp. the basis T , we have: CF (ϕ, ψ, n) = Cint (ϕ, ψ, n) + CT (ϕ, also that L∞ ( × X) = L∞ () ⊗ L∞ (X). Thus we see, that, when the basis is mixing, (F, µ) is mixing ⇐⇒ ∀ϕ, ψ ∈ L∞ (X, m) lim Cint (ϕ, ψ, n) = 0. n→∞

Remark that, by the dominated convergence theorem, random mixing implies that this last condition is satisfied (the basis being mixing or not) in our setting (we could even integrate the absolute values of the random correlation functions). However these integrated correlations can decrease very slowly under our hypothesis: we give an example of a random p.m.n.s. interval map satisfying (A0)–(A4) for which: X Cint (ϕ, ϕ, n) = ∞ n≥0

30

J. Buzzi

with ϕ : [0, 1] → R the characteristic function of some interval. Moreover our example has a.e.-constant fiber measures µω , hence its integrated correlation functions are equal to correlation functions of the skew-product. We also remark that the function K in our Main Theorem can be very large. Indeed, not only can it be non-integrable (as the above divergence shows) but even log+ K can be non-integrable under (A0)–(A4). Question: Would exponential mixing of the basis w.r.t. the “relevant” algebra of subsets imply exponential decay of integrated correlation functions? 5. Perspectives. Let us list some natural questions: (1) These random maps probably have good statistical properties (like the central limit theorem for observables with bounded variation). A general result in this direction (with an application to a similar expanding setting, but without discontinuities) was obtained by Y. Kifer [16]. (2) One also expects stability properties, i.e., continuity properties of f 7→ µ w.r.t. suitable topologies, as for stochastic perturbations of a deterministic map [15,3,5]. (3) Finally, under natural conditions one should be able to consider the global transfer operator, associated to the skew-product, and get much more precise results (in particular, quasi-compactness of this operator) as was done by V. Baladi [1] in the case of i.i.d. composition of globally smooth mappings.

0.3. Outline of the proof. We build up on the deterministic case and more precisely on the proof given by C. Liverani [19]. Section 1 is devoted to some preliminaries. We first describe an abstract setting which sums up the relevant properties in our two classes of maps. We check these properties for our class of interval maps (the multi-dimensional case being investigated in Appendix B). We continue by some facts about Birkhoff cones and their projective metrics. We finally recall how decay of correlations can be deduced from the contraction by the random transfer operators in this metric. In Sect. 2, we define “good” ω’s in such a way that we can verify that they make up nearly all of and that a fixed iterate of the transfer operator, LR ω , is uniformly contracting. In Sect. 3, we deal with the bad ω’s (which can be very bad) by grouping them with just enough good ones. We check that the transfer operator corresponding to each group is not too bad and that these groups make only a small proportion of any typical orbit. The proof of the Main Theorem is finished in Sect. 4 by concluding that iterates of the random transfer operator are exponentially contracting and that the variation norm of the density hT n ω can grow with n only subexponentially. Appendix A describes the counter-examples announced above. Appendix B presents an application of our abstract results to some higher dimensional random maps.

1. Preliminaries 1.1. Abstract setting. We unify our two concrete settings under the following abstract assumptions.

Decay of Correlations for Random Lasota–Yorke Maps

31

1.1.1. Notion of variation on X. X is a space endowed with a probability measure m and a notion of variation var : L1 (X, m) → [0, ∞] which satisfies: (V1) (V2) (V3) (V4) (V5) (V6)

var(th) = |t|var(h); var(g + h) ≤ var(g) + var(h); khk∞ ≤ khk1 + Cvar var(h) for some constant 1 ≤ Cvar < ∞; for any C < ∞, {h : X → R : khk1 + var(h) ≤ C} is L1 (m)-compact; var(1X ) < ∞, 1X being the function equal to 1 on X. {h : X → R+ : khk1 = 1 and var(h) < ∞} is L1 (m)-dense in {h : X → R+ : khk1 = 1}.

Here and below all functions are tacitly assumed to be measurable and are understood to be in fact equivalence classes modulo m or P × m. We write (V) for (V1)–(V6). 1.1.2. Lasota–Yorke inequalities. Lω is a random transfer operator on (X, m), i.e.: (LY0) Lω is a positive linear operator on L1 (X, m) and it satisfies kLω hk1 = khk1 for all non-negative h ∈ L1 (m); Moreover, (ω, x) 7→ (Lω hω )(x) is P × m-measurable for all P × m-measurable h; (LY1) var(Lω h) + kLω hk1 ≤ K(ω)(var(h) + khk1 ) with log K ∈ L1 (P); and, for some positive integer N: N N N N 1 (LY2) var(LN ω h) ≤ α (ω)var(h) + K (ω)khk1 with log α , log K ∈ L (P) and: Z log α N dP < 0;

The superscript N does not indicate exponentiation but is only a reminder that the corresponding quantities pertain to the N th iterate. It is convenient to require also: (LY3) K N (ω) = K(ω) . . . K(T N −1 ω) ≥ 6N . Of course, (LY0)–(LY2) imply (LY0)–(LY3) if we replace K(ω) by def ˜ = max(K(ω), K N (ω), 6) K(ω)

˜ ˜ N −1 ω). . . . K(T and K N (ω) by K(ω) We write (LY) for (LY0)–(LY3). 1.1.3. Random covering. We shall also check the following form of covering: (RC) for each a > 0, for a.a. ω ∈ , there exist random numbers nc (ω) < ∞ and α0 (ω), α1 (ω), · · · > 0 such that for all h ∈ Ca : ∀n ≥ nc essinf x (Lnω h)(x) ≥ αn khk1 (the dependence on a being understood). Definition 1.1. A random transfer operator is good w.r.t. (X, m, var, α N , K) if it has properties (V)+(LY)+(RC) defined just above.

32

J. Buzzi

1.2. p.m.n.s. with (A0)–(A4) define good transfer operators. Let f be a random p.m.n.s. interval map satisfying (A0)–(A4). The usual notion of variation for functions on the interval [0, 1] modulo Lebesgue measure (recalled in the introduction) satisfies (V). Claim. f satisfies the random covering property (RC). Proof. We follow C. Liverani [19]. Consider a partition of [0, 1] into subintervals with length at most 1/2a. AsR h ∈ Ca , one has essinf I h ≥ 21 khk1 for at least one of the subintervals: otherwise I h dm ≤ m(I )(essinf I h + var(h|I )) for all I would imply: X X m(I ) + (1/2a) var(h|I ) ≤ khk1 , khk1 < (1/2)khk1 I

I

a contradiction. Using the covering hypothesis (A4), one finds nc = nc (ω) such that, for all n ≥ nc : essinf [0,1] Lnω h ≥

1 khk1 essinf [0,1] Lnω 1I ≥ αn khk1 . 2

t u

We turn to (LY). (LY0) is clear from (A0) once one sees that (ω, x) 7→ f 0 (ω, x) is a measurable function mod P × m as f 0 is defined a.e. and f is measurable. To prove (LY1)–(LY2) we use the two following Lasota–Yorke type inequalities as in [9]. The first one is classic and will be used most of the time: Lemma 1.2 (M. Rychlik [22]). Let f : [0, 1] → [0, 1] be a p.m.n.s. interval map with var(1/|f 0 |) < ∞. Let h : [0, 1] → R be a function of bounded variation. Then we have: var(Lf h) ≤

3 var(h) + β(f )khk1 , δ(f )

where β(f ) < ∞ depends measurably on f . The problem with the above estimate is that β(fω ) (and a fortiori β(fωN )) is not known to be integrable. This explains the need for the following inequality: Lemma 1.3 ([9, Lemma 1.2]). Let g : [0, 1] → [0, 1] be a p.m.n.s. interval map. Let h : [0, 1] → R be a function of bounded variation. Then we have: var(Lg h) ≤ K0 (g)(var(h) + khk1 ) with:

1 0 var(1/|g |) ∨ 1 ∨1 . δ(g)

N(g) ∨1 K0 (g) = 4 δ(g)

Proof. We have:

h −1 ◦ g · 1 g(]a ,a [) i−1 i i |g 0 | i=1 N X h h + max 1 var ]a ,a [ (h) (a (a ) + ) ≤ i−1 i i−1 i |g 0 | |g 0 | |g 0 | i=1 1 max(|h|), + var ]ai−1 ,ai [ |g 0 |

var(Lg h) ≤

N X

var

Decay of Correlations for Random Lasota–Yorke Maps

33

using var(fg) ≤ var(f ) max(|g|) + max(|f |)var(g). Hence, 1 1 2N(g) + var max(|h|) + var(h) var(Lg h) ≤ 0 δ(g) |g | δ(g) 1 1 2N(g) + var (khk1 + var(h)) + var(h) ≤ 0 δ(g) |g | δ(g) 1 1 2N (g) 1 2N(g) + var + var(h) + + var khk1 . ≤ δ(g) |g 0 | δ(g) δ(g) |g 0 | t u Claim. f satisfies (LY). Proof. (LY0) has been checked already. (LY1) follows immediately from Lemma 1.3 and (A0)-(A3) if we define K(ω) = K0 (ω) as in the statement of the lemma (in particular, log+ K0 is integrable). It remains to check (LY2) as (LY3) will be a consequence of the rest. By (A1), assuming that K∗ < ∞ is a large constant, we have: Z def log min(δ, K∗ ) dP > 0. λ∗ =

Choose an integer N > 0 so large that log 3 < λ∗ N/10. def Q def QN −1 N k k Set δ¯N (ω) = N−1 k=0 min(δ(T ω), K∗ ) and K0 (ω) = k=0 K0 (T ω). Pick 0 > 0 so small that for any E ⊂ with P(E) < 0 : Z Z λ∗ N λ∗ N and log K0N dP < log δ¯N dP < 10 10 E E (this is possible because of the integrability of both functions). def

Let β∗N < ∞ be a constant so large that: b = {ω : β(fωN ) > β∗N }) has P-measure < 0 . Define: ( 3/δ¯N (ω) if ω ∈ / b N α (ω) = N K0 (ω) otherwise and: K(ω) = max(K0 (ω), α N (ω), (β∗N )1/N , 6) and K N (ω) = K(ω) . . . K(T N −1 ω). If ω ∈ / b , then, by definition, α N (ω) = 3/δ(fωN ) and K N (ω) ≥ β∗N ≥ β(fωN ). Hence by Lemma 1.2, the inequality in (LY2) holds in this case. If ω ∈ b , then α N (ω) = K0N (ω) and K N (ω) ≥ K0N (ω). Hence by Lemma 1.3, the inequality in (LY1) holds also in this case. As log δ¯ and log K0N are integrable, so is log α N . This in turn implies the integrability of log K N . We compute: Z Z Z Z log α N dP ≤ log 3 − log δ¯N dP + log δ¯N dP + log K0N dP

b

7 ≤ − λ∗ N < 0. 10 This completes the proof of (LY2), hence of (LY). u t

b

34

J. Buzzi

1.3. Cones and decay of correlations. Building on the study of the deterministic case by C. Liverani [19], we consider the cone: Ca = {h ∈ L∞ (X, m) : h ≥ 0

(mod m), var(h) ≤ akhk1 }

with parameter a to be determined. Remark that we work modulo m, whereas C. Liverani works with true functions. We recall some basic facts (see [10,19,24]). By (V1)-(V4) Ca is a Birkhoff cone for each a < ∞. We write θa for the projective metric on Ca . We have for g ∈ Cνa (0 < ν < 1) normalized by kgk1 = 1:

θa (g, 1) ≤ log

(1 + ν)(1 + V ) esssup g var(1X ) V = , (1 − ν)(1 − V ) essinf g a

(1.1)

with essential supremum and infimum w.r.t. m, provided V < 1. This follows from θa (g, h) = log βα , where α is maximal so that g − αh ∈ Ca and β is minimal so that βg − h ∈ Ca . Indeed: g − min

1+ν 1−ν , essinf g · 1X and max , esssup g · 1X − g 1+V 1−V

are both in Ca as they are nonnegative and, assuming that m(g) = 1 and writing α, resp. 1−ν 1+ν , essinf g), resp. max( 1−V , esssup g): β, for min( 1+V var(g) + αvar(1X ) ν + αV ν(1 + V ) + (1 − ν)V var(g − α1X ) ≤ ≤ a≤ a = a, m(g − α1X ) 1−α 1−α 1+V −1+ν βvar(1X ) + var(g) βV + ν (1 − V )ν + (1 + ν)V var(β1X − g) ≤ ≤ a≤ a = a, m(β1X − g) β −1 β 1+ν−1+V using that α 7 → (ν + αV )/(1 − α), resp. β 7→ (βV + ν)/(β − 1) is increasing, resp. decreasing. The fundamental property of this distance is the following: if A is a linear operator preserving Ca , then θa (Af, Ag) ≤ tanh(1/4)θa (f, g) with 0 ≤ 1 ≤ ∞ the θa -diameter of A(Ca ). In particular: 1) A is non-expanding; 2) A is uniformly contracting as soon as 1 < ∞. Let C+ be the cone of positive functions. As Ca ⊂ C+ , θa (g, h) ≥ θ+ (g, h), where θ+ (g, h) = log esssup h/g · esssup g/ h is the projective metric on the cone C+ . We also have, for g, h ∈ C+ with kgk1 = khk1 = 1:

g

− 1 ≤ exp θ+ (f, g) − 1. h ∞ Lemma 1.4. Let ϕ, ψ : [0, 1] → R+ . For n ≥ 0: |Cω (ϕ, ψ, n)| ≤ µω (ϕ)kψk∞ khT n ω k∞ exp θ+ Lnω

1 ϕhω , hT n ω − 1 . µω (ϕ)

Decay of Correlations for Random Lasota–Yorke Maps

35

Proof. As µ is equivalent to P × m, µω (ϕ) = 0 iff ϕ = 0 (mod m). But if ϕ = 0 (mod m), there is nothing to prove. Hence we can assume µω (ϕ) > 0. Setting φ = µω (ϕ)−1 ϕ, we have: |Cω (ϕ, ψ, n)| = µω (ϕ) |Cω (φ, ψ, n)| Z ψ Ln (φhω ) − hT n ω dm = µω (ϕ) ω [0,1]

n

Lω (φhω )

n − 1 ≤ µω (ϕ)kψk∞ khT ω k∞

hT n ω ∞

≤ µω (ϕ)kψk∞ khT n ω k∞ exp θ+ (Lnω (φhω ), hT n ω ) − 1 .

t u

Therefore basically one has to prove the exponential decay of: θ+ (Lnω (φhω ), hT n ω ) = θ+ (Lnω (φhω ), Lnω hω ). This would be immediate if we had: Lω (Ca ) is a subset of Ca with bounded θa -diameter. Of course this does not hold. But, replacing Lω by LR ω with R large enough, we are going to make it almost always true and then we shall “only” have to control the bad times. 2. Typical Behavior Let L be a good random transfer operator on (X, m, var, α N , K). Define: Z 1 def log α N dP, λ = − N Z 1 def log K N dP. σ = N By our assumptions 0 < λ < ∞ and log 6 ≤ σ < ∞. In this section, we define and study good ω’s which make nearly all of and are such that, for some fixed R to be determined, LR ω behaves “typically”, i.e., as prescribed by λ and σ . 2.1. Control of the variation upon iteration. Lemma 2.1. For each > 0, there exists an a.e.-finite function C0 on such that, for a.a. ω ∈ , for every n ≥ 0: var(LnT −n ω h) ≤ C0 (ω)e−(λ−)n var(h) + C0 (ω)khk1 . This is an abstract version of Proposition 1.4 of [9]. Proof. T N is not necessarily ergodic but T is, so writing: nN−1 X k=0

·◦Tk =

n−1 N −1 X X r=0 k=0

· ◦ T kN +r ,

36

J. Buzzi

we see that there must be an integer 0 ≤ r = r(ω) < N such that: n−1

1X log α N (T −r−kN ω) ≤ λ. lim n→∞ n

(2.1)

k=0

Write n = d + qN + r with q ≥ 0, 0 ≤ d < N. For any h : X → R, (LY1) gives: var(LdT −n ω h) ≤ K(T −n ω)K(T −n+1 ω) . . . K(T −n+d−1 ω)(var(h) + khk1 )

≤ C(ω)e 2 n (var(h) + khk1 ),

1 log K(T k ω) = 0 by the ergodic theorem so Indeed, as log K ∈ L1 (P), lim|k|→∞ |k| that, for each s > 0, there is C(·) such that, for all k ∈ Z: K(T k ω) ≤ C(ω)es|k| .

Notation. Here and below C(·) stands for arbitrary a.e.-finite measurable functions on . Similarly: var(LrT −r ω h) ≤ C(ω)(var(h) + khk1 ). We turn to the main segment [−n+d, −r[. Set w = T −n+d ω = T −qN −r ω. An induction from inequality (LY2) gives, for all m ≥ 0: N N N N (m−1)N w)var(h) var(LmN w h) ≤ α (w)α (T w) . . . α (T

+

m−1 X

K N (T j N w)

j =0

× α N (T (j +1)N w)α N (T (j +2)N w) . . . α N (T (m−1)N w) · khk1 .

(2.2)

Fortunately, we are interested in the case where m = q and so we have the following estimate for all l ≥ 0 by Eq. (2.2): l Y

α N (T −r−kN ω) ≤ C(ω)e−(λ−/4)lN .

k=1

The integrability of log K N gives K N (T −r−kN ω) ≤ C(ω)e 4 kN (recall that r can take only finitely many values). Therefore (2.1) becomes:

−(λ− 4 )qN var(h) var(LqN w h) ≤ C(ω)e

+

q−1 X

C(ω)e(q−j ) 4 N · C(ω)e−(λ− 4 )(q−j )N · khk1

j =0

≤ C(ω)e−(λ− 4 )qN var(h) +

C(ω)2

1 − e−(λ− 2 )N = C(ω)e−(λ− 4 )qN var(h) + C(ω)khk1 .

khk1

Decay of Correlations for Random Lasota–Yorke Maps d+qN

37

qN

Writing LT −n ω = LT −n+d ω ◦ LdT −n ω , we get: 3 3 d+qN var(LT −n ω h) ≤ C(ω)2 e−(λ− 4 )qN var(h) + C(ω)2 e−(λ− 4 )n + C(ω) khk1 ≤ C(ω)e−(λ− 4 )qN var(h) + C(ω)khk1 . 3

d+qN

Writing LnT −n ω = LrT −r ω ◦ LT −n ω , we get: 3 var(LnT −n ω h) ≤ C(ω)2 e−(λ− 4 )qN var(h) + C(ω)2 + C(ω) khk1 ≤ C(ω)e−(λ−)n var(h) + C(ω)khk1 , as qN = n − d − r ≥ n(1 − 2N/n) and n is large. u t 2.2. Definition of a good block. We first give conditions on ω ∈ ensuring that LR ω is a strict contraction of the cone of functions Ca into itself for some appropriate parameter a = a() to be defined. Definition 2.2. For > 0, we define B∗ () to be the smallest number such that C0 (ω) ≤ B∗ on a set of P-measure at least 1 − /8 (C0 was defined in Lemma 2.1). Definition 2.3. For > 0, we define the cone parameter a = a() to be: def

a = max(6B∗ (), 2 · var(1X )). We shall choose R to be a multiple of N and to be so large that e−(λ/2)R < 1/3. Definition 2.4. Say that ω is good w.r.t. the parameters a, R, B∗ , α∗ if: −(λ−O())R var(h) + B∗ khk1 var(LR ω h) ≤ e

(2.3)

R/N−1 1 X log K N (T kN ω) ∈ [σ − , σ + ] R

(2.4)

essinf LR ω h ≥ α∗ khk1 ∀h ∈ Ca

(2.5)

k=0

O() stands for functions of such that lim sup→0 |O()/| < ∞. Write ∗ for the set of good ω’s. The key property of good ω’s is the following: Lemma 2.5. Let > 0. Let B∗ () and a() be defined as above. Let R be some positive integer and α∗ > 0. Then there exists a constant κ = κ(, α∗ ), 0 < κ < 1, such that for all good ω ∈ , LR ω : Ca → Ca is a θa -contraction with coefficient κ. More precisely, LR ω (Ca ) ⊂ B(1, 1), a ball around the function 1 of radius 1 < ∞ w.r.t. θa .

38

J. Buzzi

Proof. If ω is good, then for h ∈ Ca , we have: var(LR ω h) ≤

a 1 a var(h) + B∗ khk1 ≤ + B∗ khk1 ≤ khk1 . 3 3 2

Hence, LR ω (h) ∈ Ca/2 . R R Moreover, esssup LR ω h ≤ kLω hk1 + Cvar var(Lω h) ≤ (1 + Cvar a/2)khk1 and R essinf Lω h ≥ α∗ khk1 by (2.5). Thus, by (1.4): 3

2 θa (LR ω h, 1) ≤ log 3

(1 + Cvar a/2) 1 2 α∗

(where 3 ≥ (1 + V )(1 − V ) as V = var(1X )/a ≤ 1/2 by the choice of a). Hence, diama (LR ω (Ca )) = 1 ≤ 2 log (9(1 + Cvar a/2)/α∗ ) < ∞. def

This implies that LR ω : Ca → Ca is a contraction with coefficient 0 < κ < 1 satisfying κ ≤ tanh(1/4). u t 2.3. Prevalence of good blocks. To be useful these good blocks must represent almost all blocks and this demand will direct the choice of the remaining free parameters R and α∗ . Lemma 2.6. Let > 0, B∗ () and a() be defined as above. One can find α∗ and R such that the set ∗ of good ω’s has P-measure greater than 1 − /4. Proof. Recall that the constant B∗ = B∗ () has been defined in order that after restricting T qN ω to a subset 0 of with measure > 1−/8, we can replace the function C0 (T qN ω) in Lemma 2.1 by B∗ . Hence for all ω ∈ T −qN 0 , for all q ≥ 0: −(λ−)qN var(h) + B∗ khk1 . var(LqN ω h) ≤ B∗ e

Now: • (2.3) in the definition of a good block is satisfied for ω ∈ T −qN 0 if we set R = qN large enough. • By the ergodic theorem applied to T −1 , (2.4) is obviously satisfied for R large enough, perhaps after restricting arbitrarily slightly 0 . • We turn to (2.5): a > 0 is fixed so (RC) gives random numbers nc and α0 , α1 , . . . . One has nc (ω) ≤ n∗ for ω ∈ 00 with P(00 ) > 1 − /8 provided n∗ is large enough. Now pick q0 ≥ n∗ /N so large that (2.3) and (2.4) also hold for: def

R = R() = q0 N. Set also:

α∗ = αR > 0. LR ωh

≥ α∗ khk1 . Now, we have: essinf By the above, all ω’s in 00 ∩ T −R 0 are good and this set has measure > 1 − /4. t u def QR/N−1 We define K R (ω) = k=0 K N (T kN ω).

Decay of Correlations for Random Lasota–Yorke Maps

39

3. Deviations In this section, we take care of the “unbounded badness” of bad ω’s by including them into “coating intervals” as described in the following: Proposition 3.1. Let L be a good random transfer operator and > 0 be small. Then, def for a.a. ω, for all n ≥ n0 (ω) and |p| ≤ n, the orbit starting at γ = T p ω can be split into: (1) an initial segment {γ , . . . , T j∗ −1 γ } with 0 ≤ j∗ < R; (2) good blocks of the form {T j∗ +kR γ , . . . , T j∗ +(k+1)R−1 γ } with T j∗ +(k−1)R γ good (k ∈ N); 0 (k 0 −k)R (3) coating intervals of the form {T j∗ +kR γ , . . . , T j∗ +k R−1 γ } such that LT j∗ +akR γ preserves the cone Ca (k, k 0 ∈ N). Moreover, the intervals in (3) satisfy:   [ 0 [j∗ + kR, j∗ + k R[∩[0, n[ ≤ O( 1/2 )n # (k,k 0 )

and j∗ is such that: lim

|m|→∞

1 #{k ∈ Z : 0 ≤ kR/m < 1 and C0 (T j∗ +kR γ ) ≤ B∗ ()} < , |m|

where C0 (·) is defined by Lemma 2.1 and B∗ () just after it. This proposition is the union of Lemmas 3.4 and 3.5 below.

3.1. Construction of the coating intervals. Definition 3.2. For ω ∈ , j∗ (ω) is the smallest integer 0 ≤ j∗ < R such that, for each choice of the sign ±: (i) limm→∞ (ii) limm→∞

1 m #{0 1 m #{0

≤ k < m : T ±kR+j∗ ω ∈ ∗ } > 1 − , ≤ k < m : C0 (T ±kR+j∗ ω) ≤ B∗ } > 1 − .

This is possible a.e. as at least (3/4)R integers j∗ in [0, R[ satisfy each one of these conditions. Obviously j∗ is a measurable function with: j∗ (T j∗ (ω) ω) = 0 and j∗ (T R ω) = j∗ (ω). Now, for ω ∈ , define the coating length `(ω) to be, if ω is bad, the smallest integer n ≥ 1 such that: 1 n

X

log K R (T kR ω) ≤ 1/2 σ R

0≤k
(or infinity if there is no such n). If ω is good, set `(ω) = 1.

(3.1)

40

J. Buzzi

Given ω ∈ , we define the coating intervals along the orbit starting at ω to be the integer intervals [ai , bi [, i ≥ 1, defined by: ai = min{k > bi−1 : T kR ω is bad} and bi = ai + `(T ai R ω) for i ≥ 1 (setting b0 = −1). These correspond to the intervals of type (3) in the statement of the proposition. Remark 3.3. The idea of making the grouping varying with ω is borrowed from V. Baladi and M. Benedicks. But our grouping is directly given by the “deviations” of our map process. In this way we get interesting estimates which are both precise and valid for all ω’s, not only most of them. This is necessary for us. Remark that `(ω) < ∞ for a.a. ω with j∗ (ω) = 0. Indeed, 1 n

X

log K R (T kR ω) =

0
1 X 1 log K R (T kR ω) − n n 0
X

log K R (T kR ω)

0
and, for large n, the first average is ≤ (σ + )R whereas the second one is ≥ (1 − ) · (σ − )R (lower bound on the number of good blocks – using j∗ = 0 – times the lower bound on K R for a good block). Hence, if j∗ (ω) = 0: 1 n→∞ n lim

X

log K R (T kR ω) < (2 + σ )R.

(3.2)

0≤k
> 0 being small, we may assume: (2 + σ ) < 1/2 σ , so that `(ω) is indeed finite for a.a. ω with j∗ (ω) = 0. Remark also that for bad ω ∈ , `(ω) ≥ 2. Indeed, by definition K R (·) ≥ 6R and so `(ω) = 1 would imply log 6 ≤ 1/2 σ , contradicting the smallness of . Now, if 2 ≤ m ≤ `(ω) then m − 1 ≥ m/2 and: X

log K R (T kR ω) > 1/2 σ R · (m − 1) ≥

0≤k<m−1 / ∗ T kR ω∈

1/2 σ Rm. 2

(3.3)

This estimate which is given by the minimality of `(ω) will imply that the union of the coating intervals is a small proportion of a typical orbit. 3.2. Coating intervals preserve Ca . Lemma 3.4. For all ω ∈ with `(ω) < ∞, (Ca ) ⊂ Ca . L`(ω)R ω `(ω)R

In particular Lω distances).

: Ca → Ca is a weak contraction (i.e., it does not increase θa -

Decay of Correlations for Random Lasota–Yorke Maps

41

Proof. If ω is good, this is simply Lemma 2.5. Assume that ω is bad. We have to prove `(ω)R to a function h ∈ Ca . that the variation stays bounded by a after the application of Lω We evaluate var(L`R ω h) by composing the inequality (2.3) (for good blocks) and the R one given by (LY1): var(LR ω h) ≤ K (ω)(var(h) + khk1 ) (for the others). We get an inequality similar to (2.1): (k−1)R ω)var(h) var(LkR ω h) ≤ 0(ω) . . . 0(T

+

k−1 X

B(T j R ω) · 0(T (j +1)R ω) . . . 0(T (k−1)R ω) · khk1 ,

j =0

where 0 and B are defined by: 0(γ ) = e−(λ−O())R and B(γ ) = B∗ if γ is good, 0(γ ) = B(γ ) = K R (γ ) otherwise. We estimate the products in the above inequality. def

First, the left-hand side in (3.1) is > 1/2 σ R if n < ` = `(ω). Comparing with the definition of `(ω), we get, for 0 ≤ j < `: X 1 log K R (T kR ω) < 1/2 σ R. (3.4) `−j j ≤k<` T kR ω∈ / ∗

Second, as K R (·) ≥ 6R , the proportion of bad blocks is bounded by: 1/2 σ 1 def #{j ≤ k < ` : T kR ω ∈ . / ∗ } ≤ τ = `−j log 6

(3.5)

Hence we have, for 0 ≤ j ≤ `: 0(T j R ω) . . . 0(T (`−1)R ω) ≤ e−(λ−O())R·(1−τ )(`−j ) · e ≤ e−(λ−O(

1/2 ))R(`−j )

1/2 σ R(`−j )

,

where we have used (3.5) and (2.3) (for the good blocks) and (3.4) (for the bad blocks). Now, as K R (·) ≥ 1, we have for 0 ≤ j < `: Y 1/2 K R (T kR ω) ≤ e σ R(`−j ) , K R (T j R ω) ≤ j ≤k<`T kR ω∈ / ∗

using again (3.4). Thus: −(λ−O( var(L`R ω h) ≤ e

1/2 ))R`

var(h) +

`−1 X

B∗ e

1/2 σ R(`−j )

· e−(λ−O(

1/2 ))R(`−j )

khk1

j =0

with no random factor C(ω), thanks to the definition of ` = `(ω). Finally: e−(λ−O(

1/2 ))R

e

1/2 σ R

≤ e−(λ/2)R < 1/3.

We get: var(L`R ω h) ≤

1 var(h) + 2B∗ khk1 . 2

t Therefore L`R ω maps Ca into itself as a/2 + 2B∗ ≤ a (recall a ≥ 6B∗ ). u

(3.6)

42

J. Buzzi

3.3. The total coating length is small. Lemma 3.5. For a.a. ω ∈ with j∗ (ω) = 0 and for all N ≥ N0 (ω) and all |P | ≤ N , the coating intervals make up only a small proportion of the orbit of length N R starting at T P R ω: # ∪i≥1 [ai , bi [∩[P , P + N [≤ O( 1/2 )N. Proof. Let L be the sum of the lengths of the coating intervals included in [P , P + N [ and `0 be the length of intersection of the last coating interval with [P , P + N[ (or zero if this last interval is completely included in [P , P + N [). Remark that (3.2) may be extended to orbits of the form [P , P + N [ with |P | ≤ N . Indeed, write [P , P + N [⊂ [−N, 0] ∪ [0, 2N[ and remark that (3.2) also holds for the negative orbit [−N, 0]. Comparing with (3.3) we get, for all large N : 1 1/2 σ (L + `0 ) < 3(2 + σ )N 2

so that: L + `0 < 6

2 + 1 1/2 N. σ

t u

4. Proof of the Main Theorem 4.1. Exponential contraction by LnT p ω , |p| ≤ n. Proposition 4.1. Let > 0 be small enough and V : → R+ be measurable. Then there exist a constant ρ < 1 and a measurable function n0 : → N such that, for a.a. ω ∈ , all n ≥ n0 (ω), all ` ≥ 0 and all |p| ≤ n: θ+ (LnT p ω h, Ln+` g) ≤ ρ n T p−` ω for all h, g : X → R+ with m(h) = m(g) = 1 and var(h) ≤ V (ω)en , var(g) ≤ V (ω)e(n+`) . Moreover ρ = ρ() does not depend on V . Proof of the Proposition. Let > 0 be given. Recall that C0 : → R was defined in Lemma 2.1 so that: var(LnT −n ω h) ≤ C0 (ω)e−(λ−)n var(h) + C0 (ω)khk1 for a.a. ω ∈ . > 0 defines B∗ , R, a (see Sect. 2.2). Let n, ` ≥ 0 and |p| ≤ n. Set j∗ = j∗ (T p ω) (see Sect. 3.1). Let d be the smallest integer ≥ 0 satisfying: 1 (1) j∗ + dR ≥ λ− (n + log V (ω)); (2) C0 (T p+j∗ +dR ω) ≤ B∗ .

= O()n/R integers d in the These two conditions fail for at most O() |p|+|p+n| R interval [−|p|/R, |p + n|/R] using the definition of j∗ and assuming that n is large. Thus dR ≤ O()n. p 0 −p Set p 0 = p + j∗ + dR and h0 = LT p ω h. We have, by Lemma 2.1: 0

var(h0 ) ≤ B∗ e−(λ−)(p −p) var(h) + B∗ khk1 ≤ B∗

var(h) + B∗ khk1 ≤ 2B∗ ≤ a. V (ω)en

Decay of Correlations for Random Lasota–Yorke Maps

43

p0 −p+`

Similarly, setting g 0 = LT p−` ω g, we have: var(g 0 ) ≤ a (we may assume that ≤ λ/2). 0

0

0

Thus, θ+ (LnT p ω h, Ln+` g) = θ+ (Ln p0 h0 , Ln p0 g 0 ) with h0 , g 0 ∈ Ca , j∗ (T p ω) = T p−` ω T ω T ω 0 0 0 and n = n − (p − p) ≥ (1 − O())n. From now on we forget about the primes. def

Write n = mR + r with 0 ≤ r < R and γ = T p ω. Let L be the total length of the intersections of the coating intervals of the orbit starting at γ with [0, m + 1[. By Proposition 3.1, L ≤ O( 1/2 )m for all large m, i.e., m ≥ m0 (ω). Let k = 1 if γ (i.e., the first block) is good, k = `(γ ) + 1, otherwise. Clearly k ≤ L + 1 ≤ O( 1/2 )m for all large m. In particular, k < m. We have, for all large m: mR θ+ (Lnγ h, Lnγ g) ≤ θ+ (LmR γ h, Lγ g)

by (i)

mR ≤ θa (LmR γ h, Lγ g)

by (ii)

≤κ

m−L−1

≤κ

(1−O( 1/2 ))m

≤ 1ρ

n

θa (Lkγ h, Lkγ g)

by (iii)

1

by (iv)

with constants ρ = κ

(1−O( 1/2 ))/R

< 1 and 1 < ∞,

using: and LrT p+n−r γ preserves the cone C+ hence is a weak Lnγ = LrT p+n−r γ ◦ LmR γ contraction for the projective metric θ+ ; (ii) θ+ ≤ θa as C+ ⊃ Ca ;

(i)

`(T j R γ )

(iii) LT j R γ

is a contraction of Ca : weak if T j R γ is bad (Proposition 3.1), strong, by a

factor κ = κ(), if T j R γ is good (Lemma 2.5), LR is the total length of the coating intervals; R (iv) L ≤ O( 1/2 )m and LkR γ (Ca ) ⊂ LT (k−1)R γ (Ca ) by Proposition 3.1, but this last set has diameter 1 < ∞ by Lemma 2.5 as T (k−1)R γ is good.

This gives the proposition. u t Proof of the Main Theorem, 1. We prove the existence, uniqueness and bounded variation on fibers of the invariant density. Fix > 0 small enough so that Proposition 4.1 applies. Let H : X → R+ be given, some function of bounded variation. Set: hn (ω, x) = (LnT −n ω H )(x) for n ≥ 0. For a.a. ω, the sequence hn (ω, ·) : X → R, n ≥ 0, is Cauchy w.r.t. θ+ . Indeed, the previous proposition gives for all large n, all ` ≥ 0: θ+ (hn , hn+` ) ≤ ρ n . As khn (ω, ·)k1 = 1, the sequence hn (ω, ·) is also Cauchy in L∞ (m). Hence it converges in this space, for a.a. ω. Note that each hn is measurable by (LY0), so that the limit h is measurable. As khn (ω, ·) − h(ω, ·)k1 ≤ 1 for a.a. ω, the convergence must also take place in L1 (P × m). h is globally invariant, i.e.: Lω hω = hT ω for a.a. ω. Now, Lemma 2.1 shows that var(hn (ω, ·)) ≤ C(ω) < ∞ for all n ≥ 0. But the set of functions X → R with variation bounded by some constant is compact in L1 (m) by (V4). Hence the limit hω must also be of variation ≤ C(ω) < ∞ for a.a. ω.

44

J. Buzzi

We turn to the uniqueness. Assume that g ∈ L1 (P × m) is another globally invariant normalized density. By the ergodicity of (T , P), kgω k1 = 1 for a.a. ω. By (V6) one can find for every t > 0, g˜ : × X → R+ with uniformly bounded variation on ˜ 1 ≤ t. Now the previous proposition fibers and kg˜ ω k1 = 1 for all ω such that kg − gk gives: θ+ (hω , LnT −n ω g˜ T −n ω ) ≤ ρ n for large n. As above, this shows that Ln g˜ defined by

˜ x) = LnT −n ω g˜ T −n ω (x) converges in L1 (P × m) to h. Hence: Ln g(ω, def

kh − gk1 ≤ lim sup kLn (g˜ − g)k1 ≤ kg˜ − gk1 ≤ t. n→∞

Therefore h = g (mod m). This proves the uniqueness of the globally invariant density. t u To prove the exponential speed in LT −n ω H → hω , we shall need: Lemma 4.2. Let (, P, T , L) be a good random transfer operator and h be a globally invariant density with var(hω ) < ∞ for a.a. ω ∈ . For all s > 0, for a.a. ω and all q ∈ Z: khT q ω k∞ + var(hT q ω ) ≤ C(ω)es|q| . R Proof of the lemma. Let s > 0. Set σ0 = log K dP, K given by (LY1). Without loss of generality, we assume that s/2(σ0 + 1) < 1. By the ergodic theorem, there exists 2 ⊂ with measure > 1 − 4(σ0s+1) such that for all w ∈ 2 and all large n (i.e., n ≥ n0 (s)) and each choice of the sign ±: X log K(T ±j w) ≤ (σ0 + 1)n. 0≤j
As var(hw ) < ∞ for a.a. w ∈ , for A = A(s) < ∞ large enough, P({w : var(hw ) > A}) < 4(σ0s+1) . Hence, for a.a. ω and each choice of ±: lim

k→±∞

s 1 #{0 ≤ j < |k| : var(hT ±j ω ) > A or T ±j ω ∈ < 1. (4.1) / 2 } < |k| 2(σ0 + 1)

Now, consider the largest integer 0 ≤ m < (1−s/2(σ0 +1))|q| such that var(hT ±m ω ) ≤ A and T ±m ω ∈ 2 (± being defined by q = ±|q|). For all large |q|, m exists and satisfies |q| − m < 2(σ0s+1) |q| by the estimate (4.1). Thus, using (LY1), we get for all large |q|: var(hT q ω ) + khT q ω k1 ≤ (var(hT ±m ω ) + 1)K(T ±m ω) . . . K(T ±(|q|−1) ω) ≤ (A + 1) exp

|q|−1 X

log K(T ±j ω)

j =m

≤ (A + 1)e

s|q|

,

|q| − m ≥ 2(σ0s+1) |q| being also large. As var(hw ) + khw k1 is finite for a.a. w ∈ , this bound for large |q| gives a bound for all q, at the price of a factor C(ω). Finally, the use of inequality (V3) concludes the proof: t u var(hw ) + khw k∞ ≤ (Cvar + 1)(var(hw ) + khw k1 ).

Decay of Correlations for Random Lasota–Yorke Maps

45

Proof of the Main Theorem, 2. We prove the exponential convergence of LT −n ω H to the invariant density hω for any H : X → R+ with m(H ) = 1 and var(H ) < ∞. Fix > 0 small enough so that Proposition 4.1 applies. var(hT −n ω ) ≤ C∗ (ω)en by the previous lemma. Proposition 4.1 with V (ω) = max(C∗ (ω), var(H )) gives ρ < 1 such that: θ+ (LnT −n ω H, hω ) = θ+ (LnT −n ω H, LnT −n ω hT −n ω ) ≤ ρ n for all large n ≥ 0. But:

n

L −n H − hω ≤ khω k∞ exp θ+ (Ln −n H, hω ) − 1 ≤ C(ω)ρ n ≤ ρ˜ n T ω T ω ∞ for any ρ < ρ˜ < 1 and all large n. u t Proof of the Main Theorem, 3. We prove the exponential decay of backward (p = −n) and forward (p = 0) correlations. Fix > 0 small enough so that Proposition 4.1 applies. We may assume that ϕ, ψ ≥ 0, none being identically zero, the general case following readily from this one. def

Set γ = T p ω and φ =

1 µγ (ϕ) ϕ.

Of course,

Cγ (ϕ, ψ, n) = µγ (ϕ) Cγ (φ, ψ, n) . By Lemma 1.4: Cγ (φ, ψ, n) ≤ kψk∞ khT p+n ω k∞ exp θ+ Ln (φhγ ), Ln (hγ ) − 1 . γ

γ

Set s > 0 so small that, ρ being given by Proposition 4.1, ρes < 1. By Lemma 4.2, var(hγ ) ≤ V (ω)es|p| for some a.e.-finite function V and this is small enough w.r.t. Proposition 4.1. On the other hand, we do not control the variation of φ because of the normalization. To solve this problem, we borrow the following trick from [19]. Set: var(φhγ ) V (ω)es|p| so that (as a simple computation shows): var φ+Q·1 ≤ 2V (ω)es|p| . Also, using 1+Q hγ khγ k1 = kφk1 = 1: def

Q =

(var(hγ ) + khγ k∞ )(var(φ) + kφk∞ ) V (ω)es|p| 2 (Cvar + 1) (var(hγ ) + khγ k1 )(1 + var(φ)) ≤ V (ω)es|p| 2 (Cvar + 1) V (ω)es|p| ≤ (1 + var(φ)) V (ω)es|p| ≤ (Cvar + 1)2 · (1 + var(φ)).

0≤Q≤

Now, φ 7 → Cγ (φ, ψ, n) is linear. Hence: φ+Q·1 , ψ, n + Q |Cγ (1, ψ, n)| . |Cγ (φ, ψ, n)| ≤ (1 + Q) Cγ {z } | 1+Q =0

46

J. Buzzi

Apply Lemma 1.4: |Cγ (φ, ψ, n)| ≤ (1 + Q)kψk∞ khT p+n ω k∞ φ+Q × exp θ+ Lnγ hγ , Lnγ hγ − 1 . Q+1 We may now apply Proposition 4.1 and we get, for all large n:

|Cγ (φ, ψ, n)| ≤ (1 + Q)kψk∞ khT p+n ω k∞ exp ρ n − 1 .

Using the above estimate on Q and Lemma 4.2, we get, for large n: |Cγ (φ, ψ, n)| ≤ C(ω)(var(φ) + 1)kψk∞ C(ω)esn · 2ρ n . To get back to ϕ, remark that: µγ (ϕ)(var(φ) + 1) = (var(ϕ) + µγ (ϕ)) ≤ (var(ϕ) + kϕk∞ ) ≤ (Cvar + 1)(var(ϕ) + kϕk1 ). We may thus conclude that for large n: |Cγ (ϕ, ψ, n)| ≤ C(ω)(var(ϕ) + kϕk1 )kψk∞ (ρes )n , and recall that: ρes < 1. Finally this estimate extends to all n ≥ 0 by enlarging C(ω): it is enough to recall t that |Cγ (ϕ, ψ, n)| ≤ 2kϕk∞ kψk∞ ≤ 2Cvar · (var(ϕ) + kϕk1 )kψk∞ . u

Appendix A. Counter-Examples Example 1: Slow decay of integrated correlations. For each δ > 0, there exists a random def p.m.n.s. interval map f satisfying (A0)–(A4) such that if ϕ(x) = 2 · 1[0,1/2] (x), then the integrated correlation function satisfies: Z def Cω (ϕ, ϕ, n) dP ∼ const · n−δ as n → ∞ Cint (ϕ, ϕ, n) =

(a ∼ b meaning that lim a/b = 1). In particular, for 0 < δ ≤ 1: X Cint (ϕ, ϕ, n) = ∞. n≥1

˜ T˜ ) be the Bernoulli shift with symbol set {1, 2, . . . } and ˜ P, ˜ A, Construction. Let (, probability vector (Z, Z/22+δ , . . . , Z/n2+δ , . .R. ) (Z being the normalization constant). P 1+δ < ∞. ˜ ˜ Let h(ω) = ω0 (the 0th coordinate of ω ∈ ). ˜ h dP = n≥1 Z/n ˜ T˜ ), i.e.: ˜ P, ˜ A, Let (, A, P, T ) be the suspension by h over (, ˜ × N : 0 ≤ n < h(ω)}, = {(ω, n) ∈ T (ω, n) = (ω, n + 1) if n + 1 < h(ω) and T (ω, h(ω) − 1) = (T˜ ω, 0), Z −1 X ˜ ∩ ( ˜ ˜ × {n})). P(A P(A) = hd P n≥0

Decay of Correlations for Random Lasota–Yorke Maps

47

Write E(·) for the integer part and {·} for the fractional part. Define g0 : [0, 1] → [0, 1] to be the doubling map, i.e.: g0 (x) = {2x}. Define g1 : [0, 1] → [0, 1] by requiring that the restrictions g1 : [0, 1/2[→ [0, 1/2[ and g1 : [1/2, 1[→ [1/2, 1[ be two scaled copies of the doubling map, i.e.: g1 (x) = 1 2 (E(2x) + {4x}). Define: ( g1 if i < h(ω) − 1 f(ω,i) = g0 if i = h(ω) − 1. Clearly the expansion is δ(ω, i) = 2 and the number of pieces is N (ω, i) = 2 or 4 and V (ω, i) = 0 so that (A0)–(A3) are satisfied in the most uniform way possible. The random covering property (A4) is also satisfied, but with very large waiting time. For instance, the time nc (ω, i) one has to wait for the image of [0, 1/2] to cover the whole interval [0, 1] is: nc (ω, i) = min{k ≥ 1 : T k−1 (ω, i) ∈ } = h(ω) − i R ( nc dP = ∞ for δ ≤ 1 – see Example 2). µ = P × m is the unique a.c.i.m. for the skew product F . It is ergodic. R Consider now: ϕ(x) = 2 · 1[0,1/2] (x). ϕ has bounded variation and [0,1] ϕ dm = 1. We have, for any ψ ∈ L1 (m), the following conditional expectations: 1 ψ ◦ g1 |{[0, 1/2], [1/2, 1]} = Em (ψ|{[0, 1/2], [1/2, 1]}) , 2 1 ψ ◦ g0 |{[0, 1/2], [1/2, 1]} = Em (ψ) . Em 2

Em

Therefore: Z ϕ·ϕ

[0,1]

n ◦ f(ω,i) dm

(

Z =2

ϕ

[0,1/2]

Thus,

n ◦ f(ω,i) dm

=

2 if n < nc (ω, i) 1 otherwise.

( C(ω,i) (ϕ, ϕ, n) =

1 if n < nc (ω, i) 0 otherwise.

so that: Cint (ϕ, ϕ, n) = µ ( × {n, n + 1, . . . }) =

X

µ ([m] × {n, . . . , m − 1})

m>n

=

X

(m − n)

m>n

This achieves the construction. u t

Z const ∼ as n → ∞. m2+δ nδ

48

J. Buzzi

Example 2: Non-integrability of log+ K. There exists a random p.m.n.s. interval map f satisfying (A0)–(A4) such that: Z Z |Cω (ϕ, ψ, n)| + dP ≥ const · nc dP = ∞, log sup ρn n≥0 where nc is the (random) waiting time for some fixed interval. Proof. We take the previous example with δ ≤ 1 and the same ϕ(x) = 2 · 1[0,1/2] (x). We have, for n < nc (ω, i): C(ω,i) (ϕ, ϕ, n) = 1 ≤ K(ω, i) · 2 · 2 · ρ n . Thus, log K(ω, i) ≥ | log ρ|nc (ω, i) + const. But: Z X XZ X nc dP = P(nc > n) = C(ω,i) (ϕ, ϕ, n) dP = Cint (ϕ, ϕ, n) = ∞,

n≥0

n≥0

n≥0

because of Example 1. u t Example 3: Mixing without covering. There exists a random p.m.n.s. interval map f satisfying (A0)–(A3), with a unique a.c.i.m. µ and such that: (1) supp µω = [0, 1] for a.a. ω, (2) µ is mixing w.r.t. the skew-product F , (3) f does not satisfy the covering assumption (A4). More precisely, (3’) there exists ϕ : [0, 1] → R with bounded variation such that: Cω (ϕ, ϕ, n) = ±1 for all n ≥ 0. Thus this example is globally mixing but not randomly (i.e., pathwise) mixing. Construction. Let (, A, P, T ) be the ( 21 , 21 )-Bernoulli shift on {−1, +1}Z . Let g1 : [0, 1] → [0, 1] be, as above, the juxtaposition of two copies of the doubling map, one on [0, 1/2[, the other on [1/2, 1[. Let τ : [0, 1] → [0, 1] be the map which isometrically exchanges [0, 1/2[ and [1/2, 1[. Define f by setting: ( g1 if ω0 = +1 . fω = τ ◦ g1 if ω0 = −1 Obviously f satisfies (A0)–(A3) and µ = P × m is F -invariant and satisfies (1), (3) and (3’). It remains to show the mixing (2). Observe that (µ, F ) is isomorphic to the direct product of (g0 , m), the doubling map with Lebesgue measure together with (P × ν, S), where ν is the normalized counting measure over {−1, +1} and S : ×{−1, +1} → ×{−1, +1} is defined by S(ω, ) = (T ω, ω0 ). Thus, to see that (µ, F ) is mixing, it is enough to prove that each one of these two factors are mixing. It is well-known for the doubling map. For (P × ν, S), one checks easily that P × ν(S −n A ∩ B) → P × ν(A) · P × ν(B) as n → ∞, for A, B of the form: a cylinder × {±1}. u t

Decay of Correlations for Random Lasota–Yorke Maps

49

Appendix B. Multi-Dimensional Example The class of multi-dimensional β-transformations is not stable under iteration. Hence we are led to consider more generally piecewise affine maps defined below. We shall prove that under conditions satisfied by expanding random β-transformations which are covering, random piecewise affine maps define good transfer operators (Proposition B.1 below). We remark that this approach could be extended to piecewise C 1+α -smooth maps under appropriate conditions on the distortion not only of the maps but also of the hypersurfaces bounding the smooth pieces of the maps. B.1. Piecewise affine maps. Recall that a polytope in Rd , d = 1, 2, . . . , is a finite intersection of half-spaces. A piecewise affine map is a map f : Y → X determined by (X, P , f ), where P is S a finite collection of pairwise disjoint, bounded and open polytopes of Rd such that: Y = A∈P A is dense in X and, for each A ∈ P , f : A → f (A) ⊂ X is the restriction of an affine map fA : Rd → Rd . We additionaly assume that each fA is invertible. The following two quantities will be needed to control the action of our maps on densities: First, the minimal rate of expansion of f defined as: δ(f ) = inf{kfx0 (v)k : x ∈ Y and kvk = 1}. def

Second, the multiplicity of the boundary of P defined as follows. For a polytope A, let mult(∂A, , x) be the number of supporting hyperplanes of A meeting B (x), the ball of radius > 0 centered at x ∈ X. Set: X def mult(∂A, , x), mult(∂P , ) = sup x∈X A∈P ¯ A3x

def

mult(∂P ) = lim mult(∂P , ). →0

Remark that there exists > 0, such that mult(∂P , ) = mult(∂P ). We denote the supremum of these by (∂P ). The goal of this appendix is the following: Proposition B.1. Let f be a random piecewise affine map. Write (X, Pωn , fωn ) for the obvious piecewise affine map. We assume: (B0) for each n = 1, 2, . . . , ω 7 → (δ(fωn ), #Pωn , mult(∂Pωn ), (fωn )) is measurable. (B1) #P /δ (i.e., #Pω /δ(ω)) is log+ -integrable w.r.t. P. def

Z

λ = lim lim

n→∞ K→∞

δ(fωn ) 1 log min , K dP > 0. n mult(Pωn )

(B.1)

50

J. Buzzi

(B3) for any ball B ⊂ X, for a.a. ω, there exists nc = nc (ω, B) < ∞ such that, for all n ≥ nc , f n (B) = X modulo Lebesgue measure. Then f defines a good random transfer operator. This will prove the claim about β-transformations contained in Proposition 0.2 using the following: Lemma B.2. Let f1 , f2 , . . . , fn be multi-dimensional β-transformations. Write (X, P (n) , f (n) ) for the piecewise affine map defined by f (n) = fn ◦ · · · ◦ f1 . Then: mult(∂P (n) ) ≤ (n + 1) · d. Proof. This is a corollary of the proof of Lemma 1 in [8]. u t Finally the claim in Remark 0.3 follows from: Lemma B.3 ([8, Lemma 5]). Let f : Y → X be a β-transformation. If B ⊂ [0, 1[d is a ball with radius r then f (B) either is the whole [0, 1[d or it contains a ball of radius δ(f√) r. 1+ d

We prove the proposition by checking the conditions (V), (RC) and (LY) which define “goodness”. Condition (V). We shall work with the following notion of variation, due to G. Keller [12] and introduced for the study of multi-dimensional maps by M. Blank [4]. Fix a scale 0 > 0 and define for h ∈ L∞ (Rd ): Z 1 esssupB (x) (f ) − essinf B (x) (f ) dm(x) ∈ [0, ∞]. var 0 (h) = sup d {z } 0<≤0 R | osc(f,B (x))

Following B. Saussol [23] we define var 0 (h) for h ∈ L∞ (X) by setting h|(Rd \X) = 0. We shall fix 0 > 0 below. We claim that (X, m, var) satisfies the condition (V). The only points not completely obvious are the compactness property (V4) and the bound (V3): khk∞ ≤ khk1 + Cvar var(h). (V4) is proved in [12] and (V3) follows from [23, Prop. 2.2] using diam(X) < ∞. Condition (RC). By Lemma 5.3 of [23], if h is in Ca = {h ∈ L∞ (X, m) : h ≥ 0 and var 0 (h) ≤ akhk1 }, then there exists a ball of radius 1 = 1 (0 , a) > 0 such that h ≥ 21 khk1 on this ball. By compactness of X one can find a finite collection of balls B1 , . . . , Br such that any 1 -ball contains one of the Bi ’s. It is now enough to set nc (ω, a) = maxi=1,...,r nc (ω, Bi ) and to remember that supx | jac(f )| < ∞. Conditions (LY). First remark that (LY0) is clear. (LY1)–(LY2) require some work. (LY3) will follow from the others. The basic estimate for the action of our maps on this variation is the following statement of B. Saussol (which we have restricted to piecewise affine maps):

Decay of Correlations for Random Lasota–Yorke Maps

51

Lemma B.4 (B. Saussol [23, Corollary 4.1]). Let (X, P , f ) be an expanding piecewise affine map. Let 0 > 0 be small enough w.r.t. f . Then there exists C < ∞ depending only on the dimension and D = D(f, 0 ) < ∞ such that for all h ∈ L∞ (X): mult(∂P ) var 0 (h) + Dkf k1 . var 0 (Lf h) ≤ δ(f )−1 + C δ(f ) − 1 We shall need a little more as some of our maps will be contracting, or require too small a 0 and we also need to control the coefficient D. A slight modification of the proof by B. Saussol [23] does the job, giving: Corollary B.5. Let (X, P , f ) be an arbitrary piecewise affine map. There exists C < ∞, depending only on the dimension, such that for every 0 > 0, h ∈ L∞ (X): d mult(∂P , ) C mult(∂P , 0 ) 0 var 0 (h) + kf k1 . var 0 (Lf h) ≤ C 1 + δ(f )−1 δ(f ) 0 δ(f ) Proof. Starting from the proof in [23], it is enough to remark that for s > 0 (but possibly s ≤ 1): Z 1 |f (z)| + osc(f, B0 +s (z)) dm(z) esssupBs (y) |f | ≤ m(B0 (y)) B0 (y) Z osc(f, B(1+s)0 (z)) dm(z) ≤ const · (1 + s)d 0 var 0 (f ). t u Rd

If we remark that mult(∂P , 0 ) ≤ #P and that 1/δ ≤ #P /δ, then (LY1) follows immediately from (B1) and Corollary B.5. #P /δ is log+ -integrable, we can find ν > 0 such that, for all E with P(E) ≤ ν, R As + E log #P /δ dP < λ/10d. Now fix N so large that the following conditions are fulfilled: (C1) P(δ(fωN ) < 1) ≤ ν. (C2) There exists 1 < K∗ < ∞ such that: Z δ(fωN ) 9 1 , K∗ dP ≥ λ. log min N mult(PωN ) 10 (C3) C22d ≤ eλN/10 , where C is defined in Corollary B.5. Fix 1 < K∗ < ∞ as in (C2). Fix 0 > 0 so small that, setting 0 = {ω ∈ : (PωN ) < 0 }, we have: P(0 ) ≤ λN min(ν, 10 log K∗ ). Define: N

def

α (ω) =

C(1 + δ(fωN )−1 )d K N (ω) = 1 + def

mult(∂PωN , 0 ) −1 , K∗ max , δ(fωN )

C mult(∂PωN , 0 ) . 0 δ(fωN )

Corollary B.5 says exactly that the inequality in (LY2) is satisfied with these definitions and we must still check the integrability of log K N and of log α N and the negativity of the last integral.

52

J. Buzzi

• First claim: log K N is integrable. It is enough to remark that: #PωN #Pω . . . #PT N −1 ω mult(∂PωN , 0 ) ≤ ≤ δ(fωN ) δ(fωN ) δ(ω) . . . δ(T N −1 ω)

(B.2)

and that log+ #P /δ is integrable by (B1). • Second claim: log α N is integrable. Indeed, we have just seen that log K N is integrable, but: 0 < C/K∗ ≤ α N (ω) ≤ (1 + δ(fωN )−1 )d K N (ω) and: log+ (1/δ(fωN )) ≤ log+ (1/δ(ω) . . . δ(T N −1 ω))

≤ log+ (1/δ(ω)) + · · · + log+ (1/δ(T N −1 ω)),

(B.3)

which is integrable, again by (B1). • Third claim: log α N has negative integral. We first bound the integral of the log of the first factor. Obviously, (1 + δ −1 ) ≤ 2 if δ ≥ 1 and (1 + δ −1 ) ≤ 2δ −1 otherwise. Also recall that, setting 1 = {ω : δ(fωN ) < 1}, we have by (C1): P(1 ) ≤ ν. Hence, using the definition of ν, Eq. (B.3) and (C3): Z Z log C(1 + δ(fωN )−1 )d dP ≤ log C · 2d + d · log 2/δ(fωN ) dP

1 N −1 Z X

≤ log C · 22d + d

k=0

T k 1

log− δ dP

2 λN. ≤ 10 Thus: Z logα N dP Z log C(1 + δ(fωN )−1 )d dP ≤ Z Z δ(fωN ) δ(fωN ) , K , K log min log min dP + − ∗ ∗ dP mult(∂PωN ) mult(∂PωN ) 0 Z δ(fωN ) log min , K∗ dP − #PωN 0 Z 9 δ(fωN ) 2 λN − λN + P(0 ) log K∗ − log min , K dP ≤ ∗ 10 10 #PωN 0 Z δ(fωN ) 6 log min , K∗ dP, ≤ − λN − 10 #PωN 0 as P(0 ) ≤ min(ν, λN/10 log K∗ ).

Decay of Correlations for Random Lasota–Yorke Maps

53

Setting 00 = {ω ∈ 0 : δ(fωN )/#PωN ≤ K∗ }, the last term above can be bounded as follows, using K∗ ≥ 1, (B.2) and the definition of ν: Z Z Z δ(fωN ) #PωN dP − log min , K log log K∗ dP dP = − ∗ #PωN δ(fωN ) 0 00 0 \00 Z #PωN dP − 0 log ≤ δ(fωN ) 00 Z #Pω . . . #PT N −1 ω dP log ≤ δ(ω) . . . δ(T N −1 ω) 00 N −1 Z X 1 #P dP ≤ λN. log+ ≤ 0 k δ 10 T 0 k=0

Summing up, we get:

Z

log α N dP ≤ −

λN < 0. 2

This proves (LY2) and concludes the proof of the proposition. u t Acknowledgements. This paper owes most to stimulating and enlightening discussions with Viviane Baladi. In particular, the idea of randomly grouping the transformations fω (called “coating”) is inspired from V. Baladi and M. Benedicks (see Remark 3.3). This work was partly done during a visit to the Section de Mathématiques de l’Université de Genève with the financial support of the Fonds National de la Recherche Scientifique (Switzerland). I am also indebted to Bernard Schmitt, under the guidance of whom I learnt about the Birkhoff cone technique. Finally, I thank Véronique Maume for interesting comments which prompted the construction of the example with slow decay of integrated correlations.

References 1. Baladi, V.: Correlation spectrum of quenched and annealed equilibrium states for random expanding maps. Commun. Math. Phys. 186, 671–700 (1997) 2. Baladi, V., Kondah, A., Schmitt, B.: Random correlations for small perturbations of expanding maps. Random Comput. Dynam. 4, 179–204 (1996) 3. Baladi, V., Young, L.-S.: On the spectra of randomly perturbed expanding maps. Commun. Math. Phys. 156, 355–385 (1993); (Erratum, 166, 219–220 (1994)) 4. Blank, M.: Discreteness and continuity in dynamical systems. Providence, RI: Am. Math. Soc., 1997 5. Bogenschütz, T.: Stochastic stability of invariant subspaces. Preprint (1998) 6. Bogenschütz, T., Gundlach, V.M.: Ruelle’s transfer operator for random subshifts of finite type. Ergod. Th. & Dynam. Sys. 15, 413–417 (1995) 7. Bogenschütz, T., Kowalski, Z.: A condition for mixing of skew-products. Preprint (1997 8. Buzzi, J.: Intrinsic Ergodicity of Affine Maps on [0, 1]d . Monat. Math. 124, 97–118 (1997) 9. Buzzi, J.:Absolutely continuous S.R.B. for random Lasota–Yorke maps. Trans.Am. Math. Soc. (to appear) 10. Ferrero, P., Schmitt, B.: Produits aléatoires d’opérateurs matrices de transfert. Probab. Th. related fields 79, 227–248 (1988) 11. Hofbauer, F., Keller, G.: Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, 119–140 (1982) 12. Keller, G.: Generalized bounded variation and applications to piecewise monotonic transformations. Z. Wahr. Verw. Geb. 69, 461–478 (1985) 13. Khanin, K., Kifer, Y.: Thermodynamic formalism for random transformations and statistical mechanics. In: Sinai’s Moscow seminar on dynamical systems, A.M.S. Translation, Series 2 171,Providence, RI: Am. Math. Soc., 1996 14. Kifer, Yu.: Ergodic theory of random transformations. Boston: Birkhäuser, 1986 15. Kifer, Yu.: Random perturbations of dynamical systems. Boston: Birkhäuser, 1988

54

J. Buzzi

16. Kifer, Yu.: Limit theorems for random transformations and processes in random environments Trans. Am. Math. Soc. 350, 1481–1518 (1998) 17. Kondah, A.: Les endomorphismes dilatants de l’intervalle et leurs perturbations aléatoires. Dijon: Thèse de l’Université de Bourgogne, 1991 18. Lasota, A., Yorke, J.A.: On the existence of invariant measures for piecewise monotonic transformations. Trans. Am. Math. Soc. 186, 481–488 (1973) 19. Liverani, C.: Decay of correlations for piecewise expanding interval maps. J. Stat. Phys. 78, 1111–1129 (1995) 20. Morita, T.: Random iteration of one-dimensional transformations. Osaka J. Math. 22, 489–518 (1985) 21. Pelikan, S.: Invariant densities for random maps of the interval. Trans. Am. Math. Soc. 281, 813–825 (1984) 22. Rychlik, M.: Bounded variation and invariant measures. Studia Math. 76, 69–80 (1983) 23. Saussol, B.: Absolutely continuous invariant measures for multi-dimensional expanding maps. Preprint (1998) 24. Viana, M.: Stochastic dynamics of deterministic dynamical systems. Brazillian Math. Colloquium, IMPA (1997) Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 55 – 63 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On the Non-Integrability of the Stark–Zeeman Hamiltonian System S. Ferrer, F. Mondéjar Departamento de Matemática Aplicada, Universidad de Murcia, 30071 Espinardo, Spain. E-mail: [email protected]; [email protected] Received: 30 March 1999 / Accepted: 16 May 1999

Abstract: In this paper we present a proof of the non-integrability in the Liouvillian sense of the Stark–Zeeman Hamiltonian. In particular, we generalize the result of Kummer and Saenz about the non-integrability of the pure Zeeman Hamiltonian. The proof we give is an application of the theorem of Morales and Ramis (1998) about nonintegrability, based on differential Galois theory.

1. Introduction One of the fundamental topics in contemporary physics is the problem of confronting classical and quantum mechanics in regimes where classical motion of non-integrable dynamical systems is chaotic. In particular, much interest has been concentrated on the investigation of Rydberg electrons [1] in external electric and magnetic fields where the interaction between classical dynamics and quantum mechanics has been studied. The current interest in these systems began with the first observation of the so-called quasi-Landau resonances in the photo absorption spectra of barium atoms in a magnetic field in 1969 by Garton and Tomkins [2]. In this class of perturbed Coulomb systems, the one most thoroughly investigated is the quadratic Zeeman effect (see [3] and references therein). One of the variants of this problem is obtained by introducing an electric field parallel to the magnetic field: the Stark–Zeeman effect ([6,7]). In the case of the quadratic Zeeman effect there is a numerical evidence for the occurrence of chaos [8]. This was taken as a first hint of non-integrability; the same happens in the Stark–Zeeman effect [9]. A rigorous mathematical study of the non-integrability in the system defined by the Zeeman effect is due to Kummer and Saenz [10] using Ziglin’s theorem [11,12]. A common limitation for applying Ziglin’s theorem to prove non-integrability is the restriction to fuchsian variational equations (their singularities must be regular singular). This is not the case in the known particular solutions of the Stark–Zeeman effect: we cannot apply Ziglin’s theorem.

56

S. Ferrer, F. Mondéjar

Fortunately, Morales and Ramis have obtained new non-integrability results [13–15] avoiding the monodromy group and working directly with the differential Galois group. Basically, these results affirm that, in the integrable case, the identity component of the differential Galois group of the variational equation must be abelian. Moreover, this new theory can be applied in the case of existence of irregular singular points. Using these recent theorems we prove in this paper the non-integrability of the Zeeman-Stark effect by meromorphic functions in a sense to be specified later (Sect. 3). The paper is organized as follows. Section 2 is dedicated to present the main theorems of Morales and Ramis that we will apply. Section 3 is devoted to set up the main result of this paper. Finally, an Appendix gathers the essential part of Kovacic’s algorithm needed in the analysis of our problem. 2. Terminology and Basic Theorems In this section we present a short description of the theorems of Morales-Ramis that we will use to prove the results of the paper. Let us consider a 2n-dimensional complex analytic manifold M and a holomorphic Hamiltonian system on M, XH . Let i(0) be a maximal integral curve of XH defined by x = x(t), that is not an equilibrium point. Let 0 be the connected Riemann surface corresponding to i(0). We write the variational equation (VE) along x = x(t) and consider the restriction of the variational equation along the normal bundle to 0, the normal variational equation (NVE). We interpret the NVE as a holomorphic (resp. meromorphic) linear differential equation over 0. In this situation it is proved in [13, Theorem 7]: Theorem 2.1. Assume that there are n first integrals of XH which are meromorphic, in involution and independent in a neighborhood U of the curve i(0) in M. Then the identity component of the Galois group of the NVE is an abelian subgroup of the symplectic group. In some cases, if the problem has a finite set of equilibria that belong to the closure of i(0) in M, we add to 0 this finite set of equilibria. We denote this new curve by 0. Then, we have i(0) ⊂ 0 ⊂ M, where 0 is a closed analytic curve and 0 its corresponding connected Riemann surface, 0 ⊂ 0. In other cases we add to 0 (or 0) a finite set of points corresponding to points at the infinity of 0, as in our problem. In these cases we suppose that the manifold M is contained in a connected manifold M 0 , such that M∞ = M 0 − M is an analytic hypersurface in M 0 , called hypersurface at infinity, and that the holomorphic symplectic 2-form over M extends to a meromorphic symplectic 2-form 0 over M 0 (see [13]). 0 Then, we obtain 0 ⊂ 0 0 ⊂ M 0 and 0 ⊂ 0 , where 0 0 is a closed analytic curve in M 0 and 0 0 is the corresponding connected Riemann surface. Then the meromorphic connection 0 over 0 extends to a meromorphic connection over 0 . Finally, we compute the differential 0 Galois group G (resp. G ) of the NVE relative to the differential field of meromorphic 0 functions over 0 (resp. 0 ). Let us remember that the above differential Galois group is isomorphic to a linear algebraic group over C, and a linear algebraic group is a subgroup of GL(m, C) whose matrix coefficients satisfy polynomial equations over C (see [18]). In this situation it is proved in [13, Theorem 9]: Theorem 2.2. Assume that there is a finite set of equilibrium points and points at infinity. Assume that there are n first integrals of XH which are meromorphic, in involution and independent in a neighborhood U of the curve 0 0 in M 0 . Then the identity component of

Non-Integrability of the Stark–Zeeman Hamiltonian System

57

0

the Galois group G of the NVE over the differential field of meromorphic functions on 0 0 is an abelian subgroup of the symplectic group. 0

In general G ⊂ G ⊂ G with strict inclusion. However, when the extended connection 0 of the variational equation over 0 (resp. 0 ) is Fuchsian (i.e. the singular points are regular 0 singular points) we have G = G (resp. G = G ). In the applications, as in the problem considered here, the original linear differential equation over a compact Riemann surface is replaced by a linear differential equation with rational coefficients over the Riemann sphere P1 . In general, Morales and Ramis considered the effect of finite coverings, and Theorem 5 of [13] reads Theorem 2.3. Let X be a connected Riemann surface. Let (X0 , f, X) be a finite ramified covering of X by a connected Riemann surface X 0 . Let ∇ be a meromorphic connection over X. We set ∇ 0 = f ∗ ∇. Then, we have a natural injective homomorphism Gal (∇ 0 ) −→ Gal (∇) of differential Galois groups which induces an isomorphism between their Lie algebras. In terms of differential Galois groups this theorem means that the identity component of the differential Galois group is invariant by the covering. 3. Main Results 3.1. Hamiltonian, particular solutions and Riemann surfaces. We consider the problem of the dynamics of an electron of reduced mass µ in an atom of infinite massive nucleus under the effect of a magnetic and electric parallel fields. Choosing the z axis as the direction of the fields, and expressing the problem in a rotating frame around the z axis, with angular velocity ωL = eB/2cµ which is called the Larmor precesion, where B is the modulus of the magnetic field. Then, the Hamiltonian function takes the form H=

e2 1 2 2 1 (X2 + Y 2 + Z 2 ) − p + |e| Ez + µωL (x + y 2 ), 2 2 2 2µ 2 x +y +z

(1)

where e is the load of the electron and E is the modulus of the electric field (see [4]). Thus, taking convenient units, we will consider the biparametric dynamical system defined by the above Hamiltonian, written as H=

1 β 1 2 + αz + (x 2 + y 2 ). (X + Y 2 + Z 2 ) − p 2 2 2 2 8 x +y +z

(2)

When β = 0 we have the classical Stark effect which is known to be integrable. To our knowledge, the analysis of the integrability of the differential system defined by (2) has not been done. Only the particular case α = 0, which defines the Zeeman effect, has been proved by Kummer and Saenz to be non integrable. Thus, we will consider the general case. Then, effecting an appropriate change of time (see [5]), the Hamiltonian function takes the form H=

1 1 1 2 (X + Y 2 + Z 2 ) − p + F z + (x 2 + y 2 ), 2 2 2 2 8 x +y +z

(3)

58

S. Ferrer, F. Mondéjar

where F is a non-negative adimensional parameter. The phase space is the six-dimensional real manifold M = {(U, V ) ∈ R6 : U = (x, y, z), V = (X, Y, Z), x 2 + y 2 + z2 > 0}. In order to apply Morales and Ramis (MR) theory we consider the Hamiltonian (3) as a holomorphic function on the six-dimensional complexification of the manifold M b = {(U, V ) ∈ C6 : U = (x, y, z), V = (X, Y, Z), x 2 + y 2 + z2 6 = 0}, M equipped with the non-degenerated two-form d2, where 2 is the canonical one-form b as an open subset of the six-dimensional complex con2 = V · dU . We regard M b 0 = P1 (C)6 . The holomorphic two-form d2 extends uniquely to a nected manifold M b 0 (see [13]). meromorphic two-form over M b⊂M b 0 is The Hamiltonian vector field XH associated to H on M x˙ = X, y˙ = Y, z˙ = Z,

1 1 , X˙ = −x 3 + r 4 1 1 , Y˙ = −y 3 + r 4 z Z˙ = − 3 − F, r

(4)

p to the submanifold x = y = where r = x 2 + y 2 + z2 . This vector field is tangent T e = 0×0×C×0×0×C M b and define the symplectic form X = Y = 0. We take M by d2|M e = dz ∧ dZ. Then, the vector field (4) becomes the Hamiltonian vector field e on M associated to the Hamiltonian meromorphic function e = 1 Z2 + F z − 1 . H 2 z

(5)

For the non-equilibrium solutions needed in MR theorems we use the curve ϕ = ϕ(t) = (0, 0, ϕ1 (t), 0, 0, ϕ2 (t)), where ϕ = (ϕ1 , ϕ2 ) is a maximally continued integral curve of (5) in the zero level energy, value that we have taken for simplicity of our b 0 . The vector field associated computations; and we denote i(0) the image of ϕ in M e with the Hamiltonian H has two equilibrium points in an energy level different from zero. Then, there are not equilibrium points in the closure of i(0). Thus, we take 0 the abstract Riemann surface defined by i(0). Because ϕ1 (t) is an elliptic function we have that 0 is a complex torus without two points (the poles of the elliptic function). We b 0 which is the curve i(0) adding two points at infinity consider now the curve 0 0 in M that correspond to the poles of the parameterization of i(0) by the elliptic function, and 0 we consider the abstract Riemann surface 0 defined by 0 0 . In the following paragraphs we will compute the NVE over the Riemann surfaces obtained above. Those computations will be valid for any value of F . b 0 , and let us choose the Let (x, y, z, X, Y, Z) be local canonical coordinates of M 0 holomorphic frame R = {ex , ey , ez , eX , eY , eZ } of T M , where ex = ∂∂x , etc. Then, 0 the variational equation along 0 is the differential system

Non-Integrability of the Stark–Zeeman Hamiltonian System

59

dξ e ξ, = A(t) dt e = J HessH(ϕ(t)), A(t) where J is the standard symplectic matrix. 0 The normal variational equation along 0 is composed of two uncoupled equations ξ¨i −

1 1 − ξi = 0, i = 1, 2. ϕi (t)3 4

(6)

We denote by Gi (i = 1, 2) the differential Galois group of each equation of (6) over the 0 field of meromorphic functions over 0 , and by G the differential Galois group of the normal variational equation (6). Our objective is to prove that the identity component of the group G is not abelian. Because each element in G is of the form A 0 , 0 B where 0 is the 2 × 2 null matrix, A ∈ G1 and B ∈ G2 , the identity component of G is not abelian if the identity component of G1 or G2 is not abelian. Then, in what follows we will consider the normal variational equation ξ¨ −

1 1 ξ =0 − ϕ1 (t)3 4

0

over 0 , and we will compute the differential Galois group of this equation over the field 0 of meromorphic functions over 0 . We denote this group by G3 . First, we carry out the change of variables t ↔ z, z = ϕ1 (t). Then, we obtain 0 0 ' P1 , and the algebraic expression of the normal variational equation (ANVE) on P1 reads η¨ −

4 − z3 1 + F z2 η ˙ − η = 0. 2z(1 − F z2 ) 8z2 (1 − F z2 )

(7)

We observe that the poles z = 0 and z = ∞ correspond to the two points at infinity of 0 0 0 , and the poles z = ± √1 are ramification points of the finite covering 0 ' P1 . F

3.2. Application of Kovacic’s algorithm. We suppose first F 6 = 0. Then, by a second √ change of variables z ↔ u, u = F z on P1 , we obtain that Eq. (7) η¨ −

1 − δu3 1 + u2 η ˙ − η = 0, 2u(1 − u2 ) 2u2 (1 − u2 )

(8)

1 . Let us denote by GB the differential Galois group of Eq. (8) over 4F (3/2) the differential field of meromorphic functions on P1 . By Theorem 2.3 we have that the identity components of G3 and GB coincide. Then, we will compute GB . where δ =

60

S. Ferrer, F. Mondéjar

Transforming the ANVE (8) to its normal invariant form is done by means of the R 1 + u2 . We obtain usual change χ = exp ( 21 p)η, where p = − 2u(1 − u2 ) χ¨ = rχ

(9)

with r=

13 16 u2

+

−1 16

−3 5 δ −3 + 4δ 16 16 + 4 16 + + + . u−1 (u − 1)2 u+1 (u + 1)2

(10)

We note that the singular points u = 0, ±1 are regular and u = ∞ is a irregular singular point. The solvability of Eq. (8) is equivalent to the solvability of the ANVE. Then, we will determine the differential Galois group of Eq. (9) over the field of meromorphic functions over P1 . We denote this group by G4 (in general the groups G4 and GB do not coincide). In order to obtain the group G4 , we apply the original Kovacic’s algorithm (see [17]). Because G4 is an algebraic subgroup of SL(2, C) the following proposition gives the possible cases for G4 (see [17] or [18]). Proposition 3.1. Let V be an algebraic subgroup of SL(2, C). Then, one of the following four cases may happen: 1. V is triangulisable, 2. V is conjugate to a subgroup of [ 0 c c 0 ∗ ∗ / c ∈ C / c ∈ C D= −c−1 0 0 c−1

(11)

and case (1) does not hold, 3. V is finite and cases (1) and (2) do not hold, 4. V = SL(2, C). In the last case the identity component V ◦ of V coincides with the whole V . For a general second order linear differential equation y¨ = ry, with r a meromorphic function over P1 , necessary conditions for the above cases to hold are recovered in the following proposition (see the first theorem in Sect. 2.1 of [17]). Proposition 3.2. Necessary conditions for the cases of Proposition 3.1 to hold are 1. Every pole of r must have even order or else have order 1, and the order of r at ∞ must be even or else be greater than 2 in order for the case (1) to hold 2. r must have at least one pole that either has odd order greater than 2 or else has order 2 in order for the case (2) to hold 3. The order of a pole of r can not exceed 2 and the order of r at ∞ must be at least 2. If the partial fraction expansion of r is X βj X αi + r= (x − ci )2 x − dj i

j

√ P P P then 1 + 4αi ∈ Q, for each i, j βj = 0, and if γ = i αi + j βj , then √ 1 + 4γ ∈ Q. This condition is necessary for case (3) to hold.

Non-Integrability of the Stark–Zeeman Hamiltonian System

61

Then, applying Proposition 3.2 to Eq. (9) only cases (2) or (4) of Proposition 3.1 can be possible. Thus, we only need to compute the second step of Kovacic’s algorithm (see Appendix A). Working with it in our problem, let ϒ = {0, −1, 1, ∞} be the set of poles of r. Then, for each c ∈ ϒ we compute r \ 13 : k = 0, ±2} Z = {2}, E0 = {2 + k 1 + 4 16 r \ −3 : k = 0, ±2} Z = {1, 2, 3}, E1 = E−1 = {2 + k 1 + 4 16 E∞ = {1}. Now, for all families (ec )c∈ϒ , ec ∈ Ec , not all members of the family even, the number P d = 21 (e∞ − c∈ϒ ec ) is not a non-negative integer. Then, from the second step of Kovacic’s algorithm we deduce that case (2) does not hold. Thus, case (4) holds and G4 = G◦4 = SL(2, C). As a final conclusion the group G4 has a not abelian identity component, and so, the identity component of the group G is not abelian. It remains to consider the case F = 0. Then, Eq. (7) reads η¨ −

z3 − 4 1 η=0 η˙ + 2z 8z2

(12)

and its normal invariant form is χ¨ = rχ, with r =

z 13 − . 16z2 8

(13)

We denote by GB and G5 the differential Galois groups of Eqs. (12) and (13) respectively. Then, as in the previous case, the solvability of GB is equivalent to the solvability of G5 . Thus, we will determine G5 . Because the order at infinity of r is −1, by Proposition 3.2 only cases (2) and (4) of Proposition 3.1 may happen. Then, as in the previous case we compute the second step of Kovacic’s algorithm. Let ϒ = {0, ∞} be the set of poles of r. Then, E0 = {2} = {−1}. Thus, for each election of ec ∈ Ec with c ∈ ϒ, we have that d = and E∞P 1 (e − ∞ c∈ϒ ec ) is not a non-negative integer. As a conclusion we have that the identity 2 component is G◦5 = SL(2, C). In other words, the identity component of the group G is not abelian. Summarizing the results obtained for F ≥ 0, and using Theorem 2.2, we trivially obtain: Theorem 3.1. Let U ⊂ P1 (C)6 be an arbitrary open neighborhood of 0 0 . Then the Stark–Zeeman Hamiltonian does not admit three independent meromorphic integrals in involution defined on U . Then, in terms of the original Hamiltonian vector field on M, we have the following result: Theorem 3.2. The Stark–Zeeman Hamiltonian does not admit three independent globally defined analytic integrals in involution which extend meromorphically to P1 (C)6 . As a consequence of the above theorems we have the following result: Theorem 3.3. The Stark–Zeeman Hamiltonian system is not completely integrable by rational functions on M.

62

S. Ferrer, F. Mondéjar

Readers should note the possibility of the existence of three independent analytic first integrals in involution for the Stark–Zeeman system which can be extended merob but not meromorphically to P1 (C)6 ; this has already been noted by morphically to M Morales and Ramis in [14]. Finally, we note that our work includes an alternate proof of the non-integrability of the Zeeman Hamiltonian obtained by Kummer and Saenz [10]. However, the non-integrability result obtained by them is different from our result, because in their paper it is proved that the reduced Zeeman Hamiltonian system by the S 1 symmetry is not integrable by meromorphic functions defined in the reduced manifold. Acknowledgements. The authors are very grateful to Prof. Morales for his help in order to clarify some theoretical concepts applied in this paper. We want also to acknowledge the anonymous referee for the improvements in the style of the text. This research is partially supported by the project DGICYT, PB95-0795 of the Ministerio de Educación y Cultura of Spain.

A. Second Step of Kovacic’s Algorithm Let r be the rational function on C(x) that defines the second order linear differential equation y 00 = ry. Let ϒ be the set of the poles of r. Step 1. For each c ∈ ϒ we define Ec as follows: (a) If c is a pole of r of order 1, then Ec = {4}. 1 (b) If c is a pole of r of order 2 and if b is the coefficient of (x−c) 2 of the partial fraction expansion of r, then n o\ √ Z. (14) Ec = 2 + k 1 + 4b, k = 0, ±2 (c) If c is a pole of r of order v > 2, then Ec = {v}. (d) If r has order > 2 at ∞, then Ec = {0, 2, 4}. (e) If r has order 2 at ∞, and b is the coefficient of x −2 in the Laurent series expansion of r at ∞, then n o\ √ Z. (15) Ec = 2 + k 1 + 4b, k = 0, ±2 (f) If the order of r at ∞ is v < 2, then Ec = {v}. Step 2. We consider all families (ec )c∈ϒ with ec ∈ Ec . Those families whose P all of coordinates are even may be discarded. Let d = 21 e∞ − c∈ϒ ec . If d is a non-negative integer, the family should be retained, otherwise the family is discarded. If no families remain under consideration, case (2) of Proposition 3.1 cannot hold. Step 3. For each family retained from Step 2, we form the rational function θ = ec 1P c∈ϒ x−c . Next we search for a monic polynomial P of degree d (as de2 fined in (A)) such that P 000 + 3θP 00 + (3θ 2 + 3θ 0 − 4r) P 0 + (θ 00 + 3θ θ 0 + θ 3 − 4rθ − 2r 0 ) P = 0. If no such polynomial is found for any family retained from Step 2, then case (2) of Proposition 3.1 cannot hold. 0 Suppose that such a polynomial is found. Let φ = θ + PP and let ω be a solution R of the equation ω2 + φω + ( 21 φ 0 + 21 φ 2 − r) = 0. Then η = exp ω is a solution of y 00 = ry.

Non-Integrability of the Stark–Zeeman Hamiltonian System

63

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

Gallagher, T. F.: Rydberg Atoms. Cambridge: Cambridge University Press, 1994 Garton, W.R.S. and Tomkins, F.S.: Astrophys. J. 158, 839 (1969) Friedrich, H. and Wintgen, D.: Phys. Rep. 183, 37 (1989) Farrelly, D. et al.: Phys. Rev. A 45, 4738 (1992) Gutzwiller, M. C.: Chaos in Classical and Quantum Mechanics New York, Springer-Verlag, 1990 Braun, P.A.: Sov. Phys. JETP. 70, 986–992 (1990) Iken, M. Borondo, F., Benito, R.M., and Uzer, T.: Phys. Rev. A 49, 2734 (1994) Robnik, M.: J. Phys. A., Math Gen. 14, 3195 (1981) Salas, J.P. , Deprit, A., Ferrer, S., Lanchares, V., Palacián, J.: Phys. Let. A 242, 83–93 (1998) Kummer, M. and Saenz, A.W.: Commun. Math. Phys. 162, 447–465 (1994) Ziglin, S.L.: Functional Anal. Appl. 16, 181–189 (1982) Ziglin, S.L.: Functional Anal. Appl. 17, 6–17 (1983) Morales, J.J. and Ramis, J.P.: Galoisian Obstructions to Integrability of Hamiltonian Systems I. Submitted for publication to J. Diff. Geom., 1998 Morales, J.J. and Ramis, J.P.: Galoisian Obstructions to Integrability of Hamiltonian Systems II. Submitted for publication to J. Diff. Geom., 1998 Morales, J.J. and Ramis, J.P.: A Note on the Non-Integrability of some Hamiltonian Systems with a Homogeneous Potential. Submitted for publication to J. Diff. Geom., 1998 Chern, S.S.: Complex Manifolds without Potential Theory. 2nd ed., New York: Springer-Verlag, 1979 Kovacic, J.J.: J. Symbolic Computation 2, 3–43 (1986) Kaplansky, I.: An Introduction to Differential Algebra. Paris: Hermann, 1975

Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 65 – 90 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Hyperbolic Billiards on Surfaces of Constant Curvature Boris Gutkin1 , Uzy Smilansky1 , Eugene Gutkin2 1 Department of Physics of Complex Systems, The Weizmann Institute of Science, Rehovot 76100, Israel.

E-mail: [email protected]

2 Department of Mathematics, University of Southern California, Los Angeles, CA 90089-1113, USA.

E-mail: [email protected] Received: 26 January 1999 / Accepted: 17 May 1999

Abstract: We establish sufficient conditions for the hyperbolicity of the billiard dynamics on surfaces of constant curvature. This extends known results for planar billiards. Using these conditions, we construct large classes of billiard tables with positive Lyapunov exponents on the sphere and on the hyperbolic plane. 1. Introduction From the point of view of differential dynamics, billiards are the geodesic flows on manifolds with a boundary. Since the early beginnings of the study of classical and quantum chaos, billiards have been used as a paradigm. Billiards are one of the best understood classes of dynamical systems that demonstrate a broad variety of behaviors: from integrable to chaotic. In fact, several key properties of chaotic dynamics were first observed and demonstrated for billiards. Many popular models of statistical mechanics, e.g., the Lorenz gas, the hard sphere (Boltzmann–Sinai) gas, etc., can be reduced to billiards in special domains. Among chaotic dynamical systems, the billiards with nonvanishing Lyapunov exponents are of special interest. For brevity we will often call them hyperbolic billiards. The Pesin theory of smooth nonuniformly hyperbolic systems [Pe], extended by A. Katok and J.-M. Strelcyn to systems with singularities [KS], implies that hyperbolic billiards have strong mixing properties: at most countable number of ergodic components, positive entropy, Bernoulli property, etc. In the present paper we consider billiards on surfaces of constant curvature. For simplicity of exposition, we restrict the details of our analysis to the simply connected surfaces of constant curvature: the plane, the sphere and the hyperbolic plane. Employing a uniform method, we establish widely applicable conditions, sufficient for positivity of the Lyapunov exponent. The study of billiards on curved surfaces is partially motivated by recent technical advances in semiconductor fabrication techniques. They allow to manufacture solid state (mesoscopic) devices where electrons are confined to a curved

66

B. Gutkin, U. Smilansky, E. Gutkin

surface (e.g. sphere) [FLBP]. Many properties of these devices can be theoretically derived, using billiards as simplified models. The billiard dynamics crucially depends on the curvature of the surface. On the plane, billiard trajectories separate only linearly with time, so that the motion between collisions with the boundary is neutral. Exponential separation of billiard trajectories can occur only if the reflections from the boundary introduce sufficient instability. On the hyperbolic plane, geodesics diverge exponentially, so that the main role of the boundary is to confine the mass point to the billiard table. Thus, the boundary can be neutral (i. e., with zero curvature), and the “stretching and folding” necessary for chaotic dynamics, will be provided by the metric. This phenomenon contrasts the billiard dynamics on the sphere, where any two geodesics intersect twice, at focal points. Thus, the boundary reflections have to compensate for the focusing effect of the sphere, in order to produce chaotic dynamics. Up to now, the study of billiards on surfaces (and hyperbolic billiard dynamics in particular) has been by and large restricted to the Euclidean plane. See, however, [Ve] for a study of integrable billiards on surfaces of constant nonzero curvature. See also [Ta] for some results on chaotic billiards on the hyperbolic plane, and [Vet1,Vet2,KSS] for some results on hyperbolic billiards on a general Riemannian surface. There are many results in the literature concerning hyperbolic dynamics for planar billiards [Si,Bu1Bu4,Wo2,Ma,Do]. In the present work we generalize Wojtkowski’s criterion of hyperbolicity [Wo2] to billiards on arbitrary surfaces of constant curvature. We interpret Wojtkowski’s condition [Wo2] in terms of a special class of trajectories, which generalize two-periodic orbits. Let Q be a billiard table on a surface of constant curvature. The billiard map φ : V → V acts on the phase space V , which consists of pairs v = (m, θ). Here m is the position of the ball on the boundary ∂Q of Q, and θ is the angle between the outgoing velocity and the tangent to ∂Q at m. The billiard map preserves a natural probability measure µ on V . We denote the images of v after n iterations by (mn+1 , θn+1 ) = φ n (v). The trajectory φ n (v) is a generalized two-periodic trajectory (g.t.p.t.) if the following conditions are satisfied: 1. The incidence angle and the curvature of the boundary κn at the bouncing points have period 2: θ2n = θ2 , θ2n+1 = θ1 , κ2n = κ2 , κ2n+1 = κ1 ; 2. The geodesic distance between consecutive bouncing points is constant: s = |mn mn+1 | (see Fig. 1a). 3. If θi = π/2, the g.t.p.t. is a two-periodic orbit, see Fig. 1b. Along a g.t.p.t. the linearized map Dv φ is two-periodic, and the stability of a g.t.p.t. is determined by Dv φ 2 . As we will see in Sect. 2, for each surface of constant curvature, the stability type of a g.t.p.t. is completely determined by the triple of parameters (d1 , d2 , s), where 2d1 (resp. 2d2 ) is the signed length of the chord generated by the intersection of the line m1 m2 with the osculating circle at m1 (resp. m2 ) (see Fig. 1a). We shall use the symbol T (d1 , d2 , s) for the g.t.p.t. with parameters (d1 , d2 , s). We will now discuss g.t.p.t.s for planar billiards in some detail. Here s is the euclidean distance between consecutive bouncing points, and di = ri sin θi , i = 1, 2, where ri are the radii of curvature of the boundary ∂Q at the respective points. If the curvature of the boundary at the bouncing point is zero we take ri = −∞ as the radius of curvature and di = −∞ respectively. By an elementary computation, T (d1 , d2 , s) is unstable if and

Hyperbolic Billiards on Surfaces of Constant Curvature

...

m

m

2

67

m 4

...

2

2d 2

2d 2

2d 1

2d 1

...

... m=m

1

m

3

m

a)

5

m

1

b) Fig. 1.

only if

   [d1 , d2 ] ∪ [d1 + d2 , ∞) if d1 , d2 ≥ 0 s ∈ [0, ∞) if d1 , d2 ≤ 0   [0, d + d ] ∪ [d , ∞) if d ≥ 0, d ≤ 0. 1 2 1 1 2

(1.1)

Moreover, the trajectory is hyperbolic (i. e., strictly unstable) if s is in the interior of the corresponding interval, and the trajectory is parabolic if s is a boundary point (in the limiting case d1 = d2 = −∞ the trajectory is parabolic for any value of s). We introduce the notions of B-unstable and S-unstable g.t.p.t.s. The g.t.p.t. T (d1 , d2 , s) is B-unstable if in Eq. (1.1) s belongs to a “big interval”:   [d1 + d2 , ∞) if d1 , d2 ≥ 0 (1.2) s ∈ [0, ∞) if d1 , d2 ≤ 0  [d , ∞) if d1 ≥ 0, d2 ≤ 0. 1 On the contrary, if s belongs to a “small interval”, then T (d1 , d2 , s) is S-unstable: ( if d1 , d2 ≥ 0 [d1 , d2 ] (1.3) s∈ [0, d1 + d2 ] if d1 ≥ 0, d2 ≤ 0. Note that a small interval shrinks to a point when |d1 | = |d2 |. We will outline a simple connection between the present approach and Wojtkowski’s method (for planar billiards). With any point v = (m1 , θ1 ) ∈ V of the phase space we associate a formal g.t.p.t. T (v). Let φ(v) = (m2 , θ2 ). We set d1 = d(v), d2 = d(φ(v)) and s = |m1 m2 |. The formal g.t.p.t. T (v) can be realized as an actual g.t.p.t. T (d1 , d2 , s) in an auxiliary billiard table Qv , constructed from the boundary ∂Q around mi , as shown in Fig. 2. Definition 1. Let the notation be as above. A point v ∈ V of the billiard phase space is (a) B-hyperbolic (or strictly B-unstable) if the g.t.p.t. T (v) is strictly B-unstable; (b) B-parabolic if T (v) is B-unstable and parabolic (i. e., s belongs to the boundary of the appropriate big interval in Eq. (1.2)); (c) B-unstable if T (v) is B-unstable (i. e., B-parabolic or B-hyperbolic);

68

B. Gutkin, U. Smilansky, E. Gutkin

φ(v )

m θ2

θ2

Q

m

2

2d 2

s

s θ1

s Qv

2d 1

θ1

v

4

m

1

m

3

m

5

Fig. 2.

(d) eventually strictly B-unstable if for some n ≥ 0 the point φ n (v) is strictly B-unstable, while φ i (v) are B-unstable for 0 ≤ i < n. In our interpretation, Wojtkowski’s hyperbolicity criterion [Wo2] is the condition that µ-almost all points of the billiard phase space are eventually strictly B-unstable. The concept of g.t.p.t.s and the associated structures make sense for billiards on any surface. In the body of the paper we will generalize the notions of the B-unstable and S-unstable g.t.p.t.s to arbitrary surfaces of constant curvature, thus extending Definition 1 to billiards on all of these surfaces. Now we formulate the main result of this work. Theorem 1 (Main Theorem). Let Q be a billiard table on a surface of constant curvature, and let φ : V → V be the billiard map. Let µ be the canonical invariant measure on V . If µ almost every point of V is eventually strictly B-unstable then the billiard in Q is hyperbolic. Later on in the paper we will derive geometric conditions on the billiard table that insure that the g.t.p.t.s are B-unstable. With these conditions, which depend on the curvature of the surface, Theorem 1 will become a geometric criterion for hyperbolicity of the billiard dynamics on surfaces of constant curvature. In particular, for planar billiards Theorem 1 yields Wojtkowski’s criterion [Wo2]. Let λ(v) ≥ 0 be the Lyapunov exponent of the billiard in the table Q, which is defined for µ-almost all v ∈ V . Recall that in our terminology the billiard in Q is hyperbolic if λ(v) is positive µ almost everywhere. We denote by h(Q) the metric entropy (with respect to µ) of the billiard in Q. Following the approach of Wojtkowski’s [Wo2], we will estimate from below the metric entropy of billiards satisfying the conditions of Theorem 1. ¯ Let φv be the map corresponding to the g. t. p. t. T (v), and let λ(v) limn→±∞ n1 log ||Dφvn || ≥ 0 be its Lyapunov exponent.

=

Theorem 2. Let Q be a billiard table satisfying the assumptions of the main theorem, and let the notation be as above. Then Z λ¯ (v) dµ. (1.4) h(Q) ≥ V

Hyperbolic Billiards on Surfaces of Constant Curvature

69

To explain the mysterious appearance of g.t.p.t.s, which bear the crux of our approach to hyperbolicity in billiard dynamics, we will outline a connection between them and the method of invariant cone fields of Wojtkowski [Wo1,Wo2]. Let σ : V → V be the time-reversal involution: σ (m, θ ) = (m, π − θ ) and let W = {W (v) : v ∈ V } be an invariant cone field defined in terms of a projective coordinate (each W (v) is an interval in R ∪ ∞). We say that W is symmetric, if W (v) = W (σ (v)) for each v ∈ V . The invariant cone fields defined in [Wo2] are symmetric. It can be shown that the existence of a symmetric invariant cone field in V implies the instability of µ almost all g.t.p.t.s T (v), v ∈ V . In the proof of Theorem 1 we will show that for our class of billiards the (quasi)converse holds. More precisely, if µ almost all g.t.p.t.s T (v), v ∈ V , are B-unstable, then V has a symmetric invariant cone field. If, besides, µ almost all g.t.p.t.s are eventually strictly B-unstable, then such cone field is eventually strictly invariant and the billiard dynamics is hyperbolic. The plan of the paper is as follows. In Sect. 2 we provide the necessary preliminaries and study the geometric optics (i. e., the propagation and reflection of infinitesimal light beams) on surfaces of constant curvature. In Sect. 3 we apply these results to obtain explicit analogs of Eqs. (1.1–1.3). We derive linear instability conditions for g.t.p.t.s and show that they distinguish between B-unstable and S-unstable trajectories in a natural way. In Sect. 4, using invariant cone fields á lá Wojtkowski, we prove the main theorem. We define our cone fields for billiards on all surfaces of constant curvature. Employing geometric optics, we show that under the assumptions of the main theorem these cone fields are invariant, and eventually strictly invariant. Also in Sect. 4 we prove Theorem 2. In Sect. 5 we derive hyperbolicity criteria for elementary billiard tables (the boundary consists of circular arcs). Then we apply the main theorem and its corollaries to construct several classes of billiard tables with hyperbolic dynamics on the sphere and on the hyperbolic plane. Finally, we formulate general principles for the design of billiard tables satisfying the conditions of Theorem 1. In particular, we obtain the counterparts of Wojtkowski’s geometric inequality [Wo2] for surfaces of constant nonzero curvature. The calculations are involved, and we relegate them to the Appendix. In a forthcoming publication [Gb] we will apply the methods developed here to investigate the dynamics of billiards in constant magnetic fields on arbitrary surfaces of constant curvature. The results of Wojtkowski [Wo2] have been strengthened (for planar billiards) in [Bu3,Bu4], and [Do]. It turns out that the criteria of [Bu3,Bu4], and [Do] can be obtained using certain invariant cone fields, which are, in general, not symmetric. This suggests that our hyperbolicity criterion for billiards on surfaces of constant curvature can be considerably strengthened, by employing other invariant cone fields. In particular, we believe that the results of Bunimovich [Bu3,Bu4] and Donnay [Do] can be extended to billiards on surfaces of constant curvature. 2. Geometric Optics and Billiards on Surfaces of Constant Curvature Let M be a simply connected surface of constant curvature, and let Q be a connected domain in M, with a piecewise smooth boundary ∂Q. For concreteness, we will assume that the curvature is either zero (M = R2 ), or one (M = S2 ), or minus one (M = H2 ). In what follows, ∂Q is endowed with the positive orientation. The billiard in Q is the dynamical system arising from the geodesic motion of a point mass inside Q, with specular reflections at the boundary. The standard cross-section, V ⊂ T Q, of the billiard flow consists of unit tangent vectors, with origin points on ∂Q,

70

B. Gutkin, U. Smilansky, E. Gutkin

pointing inside Q. The first return associated with this cross-section is the billiard map, φ : V → V . We will use the standard coordinates (l, θ ) on V , where l is the arclength parameter on ∂Q and 0 ≤ θ ≤ π is the angle between the vector and ∂Q. We call V the phase space of the billiard map, associated with the billiard table Q. The invariant measure µ = (2|∂Q|)−1 sin θdldθ is a probability measure, µ(V ) = 1. We will study the natural action of the differential of φ on the projectivization B of the tangent manifold of V . Abstractly, B consists of straight lines (as opposed to vectors) in the tangent planes to points of V . We will describe this space using the language of geometric optics. An oriented curve γ ⊂ M, of class C 2 , defines a “light beam”, i. e., the family of geodesic rays orthogonal to γ . The geodesics which intersect γ infinitesimally close to a point, m ∈ γ , form an “infinitesimal beam”, which is completely determined by the normal unit vector v ∈ Tm M to γ , and by the geodesic curvature χ of γ at m. We denote the infinitesimal beam by b(v, χ). Our convention for the sign of the curvature is opposite to the one used in [Si,Bu1–Bu4]. Infinitesimal beams yield a geometric representation of the projectivized tangent manifold to the unit tangent bundle of M. In particular, they give us a geometric realization of the space B. We will describe the differential of the billiard map in this realization. Let p : B → V be the natural projection. Since dim V = 2, each fiber p−1 (v) ≡ Bv ⊂ B is abstractly isomorphic to the projective line, and we take χ ∈ R ∪ ∞ as projective coordinate on Bv (this representation of B was discussed for the planar case by e. g., [Wo2]). In this coordinatization, Bv = {b(v, χ) : χ ∈ R ∪ ∞}. Let X ⊂ T M be the set of unit tangent vectors with origin points in ∂Q, and let Y = {b(v, χ) : v ∈ X, χ ∈ R ∪ ∞} be the set of corresponding infinitesimal beams. Let ρm : Tm M → Tm M be the linear reflection about the tangent line to ∂Q. As m runs through ∂Q, the reflections ρm yield a selfmapping ρ : X → X whose differential acts on Y . Let 8s denote the geodesic flow of M. Let G(v) be the oriented geodesic defined by a unit tangent vector. For v ∈ V let s(v) be the distance along G(v) between the origin point of v, and the next intersection point of G(v) with ∂Q. Then 8s(v) (v) ∈ X, and ρ ◦ 8s(v) (v) ∈ V . Let 8 : V → X be the mapping v 7→ 8s(v) (v). We will use the same letters, φ, ρ, and 8, for the (projectivized) differentials of these mappings. Since the billiard map is the composition: φ = ρ ◦ 8,

(2.1)

it remains to compute the action of 8 and ρ on infinitesimal beams. Let b(v− , χ− ) ∈ Y be an infinitesimal beam, and let m ∈ ∂Q be the origin point of v− . Set ρ · b(v− , χ− ) = b(v+ , χ+ ). Let κ be the curvature of ∂Q at m, and let θ be the angle between v− and the positive tangent vector to ∂Q at m. Then v+ = ρm (v− ), and 2κ . (2.2) sin θ This formula is well known when M = R2 [Si,Bu1], and extends to all surfaces of constant curvature. Let now b = b(v, χ) be an arbitrary infinitesimal beam, and set b0 = 8s · b = b(v 0 , χ 0 ), where v 0 = 8s (v). We will express χ 0 separately for each surface. χ+ = χ− +

a) Flat case (M = R2 ). By elementary euclidean geometry, we have χ0 =

χ s −2 = −s −1 + −1 . 1 − sχ s −χ

(2.3)

Hyperbolic Billiards on Surfaces of Constant Curvature

71

l’ C(l)

C(l) l’

l’ C(l)

2d

2d

2d A l

2dB l

l R

2

H 2 |κ| > 1

S2

l’ 2d A

C(l)

l’

l H 2 |κ| < 1

Fig. 3.

b) Curvature one case (M = S2 ). By elementary spherical geometry: χ 0 = − cot s +

sin−2 s . cot s − χ

(2.4)

c) Curvature minus one case (M = H2 ). The considerations depend on whether |χ| is greater or less than one. However, the final expression is the same (we omit the details): sinh−2 s . (2.5) χ 0 = − coth s + coth s − χ Note that in the limit s → 0 Eqs. (2.3–2.5) coincide. For v ∈ V set D(v) = sin θ/κ, so that Eq. (2.2) becomes 2 . (2.6) −χ− + χ+ = D(v) Using classical formulas for surfaces of constant curvature ([Vi], compare also Eq. (2.8) below with [Ta], for a different but related context), we will give a geometric interpretation of the function D(·). Let v ∈ V , and let m = m(l) ∈ ∂Q be the origin point of v. Let C(l) ⊂ M be the osculating circle (hypercycle if M = H2 and |κ(l)| < 1) of ∂Q. The geodesic, G(v), corresponding to v intersects C(l) at m and another point, ˜ be one half of the signed distance between m and m0 , along G(v). m0 = m(l 0 ). Let d(v) If |κ(l)| < 1, the hypercycle C(l) consists of two components, see Fig. 3. Then there are two possibilities: the points l and l 0 belong to the same component (resp. different components) of C(l), Fig. 3. The former (resp. the latter) case occurs if |D(v)| ≤ 1 (resp. |D(v)| > 1). Remark. When κ(l) = 0 (D(v) = ∞) and M = R2 , S2 there is ambiguity in the above ˜ ˜ definition of d(v). In this case there are two different values d(v) = ±d˜0 (d˜0 = +∞ for M = R2 and d˜0 = π/2 for M = S2 ) satisfying the above definiton (if M = H2 , then d˜0 = 0 and two values coincide). In what follows we always choose in such case the ˜ i.e., we consider the case of zero curvature negative value −d˜0 as the definition for d(v), ˜ boundary as a limiting case of a negative curvature boundary. Thus d(v) ∈ [−∞, ∞) if ˜ ∈ [−π/2, π/2) if M = S2 . M = R2 and d(v)

72

B. Gutkin, U. Smilansky, E. Gutkin

Set

 ˜ d(v)     ˜  d(v) ˜ d(v) = d(v)   ˜ d(v) + iπ/2   ˜ d(v) + iπ/2

Then we have

if M if M if M if M if M

= R2 or M = S2 = H2 and |κ(l)| ≥ 1 = H2 , |κ(l)| < 1, |D(v)| ≤ 1 = H2 , |κ(l)| < 1, |D(v)| > 1 = H2 , |κ(l)| < 1, |D(v)| > 1.

  if M = R2 d(v) D(v) = tan d(v) if M = S2  tanh d(v) if M = H2 .

(2.7)

(2.8)

For the case M = H2 we will use the following classification of points of the phase space V . We say that v ∈ V is of type A (resp. B) if |D(v)| ≤ 1 (resp. |D(v)| > 1). Let V A , V B be the corresponding subsets of V . Then V = V A ∪ V B is a partition. We will use the notation: ( d A (v) ∈ [−∞, ∞] if v ∈ V A ˜ (2.9) d(v) = d B (v) ∈ (−∞, ∞) if v ∈ V B . 3. Generalized Two-Periodic Trajectories (g.t.p.t.s) Consider the billiard dynamics in an arbitrary table on a surface of constant curvature. Eqs. (2.2) and (2.3–2.5) describe the action of the billiard map on infinitesimal beams. Starting with an arbitrary b(v, χ) and iterating the equations, we obtain for χ after an infinite number of reflections a formal continued fraction b0 , (3.1) c ≡ χ ∞ = a0 + b1 a1 + a2 · · · whose coefficients are determined by di = d(φ i−1 · v), and by the lengths si of consecutive billiard segments, where i = 1, 2, . . . . The idea to associate a continued fraction (3.1) to a billiard orbit has been introduced by Y. Sinai in the seminal paper [Si], where he considered billiards in R2 . Equation (3.1) is a direct extension of Sinai’s idea to an arbitrary surface of constant curvature. Let Q be a billiard table, and let v ∈ V be an arbitrary point in the phase space of the billiard map. Set v1 = v, v2 = φ(v), di = d(vi ), i = 1, 2, and let s = s(v) be the distance between the origin points of v1 and v2 , respectively (Fig. 2). Let T (v) = T (d1 , d2 , s) be the associated g.t.p.t. (see Sect. 1). The g.t.p.t. T (v) can be realized as a trajectory in an artificial billiard table whose exact shape Qv is not important (see Fig. 2). We denote by φv the associated billiard map. Let c(v) be the formal continued fraction Eq. (3.1), corresponding to T (v). Note that c(v) is periodic. Proposition 1 below relates the convergence of c(v) with the stability type of T (v). Recall that the standard definitions of elliptic, hyperbolic, and parabolic periodic points can be expressed in terms of the appropriate power of the differential of the transformation, i. e., a particular matrix associated with the periodic orbit, see, e. g., [KH]. Hence, these definitions straightforwardly extend to generalized periodic orbits, and we leave the details to the reader. In what follows we will talk about elliptic, parabolic, or hyperbolic g.t.p.t.s. We say that a g.t.p.t. is (exponentially) unstable if it is either hyperbolic or parabolic (resp. hyperbolic).

Hyperbolic Billiards on Surfaces of Constant Curvature

73

Proposition 1. Let v ∈ V be arbitrary, and let the notation be as above. The g.t.p.t. T (v) is (exponentially) unstable if and only if the continued fraction c(v) converges (exponentially fast). We outline a proof of Proposition 1, referring to [Wa] for the standard material on continued fractions. With a periodic continued fraction one associates a fractional linear transformation, or, equivalently, a 2 × 2 matrix, defined up to a scalar factor. For a c(v) this matrix essentially coincides with the linear transformation associated with the g.t.p.t. T (v). The claim now follows from the standard facts [Wa] (we leave details to the reader). Note that Proposition 1 (and its proof) straightforwardly extends to generalized periodic trajectories of any period. Remark. Another approach to the stability of T (v) is to consider the linearization Dφv2 . Then T (v) is hyperbolic if |tr(Dφv2 )| > 2, parabolic if |tr(Dφv2 )| = 2, and elliptic if |tr(Dφv2 )| < 2. Lemma 1. Let v ∈ V , and let d1 , d2 , s be the associated data. Then the coefficients ai , bi , i ≥ 1 of the continued fraction c(v) are given by the following formulas: a) M = R2 . We have a2n+1 = −2s −1 + 2d1−1 , a2n = −2s −1 + 2d2−1 , bn = −s −2 ; b) M = S2 . Then a2n+1 = −2 cot s + 2 cot d1 , a2n = −2 cot s + 2 cot d2 , bn = − sin−2 s; c) M = H2 . Here we have a2n+1 = −2 coth s + 2 coth d1 , a2n = −2 coth s + 2 coth d2 , bn = − sinh−2 s. Proof. The formulas are obtained by direct computations from Eqs. (2.2–2.6). u t Since the g.t.p.t. T (v) and the continued fraction c(v) are essentially determined by the triple (d1 , d2 , s) corresponding to v, we will use the notation T (d1 , d2 , s) and c(d1 , d2 , s) in what follows. The formulas of Lemma 1 allow to compute the 2 × 2 matrix associated with c(d1 , d2 , s). Analyzing this matrix for each of the three surfaces, we obtain simple criteria for the convergence of c(d1 , d2 , s). Proposition 2. The continued fraction c(d1 , d2 , s) converges if and only if the following inequalities are satisfied: a) If M = R2 : (s − d1 )(s − d2 )(s − d1 − d2 )s ≥ 0.

(3.2)

sin(s − d1 ) sin(s − d2 ) sin(s − d1 − d2 ) sin s ≥ 0.

(3.3)

sinh(s − d1 ) sinh(s − d2 ) sinh(s − d1 − d2 ) sinh s ≥ 0.

(3.4)

b) If M = S2 :

c) If M = H2 :

Taking into consideration that s ≥ 0 for R2 and H2 , and that 0 ≤ s ≤ 2π for S2 , we reformulate Proposition 2 in a more explicit form.

74

B. Gutkin, U. Smilansky, E. Gutkin

a) Let M = R2 . Then T (d1 , d2 , s) is unstable if and only if   [d1 , d2 ] ∪ [d1 + d2 , ∞) if d1 , d2 ≥ 0 s ∈ [0, ∞) if d1 , d2 ≤ 0  [0, d + d ] ∪ [d , ∞) if d ≥ 0, d ≤ 0. 1 2 1 1 2 b) Let M = S2 . Set

(3.5)

( s modπ =

s if s ≤ π s − π if s > π.

Then T (d1 , d2 , s) is unstable if and only if   [d1 + d2 , π] ∪ [d1 , d2 ]   [0, d + d + π] ∪ [π − d , π − d ] 1 2 1 2 s modπ ∈  [d , π + d ] ∪ [0, d + d ] 1 1 2   2  [d2 , π + d1 ] ∪ [π + d2 + d1 , π]

if d1 , d2 ≥ 0 if d1 , d2 ≤ 0 if d1 ≤ 0, d2 ≥ 0, |d2 | ≥ |d1 | if d1 ≤ 0, d2 ≥ 0, |d2 | ≤ |d1 |. (3.6) c) Let M = H2 . We say that T (d1 , d2 , s) is of type (A − A) if v1 ∈ V A and v2 ∈ V A . The other types: (A − B), (B − A), and (B − B) are defined analogously. We formulate the criteria of instability for T (d1 , d2 , s) “type-by-type”. Type (A − A):  A A A A A A  [d1 , d2 ] ∪ [d1 + d2 , ∞) if d1 , d2 ≥ 0 (3.7a) s ∈ [0, ∞) if d1A , d2A ≤ 0  [0, d A + d A ] ∪ [d A , ∞) if d A ≥ 0, d A ≤ 0. 1 2 1 1 2 Type (B − B):

( s∈

[d1B + d2B , ∞) if d1B + d2B ≥ 0 [0, ∞) if d1B + d2B ≤ 0.

Types (A − B) or (B − A):

(3.7b)

( s∈

[d1A , ∞) if d1A ≥ 0 [0, ∞) if d1A ≤ 0.

(3.7c)

It is worth mentioning that in Eqs. (3.2–3.4) (resp. Eqs. (3.5–3.7)) the hyperbolicity of T (d1 , d2 , s) corresponds to strict inequalities (resp. inclusions in the interior). The equality case (resp. boundary case) corresponds to the parabolicity of T (d1 , d2 , s). There are also two special cases when T (d1 , d2 , s) is parabolic independently of the value of s: M = R2 , d1 = d2 = −∞ and M = H2 , |d1 | = |d2 | = ∞ (it means also that v1 , v2 ∈ V A ). We say that the right-hand side in Eqs. (3.5–3.7) is the instability set of T (d1 , d2 , s). In general, it is a union of two intervals, where one of them degenerates when |d1 | = |d2 |, while the other is always nontrivial. For want of a better name, we will say that the interval which persists is the “big interval", and the other one is the “small interval". This motivates the following terminology: We will say that T (d1 , d2 , s) is (strictly) Bunstable if s belongs to the (interior of the) big interval of instability. The proposition below makes this terminology explicit.

Hyperbolic Billiards on Surfaces of Constant Curvature

75

Proposition 3. The g.t.p.t. T (d1 , d2 , s) is B-unstable if (and only if) the triple (d1 , d2 , s) satisfies the following conditions: a) Let M = R2 . Then

  [d1 + d2 , ∞) if d1 , d2 ≥ 0 s ∈ [0, ∞) if d1 , d2 ≤ 0  [d , ∞) if d1 ≥ 0, d2 ≤ 0. 1

(3.8)

  if d1 , d2 ≥ 0 [d1 + d2 , π] s modπ ∈ [0, d1 + d2 + π] if d1 , d2 ≤ 0  [d , π + d ] if d1 ≤ 0, d2 ≥ 0. 2 1

(3.9)

b) Let M = S2 . Then

c) Let M = H2 . Then: In the case (A − A)

 A A A A  [d1 + d2 , ∞) if d1 , d2 ≥ 0 s ∈ [0, ∞) if d1A , d2A ≤ 0  [d A , ∞) if d1A ≥ 0, d2A ≤ 0, 1

or |d1A | = |d2A | = ∞ and arbitrary s. In the case (B − B) ( [d1B + d2B , ∞) if d1B + d2B ≥ 0 s∈ [0, ∞) if d1B + d2B ≤ 0. In the cases (A − B) or (B − A) ( s∈

[d1A , ∞) if d1A ≥ 0 [0, ∞) if d1A ≤ 0.

(3.10a)

(3.10b)

(3.10c)

4. Proofs of Theorem 1 and Theorem 2 Proof of the main theorem (Theorem 1). We will define a cone field on the phase space of the billiard map. A cone in Tv V corresponds to an interval in the projectivization, Bv . In Sect. 2 we have explicitly identified each space Bv with the standard projective line R ∪ ∞. Therefore, a cone field, W, is determined by a function, W (·), on V , where each W (v) ⊂ R ∪ ∞ is an interval in the projective coordinate χ. We introduce an auxiliary coordinate f on Bv , which has a simple geometric meaning. Let b(v, χ) be an infinitesimal beam, and let G(v) be the corresponding oriented geodesic. Consider the beams 8t · b(v, χ), obtained by the action of the geodesic flow. Suppose, that M = R2 or M = S2 , or M = H2 and |χ | ≥ 1. Then there is t ∈ R ∪ ∞, such that the beam 8t · b(v, χ ) has infinite curvature. If M = R2 or M = H2 (|χ| ≥ 1), then t is unique, and we set f (χ ) = t. If M = S2 , then t is unique modulo π, and let f (χ) ∈ [−π/2, π/2) be the one with the smallest absolute value. We denote by o(v, χ) ∈ M the origin point of 8f (χ ) · v. This is the focusing point of the infinitesimal

76

B. Gutkin, U. Smilansky, E. Gutkin

v

χ

f(χ )

v

v

O(v, χ )

f(χ )

χ

v

χ

f( χ ) O(v, χ )

χ O’

O(v, χ ) R2

H2 |χ | ≥ 1

S2

H2 |χ | < 1

Fig. 4.

beam b(v, χ), see Fig. 4a, b, c. If M = H2 , and |χ | < 1 then the beam b(v, χ) has no focusing point Fig. 4d. While the focusing point, o(v, χ), depends on both v and χ , the signed focusing distance is determined by the curvature of the beam alone, f = f (χ ). The explicit relations between f and χ depend on M. a) When M = R2 , we have χ = 1/f ; b) If M = S2 , we have χ = cot(f ); c) If M = H2 and |χ| ≥ 1, we have χ = coth(f ). We will define the cone field W using the projective coordinate χ. a) Let M = R2 . Set ( [−∞, D −1 (v)] if D(v) ≤ 0 W (v) = [D −1 (v), +∞] if D(v) > 0. b) Let M = S2 . Set ( W (v) =

[−∞, D −1 (v)] if D(v) ≤ 0 [D −1 (v), +∞] if D(v) > 0.

c) Let M = H2 . We consider two cases. 1) If v ∈ V A , we set ( [−∞, D −1 (v)] if D(v) ≤ 0 W (v) = [D −1 (v), +∞] if D(v) > 0, . 2) If v ∈ V B , then

W (v) = [−∞, D −1 (v)].

In terms of the auxiliary coordinate f the cone field W is given for M = R2 and M = S2 by the following intervals: ( [d(v), 0] if d(v) ≤ 0 W (v) = [0, d(v)] if d(v) > 0.

Hyperbolic Billiards on Surfaces of Constant Curvature

ν

77

f1 2d 2 2d 1

φ(ν)

Q f2

φ 2 (ν)

Fig. 5.

In what follows, we will use the cone field W in one form or the other, whichever is more convenient. We recall the classification of points in the phase space of the billiard map. A point v ∈ V is B-hyperbolic (we will also say strictly B-unstable) if the corresponding g.t.p.t. T (v) is B-unstable and hyperbolic. A point is B-parabolic if T (v) is B-unstable and parabolic. Putting the two definitions together, we will say that v ∈ V is B-unstable if the corresponding g.t.p.t. T (v) is B-unstable (i. e., either B-parabolic or B-hyperbolic). We will say that v ∈ V is eventually strictly B-unstable if there exists n ≥ 0 such that the points φ i (v) are B-unstable for 0 ≤ i < n and φ n (v) is strictly B-unstable. Lemma 2. Let M be a surface of constant curvature, let Q ⊂ M be an arbitrary billiard table, and let W be the cone field defined above. Let v ∈ V be such that the g.t.p.t. T (v) is (strictly) B-unstable. Then φ(W (v)) ⊆ W (φ(v)) (resp. the strict inclusion φ(W (v)) ⊂ W (φ(v)) holds). Proof. Let (d1 , d2 , s) be the triple, associated to v. We will prove the claim separately for each of the three surfaces. a) Let M = R2 (Fig. 5). We rewrite Eq. (2.6) as (d2 − f2 ) (s − f1 − d2 ) = . s − f1 f2

(4.1)

Since (d1 , d2 , s) satisfies Eq. (3.8), we obtain (d2 − f2 )/f2 ≥ 0. The inequality is strict if T (v) is strictly B-unstable. This implies the claim. b) Let M = S2 (Fig. 5). Equation (2.6) and the relation between χ and f on S2 imply sin(d2 − f2 ) sin(s − f1 − d2 ) . = sin(s − f1 ) sin f2

(4.2)

Since the triple (d1 , d2 , s) satisfies Eq. (3.9), sin(d2 −f2 )/ sin f2 ≥ 0 (strict inequality if T (v) is strictly B-unstable). Simple considerations, which we leave to the reader, yield the claim. c) Let M = H2 . From Eqs. (2.5) and (2.6) we have χ2 =

2 sinh−2 s . − coth s + D(v) coth s − χ1

(4.3)

78

B. Gutkin, U. Smilansky, E. Gutkin

Recall that V = V A ∪ V B , a partition of V into the sets of points of type A and type B. Hence, depending on the type of vi , i = 1, 2, we have four cases to consider. We will prove the claim case-by-case. Case B − B. From Eq. (4.3) and Eq. (3.10b), we obtain χ2 ≤ tanh d2B , which implies the claim. Case B − A. From Eq. (4.3) and Eq. (3.10c), we have χ2 ∈ [−∞, coth d2A ] if d2A ≤ 0, and χ2 ∈ [coth d2A , ∞] if d2A > 0. The claim follows. Case A − A. From Eq. (4.3) and Eq. (3.10a), we obtain χ2 ∈ [−∞, coth d2A ] if d2A ≤ 0, and χ2 ∈ [coth d2A , ∞] if d2A > 0, which implies the claim. Case A − B. From Eq. (4.3) and Eq. (3.10c), we have χ2 ≤ tanh d2B , implying the claim. This proves Lemma 2. u t Now we finish the proof of the main theorem. Since, by assumption, almost every point of the phase space is eventually strictly B-unstable, Lemma 2 implies that the cone field W is eventually strictly invariant. The claim now follows from a theorem of Wojtkowski [Wo1,Wo2]. u t Proof of Theorem 2. Let l(v) and r(v) be the left and the right endpoints of the interval W (v) defined in terms of the projective coordinate (for the cone fields defined above l(v) and r(v) are either ∞ or D −1 (v)). Let l1 (v) and r1 (v) be the left and the right endpoints of the interval φ(W (v)). Applying Theorem 2 in [Wo2] to the billiards, satisfying the assumptions of the main theorem, we obtain √ Z Z ζ +1 dµ, (4.4) λ+ dµ ≥ log √ ζ −1 V V where ζ (v) =

r(φ(v)) − l1 (v) l(φ(v)) − r1 (v) . r(φ(v)) − r1 (v) l(φ(v)) − l1 (v)

Let φv be the map associated with the g.t.p.t. T (v). By straightforward calculations 2 √ −2 √ ζ +1 ζ +1 + √ = |tr(Dφv2 )|. √ ζ −1 ζ −1 The claim now follows from the inequality (4.4). u t

5. Applications and Examples There are many classes of planar domains with hyperbolic billiard dynamics [Wo2], [Bu3,4,Ma]; see also [Tab] and the references there. In Subsect. 5.1 we will apply the Main Theorem to obtain convenient sufficient conditions of hyperbolicity for elementary billiard tables on all surfaces of constant curvature. In Subsect. 5.2 we will use these conditions (as well as the Main Theorem directly) to construct several classes of examples of billiard tables with chaotic dynamics on S2 and H2 . In Subsect. 5.3, expanding the ideas of [Wo2] for billiards in R2 , we obtain a simple set of principles for constructing billiard tables with hyperbolic dynamics on arbitrary surfaces of constant curvature.

Hyperbolic Billiards on Surfaces of Constant Curvature

79

5.1. Elementary billiard tables: Conditions for hyperbolicity. We shall use the term “elementary billiard tables” to denote billiard tables Q, such that ∂Q is a finite union of arcs, 0i , of constant geodesic curvature, κ(0i ) = κi . We will use the notation 0i+ (resp. 0i− , resp. 0i0 ) to indicate that κi > 0 (resp. κi < 0, resp. κi = 0). Let Ci be the curve of constant curvature containing 0i . Let Di ⊂ M be the smallest region such that Ci = ∂Di . The representation ∂Q = ∪N i=1 0i is unique, and we call 0i the components. We will refer to 0i+ (resp. 0i− , resp. 0i0 ) as the components of type plus (resp. of type minus, resp. of type zero). Applying the Main Theorem to elementary billiard tables in R2 , we recover a classical result of L. Bunimovich [Bu1]. Corollary 1. Let Q ⊂ R2 be an elementary billiard table with at least two boundary components, and assume that not all of them are neutral. If for every 0i+ we have Di ⊂ Q, then the billiard in Q is hyperbolic. The extension of this result for M = S2 and M = H2 will be given below. For this purpose we introduce the following terminology: If R ⊂ S ⊂ M are regions with piecewise C 1 boundaries, we call the inclusion R ⊂ S proper if ∂R ∩ int S 6 = ∅. Consider now an elementary billiard table Q ⊂ S2 . For any domain D ⊂ S2 we denote by −D ⊂ S2 the domain obtained by the reflection of D about the center of the sphere (polar domain). Condition S1. The table Q satisfies Di ⊂ Q for every boundary component 0i+ . Besides, either −Di ⊂ Q, or −Di ⊂ S2 \ Q, and the inclusions are proper. Condition S2. For every 0j− we have Dj ⊂ S2 \ Q, and the inclusions −Dj ⊂ S2 \ Q, or −Dj ⊂ Q are proper. Corollary 2. Let Q ⊂ S2 be an elementary billiard table with at least two boundary components of nonzero type. If Q satisfies Conditions S1 and S2, then the billiard in Q is hyperbolic. Outline of proof. Straightforward analysis shows that Q satisfies the conditions of the Main Theorem. Remark. Suppose Q0 = S2 \ Q is connected. If Q satisfies Conditions S1 and S2, then Q0 also does, and hence the billiard in Q0 is hyperbolic. Let Q ⊂ H2 be an elementary billiard table. We use the notation 0iA (resp. 0iB ) if |κi | ≥ 1 (resp. |κi | < 1). In combination with the previous conventions, this yields the self-explanatory notation 0iA+ , 0iA− , 0iA0 , 0iB+ , etc. We will call them the components of type A plus, B minus, etc. Condition H1. For every component 0iA+ of ∂Q, we have Di ⊂ Q. Condition H2. There are no components of type B+. Corollary 3. Let Q ⊂ H2 be an elementary billiard table with at least two boundary components. If Q satisfies conditions H1 and H2, then the billiard in Q is hyperbolic. Outline of proof. The assumptions of Corollary 3 imply those of the Main Theorem. Remark. The purpose of the assumptions that ∂Q has at least two boundary components, and that the inclusions are proper is to exclude degenerate situations, where each v ∈ V is B-parabolic. For instance, this is the case if Q is a disc, or an annulus between concentric circles.

80

B. Gutkin, U. Smilansky, E. Gutkin

-D 2 D1 D3 Q

-D 4

D4

polar discs

periodic orbits

-D 3 -D

1

Q

D2

b)

10 10 10 10 10 10 011010101010101010101010101010101010101010101010101010 101010101010101010101010101010101010 10 10 10 101010101010101010101010101010101010101010101010101010 10 10 10 101010101010101010101010101010101010101010101010101010 10 10 10 10101010101010101010101010101010101010101010101010101010 101010101010101010101010101010101010 10 10 10 101010101010101010101010101010101010101010101010101010 10 1010 1010 10 10 1010 10

a)

Q

parallel circles

c) Fig. 6.

5.2. Elementary hyperbolic billiard tables: Examples. Using Corollaries 2 and 3, we will produce examples of elementary billiard tables with hyperbolic dynamics in S2 and H2 . Besides, we will give examples of elementary billiard tables that do not satisfy the assumptions of Corollaries 2 and 3, but have hyperbolic dynamics. We will prove the hyperbolicity of these billiards from the Main Theorem. 5.2.1. Examples on the sphere. Spherical Lorenz gas. One of the first examples of hyperbolic billiards was the flat torus with a round hole, i. e., the Sinai billiard. This dynamical system is the simplest special case of the Lorenz gas, which is still actively investigated. The natural analog of the Lorenz gas on the sphere is the billiard table, obtained by removing a finite number of disjoint discs, see Fig. 6a. Removing one disc, or a pair of parallel discs, we obtain an integrable billiard [Ve]. Let Di , 1 ≤ i ≤ n, be the removed discs, so that Q = S2 \ ∪Di , and n > 1. If all intersections Di ∩ ±Dj , i 6 = j, are empty, then the billiard in Q is hyperbolic, by Corollary 2, see Fig. 6b for n = 2. For these billiards the non-intersection condition above is also necessary for hyperbolicity. If it is not satisfied, then Q has stable periodic

Hyperbolic Billiards on Surfaces of Constant Curvature

Q

81

Q

a)

b) Fig. 7.

orbits of period two. They go along the large circle which connects the centers of the two removed discs. Let now Q be obtained by removing m pairs of parallel discs, Pi , 1 ≤ i ≤ m, and n single discs, Dj , 1 ≤ j ≤ n, where m + n > 1. Consider the configuration n (∪m i=1 ± Pi ) ∪ (∪j =1 ± Dj ). Suppose that the only nonempty intersections are the trivial ones: Pi ∩ −Pi 6 = ∅, see Fig. 6c. Corollary 2 does not apply, however a direct analysis shows that almost every point of the phase space is eventually strictly B-unstable. By the Main Theorem, these billiard tables are hyperbolic. Pseudo-stadia. A pseudo-stadium on S2 is an elementary billiard table Q, such that ∂Q has four components : Two of them are parallel, and of negative type, and the other two are of positive type, see Fig. 7. The two positive components may have the same curvature, Fig. 7a, or different curvatures, Fig. 7b. If Q satisfies the conditions of the Main Theorem (like the pseudo-stadia in Figs. 7a, 7b), then Q is hyperbolic. Flowers. Figures 8a,b,c are examples of elementary billiard tables, that belong to the class of “flowers". Some flowers satisfy the conditions of Corollary 2, and hence, are hyperbolic. Note that the dual tables Q0 = S2 \ Q satisfy the conditions of Corollary 2 as well (see Figs. 8a,b,c). Hence, they are also hyperbolic, Billiard tables with flat components. Let a billiard table Q ⊂ S2 (not necessarily elementary) have a flat component, 0 0 ⊂ ∂Q. We apply to Q the method of reflections, widely used to study billiards in polygons [Ge]. In a nutshell, we associate with Q the table Q1 , which is the union of Q and its reflection about 0 0 , see Fig. 9a. The billiard dynamics in Q and Q1 are essentially isomorphic. (We leave it to the reader to extend the argument of [Ge] from R2 to all surfaces of constant curvature.) Hence, if Q1 satisfies the conditions of the Main Theorem, then the billiard in Q is hyperbolic. Sometimes the method of reflections yields an easy proof of hyperbolicity. Figure 9a illustrates this point: The table Q in Fig. 9a does not satisfy the conditions of Corollary 2, but Q1 does. The preceding discussion implies that Q is hyperbolic.

82

B. Gutkin, U. Smilansky, E. Gutkin

Q

Q

Q’

Q’

a)

b)

Q

Q’ c) Fig. 8.

Let ∂Q have two or more flat components. Then, typically, Q does not satisfy conditions of the Main Theorem. Let Q1 be the table, obtained by “reflecting and unfolding” Q about the flat components any number of times (including infinity). Often, Q1 is not a subset of S2 because of self-intersections. Then we think of Q1 as a billiard table located in a branched covering of S2 . Unfolding Q infinitely many times, we can always assume that Q1 has no flat components in its boundary. However, typically, the phase space of Q1 will have points v such that in the corresponding triple (d1 , d2 , s) the distance s is near π. Therefore, Q1 does not satisfy the conditions of the Main Theorem. See, for example, the stadium in Fig. 9b. If Q1 is located strictly inside a hemisphere (possibly with self-intersections), then this problem does not arise. In particular, if Q1 satisfies the conditions of Corollary 2, then the billiard dynamics in Q is hyperbolic. For instance, in Figs. 9c and 9d, Q1 is inside the upper hemisphere, and satisfies the conditions of Main Theorem. Hence, the “stadia" in Figs. 9c and 9d have hyperbolic billiard dynamics. 5.2.2. Examples on the hyperbolic plane. Analogs of the Sinai billiard. Consider the billiard tables Q ⊂ H2 (not necessarily elementary) such that ∂Q has components of nonpositive curvature only (Fig. 10a). Let

Hyperbolic Billiards on Surfaces of Constant Curvature

Q

83

... Q ...

Q

1

Q

1

a)

... Q ... 1

b)

... Q ... 1

Q

c)

Q

d) Fig. 9.

v ∈ V . If v ∈ V A , then d A (v) ≤ 0, and for v ∈ V B we also have d B (v) ≤ 0. By Eq. (3.10), Q satisfies conditions of Main Theorem, hence these billiard tables have hyperbolic dynamics. Polygons. Let Q be a geodesic polygon in H2 , see Fig. 10b. Then V = V B , and d B (v) = 0 for every v ∈ V . By Eq. (3.10b), Q satisfies the assumptions of Main Theorem. Thus, geodesic polygons in H2 have hyperbolic dynamics. In fact, polygons are a special case of the Sinai billiards in H2 . Stadia. Let Q ⊂ H2 be an analog of the stadium: ∂Q has four components, two of type zero, and two of positive type (Fig. 11). Let Q be any stadium, and let Q1 be the table obtained by unfolding Q about the flat components infinitely many times, see Fig. 11. If Q1 satisfies the conditions of Corollary 3, then, applying the method of reflections [Ge], extended to the hyperbolic plane, we obtain that the billiard in Q is hyperbolic. Figure 11 illustrates this point. Flowers. This is another class of elementary billiard tables in H2 (Fig. 12). If ∂Q satisfies conditions H1 and H2 (see Figs. 12a and 12b), then, by Corollary 3, the billiard in Q is hyperbolic.

84

B. Gutkin, U. Smilansky, E. Gutkin

Q

Q

a)

b) Fig. 10.

Q

1

Q

A

A

Fig. 11.

A

A

A

A

A

Q

A

Q B

A

A

A

a)

b) Fig. 12.

Hyperbolic Billiards on Surfaces of Constant Curvature

85

Sa a

a

Fig. 13.

5.3. Convex scattering for billiards on surfaces of constant curvature. Let M be a surface of constant curvature. In this subsection we consider billiard tables in M with piecewise smooth boundary, ∂Q = ∪i γi . We will investigate the conditions on the components γi which ensure that the billiard in Q is hyperbolic. In [Wo2] Wojtkowski introduced the notion of convex scattering. By definition, a convex arc γ ⊂ R2 is convex scattering, if it can be used as a component of a billiard table, for which the cone field defined in [Wo2] is invariant. Using the notion of convex scattering, Wojtkowski introduced three “principles of design of billiards (in R2 ) with hyperbolic dynamics”, and constructed several examples of such tables. In our notation, γ ⊂ R2 is convex scattering if for any v ∈ V , such that the origin points of v and φ(v) belong to γ the corresponding g.t.p.t. T (v) is B-unstable. Such condition is equivalent (see eq. 3.8) to the inequality d1 + d2 ≤ s as it appears in [Wo2]. Let l be the arclength parameter on γ , and let r(l) be the radius of curvature. A convex arc γ is convex scattering if and only if r 00 ≤ 0, as it has been shown in [Wo2]. In what follows we generalize the notion of convex scattering to S2 and H2 . We call a convex curve γ ⊂ M convex scattering if for any v ∈ V , such that the origin points of v and φ(v) belong to γ the corresponding g.t.p.t. T (v) is B-unstable. Using Proposition 3 we will obtain geometric criteria for convex scattering. Then we will extend to S2 and H2 Wojtkowski’s principles of design of billiards with hyperbolic dynamics. Convex scattering and hyperbolic billiard tables in S2 . A convex curve γ ⊂ S2 is convex scattering if for every pair of the points γ0 , γ1 ∈ γ , such that the arc of γ between γ0 and γ1 lies entirely on one side of the geodesic passing through γ0 and γ1 , we have d1 + d2 ≤ s ≤ π

(5.1)

(compare with condition (3.9)). For simplicity of exposition, we will restrict our attention to piecewise convex billiard tables. The Main Theorem yields the following principles for the design of billiard tables in S2 with hyperbolic dynamics: P1: All components of ∂Q are convex scattering. P2: Every component of ∂Q is sufficiently far, but not too far, from the other components.

86

B. Gutkin, U. Smilansky, E. Gutkin

More precisely, condition P2 means that any two consecutive bouncing points of the billiard ball satisfy Eq. (5.1), even if they belong to different components of the boundary. In particular, the interior angles between consecutive components of ∂Q are greater than π . Let κ(l) be the geodesic curvature of γ . In Appendix A we will show that the differential inequality (κ −1 )00 ≤ 0 is necessary, but, in general, not sufficient for convex scattering. However, a sufficiently short arc satisfying (κ −1 )00 < 0 is convex scattering. Let Sa be the spherical analog of the cardioid. It is the curve obtained by rotating a circle of radius a on another circle of the same radius, see Fig. 13. For small a the curve Sa is well approximated by the cardioid Ra . Since Ra is (strictly) convex scattering [Wo2], the curvature, κa , satisfies the inequality lima→0 (κa−1 )00 < 0. Since tan r ∼ a, (tan r)0 ∼ a 0 and (tan r)00 ∼ a −1 , as a goes to zero, condition (A.5) is satisfied for sufficiently small a. Thus, there is a critical value, acr , such that for a < acr the curve Sa is convex scattering, and the billiard in it is hyperbolic. This approach generalizes to any curve on the sphere whose planar counterpart is strictly convex scattering. Finally, let us mention here, that the application of the Main Theorem to the concave billiards on the sphere leads to the hyperbolicity criterion, which is closely related to the results of Vetier [Vet1,Vet2] (see also [KSS]). In fact, if concave billiard on the sphere satisfies Vetier conditions Cconditions 1.2–1.4 in [KSS]) it satisfies also the conditions of the Main Theorem. Convex scattering and hyperbolic billiard tables in H2 . A convex curve γ ⊂ H2 is convex scattering if for each v ∈ V , such that the origin points of v and φ(v) belong to γ , we have v, φ(v) ∈ V A and (5.2) d1 + d2 ≤ s −1 00 (compare with Eq. (3.10)). The differential inequality (κ ) ≤ 0 is necessary but, in general, not sufficient for Eq. (5.2), see Appendix B. (κ −1 )00 < 0 implies that every sufficiently short arc is convex scattering. The Main Theorem yields the following principles for the design of billiard tables in H2 with hyperbolic dynamics: P1: All convex components of ∂Q are convex scattering. P2: Every convex component of ∂Q is sufficiently far from any other component and satisfies κ(l) ≥ 1. More precisely, Condition P2 means that any two consecutive bouncing points of the billiard ball which belong to different components satisfy Eq. (3.10). This implies the following conditions on the angles between adjacent components of ∂Q. P3: Let γ 0 , γ 00 ⊂ ∂Q be two adjacent components. If they are both convex, then the angle between them is greater than π . If one of them is convex and the other is concave, then the angle is greater than or equal to π. Remark. Comparing the principles of the design of hyperbolic billiard tables for the three types of surfaces of constant curvature, we see the same pattern. There are, however, important differences. For instance, on S2 , we need to complement the requirement “to be far from each other” for the components of ∂Q, by the one “to be not too far”. The other important difference is that on S2 and H2 the differential inequality (κ −1 )00 ≤ 0 is necessary, but not sufficient for convex scattering, see the appendix below. Acknowledgements. This work was supported partially by the Minerva Center for Nonlinear Physics of Complex Systems.

Hyperbolic Billiards on Surfaces of Constant Curvature

87

α

α

A=(x,y) x

x

A=(x,y) y

O γ(l0 )

γ(l1 )

θ β

y γ(l0 )

O

a)

θ γ(l1 )

β

b) Fig. 14.

Appendix: Geometry of Convex Scattering on S2 and H2 We will investigate when a convex arc on the sphere or the hyperbolic plane is convex scattering. Let M be any surface of constant curvature. Let γ ⊂ M be any smooth curve, and let κ(l) be the geodesic curvature of γ (as a function of arclength). Let r(l) be the radius of the osculating circle (hypercycle if M = H2 , and |κ(l)| < 1). Then κ = r −1 in R2 , and κ = cot r for S2 . On H2 we will modify the definition of r(l). There are two cases, A and B (compare with Sect. 2), where |κ(l)| > 1 in case A, and |κ(l)| ≤ 1 in case B. We will denote by r A and r B respectively the radius of the osculating circle (hypercycle). In the case A (resp. B) we have κ = coth r A (resp. κ = tanh r B ). We set r = r A and r = r B + iπ/2 respectively. Then κ = coth r.

A. The Sphere Let α and β be a pair of orthogonal oriented geodesics on S2 . For A ∈ S2 let x and y be the oriented distances from A to α and β. Then (x, y) is a coordinate system in S2 . Let now γ (l0 ) and γ (l1 ) be two points on γ such that the arc of γ between γ (l0 ) and γ (l1 ) lies on one side of the geodesic passing through these points, see Fig. 14a. Let α be that geodesic, and let β be such that in the parameterization γ (l) = (x(l), y(l)), l0 < l < l1 , the coordinate y takes its maximal value when x = 0, see Fig. 14a. Let θ (l) be the angle between γ and the orthogonal to β geodesic passing through γ (l). By elementary geometry: dy sin θ dx = cos θ, = , (A.1a) dl dl cos x dθ = sin θ tan x − cot r. dl

(A.1b)

Since γ is convex, the inequality s < π in Eq. (5.1) is satisfied for any two points of γ . It remains to consider the inequality s ≥ d1 + d2 . Set 1 = s − d1 − d2 . Then

88

B. Gutkin, U. Smilansky, E. Gutkin

Z 1=

[d(arctan(tan r sin θ )) + dx] Z (tan r)0 + cos θ tan r(tan x + sin θ tan r) cos x . = dy 1 + tan2 r sin2 θ

Since y(l0 ) = y(l1 ) = 0, we obtain Z 1 = − dl (tan r)00 + F (θ (l), r(l), x(l))

y cos x , 1 + tan2 r sin2 θ

(A.2)

where we have set for brevity F (θ, r, x) = tan r sin2 θ(1 − tan2 x) − sin3 θ tan x tan2 r + sin θ tan x (tan r)0 + cos θ tan r(tan x + sin θ tan r) + (tan r)0 tan r sin 2θ − 1 + tan2 r sin2 θ 2 0 2 2 × ((tan r) sin θ + tan r sin θ sin 2θ tan x − tan r sin 2θ ). Set L = l1 − l0 . From Eq. (A.2) we have 1=−

(tan r)00 3 L + O(L4 ). 12 tan r

(A.3)

Thus, if the curve γ is convex scattering, then the condition (tan r(l))00 ≤ 0 holds everywhere on γ . Recall that tan r = κ −1 . If the strict inequality (κ −1 (l0 ))00 < 0 holds, then, by Eq. (A.3), there is Lcr such that the arc γ (l) : l ∈ [l0 , l0 + Lcr ] is convex scattering. Thus, any sufficiently short curve satisfying the condition (κ −1 )00 < 0 is convex scattering. By the choice of the coordinate system we have |x(l)| ≤ max(r) for the corresponding quantities on γ (l), l0 ≤ l ≤ l1 . Then, we can obtain for F (θ (l), r(l), x(l)), l0 ≤ l ≤ l1 the estimate (A.4) F < (tan r)max (1 + 3(tan r)2max + 5|(tan r)0 |max ), where (tan r)max , |(tan r)0 |max are the maxima of the respective quantities on γ between the points γ (l0 ) and γ (l1 ). Equation (A.2) implies that if the inequality −(tan r)00 > (tan r)max (1 + 3(tan r)2max + 5|(tan r)0 |max ),

(A.5)

holds everywhere, then γ is convex scattering. B. The Hyperbolic Plane Let α and β be a pair of geodesics in H2 , intersecting orthogonally. Just like in part A, we associate with this a coordinate system (x, y) on the hyperbolic plane. For a convex curve, γ , and two points, γ (l0 ) and γ (l1 ) of γ , we choose the geodesics α and β like in part A, see Fig. 14b. Then the curvature κ = coth r of γ satisfies dx = cos θ, dl

dy sin θ = , dl cosh x

(B.1a)

Hyperbolic Billiards on Surfaces of Constant Curvature

89

dθ = − sin θ tanh x − coth r, (B.1b) dl where θ(l) is the angle between the geodesic through the point A, orthogonal to β, and γ . By straightforward calculations, we obtain Z 1 = s − d1 − d2 = d(arctanh(tanh r sin θ )) + dx Z (tanh r)0 − cos θ tanh r(tanh x + sin θ tanh r) cosh x . = dy 1 − tanh2 r sin2 θ Set F (θ, r, x) = − tanh r sin2 θ(1 + tanh2 x) − sin3 θ tanh x tanh2 r − sin θ tanh x (tanh r)0 − cos θ tanh r(tanh x + sin θ tanh r) − (tanh r)0 tanh r sin 2θ + 1 − tanh2 r sin2 θ × (tanh2 r)0 sin2 θ + tanh2 r sin θ sin 2θ tanh x + tanh r sin 2θ . Then, since y(l0 ) = y(l1 ) = 0, we have Z 1 = − dl (tanh r)00 + F (θ (l), r(l), x(l))

y cosh x . 1 − tanh2 r sin2 θ

(B.2)

Let L = l1 − l0 . By Eq. (B.2), we obtain 1(L) = −

(tanh r)00 3 L + O(L4 ). 12 tanh r

(B.3)

This leads to the necessary condition for convex scattering curve on the hyperbolic plane: (κ −1 )00 ≤ 0. Just like in part A, Eq. (B.3) implies that any sufficiently short arc satisfying (κ −1 )00 < 0 is convex scattering. References [Bu1] [Bu2] [Bu3] [Bu4]

Bunimovich, L. A.: Mathem. Sbornik 95, 49–73 (1974) Bunimovich, L. A.: Commun. Math. Phys. 65, 295–312 (1979) Bunimovich, L. A.: Chaos 1 (2), 187 (1991) Bunimovich, L. A.: Lecture Notes in Math. Vol. 1514. Berlin–Heidelberg–New York: Springer Verlag 1991, pp. 62–82 [Do] Donnay, V. J.: Commun. Math. Phys. 141, 225–257 (1991) [FLBP] Foden, C. L., Leadbeater, M. L., Burroughes, J. H., Peper, M.: J. Phys. Condens. Matter 6, L127 (1994) [Gb] Gutkin, B.: Hyperbolic billiards in magnetic field on surfaces of constant curvature. In preparation [Ge] Gutkin, E.: J. Stat. Phys 83, 7–26 (1996) [KH] Katok, A. and Hasselblatt, B.: Introduction to the modern theory of dynamical systems. Cambridge: Cambridge University Press, 1995 [KS] Katok, A. and Strelcyn, J.-M.: Invariant Manifolds, Entropy and Billiards; Smooth Maps with Singularities. Lecture Notes in Math. Vol. 1222. Berlin–Heidelberg–New York: Springer-Verlag, 1986 [KSS] Kramli, A., Simanyi, N., Szasz, D.: Commun. Math. Phys. 125, 439–457 (1989) [Ma] Markarian, R.: Commun. Math. Phys. 118, 87–97 (1988) [Pe] Pesin, Ya. B.: Russ. Math. Surv. 32, 55–114 (1977) [Si] Sinai, Ya. G.: Russ. Math. Surv. 25, 137–189 (1970)

90

[Ta] [Tab] [Ve] [Vet1] [Vet2] [Vi] [Wa] [Wo1] [Wo2]

B. Gutkin, U. Smilansky, E. Gutkin

Tasnadi, T.: Hard chaos in magnetic billiards (On the hyperbolic plane). J. Math. Phys. 39, 3783–3804 (1998) Tabachnikov, S.: Billiards. Societe Mathematique de France, 1995 Veselov, A. P.: J. Geom. Phys. 7, 81–107 (1990) Vetier, A.: Sinai billiard in potential field (contraction of stable and unstable fibers). Coll. Math. Soc. J. Bolyai 36, 1079–1146 (1982) Vetier, A.: Sinai billiard in potential field (absolute continuity) Proc. 3rd Pann. Symp. J. Mogyorody, I. Vincze, W. Wertz (eds.). Budapest: Hungarian Academy of Sciences, 1982, pp. 341–351 Vinberg, E. B.: Geometry 2. Encycl. of Math. Sc., Vol. 29 New York: Springer-Verlag, 1993 Wall, H. S.: Continued Fractions. New York: D. Van Nostrand, 1948 Wojtkowski, M.: Erg. Theor. Dyn. Sys. 5, 145–161 (1985) Wojtkowski, M.: Commun. Math. Phys. 105, 391–414 (1986)

Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 91 – 105 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps W. de Melo1 , A. A. Pinto2 1 IMPA, Rio de Janeiro, Brazil. E-mail: [email protected] 2 DMA, Faculdade de Ciências, Universidade do Porto, 4000 Porto, Portugal. E-mail: [email protected]

Received: 16 March 1999 / Accepted: 21 May 1999

Abstract: Given C 2 infinitely renormalizable unimodal maps f and g with a quadratic critical point and the same bounded combinatorial type, we prove that they are C 1+α conjugate along the closure of the corresponding forward orbits of the critical points, for some α > 0. Contents 1. 2. 3. 4.

Introduction . . . . . . . . . Shadowing Unimodal Maps . Varying Quadratic-Like Maps Proofs of the Main Results . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

91 93 97 103

1. Introduction It was already clear more than 20 years ago, from the work of Coullet-Tresser and Feigenbaum, that the small scale geometric properties of the orbits of some one dimensional dynamical systems were related to the dynamical behavior of a non-linear operator, the renormalization operator, acting on a space of dynamical systems. This conjectural picture was mathematically established for some classes of analytic maps by Sullivan, McMullen and Lyubich. Here we will extend this description to the space of C 2 maps and prove a rigidity result for a class of unimodal maps of the interval. As it is well-known, a unimodal map is a smooth endomorphism of a compact interval that has a unique critical point which is a turning point. Such a map is renormalizable if there exists an interval neighborhood of the critical point such that the first return map to this interval is again a unimodal map, and the return time is greater than one. The map is infinitely renormalizable if there exist such intervals with arbitrarily high return times. We say that two maps have the same combinatorial type if the map that sends the i th

92

W. de Melo, A. A. Pinto

iterate of the critical point of the first map into the i th iterate of the critical point of the second map, for all i ≥ 0, is order preserving. Finally, we say that the combinatorial type of an infinitely renormalizable map is bounded if the ratio of any two consecutive return times is uniformly bounded. A unimodal map f is C r with a quadratic critical point if f = φf ◦ p ◦ ψf , where p(x) = x 2 and φf , ψf are C r diffeomorphisms. Let cf be the critical point of f . In this paper we will prove the following rigidity result. Theorem 1. Let f and g be C 2 unimodal maps with a quadratic critical point which are infinitely renormalizable and have the same bounded combinatorial type. Then there exists a C 1+α diffeomorphism h of the real line such that h(f i (cf )) = g i (h(cg )) for every integer i ≥ 0. We observe that in Theorem 1 the Hölder exponent α > 0 depends only upon the bound of the combinatorial type of the maps f and g. Furthermore, as we will see in Sect. 2, the maps f and g are smoothly conjugated to C 2 normalized unimodal maps F = φF ◦p and G = φG ◦p with critical value 1, and the Hölder constant for the smooth conjugacy between the normalized maps F and G depends only upon the combinatorial type of F and G, and upon the norms ||φF ||C 2 and ||φG ||C 2 . The conclusion of the above rigidity theorem was first obtained by McMullen in [16] under the extra hypothesis that f and g extend to quadratic-like maps in neighborhoods of the dynamical intervals in the complex plane. Combining this last statement with the complex bounds of Levin and van Strien in [11], we get the existence of a C 1+α map h which is a conjugacy along the critical orbits for infinitely renormalizable real analytic maps with the same bounded combinatorial type. We extended this result to C 2 unimodal maps in Theorem 1, by combining many results and ideas of Sullivan in [21] with recent results of McMullen in [15], in [16], and of Lyubich in [13] on the hyperbolicity of the renormalization operator R (see the definition of R in the next section). A main lemma used in the proof of Theorem 1 is the following: Lemma 2. Let f be a C 2 infinitely renormalizable map with bounded combinatorial type. Then there exist positive constants η < 1, µ and C, and a real quadratic-like map fn with conformal modulus greater than or equal to µ, and with the same combinatorial type as the nth renormalization R n f of f such that ||R n f − fn ||C 0 < Cηn for every n ≥ 0. We observe that in this lemma, the positive constants η < 1 and µ depend only upon the bound of the combinatorial type of the map f . For normalized unimodal maps f , the positive constant C depends only upon the bound of the combinatorial type of the map f and upon the norm ||φf ||C 2 . This lemma generalizes a theorem of Sullivan (transcribed as Theorem 4 in Sect. 2) by adding that the map fn has the same combinatorial type as the nth renormalization R n f of f . Now, let us describe the proof of Theorem 1 which also shows the relevance of Lemma 2: let f and g be C 2 infinitely renormalizable unimodal maps with the same bounded combinatorial type. Take m to be of the order of a large but fixed fraction of n, and note that n − m is also a fixed fraction of n. By Lemma 2, we obtain a real quadratic-like map fm exponentially close to R m f , and a real quadratic-like map gm exponentially close to R m g. Then we use Lemma 6 of Sect. 2.2 to prove that the

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

93

renormalization (n − m)th iterates R n f of R m f , and R n g of R m g stay exponentially close to the (n−m)th iterates R n−m fm of fm and R n−m gm of gm , respectively. Again, by Lemma 2, we have that fm and gm have conformal modulus universally bounded away from zero, and have the same bounded combinatorial type of R m f and R m g. Thus, by the main result of McMullen in [16], the renormalization (n − m)th iterates R n−m fm of fm and R n−m gm of gm are exponentially close. Therefore,R n f is exponentially close to R n−m fm , R n−m fm is exponentially close to R n−m gm , and R n−m gm is exponentially close to R n g, and so, by the triangle inequality, the nth iterates R n f of f and R n g of g converge exponentially fast to each other. Finally, by Theorem 9.4 in the book [18] of de Melo and van Strien, we conclude that f and g are C 1+α conjugate along the closure of their critical orbits. Let us point out the main ideas in the proof of Lemma 2: Sullivan in [21] proves that R n f is exponentially close to a quadratic-like map Fn which has conformal modulus universally bounded away from zero. The quadratic-like map Fn determines a unique quadratic map Pc(Fn ) (z) = 1 − c(Fn )z2 which is hybrid conjugated to Fn by a K quasiconformal homeomorphism, where K depends only upon the conformal modulus of Fn (see Theorem 1 of Douady–Hubbard in [6], and Lemma 11 in Sect. 3.3). In [13], Lyubich proves the bounded geometry of the Cantor set consisting of all the parameters of the quadratic family Pc (z) = 1 − cz2 corresponding to infinitely renormalizable maps with combinatorial type bounded by N (see definition in Sect. 2 and the proof of Lemma 2). In Lemma 8 of Sect. 2.2, we prove that R n f and Fn have exponentially close renormalization types. Therefore, letting cn be the parameter corresponding to the quadratic map Pcn with the same combinatorial type as R n f , we have, from the above result of Lyubich, that c(Fn ) and cn are exponentially close. In Lemma 12 of Sect. 3.3, we use holomorphic motions to prove the existence of a real quadratic-like map fn which is hybrid conjugated to Pcn , and has the following essential property: the distance between Fn and fn is proportional to the distance between c(Fn ) and cn raised to some positive constant. Therefore, the real quadratic-like map fn has the same combinatorial type as R n f , and fn is exponentially close to Fn . Since the map Fn is exponentially close to R n f , we obtain that the map fn is also exponentially close to R n f . The example of Faria and de Melo in [7] for critical circle maps can be adapted to prove the existence of a pair of C ∞ unimodal maps, with the same unbounded combinatorial type, such that the conjugacy h has no C 1+α extension to the reals for any α > 0. 2. Shadowing Unimodal Maps A C r unimodal map F : I → I is normalized if I = [−1, 1], F = φF ◦ p, F (0) = 1, and φF : [0, 1] → I is a C r diffeomorphism. A C r unimodal map f = φf ◦ p ◦ ψf with quadratic critical point either has trivial dynamics or has an invariant interval where it is C r conjugated to a C r normalized unimodal map F . Take, for instance, the map −1 −2 · ψf−1 ◦ φf ·x . φF (x) = ψf−1 ◦ φf (0) ψf−1 ◦ φf (0) Therefore, from now on we will only consider C r normalized unimodal maps f . The map f is renormalizable if there is a closed interval J centered at the origin, strictly contained in I , and l > 1 such that the intervals J, . . . , f l−1 (J ) are disjoint, f l (J ) is strictly contained in J and f l (0) ∈ ∂J . If f is renormalizable, we always consider the smallest l > 1 and the minimal interval Jf = J with the above properties.

94

W. de Melo, A. A. Pinto

The set of all renormalizable maps is an open set in the C 0 topology. The renormalization operator R acts on renormalizable maps f by Rf = ψ ◦ f l ◦ ψ −1 : I → I , where ψ : Jf → I is the restriction of a linear map sending f l (0) into 1. Inductively, the map f is n times renormalizable if R n−1 f is renormalizable. If f is n times renormalizable for every n > 0, then f is infinitely renormalizable. Let f be a renormalizable map. We label the intervals Jf , . . . , f l−1 (Jf ) of f by 1, . . . , l according to their embedding on the real line, from the left to the right. The permutation σf : {1, . . . , l} → {1, . . . , l} is defined by σf (i) = j if the interval labeled by i is mapped by f to the interval labeled by j . The renormalization type of an n times renormalizable map f is given by the sequence σf , . . . , σR n f . An n times renormalizable map f has renormalization type bounded by N > 1 if the number of elements of the domain of each permutation σR m f is less than or equal to N for every 0 ≤ m ≤ n. We have the analogous notions for infinitely renormalizable maps. Note that if any two maps are n times renormalizable and have the same combinatorial type (see the definition in the introduction), then they have the same renormalization type. The converse is also true in the case of infinitely renormalizable maps. An infinitely renormalizable map has combinatorial type bounded by N > 1 if the renormalization type is bounded by N. If f = φ ◦ p is n times renormalizable, and φ ∈ C 2 , there is a C 2 diffeomorphism φn satisfying R n f = φn ◦ p. The nonlinearity nl(φn ) of φn is defined by 00 φ (x) . nl(φn ) = sup n0 x∈p(I ) φn (x) Let I(N, b) be the set of all C 2 normalized unimodal maps f = φ ◦ p with the following properties: (i) f is infinitely renormalizable; (ii) the combinatorial type of f is bounded by N ; (iii) ||φ||C 2 ≤ b. Theorem 3 (Sullivan [21]). There exist positive constants B and n1 (b) such that, for every f ∈ I(N, b), the nth renormalization R n f = φn ◦ p of f has the property that nl(φn ) ≤ B for every n ≥ n1 . his theorem together with Arzelá–Ascoli’s Theorem implies that, for every 0 ≤ β < 2, and for every n ≥ n1 (b), the renormalization iterates R n f are contained in a compact set of unimodal maps with respect to the C β norm. We will use this fact in the proof of Lemma 5 below. 2.1. Quadratic-like maps. A quadratic-like map f : V → W is a holomorphic map with the property that V and W are simply connected domains with the closure of V contained in W , and f is a degree two branched covering map. We add an extra condition that f has a continuous extension to the boundary of V . The conformal modulus of a quadraticlike map f : V → W is equal to the conformal modulus of the annulus W \ V . A real quadratic-like map is a quadratic-like map which commutes with complex conjugation. The filled Julia set K(f ) of f is the set {z : f n (z) ∈ V , for all n ≥ 0}. Its boundary is the Julia set J (f ) of f . These sets J (f ) and K(f ) are connected if the critical point of f is contained in K(f ). Let Q(µ) be the set of all real quadratic-like maps f : V → W satisfying the following properties:

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

95

(i) the Julia set J (f ) of f is connected; (ii) the conformal modulus of f is greater than or equal to µ, and less than or equal to 2µ; (iii) f is normalized to have the critical point at the origin, and the critical value at one. By Theorem 5.8 on p. 72 of [15], the set Q(µ) is compact in the Carathéodory topology taking the critical point as the base point (see the definition on p. 67 of [15]). Theorem 4 (Sullivan [21]). There exist positive constants γ (N ) < 1, C(b, N ), and µ(N) with the following property: if f ∈ I(N, b), then there exists fn ∈ Q(µ) such that ||R n f − fn ||C 0 ≤ Cγ n . In the following sections, we will develop the results that will be used in the last section to prove the generalization of Theorem 4 (as stated in Lemma 2), and to prove Theorem 1. 2.2. Maps with close combinatorics. Let D(σ ) be the open set of all C 0 renormalizable unimodal maps f with renormalization type σf = σ . The open sets D(σ ) are pairwise disjoint. Let E(σ ) be the complement of D(σ ) in the set of all C 0 unimodal maps f . Lemma 5. There exist positive constants n2 (b) and (N) with the following property: for every f ∈ I(N, b), for every n ≥ n2 , and for every g ∈ E(σR n f ), we have ||R n f − g||C 0 > . Proof. Suppose, by contradiction, that there is a sequence R m1 f1 , R m2 f2 , . . . with the property that for a chosen σ there is a sequence g1 , g2 , . . . ∈ E(σ ) satisfying ||R mi fi − gi ||C 0 < 1/i. By Theorem 3, there are B > 0 and n1 (b) ≥ 1 such that the maps R mi fi have nonlinearity bounded by B > 0 for all mi ≥ n1 . By ArzelaAscoli’s Theorem, there is a subsequence R mi1 fi1 , R mi2 fi2 , . . . which converges in the C 0 topology to a map g. Hence, the map g is contained in the boundary of E(σ ) and is infinitely renormalizable. However, a map contained in the boundary of E(σ ) is not renormalizable, and so we get a contradiction. u t Lemma 6. There exist positive constants n3 (N, b) and L(N ) with the following property: for every f ∈ I(N, b), for every C 2 renormalizable unimodal map g, and for every n > n3 , we have ||R n f − Rg||C 0 ≤ L||R n−1 f − g||C 0 . Proof. In the proof of this lemma we will use the inequality (1) below. Let f1 , . . . , fm be maps with C 1 norm bounded by some constant d > 0, and let g1 , . . . , gm be C 0 maps. By induction on m, and by the Mean Value Theorem, there is c(m, d) > 0 such that ||f1 ◦ . . . ◦ fm − g1 ◦ . . . ◦ gm ||C 0 ≤ c max {||fi − gi ||C 0 }. i=1,... ,m

(1)

Set n3 = max{n1 , n2 }, where n1 (b) is defined as in Theorem 3, and n2 (b) is defined as in Lemma 5. Set F = R n−1 f with n ≥ n3 . We start by considering the simple case (a), where F and g do not have the same renormalization type, and conclude with the complementary case (b). In case (a), by Lemma 5, there is (N) > 0 with the property that ||RF − Rg||C 0 ≤ 2 ≤ 2 −1 ||F − g||C 0 .

96

W. de Melo, A. A. Pinto

In case (b), there is 1 < m ≤ N such that RF (x) = aF F m (aF−1 x), and Rg(x) = ag g m (ag−1 x), where aF = F m (0) and ag = g m (0).By Theorem 3, there is a positive constant B(N) bounding the nonlinearity of F . Since the set of all infinitely renormalizable unimodal maps F with nonlinearity bounded by B is a compact set with respect to the C 0 topology, and since aF varies continuously with F , there is S(N ) > 0 with the property that |aF | ≥ S. Again, by Theorem 3, and by inequality (1), there is c1 (N ) > 0 such that ||F m − g m ||C 0 ≤ c1 ||F − g||C 0 .

(2)

|aF − ag | ≤ c1 ||F − g||C 0 .

(3)

Thus,

Now, let us consider the cases where (i) ||F − g||C 0 ≥ S/(2c1 ) and (ii) ||F − g||C 0 ≤ S/(2c1 ). In case (i), we get ||RF − Rg||C 0 ≤ 2 ≤ 4c1 S −1 ||F − g||C 0 . In case (ii), using that |aF | ≥ S and (3), we get ag ≥ aF − S/2 ≥ S/2, and thus, by (2), we obtain −1 aF − ag−1 ≤ aF−1 ag−1 |aF − ag | ≤ 2S −2 c1 ||F − g||C 0 . Hence, again by (2) and (3), there is c2 (N ) > 0 with the property that ||RF − Rg||C 0 ≤ ||F m ||C 0 |aF − ag | + |ag |||F m ||C 1 aF−1 − ag−1 +|ag |||F m − g m ||C 0 ≤ c2 ||F − g||C 0 . t Therefore, this lemma is satisfied with L(N ) = max{2 −1 , 4c1 S −1 , c2 }. u Lemma 7. For all positive constants λ < 1 and C there exist positive constants α(N, λ) and n4 (b, N, λ, C) with the following property: for every f ∈ I(N, b), and every n > n4 , if fn is a C 2 unimodal map such that ||R n f − fn ||C 0 < Cλn , then fn is [αn+1] times renormalizable with σR m fn = σR n+m f for every m = 0, . . . , [αn] (where [y] means the integer part of y > 0.) Proof. Let (N) and n2 (b) be as defined in Lemma 5, and let L(N ) and n3 (b) be as defined in Lemma 6. Take α > 0 such that Lα λ < 1. Set n4 ≥ max{n2 , n3 } such that Cλn4 < and Cλn4 L[αn4 ] < . Then, for every n > n4 , the values Cλn , Cλn L, . . . , Cλn L[αn] are less than . By Lemma 5, if ||R n f − fn ||C 0 < Cλn < with n > n4 , then the map fn is contained in D(σR n f ). Thus, fn is once renormalizable, and σfn = σR n f . By induction on m = 1, . . . , [αn], let us suppose that fn is m times renormalizable, and σR i fn = σR n+i f for every i = 0, . . . , m − 1. By Lemma 6, we get that ||R n+m f − R m fn ||C 0 < CLm λn < . Hence, again by Lemma 5, the map R m fn is once renormalizable, and σR m fn = σR n+m f . u t

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

97

Lemma 8. There exist positive constants γ (N ) < 1, α(N ), µ(N ), and C(b, N ) with the following property: for every f ∈ I(N, b), there exists fn ∈ Q(µ) such that (i) ||R n f − fn ||C 0 ≤ Cγ n ; (ii) fn is [αn+1] times renormalizable with σR m fn = σR n+m f for every m = 0, . . . , [αn]. Proof. The proof follows from Theorem 4 and Lemma 7. u t 3. Varying Quadratic-Like Maps We start by introducing some classical results on Beltrami differentials and holomorphic motions, all of which we will apply later in this section to vary the combinatorics of quadratic-like maps. 3.1. Beltrami differentials. A homeomorphism h : U → V , where U and V are contained in C or C, is quasiconformal if it has locally integrable distributional derivatives ∂h, ∂h, and if there is < 1 with the property that ∂h/∂h ≤ almost everywhere. The Beltrami differential µh of h is given by µh = ∂h/∂h. A quasiconformal map h is K quasiconformal if K ≥ (1 + ||µh ||∞ )/(1 − ||µh ||∞ ). We denote by DR (c0 ) the open disk in C centered at the point c0 and with radius R > 0. We also use the notation DR = DR (0) for the disk centered at the origin. The following theorem is a slight extension of Theorem 4.3 on p. 27 of the book [9] by Lehto. Theorem 9. Let ψ : C → C be a quasiconformal map with the following properties: (i) µψ = ∂ψ/∂ψ has support contained in the disk DR ; (ii) ||µψ ||∞ < < 1; (iii) lim|z|→∞ (ψ(z) − z) = 0. Then there exists C(, R) > 0 such that ||ψ − id||C 0 ≤ C||µψ ||∞ . Proof. Let us define φ1 = µψ , and, by induction on i ≥ 1, we define φi+1 = µψ H φi , where H φi is the Hilbert transform of φi given by the Cauchy Principal Value of Z Z φi (ξ ) −1 dudv. 2 π C (ξ − z) P By Theorem 4.3 on p. 27 of [9], we get ψ(z) = z + ∞ i=1 T φi (z), where T φi (z) is given by Z Z φi (ξ ) −1 dudv. π ξ C −z By the Calderón–Zigmund inequality (see p. 27 of [9]), for every p ≥ 1, the Hilbert operator H : Lp → Lp is bounded, and its norm ||H ||p varies continuously with p. An elementary integration also shows that ||H ||2 = 1 (see p. 157 of [10]). Therefore, given that ||µψ ||∞ < , there is p0 () > 2 with the property that ||H ||p0 ||µψ ||∞ < ||H ||p0 < 1.

(4)

98

W. de Melo, A. A. Pinto

Since p0 > 2, it follows from Hölder’s inequality (see p. 141 of [10]) that there is a positive constant c1 (p0 , R) such that ||T φi ||C 0 ≤ c1 ||φi ||p0 .

(5)

By a simple computation, we get 1

i ||φi ||p0 ≤ (π R 2 ) p0 ||H ||i−1 p0 ||µψ ||∞ .

(6)

Thus, by inequalities (4), (5), and (6), there is a positive constant c2 (, R) with the property that ||ψ − id||C 0 ≤

∞ X

1

||T φi ||C 0

i=1

≤ c2 ||µψ ||∞ .

c1 (π R 2 ) p0 ||µψ ||∞ ≤ 1 − ||H ||p0 ||µψ ||∞ t u

3.2. Holomorphic motions. A holomorphic motion of a subset X of the Riemann sphere over a disk DR (c0 ) is a family of maps ψc : X → Xc with the following properties: (i) ψc is an injection of X onto a subset Xc of the Riemann sphere; (ii) ψc0 = id; (iii) for every z ∈ X, ψc (z) varies holomorphically with c ∈ DR (c0 ). Theorem 10 (Słodkowski [23]). Let ψc : X → Xc be a holomorphic motion over the disk DR (c0 ). Then there is a holomorphic motion 9c : C → C over the disk DR (c0 ) such that (i) 9c |X = ψc ; (ii) 9c is a Kc quasiconformal map with Kc =

R + |c − c0 | . R − |c − c0 |

See also Douady’s survey [5]. 3.3. Varying the combinatorics. Let M be the set of all quadratic-like maps with connected Julia set. Let P be the set of all normalized quadratic maps Pc : C → C defined by Pc (z) = 1 − cz2 , where c ∈ C \ {0}. Two quadratic-like maps f and g are hybrid conjugate if there is a quasiconformal conjugacy h between f and g with the property that ∂h(z) = 0 for almost every z ∈ K(f ). By Douady–Hubbard’s Theorem 1 on p. 296 of [6], for every f ∈ M there exists a unique quadratic map Pc(f ) which is hybrid conjugated to f . The map ξ : M → P defined by ξ(f ) = Pc(f ) is called the straightening. / [1, 2] has trivial dynamics. Therefore, Observe that a real quadratic map Pc with c ∈ we will restrict our study to the set Q([1, 2], µ) of all f ∈ Q(µ) satisfying ξ(f ) = Pc(f ) for some c(f ) ∈ [1, 2]. Let us choose a radius 1 large enough such that, for every c ∈ [1, 2], Pc (z) = 1−cz2 is a quadratic-like map when restricted to Pc−1 (D1 ).

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

99

Lemma 11. There exist positive constants (µ) and K(µ) with the following property: for every f ∈ Q([1, 2], µ) there exists a topological disk Vf ⊂ D such that f restricted to f −1 (Vf ) is a quadratic-like map. Furthermore, there is a K quasiconformal homeomorphism 8f : C → C such that (i) 8f |f −1 (Vf ) is a hybrid conjugacy between f and Pc(f ) ; (ii) 8f (Vf ) = D1 ; (iii) 8f is holomorphic over C \ Vf ; (iv) 8f (z) = 8f (z). Proof. The main point in this proof is to combine the hybrid conjugacy between f and Pc(f ) given by Douady–Hubbard, with Sullivan’s pull-back argument, and with McMullen’s rigidity theorem for real quadratic maps. Using Sullivan’s pull-back argument and the hybrid conjugacy between f and Pc(f ) , we construct a K quasiconformal homeomorphism 8f : C → C which restricts to a conjugacy between f and Pc(f ) . Moreover, 8f satisfies properties (ii), (iii) and (iv) of this lemma, and the restriction of 8f to the filled-in Julia set of f extends to a quasi conformal map that is a hybrid conjugacy between f and Pc(f ) . By Rickman’s glueing lemma (see Lemma 2 in [6]) it follows that 8f also satisfies property (i) of this lemma. Now, we give the details of the proof: let us consider the set of all quadratic-like maps f : Wf → Wf0 contained in Q([1, 2], µ). Using the Koebe Distortion Lemma (see p. 84 of [2]), we can slightly shrink f −n (Wf0 ) for some n ≥ 0 to obtain an open set Vf with the following properties: (i) (ii) (iii) (iv)

Vf is symmetric with respect to the real axis; the restriction of f to f −1 (Vf ) is a quadratic-like map; the annulus Vf \ f −1 (Vf ) has conformal modulus between µ/2 and 2µ; the boundaries of Vf \f −1 (Vf ) are analytic γ (µ) quasi-circles for some γ (µ) > 0, i. e., they are images of an Euclidean circle by γ (µ) quasiconformal maps defined on C.

Let Q0 be the set of all quadratic-like maps f : f −1 (Vf ) → Vf contained in Q([1, 2], µ/2) ∪ Q([1, 2], µ) for which Vf satisfies properties (i), . . . , (iv) of last paragraph. Since for every f ∈ Q0 the boundaries of Vf \f −1 (Vf ) are analytic γ (µ) quasi-circles, any convergent sequence fn ∈ Q0 , with limit g, in the Carathéodory topology has the property that the sets Vfn converge to Vg in the Hausdorff topology (see Sect. 4.1 on pp. 75–76 of [16]). Therefore, the set Q0 is closed with respect to the Carathéodory topology, and hence is compact. Furthermore, by compactness of Q0 , and using the Koebe Distortion Lemma, there is an Euclidean disk D which contains Vf for every f ∈ Q0 . Now, let us construct 8f : C → C such that the properties (i), . . . , (iv) of this lemma are satisfied. Since Vf is symmetric with respect to the real axis, there is a unique Riemann Mapping φ : C \ Vf → C \ D1 satisfying φ(z) = φ(z), and such that φ(R+ ) ⊂ R+ . Since the boundaries of Vf \ f −1 (Vf ) are analytic γ (µ) quasi-circles, using the Ahlfors–Beurling Theorem (see Theorem 5.2 on p. 33 of [9]) the map φ has a K1 (µ) quasiconformal homeomorphic extension φ1 : C → C which also is symmetric φ1 (z) = φ1 (z).

100

W. de Melo, A. A. Pinto

Let φ2 : Vf \ K(f ) → D1 \ K(Pc(f ) ) be the unique continuous lift of φ1 satisfying Pc(f ) ◦ φ2 (z) = φ1 ◦ f (z), and such that φ2 (R+ ) ⊂ R+ . Since φ1 is a K1 (µ) quasiconformal homeomorphism, so is φ2 . Using the Ahlfors–Beurling Theorem, we construct a K2 (µ) quasi-conformal homeomorphism φ3 : C \ K(f ) → C \ K(Pc(f ) ) interpolating φ1 and φ2 with the following properties: (i) φ3 (z) = φ1 (z) for every z ∈ C \ Vf ; (ii) φ3 (z) = φ2 (z) for every z ∈ f −1 (Vf ) \ K(f ); (iii) φ3 (z) = φ3 (z). −1 Then the map φ3 conjugates f on ∂f −1 (Vf ) with Pc(f ) on ∂Pc(f ) (D1 ), and is holo-

morphic over C \ Vf ⊂ C \ D . By Theorem 1 in [6], there is a Kf0 quasiconformal hybrid conjugacy φ4 : Vf0 → 0 Vc(f ) between f and Pc(f ) , where Vf0 is a neigbourhood of K(f ). Using the Ahlfors–

Beurling Theorem, we construct a Kf00 quasiconformal homeomorphism 80 : C → C interpolating φ3 and φ4 such that (i) 80 (z) = φ3 (z) for every z ∈ C \ f −1 (Vf ); (ii) 80 (z) = φ4 (z) for every z ∈ K(f ); (iii) 80 (z) = 80 (z). Then the map 80 conjugates f on K(f ) ∪ ∂f −1 (Vf ) with Pc(f ) on −1 K(Pc(f ) ) ∪ ∂Pc(f ) (D1 ),

and satisfies the properties (ii), (iii) and (iv) as stated in this lemma. Furthermore, µ80 (z) = 0 for every z ∈ C \ Vf , |µ8f (z)| ≤ (K2 − 1)/(K2 + 1)for a. e. z ∈ Vf \ f −1 (Vf ), and µ8f (z) = 0 for a. e. z ∈ K(f ) \ J (f ). For every n > 0, let us inductively define the Kf00 quasiconformal homeomorphism 8n : C → C as follows: (i) 8n (z) = 8n−1 (z) for every z ∈ C \ f −n (Vf ) ∪ K(f ); (ii) Pc(f ) ◦ 8n (z) = 8n−1 ◦ f (z) for every z ∈ f −n (Vf ) \ K(f ). By compactness of the set of all Kf00 quasiconformal homeomorphisms on C fixing three points (0, 1 and ∞), there is a subsequence 8nj which converges to a Kf00 quasiconformal homeomorphism 8f . Then 8f satisfies the properties (ii), (iii) and (iv) as stated in this lemma. The restriction of 8f to the set f −1 (Vf ) has the property of being a quasiconformal conjugacy between f and Pc(f ) . Furthermore, the Beltrami differential µ8f has the following properties: (i) µ8f (z) = 0 for every z ∈ C \ Vf ; (ii |µ8f (z)| ≤ (K2 − 1)/(K2 + 1) for a. e. z ∈ Vf \ K(f ); (iii) µ8f (z) = 0 for a. e. z ∈ K(f ) \ J (f ). Therefore, by Rickman’s glueing lemma, 8f : C → C is a K2 (µ) quasiconformal homeomorphism, and 8f restricted to the set f −1 (Vf ) is a hybrid conjugacy between t f and Pc(f ) . u

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

101

The lemma below could be proven using the external fibers and the fact that the holonomy of the hybrid foliation is quasi-conformal as in [13]. However we will give a more direct proof of it below. Lemma 12. There exist positive constants β(µ) ≤ 1, D(µ), and µ0 (µ) with the following property: for every c ∈ [1, 2], and for every f ∈ Q([1, 2], µ), there is fc ∈ Q([1, 2], µ0 ) satisfying ξ(fc ) = Pc , and such that ||f − fc ||C 0 (I ) ≤ D|c(f ) − c|β .

(7)

Proof. The main step of this proof consists of constructing the real quadratic-like maps fc = ψc ◦ Pc ◦ ψc−1 satisfying fc(f ) = f , and such that the maps ωc : C → C −1 defined by ωc = ψc ◦ ψc(f ) form a holomorphic motion ωc , and have the property of being holomorphic on the complement of a disk centered at the origin. Using Theorem 9 and Theorem 10, we prove that there is a positive constant L3 with the property that ||ωc −id||C 0 ≤ L3 |c −c(f )|. Finally, we show that this implies the inequality (7) above. Now, we give the details of the proof: let us choose a small > 0, and a small open set U of C containing the interval [1, 2] such that, for every c ∈ U , the quadratic map Pc (z) = 1 − cz2 has a quadratic-like restriction to Pc−1 (D1 ), and Pc−1 (D1 ) ⊂ D1− . Let η : C → R be a C ∞ function with the following properties: (i) η(z) = 1 for every z ∈ C \ D1 ; (ii) η(z) = 0 for every z ∈ D1− ; (iii) η(z) = η(z) for every z ∈ C. (D1 ) → C \ Pc−1 (D1 ) of the identity There is a unique continuous lift αc : C \ Pc−1 0 map such that (i) Pc ◦ αc (z) = Pc0 (z); (ii) αc0 = id; (iii) αc (z) varies continuously with c. (D1 ), αc (z) Then the maps αc are holomorphic injections, and, for every z ∈ C \ Pc−1 0 varies holomorphically with c. (D1 ) → C \ Pc−1 (D1 ) be the interpolation between the identity Let βc : C \ Pc−1 0 map and αc defined by βc = η · id + (1 − η) · αc . We choose r 0 > 0 small enough such that, for every c0 ∈ [1, 2], and c ∈ Dr 0 (c0 ) ⊂ U , βc is a diffeomorphism. Then (D1 ) → C \ Pc−1 (D1 ) is a holomorphic motion over Dr (c0 ) with the βc : C \ Pc−1 0 following properties: (D1 ) and Pc on ∂Pc−1 (D1 ); (i) the map βc is a conjugacy between Pc0 on ∂Pc−1 0 (ii) the restriction of βc to the set C \ D1 is the identity map; (iii) if c is real then βc (z) = βc (z). By Theorem 10, βc extends to a holomorphic motion βˆc : C → C over Dr 0 (c0 ), and, by taking r = r 0 /2, the map βˆc is 3 quasiconformal for every c ∈ Dr (c0 ). By Lemma 11, there is a K(µ) quasiconformal homeomorphism 8f : C → C, and −1 (V ) is a hybrid conjugacy an open set Vf = 8−1 f f (D1 ) such that (i) 8f restricted to f between f and Pc(f ) ; (ii) 8f is holomorphic over C \ Vf ; and (iii) 8f (z) = 8f (z). Let 8c : C → C be defined by 8c = βˆc ◦ 8f . Then, for every c ∈ Dr (c0 ), 8c is a 3K quasiconformal homeomorphism which conjugates f on ∂f −1 (Vf ) with Pc on ∂Pc−1 (D1 ). We define the Beltrami differential µc as follows:

102

W. de Melo, A. A. Pinto

(i) µc (z) = 0 if z ∈ K(Pc ) ∪ (C \ D1 ); (ii) (8c )∗ µc (z) = 0 if z ∈ D1 \ Pc−1 (D1 ); −(n+1) (D1 ) and n ≥ 1. (iii) Pcn ∗ µc (z) = µc (Pcn (z)) if z ∈ Pc−n (D1 ) \ Pc Then (i) the Beltrami differential µc varies holomorphically with c; (ii) ||µc ||∞ < (3K − 1)/(3K + 1) for every c ∈ Dr (c(f )); and (iii) if c is real then µc (z) = µc (z) for almost every z ∈ C. By the Ahlfors-Bers Theorem (see [3]), for every c ∈ Dr (c(f )) there is a normalized 3K quasiconformal homeomorphism ψc : C → C with ψc (0) = 0, ψc (1) = 1, and ψc (∞) = ∞ such that µψc = µc , and ψc (z) varies holomorphically with c. Thus, the restriction of ψc to C \ D1 is a holomorphic map, and if c is real then ψc (z) = ψc (z) for every z ∈ C. The map fc : ψc (Pc−1 (D1 )) → ψc (D1 ) defined by fc = ψc ◦ Pc ◦ ψc−1 is 1 quasiconformal, and thus a holomorphic map. Furthermore, the map fc is hybrid conjugated to Pc , and so fc is a quadratic-like map whose straightening ξ(f ) is Pc . Since the conformal modulus of the annulus ψc (D1 ) \ ψc (Pc−1 (D1 )) depends only on 3K(µ), we obtain that there is a positive constant µ0 (µ) such that the conformal modulus of fc is greater than or equal to µ0 (µ). If c is real then fc (z) = fc (z), which implies that fc is a real quadratic-like map. For the parameter c(f ), the map ψc(f ) ◦8f is 1 quasiconformal and fixes three points (0, 1 and ∞). Therefore, ψc(f ) ◦ 8f is the identity map, and since the map ψc(f ) ◦ 8f conjugates f with fc(f ) , we get fc(f ) = f . Now, let us prove that the quadratic-like map fc satisfies inequality (7). By compactness of the set of all 3K(µ) quasiconformal homeomorphisms φ on C fixing three points (0, 1 and ∞), there are positive constants l(s, µ) ≤ s ≤ L(s, µ) for every s > 0 with the property that Dl ⊂ φ(Ds ) and C \ DL ⊂ φ(C \ Ds ).

(8)

−1 Thus, there is 100 = L(L(1)) with the property that ωc = ψc ◦ ψc(f ) is holomorphic in

C \ D100 for every c ∈ Dr (c(f )), and c(f ) ∈ [1, 2]. Let S2100 be the circle centered at the origin and with radius 2100 . By (8), we obtain that ωc (S2100 ) is at a uniform distance from 0 and ∞ for every c ∈ Dr (c(f )), and c(f ) ∈ [1, 2]. Hence, by the Cauchy Integral Formula, and since ωc is a holomorphic motion over Dr (c(f )), the value ac = ωc0 (∞) varies holomorphically with c, and there is a constant L1 (µ) > 0 with the property that |ac − 1| < L1 |c − c(f )|.

(9)

Thus, (i) the map ac ωc is holomorphic in C \ D100 ; (ii) ||µac ωc ||∞ is less than or equal to (9K 2 − 1)/(9K 2 + 1); and (iii) lim|z|→∞ (ac ωc (z) − z) = 0. Hence, by Theorem 9, there is a positive constant L2 (µ) such that, for every c ∈ Dr (c(f )), and for every c(f ) ∈ [1, 2], we get ||ac ωc − id||C 0 ≤ L2 ||µac ωc ||∞ .

(10)

Since ac ωc is a holomorphic motion over Dr (c(f )), and by Theorem 10, we get ||µac ωc ||∞ ≤

|c − c(f )| . r

(11)

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

103

By inequalities (9), (10), and (11) there is a positive constant L3 (µ) such that, for every c(f ) ∈ [1, 2], and for every c ∈ (c(f ) − r, c(f ) + r), we obtain ||ωc − id||C 0 (I ) < L3 |c − c(f )|.

(12)

||ωc−1 − id||C 0 (I ) < L3 |c − c(f )|.

(13)

This implies that Since ωc is a 9K 2 quasiconformal homeomorphism, and fixes three points, we obtain from Theorem 4.3 on p. 70 of [10] that there are positive constants β(µ) ≤ 1 and L4 (µ) with the property that ||ωc ||C β (I ) < L4 . Then by inequalities (12) and (13) there is a positive constant L5 (µ) such that, for every c(f ) ∈ [1, 2], and for every c ∈ (c(f ) − r, c(f ) + r), we have β

||fc − fc(f ) ||C 0 (I ) ≤ ||ωc − id||C 0 (I ) + ||ωc ||C β (I ) ||Pc − Pc(f ) ||C 0 (I ) β

β

+ ||ωc ||C β ||Pc(f ) ||C 1 (I ) ||ωc−1 − id||C 0 (I ) ≤ L5 |c − c(f )|β . Finally, by increasing the constant L5 if necessary, we obtain that the last inequality is also satisfied for every c(f ) and c contained in [1, 2]. u t 4. Proofs of the Main Results 4.1. Proof of Lemma 2. Let f = φf ◦ p be a C 2 infinitely renormalizable map with bounded combinatorial type. Let N be such that the combinatorial type of f is bounded by N, and set b = ||φf ||C 2 . By Lemma 8, there are positive constants γ (N ) < 1, α(N), µ(N), and c1 (b, N) with the following properties: for every n ≥ 0, there is an [αn + 1] times renormalizable quadratic-like map Fn with renormalization type σ (n) = σR n f , . . . , σR n+[αn] f , with conformal modulus greater than or equal to µ, and satisfying ||R n f − Fn ||C 0 (I ) ≤ c1 γ n .

(14)

By Milnor–Thurston’s topological classification (see [14] and Theorem 4.2a on p. 470 of [18]), the real values c for which the real quadratic maps Pc (z) = 1 − cz2 have renormalization type σ (n) is an interval Iσ (n) . Thus, by Sullivan’s pull-back argument (see [21] and Theorem 4.2b on p. 471 of [18]), there is a unique cn ∈ Iσ (n) such that Pcn has the same combinatorial type as R n (f ). By Douady–Hubbard’s Theorem 1 in [6], there is a unique quadratic map ξ(Fn ) = Pc(Fn ) which is hybrid conjugated to Fn . Since Fn has renormalization type σ (n), the parameter c(Fn ) belongs to Iσ (n) . By Lyubich’s Theorem 9.6 on p, 79 of [13], there are positive constants λ(N ) < 1 and c2 (N ) such that |Iσ (n) | ≤ c2 λn . Therefore, |cn − c(Fn )| ≤ c2 λn . By Lemma 12, there are positive constants β(µ) < 1, D(µ), and µ0 (µ) with the following properties: for every n ≥ 0, there is a real quadratic-like map fn with conformal modulus greater than or equal to µ0 , satisfying ξ(fn ) = Pcn , and such that β

||fn − Fn ||C 0 (I ) ≤ D|cn − c(Fn )|β ≤ Dc2 λβn . Therefore, the map fn has the same combinatorial type as R n (f ), and, by inequality β (14), for C(b, N) = c1 + Dc2 and η(N ) = max{γ , λβ }, we get ||R n f − fn ||C 0 (I ) ≤ Cηn .

t u

104

W. de Melo, A. A. Pinto

4.2. Proof of Theorem 1. Let f = φf ◦ p and g = φg ◦ p be any two C 2 infinitely renormalizable unimodal maps with the same bounded combinatorial type. Let N be such that the combinatorial type of f and g are bounded by N, and set b = max{||φf ||C 2 , ||φg ||C 2 }. For every n ≥ 0, let m = [αn], where 0 < α < 1 will be fixed later in the proof. By Lemma 2, there are positive constants η(N ) < 1 and c1 (b, N), and there are infinitely renormalizable real quadratic-like maps Fm and Gm with the following property: ||R m f − Fm ||C 0 (I ) ≤ c1 ηαn and ||R m g − Gm ||C 0 (I ) ≤ c1 ηαn .

(15)

By Lemma 6, there are positive constants n3 (b) and L(N ) such that, for every m > n3 , we get ||R n f − R n−m Fm ||C 0 (I ) ≤ Ln−m ||R m f − Fm ||C 0 (I ) n , ≤ c1 L1−α ηα

(16)

and, similarly, n ||R n g − R n−m Gm ||C 0 (I ) ≤ c1 L1−α ηα .

(17)

Now, we fix 0 < α(N) < 1 such that L1−α ηα < 1. Again, by Lemma 2, Fm and Gm have conformal modulus greater than or equal to µ(N), and the same combinatorial type as R m f and R m g. Therefore, by McMullen’s Theorem 9.22 on p. 172 of [16], there are positive constants ν2 (N ) < 1 and c2 (µ, N ) with the property that ||R n−m Fm − R n−m Gm ||C 0 (I ) ≤ c2 ν2n−m .

(18)

By inequalities (16), (17), and (18), there are constants c3 (b, N ) = 2c1 + c2 and ν3 (N) = max{L1−α ηα , ν21−α } such that ||R n f − R n g||C 0 (I ) ≤ c3 ν3n . By Theorem 9.4 on p. 552 of [18],the exponential convergence implies that there is a C 1+α diffeomorphism which conjugates f and g along the closure of the corresponding orbits of the critical points for some α(N ) > 0. u t The exponential convergence of the renormalization operator in the space of real analytic unimodal maps holds for every combinatorial type. Indeed, if f and g are real analytic infinitely renormalizable maps, by the complex bounds in Theorem A of Levin–van Strien in [11], there exists an integer N such that R N (f ) and R N (g) have quadratic like extensions. Then we can use Lyubich’s Theorem 1.1 in [12] to conclude the exponential convergence. However, as we pointed out before, this is not sufficient to give the C 1+α rigidity. Finally, at the moment, we cannot prove the exponential convergence of the operator for C 2 mappings with unbounded combinatorics. Acknowledgements. Alberto Adrego Pinto would like to thank IMPA, University of Warwick, and IMS at SUNY Stony Brook for their hospitality. We would like to thank Edson de Faria, and Mikhail Lyubich for useful discussions. This work has been partially supported by the Pronex Project on Dynamical Systems, Fundação para a Ciência, Praxis XXI from M.C.T., PRODYN from ESF, Calouste Gulbenkian Foundation, and Centro de Matemática Aplicada, da Universidade do Porto, Portugal.

Rigidity of C 2 Infinitely Renormalizable Unimodal Maps

105

References 1. Ahlfors, L.V.: Lectures on quasiconformal mappings. Princeton, NJ: D. van Nostrand Company, Inc., 1966 2. Ahlfors, L.V.: Conformal invariants, topics in geometric function theory. New York: Mc Graw-Hill, 1973 3. Ahlfors, L.V. and Bers, L.: Riemann’s mapping theorem for variable metrics. Annals of Math. (2) 72, 385–404 (1960) 4. Coullet, P. and Tresser, C.: Itération d’endomorphismes et groupe de renormalisation. J. Phys. Colloque C 539, C5–25 (1978) 5. Douady, A.: Prolongement des mouvements holomorphes [d’aprés Słodkowski et autres]. In: Séminaire Bourbaki (Nov. 93), 7-20 Astérisque, v. 227–228, 1995 6. Douady, A. and Hubbard, J.H.: On the dynamics of polynomial-like maps. Ann. Sci. Éc. Norm. Sup. 18, 287–343 (1985) 7. de Faria, E. and de Melo, W.: Rigidity of critical circle maps I. IMS Stony Brook Preprint 1997/16 (1997) 8. Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52 (1978) 9. Lehto, O.: Univalent functions and Teichmüller spaces. Graduate Texts in Mathematics 109, Berlin– Heidelberg–New York: Springer-Verlag, 1987 10. Lehto, O. and Virtanen, K.I.: Quasiconformal mappings in the plane. Berlin–Heidelberg–New York: Springer-Verlag, 1973 11. Levin, G. and van Strien, S.: Local connectivity of the Julia set of real polynomials. Annals of Math. 147, 471–541 (1998) 12. Lyubich, M.: Almost Every real quadratic map is either regular or stochastic. IMS Stony Brook Preprint 1997/8 (1997) 13. Lyubich, M.: Feigenbaum–Coullet–Tresser Universality and Milnor’s Hairiness conjecture. IHES Preprint 1-31. To be published in Annals of Math. 14. Milnor, J.: The monotonicity theorem for real quadratic maps. Mathematische Arbeitstagung Bonn, 1983 15. McMullen, C.: Complex dynamics and renormalization. Annals of Math. Studies, v. 135, Princeton, NJ: Princeton University Press, 1994 16. McMullen, C.: Renormalization and 3-Manifolds which Fiber over the Circle. Annals of Math. Studies, v. 142, Princeton, NJ: Princeton University Press, 1996 17. de Melo, W.: Rigidity and renormalization in the one dimensional dynamical systems. In: Proceedings of the International Congress of Mathematicians, Berlin 1998, 765–779. Documenta Mathematica 1998 18. de Melo, W. and van Strien, S.: One-Dimensional Dynamics. A Series of Modern Surveys in Mathematics, Berlin–Heidelberg–New York: Springer-Verlag, 1993 19. Pinto, A.A. and Rand, D.: Global phase space universality, smooth conjugacies and renormalization: 2. The C k+α case using rapid convergence of Markov families. Nonlinearity 4, 1–31 (1991) 20. Rand, D.: Global phase space universality, smooth conjugacies and renormalization: 1. The C 1+α case. Nonlinearity 1, 181–202 (1988) 21. Sullivan, D.: Bounds, quadratic differentials, and renormalization conjectures. AMS Centennial Publications. In: Volume 2: Mathematics into the Twenty-first Century (1988 Centennial Symposium, August 8–12). Providence, RI: American Mathematical Society, 1991 22. Sullivan, D.: Linking the universalities of Milnor–Thurston, Feigenbaum and Ahlfors-Bers. In: L. R. Goldberg and A. V. Phillips, editors, Topological Methods in Modern Mathematics, Publish or Perish, Inc., 1993, pp. 543–563 23. Słodkowski, Z.: Holomorphic motions and polynomial hulls. Proc. Am. Math. Soc. 111, 347–355 (1991) Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 107 – 123 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

A Multiplicative Ergodic Theorem and Nonpositively Curved Spaces Anders Karlsson? , Gregory A. Margulis?? Department of Mathematics, Yale University, New Haven, CT 06520, USA. E-mail: [email protected]; [email protected] Received: 27 April 1999 / Accepted: 25 May 1999

Abstract: We study integrable cocycles u(n, x) over an ergodic measure preserving transformation that take values in a semigroup of nonexpanding maps of a nonpositively curved space Y , e.g. a Cartan–Hadamard space or a uniformly convex Banach space. It is proved that for any y ∈ Y and almost all x, there exist A ≥ 0 and a unique geodesic ray γ (t, x) in Y starting at y such that lim

n→∞

1 d(γ (An, x), u(n, x)y) = 0. n

In the case where Y is the symmetric space GLN (R)/ON (R) and the cocycles take values in GLN (R), this is equivalent to the multiplicative ergodic theorem of Oseledec. Two applications are also described. The first concerns the determination of Poisson boundaries and the second concerns Hilbert-Schmidt operators. 1. Introduction Let (X, µ) be a measure space with µ(X) = 1 and let L : X → X be a measure preserving transformation. Birkhoff’s pointwise ergodic theorem asserts that the ergodic averages of a function f ∈ L1 (µ), n−1

1X f (Lk x), n k=0

converge for µ-a.e. x to an L-invariant function f¯ ∈ L1 (µ) when n → ∞. Two important extensions of this theorem are the subadditive ergodic theorem of Kingman [Ki] and the multiplicative ergodic theorem of Oseledec [O]. Both theorems ? Supported in part by the Göran Gustafsson Foundation.

?? Supported in part by NSF Grant DMS-9800607.

108

A. Karlsson, G. A. Margulis

have numerous applications and since the original proofs were published several alternative proofs of these theorems have appeared. Let us first recall Kingman’s theorem. Let a : N × X → R ∪ {−∞} be a subadditive (measurable) cocycle, that is a(n + m, x) ≤ a(n, Lm x) + a(m, x) for n, m ≥ 1 and x ∈ X. Assume that Z a + (1, x)dµ(x) < ∞, X

a + (1, x)

= max{0, a(1, x)}. Then the subadditive ergodic theorem asserts that where there is an L-invariant measurable function a : X → R ∪ {−∞} such that 1 a(n, x) = a(x) n→∞ n lim

for µ-a.e. x. This result generalizes Birkhoff’s theorem, because a(n, x) :=

n−1 X

f (Lk x)

k=0

is a subadditive (in fact additive) cocycle. The multiplicative ergodic theorem of Oseledec is an extension of Birkhoff’s theorem to products of matrices. Let A : X → GLN (R) be a measurable map and define the (multiplicative) cocycle A(n, x) = A(Ln−1 x) · · · A(x). Assume that Z X

Z

+

log ||A(x)||dµ(x) < ∞ and

X

log+ ||A−1 (x)||dµ(x) < ∞,

where log+ a = max{0, log a}. Then the theorem of Oseledec asserts that for µ-a.e. x the sequence A(n, x) is Lyapunov regular, which by definition means that there is a filtration of subspaces {0} = V0x

V1x

...

x Vs(x) = RN

x , and numbers λ1 (x) < ... < λs(x) (x) such that for any v ∈ Vix \ Vi−1

lim

n→∞

1 log ||A(n, x)v|| = λi (x) n

and s(x)

X 1 x λi (x)(dim Vix − dim Vi−1 ). lim log | det A(n, x)| = n→∞ n i=1

Wix

x in V x and define a positive definite be the orthogonal complement of Vi−1 Let i matrix 3(x) by requiring that 3(x)w = eλi (x) w for any w ∈ Wix , 1 ≤ i ≤ s(x). The

A Multiplicative Ergodic Theorem

109

content of this theorem is that in a certain sense, A(n, x) behave asymptotically like the iterates 3(x)n . The Lyapunov regularity is also easily seen to be equivalent to the statement that there exists a positive definite symmetric matrix 3 = 3(x) such that 1 1 log ||An 3−n || → 0 and log ||3n A−1 n || → 0, n n

(1.1)

where An denotes A(n, x). Consider the symmetric space Y = GLN (R)/ON (R) and let y = ON (R). Let g be an element in GLN (R) and let µi denote the eigenvalues of (gg t )1/2 . The distance in Y between y and gy is !1/2 N X 2 (log µi ) . (1.2) d(y, gy) = i=1

Recall also that geodesics starting at y are of the form γ (t) = etH y, where H is a symmetric matrix. Let 3 = eH be some positive definite symmetric matrix. From (1.2) it follows that 1 d(3−n y, A−1 n y) → 0, n

(1.3)

is equivalent to (1.1). Hence the Lyapunov regularity of A(n, x) is equivalent to the geometric statement (1.3). For a discussion of this, see [Ka2]. In that paper, Kaimanovich obtained a complete geometric description of sequences {yn } of points in Y for which there are a geodesic ray γ and A ≥ 0 such that the distance from yn to γ (An) grows sublinearly in n. This was done by taking advantage of the special structure of symmetric spaces of noncompact type, and using hyperbolic geometry. After that, applying the subadditive ergodic theorem, he could deduce (1.3). The present paper studies the more general situation where the cocycles take values in a semigroup of semicontractions (e.g. isometries) of a uniformly convex, nonpositively curved in the sense of Busemann, complete metric space (Y, d). For definitions and examples, we refer to Sect. 3. Note that the asymptotics of the iteration of one single semicontraction ϕ : D → D ⊂ Y is already nontrivial. For example, the case where D is a convex subset of a Hilbert space was studied by Pazy [P]. See also [Be] for this topic, which goes back to the work of Denjoy and Wolff on the iteration of an analytic map of the unit disk into itself. In several proofs of Oseledec’s theorem, the use of ergodic theory is reduced to the application of a standard theorem, that of Birkhoff or Kingman. In contrast, this reduction seems impossible to do for the proof of the multiplicative ergodic theorem given in this paper. Instead, we establish a different kind of “maximal ergodic inequality”, Lemma 4.1. The arguments in the ergodic theoretic part of this paper are in the same spirit as those commonly used to establish the subadditive ergodic theorem. Note that, in the ergodic case, this theorem is here deduced as Corollary 4.3 of Proposition 4.2. The paper is organized as follows. The section following this introduction, Sect. 2, provides a concise formulation of the main result.All the terminology used is explained in Sect. 3, which also contains one additional observation, Lemma 3.1. Section 4 proves the needed ergodic lemmas about subadditive cocycles (Proposition 4.2 and Corollary 4.3).

110

A. Karlsson, G. A. Margulis

Section 5 gives the proof of the theorem. The final sections, Sects. 6 and 7, describe two applications. 2. Formulation of the Main Result Let (Y, d) be a uniformly convex, complete metric space satisfying the Busemann nonpositive curvature condition. Examples include CAT(0)-spaces and uniformly convex Banach spaces. Let S be a semigroup of semicontractions D → D, where D is a nonempty subset of Y , and fix a point y ∈ D. Furthermore, let (X, µ) be a measure space with µ(X) = 1 and let L : X → X be an ergodic and measure preserving transformation. Given a measurable map w : X → S, put u(n, x) = w(x)w(Lx) · · · w(Ln−1 x) and denote u(n, x)y by yn (x). Assume that Z d(y, w(x)y)dµ(x) < ∞, X

(2.1)

(2.2)

then the following “multiplicative ergodic theorem” holds. Theorem 2.1. For almost every x, the following limit exists: 1 d(y, yn (x)) = A n→∞ n lim

(2.3)

and if A > 0, then for almost every x, there exists a unique geodesic ray γ (·, x) in Y starting at y such that 1 d(γ (An, x), yn (x)) = 0. n→∞ n lim

(2.4)

Remark 2.2. The existence of the limit (2.3) is well known. It is a standard consequence of the subadditive ergodic theorem, here Corollary 4.3. In the case A > 0, note that (2.4) implies that yn converges to [γ ] in Y ∪ Y (∞), where Y (∞) denotes the ideal boundary at infinity consisting of asymptote classes of rays. Remark 2.3. Assume that S = 0 is a discrete cocompact group of isometries of a CartanHadamard manifold Y . Let P be the (time 1) Markov operator associated to a 0-invariant Markov process on Y , with finite first moment and absolutely continuous transition probabilities. Take a P -stationary initial distribution on Y, then it is not difficult to construct a measure preserving system (X, µ, L) and a map w : X → 0, such that u(n, x)y and the corresponding sample path at time n stay within a finite distance from each other for all n. The theorem then yields the result that for almost every sample path there is a geodesic ray such that the distance from the sample path to this geodesic grows sublinearly in n. In this context, we refer to Ballmann’s paper [Ba1] for comparison. Remark 2.4. There is also an "invertible" version of Oseledec’s theorem, see [O], in which one gets the approximation by the powers of the same matrix at both +∞ and −∞, (the cocycle in question for negative n is A(n, x) = A(1, Ln )−1 · · · A(1, L−1 )−1 ). In view of this result, one might wonder whether the analog statement for u(n, x) is true

A Multiplicative Ergodic Theorem

111

in general, that is, is it true that there always exists a bi-infinite geodesic approximating both the backward and the forward orbit u(n, x)y in the sense of Theorem 2.1? In general, however, the answer to this question is no. For example, let Y be the manifold R × R with Riemannian metric (e−y + C)2 dx 2 + dy 2 . By some general results of Bishop and O’Neill concerning so-called warped products, the space Y is a CartanHadamard manifold. Consider for u(n, x) the powers of the parabolic isometry φ defined by (x, y) 7 → (x + 1, y). Note that in this case the constant A in the theorem will equal C. If C > 0, then the forward and the backward orbit will converge to two different points on the ideal boundary of Y . These two limit points must be fixed by φ. Now assume that they can be connected by a geodesic in Y . Then, since the two endpoints are fixed by φ, the displacement of φ is semidecreasing in both directions along this geodesic, hence it is constant. This is impossible as φ is parabolic and Y has no parallel bi-infinite geodesics. 3. Geometric Preliminaries General references for this section are [Ba2] and [J]. 3.1. Let (Y, d) be a metric space. A continuous map γ : I → Y, where I is an interval, is called a (unit speed minimizing) geodesic, if for any s, t ∈ I , d(γ (s), γ (t)) = |s − t|. A geodesic γ : [0, ω) → Y , such that limt→ω γ (t) does not exist, is called a ray. If (Y, d) is complete, then for any ray, ω = ∞. A point z is called a midpoint of x and y if d(z, x) = d(z, y) =

1 d(x, y). 2

A metric space (Y, d) is called convex if any two points in Y have a midpoint. If a convex metric space (Y, d) is complete, then any two points can be joined by a geodesic. A metric space (Y, d) is called uniformly convex if (Y, d) is convex and there is a strictly decreasing continuous function g on [0, 1] with g(0) = 1, such that for any x, y, w ∈ Y and midpoint z of x and y, d(x, y) d(z, w) ≤g , R 2R where R := max{d(x, w), d(y, w)}. See Fig. 1. An immediate consequence of this property is that midpoints are unique, and hence so are geodesics between any two points. Spaces satisfying certain parallelogram inequalities, for example the Lp -spaces, 1 < p < ∞, are uniformly convex, the original reference is [C]. For Lp , p ≥ 2, g(ε) = 1 − ε p

1/p

works in the definition. Further examples are Cartan-Hadamard manifolds (e.g. Euclidean spaces, hyperbolic spaces, and symmetric spaces of noncompact type such as

112

A. Karlsson, G. A. Margulis

y

z w

x Fig. 1. The distance d(z, w) is less than the maximum of d(x, w) and d(y, w)

GLN (R)/ON (R)), or more generally CAT(0)- spaces (e.g. Euclidean buildings and Rtrees). For a general CAT(0)-space, g is as above with p = 2 and for trees one can also take p = 1. A Banach space is CAT(0) if and only if it is a Hilbert space. A convex metric space (Y, d) is said to be nonpositively curved in the sense of Busemann if for any x, y, z ∈ Y and any midpoints mxz of x and z, and myz of y and z, 1 d(x, y). (3.1) 2 Any uniformly convex Banach space, or more generally any strictly convex Banach space, as well as any CAT(0)-space satisfies Busemann’s nonpositive curvature condition. d(mxz , myz ) ≤

3.2. From now on, let (Y, d) be a uniformly convex, Busemann nonpositively curved, complete metric space. It follows from the Busemann condition (3.1) that t → d(γ1 (t), γ2 (t)) is a convex function for any two geodesics γ1 and γ2 . In particular, for two rays γ1 and γ2 starting at y the function t→

1 d(γ1 (t), γ2 (t)) t

(3.2)

is semiincreasing. Let γi be any sequence of rays starting at y and assume that {γi (R)}∞ i=1 is a Cauchy sequence for every R. By the completeness of (Y, d), we can for each R define γ (R) = lim γi (R). It is then immediate that γ is a ray starting at y and we say that γi converges to γ . Lemma 3.1. Let x, y, z ∈ Y and assume that d(y, x) + d(x, z) ≤ d(y, z) + δd(y, x),

(3.3)

where δ ∈ [0, 1]. Let w be the point on the geodesic between y and z such that d(y, w) = d(y, x), then d(w, x) ≤ f (δ)d(y, x), where f is a function such that f (s) → 0 as s → 0. See Fig. 2.

A Multiplicative Ergodic Theorem

113

z

w

x

y Fig. 2.

Proof. Let m be the midpoint of w and x. Uniform convexity implies that d(m, z) ≤ max{d(w, z), d(x, z)}. Since d(w, z) = d(y, z) − d(y, x) by the definition of w and d(x, z) ≤ d(y, z) − d(y, x) + δd(y, x) by the inequality (3.3), we have that d(m, z) ≤ d(y, z) − d(y, x) + δd(y, x). Hence it follows, by the triangle inequality, that d(y, m) ≥ d(y, x) − δd(y, x) = (1 − δ)R,

(3.4)

where R := d(y, x) = max{d(y, x), d(y, w)}. Uniform convexity now gives us that d(w, x) d(m, y) ≤g . R 2R From the inequality (3.4) and since g is decreasing we get g −1 (1 − δ) ≥

d(w, x) . 2R

Recalling that R = d(y, x) and letting f (δ) = 2g −1 (1 − δ), we have now obtained the desired conclusion. u t 3.3. A semicontraction or nonexpanding map is a map ϕ : D → D, where D is a subset of Y, such that d(ϕ(y), ϕ(z)) ≤ d(y, z) for all y, z ∈ D. Any semigroup S of semicontractions is equipped with the Borel σ -algebra associated to the compact-open topology on S.

114

A. Karlsson, G. A. Margulis

4. Ergodic Theoretic Part Let (X, µ) be a measure space with µ(X) = 1 and let L : X → X be a measure preserving transformation. Furthermore, let a : N×X→ R be a subadditive (measurable) cocycle, that is a(n + m, x) ≤ a(n, Lm x) + a(m, x)

(4.1)

for n, m ∈ N, x ∈ X, (adopting the convention that a(0, x) ≡ 0). We will assume that the following integrability condition is satisfied: Z a + (1, x)dµ(x) < ∞, (4.2) X

where a + (1, x) = max{0, a(1, x)}. For each n, let Z a(n, x)dµ(x). an = X

(4.3)

It follows from (4.1) and (4.2) that an ≤ a1 < ∞, but it is possible that an = −∞. Since L preserves µ, the subadditivity condition (4.1) implies that an+m ≤ an + am for every n, m ∈ N. It is now an elementary fact, see for example [Kr, p. 36], that the limit 1 an n→∞ n

A := lim

exists and A < ∞. Recall also the following observation of F. Riesz, which is proved by a simple induction, see [Bi, p. 27]. Let c1 , c2 , ..., cn be a finite sequence of real numbers. Call cu a leader if at least one of the sums cu , cu + cu+1 , ..., cu + ... + cn is negative. Then the sum of the leaders is ≤ 0. (An empty sum is 0.) Lemma 4.1. Suppose that A > 0. Let E1 be the set of x in X with the property that there are infinitely many n such that a(n, x) − a(n − k, Lk x) ≥ 0 for all k, 1 ≤ k ≤ n. Then µ(E1 ) > 0. Proof. For every i ∈ N+ let us define a set 9i = {x ∈ X|∃k : 1 ≤ k ≤ i and a(i, x) − a(i − k, Lk x) < 0} and a function bi (x) = a(i, x) − a(i − 1, Lx). It is clear that a(n, x) − a(n − k, Lk x) = bn (x) + bn−1 (Lx) + ... + bn−k+1 (Lk−1 x)

(4.4)

A Multiplicative Ergodic Theorem

115

and in particular a(n, x) = bn (x) + bn−1 (Lx) + ... + b1 (Ln−1 x).

(4.5)

In view of (4.4), if Lk x ∈ 9n−k then bn−k (Lk x) + ... + bn−j (Lj x) < 0 for some j, k ≤ j ≤ n − 1. From this and F. Riesz’s lemma about leaders (with cu := bn−u (Lu x)) we deduce that for every x ∈ X and n ∈ N+ , X bn−k (Lk x) ≤ 0. (4.6) 0≤k≤n−1,Lk x∈9n−k

Using the L-invariance of µ, we get from the inequality (4.6) that n Z X Z X bj (x)dµ(x) = bn−k (x)dµ(x) j =1 9j

0≤k≤n−1 9n−k

=

X

Z

−k 0≤k≤n−1 L 9n−k

Z =

bn−k (Lk x)dµ(x)

X

X

(4.7)

bn−k (Lk x)dµ(x) ≤ 0.

0≤k≤n−1,Lk x∈9n−k

On the other hand, in view of (4.3), (4.5), and the L-invariance of µ, Z n Z X a(n, x)dµ(x) = bj (x)dµ(x). an = X

j =1 X

(4.8)

Since lim an /n = A > 0, there exists a number N such that an >

2A n 3

(4.9)

for all n > N. Let 9nc denote the complement of 9n in X. Then in view of (4.7), (4.8), (4.9), and the inequality bi (x) = a(i, x) − a(i − 1, Lx) ≤ a(1, x) ≤ a + (1, x), we have that n Z n Z X X 2A + n (4.10) a (1, x)dµ(x) ≥ bj (x)dµ(x) > c c 3 9j 9j j =1

Pn

j =1

for all n > N . Let fn = j =1 χ9jc , where χC denotes the characteristic function of a R set C ⊂ X. Let a1+ = X a + (1, x)dµ(x) and Bn = {x ∈ X : n ≥ fn (x) >

A n}. 3a1+

Since Bnc = {x ∈ X :

A n ≥ fn (x) ≥ 0}, 3a1+

116

A. Karlsson, G. A. Margulis

we have that n Z X c j =1 9j

Z

+

a (1, x)dµ(x) = Z =

X

fn (x)a + (1, x)dµ(x) Z

+

Bn

Z

fn (x)a (1, x)dµ(x) +

Bnc

A n 3a1+ Bn Z A a + (1, x)dµ(x) + n. ≤n 3 Bn

≤n

a + (1, x)dµ(x) +

fn (x)a + (1, x)dµ(x)

Z Bnc

a + (1, x)dµ(x)

Combining this inequality and the inequality (4.10) we get that Z A a + (1, x)dµ(x) > 3 Bn

(4.11)

for all n > N. The condition (4.2) implies the existence of δ > 0 such that Z A a + (1, x)dµ(x) < , 3 C whenever µ(C) < δ. Hence it follows from (4.11) that µ(Bn ) ≥ δ for every n > N. Let Cn = {x ∈ X : x ∈ 9ic for at least

A n positive integers i}, 3a1+

so Bn ⊂ Cn and Cn+1 ⊂ Cn . Therefore, the measure of the set \ Cn = {x ∈ X : x ∈ 9ic for infinitely many i} n≥1

is greater than or equal to δ > 0. Now recalling the definition of 9i we get the desired statement. u t Proposition 4.2. Suppose that L is ergodic and A > −∞. For any ε > 0, let Eε be the set of x in X for which there exist an integer K = K(x) and infinitely many n such that a(n, x) − a(n − k, Lk x) ≥ (A − ε)k T for all k, K ≤ k ≤ n. Let E = ε>0 Eε , then µ(E) = 1. Proof. For any ε > 0, let c(n, x) = a(n, x) − (A − ε)n. Then c is a subadditive cocycle and by the definition of A, Z 1 c(n, x)dµ = A − (A − ε) = ε > 0. lim n→∞ n X Note also that a(n, x) − a(n − k, Lk x) ≥ (A − ε)k

A Multiplicative Ergodic Theorem

117

is equivalent to c(n, x) − c(n − k, Lk x) ≥ 0. Hence Lemma 4.1 applied to c gives that µ(Eε ) > 0. By the subadditivity property (4.1), a(n, Ll x) − a(n − k, Lk+l x) ≥ a(n + l, x) − a((n + l) − (k + l), Lk+l x) − a(l, x). It follows that Ll Eε ⊂ E2ε for all l ≥ 0 and by ergodicity we then get that µ(E2ε ) = 1. Since this holds for every ε > 0 and Eε ⊂ Eε0 , whenever ε < ε0 , we have that µ(E) = 1. t u Corollary 4.3 (Kingman). Suppose that L is ergodic and A > −∞. Then lim

n→∞

1 a(n, x) = A n

for almost every x. Proof. Note that, by subadditivity, Proposition 4.2 implies that the set of x such that 1 lim inf a(k, x) ≥ A − ε k→∞ k for any ε > 0 has full measure. If a(n, x) is an additive cocycle, then the a.e. convergence is immediate, since in this case the above proposition can also be applied to −a(k, x). In the case of a general subadditive cocycle a(n, x), we can therefore subtract the additive cocycle n−1 X

a(1, Li x)

i=0

from a(n, x). This reduces the general case to the case of a nonpositive subadditive cocycle, that is a(n, x) ≤ 0. Fix an ε > 0 and take M such that Z 1 a(M, x)dµ(x) ≤ A + ε (4.12) M X and let M

a (n, x) = a(nM, x) −

n−1 X

a(M, LiM x).

i=0

This a M (n, x) is again a nonpositive subadditive cocycle. From the proposition and the inequality (4.12), we have that 0 ≥ lim inf n→∞

1 M a (n, x) ≥ −ε. nM

118

A. Karlsson, G. A. Margulis

From this inequality, the nonpositivity and subadditivity of a(n, x), the L-invariance and the convergence for additive cocycles, it follows that 1 1 1 1 a(nM, x) − lim inf a(nM, x) lim sup a(n, x) − lim inf a(n, x) = lim sup n→∞ n→∞ n nM n→∞ n n→∞ nM 1 M 1 M a (n, x) − lim inf a (n, x) = lim sup n→∞ nM nM n→∞ 1 M a (n, x) ≤ ε. ≤ − lim inf n→∞ nM Since this holds for any ε > 0, the corollary is established. For more details, consult [Kr, p. 37]. u t 5. Proof of the Theorem 5.1. Here we adopt the notations in Sect. 2 and we let a(n, x) = d(y, yn (x)). By the triangle inequality, the equality (2.1), and the semicontraction property, d(y, yn+m (x)) ≤ d(y, ym (x)) + d(ym (x), yn+m (x)) = a(m, x) + d(u(m, x)y, u(m, x)u(n, Lm x)y) ≤ a(m, x) + a(n, Lm x), hence a is a subadditive cocycle. Furthermore, by the assumption (2.2), Z Z + a (1, x)dµ(x) = d(y, w(x)y)dµ(x) < ∞, X

X

which means that the basic integrability condition (4.2) of the cocycle a is satisfied. Corollary 4.3 (the subadditive ergodic theorem) then implies that lim

n→∞

1 d(y, yn (x)) = A ≥ 0 n

(5.1)

for almost every x ∈ X. 5.2. Assume now that A > 0. Let E be the set defined as in Proposition 4.2 and consider an x ∈ E such that (5.1) holds. From now on, x will frequently be suppressed in the notation. For each i > 0, pick εi so small that f (δi ) ≤ 2−i , where δi := 2εi /(A − εi ) and f is the function appearing in the geometric lemma (3.1). This is possible, since f (t) → 0 as t → 0. Proposition 4.2 and Corollary 4.3 give us that there are for any i an integer Ki and infinitely many n such that a(n, x) − a(n − k, Lk x) ≥ (A − εi )k

(5.2)

A Multiplicative Ergodic Theorem

119

and (A − εi )k ≤ a(k, x) ≤ (A + εi )k

(5.3)

for all k, Ki ≤ k ≤ n. For each i, pick an integer ni greater than both ni−1 and Ki+1 , such that (5.2) and (5.3) hold. By adding the inequality (5.2) to the right inequality in (5.3), we get that for all k, Ki ≤ k ≤ ni , a(ni , x) − a(ni − k, Lk x) + (A + εi )k ≥ (A − εi )k + a(k, x), which simplified becomes a(k, x) + a(ni − k, Lk x) ≤ a(ni , x) + 2εi k. From this, recalling the definition of a, the semicontractivity of u(k, x), and the left inequality in (5.3), we get (note that at this point the order in which the maps w(Lk x) are composed to form u(n, x) is crucial) d(y, yk ) + d(yk , yni ) ≤ d(y, yni ) + 2εi k 2εi d(y, yk ). ≤ d(y, yni ) + A − εi

(5.4)

For each i, let γi be a ray from y passing through yni and let rk = d(y, yk ). Applying the geometric lemma (3.1) to (5.4), we get that d(γi (rk ), yk ) ≤ f (δi )rk ,

(5.5)

for all k, Ki ≤ k ≤ ni . 5.3. We now show that {γi (R)} is a Cauchy sequence for every R > 0. Fix R > 0. Since Ki+1 < ni < ni+1 , the inequality (5.5) implies that d(γi+1 (rni ), γi (rni )) = d(γi+1 (rni ), yni ) ≤ f (δi+1 )rni . For i large enough so that rni > R, the convexity property (3.2) implies that d(γi+1 (R), γi (R)) ≤ f (δi+1 )R, which means, using the triangle inequality, that d(γi+m (R), γi (R)) ≤

m X

f (δi+j )R ≤ 2−i R

j =1

for all m > 0. Hence {γi (R)} is a Cauchy sequence and by the completeness of Y, γi converges to some ray γ , as i → ∞.

120

A. Karlsson, G. A. Margulis

5.4. It remains to show that 1 d(γ (Ak), yk ) = 0. k→∞ k lim

For any k there is an i such that Ki ≤ k ≤ ni and by the triangle inequality d(γ (Ak), yk ) ≤ d(γ (Ak), γi (Ak)) + d(γi (Ak), γi (rk )) + d(γi (rk ), yk ) ≤ 2−i Ak + |Ak − rk | + f (δi )rk ≤ 2−i Ak + εi k + f (δi )(A + εi )k ≤ (2−i+1 A + 2εi )k.

It is then clear that 1 lim sup d(γ (Ak), yk ) ≤ 0, k→∞ k which shows (2.4). The uniqueness of γ is immediate from the convexity property (3.2) and so Theorem 2.1 is proved. 6. An Application to Random Walks and Boundary Theory General references for this section are [F] and [Ka3]. Let 0 be a countable group acting by isometries on a uniformly convex, Busemann nonpositively curved, complete metric space (Y, d). Any isometry of Y also acts on the ideal boundary at infinity Y (∞), which consists of asymptote classes of geodesic rays. Let ν be a probability measure on 0 and assume throughout this section that ν has finite first moment, that is X d(y, gy)ν(g) < ∞. g∈0

Let (X, µ) be the product of Z copies of (0, ν) and let L be the shift transformation. This is a standard construction of an ergodic measure preserving system with µ(X) = 1. Let w : X → 0 be the projection onto the 0th copy of 0, so w({g(i)}∞ i=−∞ ) = g(0), and put as usual u(n, x) = w(x)w(Lx) · · · w(Ln−1 x). This is sometimes called the right random walk determined by ν. Note that, in probabilistic language, the increments {w ◦ Lk }∞ k=1 are independent, identically distributed random variables. Theorem 2.1 (in the case A > 0) now provides a measurable map ξ : X → Y (∞), where ξ(x) = [γ (x, .)]. Since u(n, Lx) = w(x)−1 u(n, x), the map ξ clearly has the following equivariance property: ξ(Lx) = w(x)−1 ξ(x). It follows that (Y (∞), ξ∗ (µ)) is a ν-boundary for 0. In the case A = 0, we set ξ∗ (X, µ) to be the trivial ν-boundary for 0. Recall Kaimanovich’s ray approximation criterion for the maximality of a boundary [Ka1, Theorem 3].

A Multiplicative Ergodic Theorem

121

Theorem 6.1 (Kaimanovich). Let (B, λ) be a ν-boundary of 0 and assume that ν has P finite entropy H (ν) = − ν(g) log ν(g). Suppose that θ : 0 → Z is a mapping into a metric space with metric d, and πn : B → Z is a family of measurable mappings, and that there is a constant C > 0 such that card{g ∈ 0 : d(z, θ (g)) ≤ N } ≤ eCN

(6.1)

for all z ∈ Z and N ≥ 1. Let b(x) denote the image in B of the sample path {u(n, x)}. If lim

n→∞

1 d(πn (b(x)), θ (u(n, x))) = 0 n

for almost every x, then the Poisson boundary of (0, ν) is isomorphic to (B, λ). The following statement is now an immediate consequence. Corollary 6.2. Let 0 be a countable group acting on (Y, d) by isometries and let ν be a probability measure on 0 with finite first moment. Fix a point y ∈ Y and assume that for some C > 0, card{g ∈ 0 : d(y, gy) ≤ N } ≤ eCN

(6.2)

for all N ≥ 1. Then the Poisson boundary of (0, ν) is isomorphic to ξ∗ (X, µ). Proof. Set θ(g) = gy, Z = Y, B = ξ(X), λ = ξ∗ (µ), and πn (b(x)) = γ (An, x) using the notation of Theorem 2.1. Since 0 acts by isometries it follows that card{g : d(z, gy) ≤ N } ≤ card{g : d(y, gy) ≤ 2N}, which ensures that condition (6.1) is satisfied. From this condition and the finiteness of the moment of ν, it follows that the entropy of ν is finite, see [Ka3]. u t Remark 6.3. When the group generated by suppν is nonamenable, the Poisson boundary is non-trivial, see [F], and so in particular A > 0. It is also known and not hard to show that the condition (6.2) is satisfied if 0 is a discrete subgroup of isometries of a locally compact Cartan-Hadamard manifold with sectional curvatures bounded from below. Remark 6.4. Results on the determination of the Poisson boundary for various groups and measures have been obtained by many authors, see [Ka3]. Ballmann and Ledrappier in [BaLe] identified the Poisson boundary for cocompact lattices in rank 1 manifolds for nondegenerate measures with finite first moment and finite entropy (Kaimanovich was later able to replace the finite first moment with finite logarithmic moment, see [Ka3]). Note that their techniques are quite different from the methods in the present paper. Some of the ideas in [Ba1,BaLe], and [Ka3] go back to Furstenberg’s work.

122

A. Karlsson, G. A. Margulis

7. An Application to Hilbert-Schmidt Operators Let H be a real Hilbert space and let A be the algebra of Hilbert-Schmidt operators H → H, that is a ∈ A if ||a||22 := tr(aa ∗ ) =

X

||aei ||2 < ∞,

i

for some (hence any) orthonormal basis {ei } of H. Recall that < a, b >:= tr(ab∗ ) is an inner product on A and if || · || denotes the usual operator norm then || · || ≤ || · ||2 .

(7.1)

It is a standard fact that (A, <, >) is a Hilbert space. Note also that the Cauchy-Schwarz inequality (tr(ab∗ ))2 ≤ tr(aa ∗ )tr(bb∗ ), with a = vw, b = wv, where v = v ∗ and w = w∗ yields tr(vwvw) ≤ tr(v 2 w 2 ). Now let Sym = {a ∈ A : a = a ∗ } and Pos = exp{Sym} ⊂ I + Sym, where exp is the usual exponential map and I is the identity operator. Pos is an infinite dimensional Riemannian manifold with the metric < v, w >p := tr(p−1 vp −1 w), p ∈ Pos, v, w ∈ Sym ' Tp Pos. Let d be the associated distance function. The arguments in [La, Ch. XII] show that (Pos, d) is a complete metric space satisfying the semi-parallelogram law and also that the operators exp{A} act on Pos by isometries, p 7 → [exp(a)]p := exp(a)p exp(a)∗ . Hence this is a situation in which Theorem 2.1 applies. Corollary 7.1. Let u(n, x) be an integrable cocycle taking values in exp{A}. Then for almost every x there is an operator 3(x) = exp(v(x)), v(x) ∈ Sym, such that X

1

log([3−n (x)u(n, x)]I ) = 1 (log µi (n))2 2 n n i

where µi (n) are the eigenvalues of [3−n (x)u(n, x)]I ∈ Pos.

!1/2 → 0,

A Multiplicative Ergodic Theorem

123

The following Lyapunov regularity statement is a consequence of this corollary. Let {fi (x)} be the orthonormal basisPof H consisting of eigenvectors of 3(x), so 3(x)fi (x) = exp(λi (x))fi (x). For z = i zi (x)fi (x) ∈ H, let λz (x) = sup{λi (x) : zi (x) 6 = 0}. Then lim

n→∞

1 log ||u(n, x)−1 z|| = −λz (x). n

(7.2)

In [R], Ruelle obtained this type of multiplicative ergodic theorems for more general classes of operators. Note, however, that in the case of the Hilbert-Schmidt operators that we consider here, it is not clear that Corollary 7.1, which in infinite dimensions is a stronger statement than (7.2), can be proved by the methods in [R]. Acknowledgements. The authors would like to thank Vadim Kaimanovich for useful comments on an earlier version of this paper.

References [Ba1]

Ballmann, W.: On the Dirichlet problem at infinity for manifolds of nonpositive curvature. Forum Math. 1, 201–213 (1989) [Ba2] Ballmann, W.: Lectures on Spaces of Nonpositive Curvature. DMV-Seminar, Bd. 25. Basel, Boston, Berlin: Birkhäuser, 1995 [BaLe] Ballmann, W., Ledrappier, F.: The Poisson boundary for rank one manifolds and their cocompact lattices. Forum Math. 6, 301–313 (1994) [Be] Beardon, A.F.: Iteration of contractions and analytic maps. J. London Math. Soc. 41, 141–150 (1990) [Bi] Billingsley, P.:, Ergodic Theory and Information. New York: Wiley, 1965 [C] Clarkson, J.A.:, Uniformly convex spaces. Trans. Am. Math. Soc. 40, 396–414 (1936) [F] Furstenberg, H.: Boundary theory and stochastic processes on homogeneous spaces. Proc. Symp. Pure Math., Vol. 26, Providence, RI: American Mathematical Society, 1973, pp. 193–229 [J] Jost, J.: Nonpositive Curvature: Geometric and Analytic Aspects. Lectures in Mathematics: ETH Zürich. Basel, Boston, Berlin: Birkhäuser, 1997 [Ka1] Kaimanovich, V.A.: An entropy criterion for maximality of the boundary of random walks on discrete groups. Soviet Math. Dokl. 31, 193–197 (1985) [Ka2] Kaimanovich, V.A.: Lyapunov exponents, symmetric spaces and multiplicative ergodic theorem for semisimple Lie groups. J. Soviet Math. 47, 2387–2398 (1989) [Ka3] Kaimanovich, V.A.: The Poisson formula for groups with hyperbolic properties. Prépublication 97– 13, Institut de Recherche Mathématique de Rennes, 1997 [Ki] Kingman, J.F.C.: The ergodic theory of subadditive stochastic processes. J. Roy. Statist. Soc. B 30, 499–510 (1968) [Kr] Krengel, U.: Ergodic Theorems. de Gruyter Stud. in Math., Vol. 6. New York: de Gruyter, 1985 [La] Lang, S.: Fundamentals of Differential Geometry. New York: Springer-Verlag, 1999 [O] Oseledec, V.I.: A multiplicative ergodic theorem. Ljapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–231 (1968) [P] Pazy, A.: Asymptotic behavior of contractions in Hilbert space. Israel J. Math. 9, 235–240 (1971) [R] Ruelle, D.: Characteristic exponents and invariant manifolds in Hilbert space. Ann. Math. 155, 243– 290 (1982) Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 125 – 152 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Lp -Boundedness of Wave Operators for Two Dimensional Schrödinger Operators Kenji Yajima Department of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo, 153 Japan Received: 5 April 1999 / Accepted: 26 May 1999

Dedicated to Professor Daisuke Fujiwara on his sixtieth birthday Abstract: Let H = −4 + V be the two dimensional Schrödinger operator with the real valued potential V which satisfies the decay condition at infinity |V (x)| ≤ Chxi−δ , x ∈ R2 for δ > 6. We show that the wave operators W± u = s− lim eitH e−itH0 u, H0 = t→±∞

−4, are bounded in Lp (R2 ) for any 1 < p < ∞ under the condition that H has no zero bound states or zero resonance, extending the corresponding results for higher dimensions. As W± intertwine H0 and the absolutely continuous part H Pac of H : f (H )Pac = W± f (H0 )W±∗ for any Borel function f on R1 , this reduces the various Lp -mapping properties of f (H )Pac to those of f (H0 ), the convolution operator by the Fourier transform of the function f (ξ 2 ). 1. Introduction Let H0 = −4 be the free Schrödinger operator on R2 , and H = H0 + V be its perturbation by a multiplication operator with a real valued function V . We assume that V is bounded and satisfies the following decay condition. Assumption 1.1. The potential V (x) satisfies |V (x)| ≤ Chxi−δ , x ∈ R2 for some δ > 6. It is well known in the spectral and scattering theory for Schrödinger operators ([1, 4–6]) that under the (assumption much weaker than) Assumption 1.1, H is selfadjoint in L2 (R2 ) with the domain H 2 (R2 ), the Sobolev space of order 2; the spectrum of H consists of non-positive eigenvalues and the absolutely continuous spectrum [0, ∞); the singular continuous spectrum is absent; and the wave operators defined by the limits W± u = s− lim eitH e−itH0 u exist. The wave operators W± are unitary from L2 (R2 ) t→±∞

onto the absolutely continuous spectral subspace L2ac (H ) for H and intertwine H0 and the absolutely continuous part H Pac of H : W± H0 W±∗ = H Pac , where Pac is the projection

126

K. Yajima

onto L2ac (H ). It follows that f (H )Pac = W± f (H0 )W±∗ for any Borel function f on R1 and various mapping properties of f (H )Pac may be derived from those of f (H0 ) if the corresponding properties are established for W± and W±∗ . Note that f (H0 ) is the convolution operator by the Fourier transform of the function f (ξ 2 ). When the spatial dimensions m ≥ 3, we have shown in our previous papers ([15, 16]) that the wave operators W± are bounded in Lp (Rm ) for all 1 ≤ p ≤ ∞ under suitable conditions on the smoothness and the decay at infinity of V (x) and the additional spectral condition that λ = 0 is not an eigenvalue nor resonance of H . In lower dimensions, however, because of the high singularities at z = 0 of the free resolvent R0 (z) = (H0 − z)−1 , the methods in [15] and [16] do not apply at least directly and it has been an open question whether or not the wave operators are bounded in Lp . The purpose of this paper is to give an affirmative answer to this question for the two dimensional case and show that the wave operators are bounded in Lp for any 1 < p < ∞ under Assumption 1.1 on the potential V and the spectral condition Assumption 1.2 to be stated below. The one dimensional case is treated in the accompanying paper ([3]) by employing more one dimensional ODE techniques. For stating the main result of X the paper, we introduce some notation. For s ∈ R and ink,s 2 khxis D α f k2 < ∞} is the weighted Sobolev space, tegral k ≥ 0, H (R ) = {f : |α|≤k

and L2,s (R2 ) = H 0,s (R2 ). For Banach spaces X and Y , B(X, Y ) is the space of bounded operators from X to Y , B(X) = B(X, X). We denote the boundary values on the positive reals of the resolvents R0 (z) and R(z) = (H − z)−1 by R0± (λ) ≡ lim R0 (λ ± i) and →+0

R ± (λ) ≡ lim R(λ ± i). These limits exist in B(L2,σ (R2 ), H 2,−σ (R2 )), σ > 1/2 →+0

and they are locally Hölder continuous with respect to λ ∈ (0, ∞) (cf. [1]). In two dimensions, R0± (k 2 ) has the logarithmic singularities at k = 0 and has the following asymptotic expansion as a B(L2,s (R2 ), H 2,−s (R2 ))-valued function, s > 3: R0± (k 2 ) = c± (k)P0 + G0 + O(k 2 log k),

(1.1)

2 k2 2 where c± (k) = 1 ± i γ ± i log , γ is the Euler number, P0 is the rank one operaπ Z π 2 Z −1 u(x)dx and G0 u(x) = (log |x − y|)u(y)dy is the tor defined by P0 u(x) = 2π R2 R2 minimal Green function of −4. The singularities at k = 0 of R ± (k 2 ) strongly depend on the spectral property of H at zero energy. If H has zero eigenvalue or zero resonance, then R ± (k 2 ) has stronger k −2 singularities whereas R ± (k 2 ) remains bounded as k → 0 otherwise (cf. e.g. Murata [7]). When H has zero energy eigenvalue or resonance, then 1 < p < ∞ as will be shortly explained and we W± cannot be bounded in Lp for all Z assume their absence. We write c0 = V (x)dx and set V0 (x) = c0−1 V (x), P = P0 V0 and Q = 1 − P . We have P 2 = P and Q2 = Q. Assumption 1.2. We assume that c0 6= 0 and that 1+QG0 V Q is invertible in L2,−s (R2 ) for some 1 < s < δ − 1. Theorem 1.3. Suppose that Assumption 1.1 and Assumption 1.2 are satisfied. Then, the wave operators W± are bounded in Lp (R2 ) for any 1 < p < ∞: kW± ukp ≤ Cp kukp , where the constant C > 0 is independent of u ∈ L2 (R2 ) ∩ Lp (R2 ).

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

127

Some remarks are in order: Remark 1.1. If Assumption 1.2 is satisfied, then 1 + QG0 V Q is invertible in L2,−s (R2 ) for all 1 < s < δ − 1 (cf. [7]). Assumption 1.2 is satisfied if and only if there are no 2 (R 2 ) of −4u + V (x)u = 0 which satisfy the asymptotic non-trivial solutions u ∈ Hloc behaviour at infinity b1 x1 + b2 x2 ∂α (1.2) u−a− = O(|x|−1−|α|− ), |α| ≤ 1 ∂x α |x|2 for some > 0, where a, b1 and b2 are constants. If at least one of the constants a, b1 and b2 does not vanish, then u is called a resonant solution or a half bound state and 0 is the resonance of H . If all these constants vanish, then u is an eigenfunction of H and 0 is an eigenvalue of H . Indeed, if u ∈ L2,−s satisfies u + QG0 V u = 0, then u = Qu and −4u + V u = 0 since −4Q = −4. Moreover, u ∈ L2,−s (R2 ) for any s > 1 and letting |x| → ∞ in the integral expression Z −1 log |x − y|V (y)u(y)dy G0 V u(x) = 2π and using

Z Pu =

V0 (x)u(x)dx = 0,

we see that u satisfies (1.2)(cf. [2]). On the other hand if u satisfies −4u + V (x)u = 0 and (1.2), then, by comparing the singularities at ξ = 0 of the Fourier transforms F(V u)(ξ ) and ξ 2 Fu(ξ ), we have F(V u)(0) = 0 or Qu = u. And, by virtue of (1.2), the limit as R → ∞ of the boundary integral in the right-hand side of Z −1 (−4u)(y) log |x − y|dy lim R→∞ 2π |y|≤R Z ∂ log |x − y| ∂u 1 (y) log |x − y| − u(y) dy = u(x) + lim R→∞ 2π |y|=R ∂n ∂n converges to the constant −a. It follows that G0 V u = −u(x)+a and QG0 V Qu+u = 0, since Qa = 0. Remark 1.2. If V satisfies |D α V (x)| ≤ Cα hxi−δ for |α| ≤ ` and if Assumption 1.2 is satisfied, then the wave operators W± are bounded in the Sobolev space W k,p (R2 ) for any 1 < p < ∞ and k = 0, . . . , `. This may be proved by applying the commutator method used in [15] for the same purpose and we shall not go into details in this direction here. Remark 1.3. If z = 0 is a resonance or an eigenvalue of H , W± cannot be bounded in Lp (R2 ) for all 1 < p < ∞. Indeed Murata [7] has shown that e−itH Pac in this case satisfies lim k(log t)e−itH Pac f − C0 f kL2,−s = 0, s > 3,

t→∞

(1.3)

128

K. Yajima

where C0 6 = 0 is an explicitly computable finite rank operator. Equation (1.3) clearly contradicts the Lp boundedness of W± because the latter would imply, as t → ∞, k(log t)e−itH Pac f kL2,−s ≤ k(log t)W+ e−itH0 W+∗ f kLp ≤ Cp kf kp (log t)t −2(1/2−1/p) → 0 for sufficiently large p > 2 and p = p/(p − 1) and because L2,−s ∩ Lp is dense in L2,−s . In what follows we deal with W+ only. W− may be treated similarly. We use the following notation and convention. Z When ψ and φ are functions, ψ ⊗ φ denotes the integral operator (ψ ⊗ φ)u(x) =

ψ(x)φ(y)u(y)dy. hxi = (1+x 2 )1/2 and etc. Dj =

−i∂/∂xj , j = 1, 2, and we use the vector notation D = (D1 , D2 ), hDi = (1 + D 2 )1/2 . kukp is the Lp norm of u, 1 ≤ p ≤ ∞. 6 is the unit Zcircle S 1 ⊂ R2 and dω denotes 1 e−ix·ξ u(x)dx is the Fourier the standard line element of 6. Fu(ξ ) = u(ξ ˆ )= 2π R2 transform of u. Various constants are denoted by the same letter C if their specific values are not important, and these constants may differ from one place to another. We take and ˜ ∈ C ∞ (R1 ), fix throughout this paper the cut-off functions χ(t) ∈ C0∞ (R1 ) and χ(t) χ (t) + χ(t) ˜ ≡ 1, such that χ(t) = χ(−t), 0 ≤ χ(t), χ(t) ˜ ≤ 1, χ(t) = 1 for |t| ≤ c and χ (t) = 0 for |t| ≥ 2c, where 0 < c < 1 is the sufficiently small constant to be specified in Sect. 4. We note that χ (H0 ) is the convolution operator with the Fourier transform ˜ 0 ) are bounded operators in Lp (R2 ) for any of χ(ξ 2 ) ∈ C0∞ (R2 ) and χ(H0 ) and χ(H Z 1 ≤ p ≤ ∞. For f and g in suitable function spaces, hf, gi =

f (x)g(x)dx.

The rest of the paper is devoted to the proof of Theorem 1.3. The basic strategy is similar to the one employed in [15] and [16] for proving the corresponding property in higher dimensions m ≥ 3: We start from the stationary representation formula ([6]): Z ∞ 1 R − (k 2 )V {R0+ (k 2 ) − R0− (k 2 )}kudk (1.4) W+ u = u − πi 0 and expand W+ into the sum of a few Born terms and the remainder W+ =

` X j =0

(j )

W+ + W`+1

by successively replacing R − (k 2 ) by R − (k 2 ) = R0− (k 2 ) − R0− (k 2 )V R − (k 2 ) in the right (0) of (1.4): W+ = I is the identity operator and for j = 1, . . . , `, W (j ) u =

W`+1 u =

(−1)j πi

(−1)`+1 πi

Z

∞

0

Z 0

∞

R0− (k 2 )V (R0− (k 2 )V )j −1 {R0+ (k 2 ) − R0− (k 2 )}kudk,

(1.5)

R0− (k 2 )V (R0− (k 2 )V )`−1 R − (k 2 )V {R0+ (k 2 ) − R0− (k 2 )}kudk. (1.6)

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

129

We prove that the Born terms W (j ) are bounded in Lp (R2 ) for all 1 < p < ∞ by showing that they are superpositions of compositions of essentially one dimensional convolution operators; the remainder term W`+1 has the integral kernel K(x, y) which satisfies the condition of Schur’s lemma Z Z |K(x, y)|dy < ∞, sup |K(x, y)|dx < ∞ sup x∈R2 R2

y∈R2 R2

and, therefore W`+1 is bounded in Lp (R2 ) for all 1 ≤ p ≤ ∞. We explain here the difficulties which we encounter in this approach, in two dimensions in particular, and the ideas how to overcome these difficulties, simultaneously displaying the plan of this ˜ 0 ) and the low energy parts W+ χ(H0 ) by using paper. We split W+ into the high W+ χ(H the cut-off functions introduced as above. ˜ 0 ) are bounded In Sect. 2, we prove that the first two Born terms W (1) and W (2) χ(H in Lp (R2 ) for any 1 < p < ∞. We have shown in [15] that W (1) and W (2) are written in m dimensions as Z ∞ Z (1) dω K(t + 2xω, ω)u(x + tω)dt; (1.7) W u(x) = C1 6

W

(2)

Z

Z u(x) = C2

62

d

[0,∞)2

0

Kˆ 2 (t1 , t2 + 2xω2 , ω1 , ω2 )u(x + t1 ω1 + t2 ω2 )dt1 dt2 , (1.8)

where 6 is the unit sphere in Rm , dω is its surface element, d = dω1 dω2 and Z ∞ (1.9) Vˆ (rω)eitr/2 r m−2 dr, K(t, ω) = 0

Kˆ 2 (t1 , t2 , ω1 , ω2 ) =

Z [0,∞)2

ei(t1 s1 +t2 s2 )/2 Vˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 )(s1 s2 )m−2 ds1 ds2 . (1.10)

Hereafter we write W (1) = W (1) (V ) when we want to make the dependence on V explicit. When m ≥ 3, as was shown in [15], K ∈ L1 (R×6) and Kˆ 2 ∈ L1 (R2 ×6 2 ) and the classical Minkowski inequality implies that W (1) and W (2) are bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞. If m ≤ 2, this is not the case as is evident from Eqs. (1.9) and (1.10). In ˜ ∈ L1 (R × two dimension, however, we can show K1 (t, ω) = K(t, ω) − 2Vˆ (0)χ(t)/it s 6) and kK1 kL1 ≤ Ckhxi V k2 , s > 1. And the difficulty is circumvented by showing that the integral ˜ in (1.7) is a Z operator which arises when K is replaced by χ(t)/it Fω u(x)dω over ω ∈ 6 of superposition 6

Z Fω u(x) =

0

∞

χ(t ˜ + 2xω) u(x + tω)dt, t + 2xω

(1.11)

and that, after rotating the coordinates by ω, Fω u(x) is a sum of three operators, two of which are bounded by the one-dimensional Hardy–Littlewood operators on the half

130

K. Yajima

lines (0, ±∞) with positive homogeneous kernel |t + s|−1 , and the third by a Calderon– Zygmund operator, all being bounded in Lp for any 1 < p < ∞. In this way, we obtain the estimate kW (1) (V )ukp ≤ Cps khxis V k2 kukp , for any s > 1.

(1.12)

˜ 0 ) is a bit more involved. We write Kˆ 2 as The proof of the Lp boundedness of W (2) χ(H ˜ 1 )/t1 ) × a sum of three functions K21 + K22 + K23 ; K21 ∈ L1 (R2 × 6 2 ), K22 = C(χ(t ˜ 2 )/t2 ) × K 0 (t1 , ω1 ) K(t2 , ω2 ) with K(t, ω) being defined by (1.9), and K23 = (χ(t with K 0 ∈ L1 (R1 × 6). We show that the operators which are produced by replacing Kˆ 2 in (1.8) by K2j are bounded in Lp for any 1 < p < ∞ as follows. The operator arising from K21 can be estimated by using the Minkowski inequality as in the higher 2 , then dimensional cases; if we denote by M the convolution operator with χ(|x|)/|x| ˜ (1) ˜ 0 ) is bounded in Lp ; and the operator arising from K22 is of the form W M and M χ(H the operator arising from K23 may be written in the form Z Z Z ∞ K 0 (t1 , ω1 ) (Fω2 u)(x + t1 ω1 )dω2 dω1 dt1 , 6

6

0

and the estimate for (1.11) mentioned above and the Minkowski inequality imply that this also is bounded in Lp . ˜ 0 ) of the remainder W3 is In Sect. 3, we prove that the high energy part W3 χ(H bounded in Lp for any 1 ≤ p ≤ ∞ by showing that its integral kernel T (x, y) is bounded by a constant times hxi−1/2 hyi−1/2 h|x| − |y|i−2 . We write F (k) = R0− (k 2 )V R − (k 2 ). Because R0± (k 2 ) is the convolution operator with G± (x, k) = (±i/4)H0± (k|x|), where (j ) H0± (z) = H0 (z) is the 0th order Hankel function of the j th kind, ± corresponding to (−1)j +1 (cf. [12]), T (x, y) is given as T (x, y) = T + (x, y) − T − (x, y): Z ∞ 1 hF (k)V G± (y − · , k), V G+ (x − · , k)iχ(k ˜ 2 )kdk. (1.13) T ± (x, y) = − πi 0 Ce±ik|x| and the By virtue of the classical estimate for Hankel functions H0± (k|x|) ∼ √ k|x| decay property of the resolvent at high energy khxi−σ −j (d/dk)j F (k)hxi−σ −j kB(L2 ) ≤ Ck −2 for j = 0, 1, 2 and σ > 1/2, the integral (1.13) is absolutely convergent. However, a simple minded estimate by using these facts only would yield |T ± (x, y)| ≤ ˜ 0 ) is bounded Chxi−1/2 hyi−1/2 which is far from being sufficient to conclude that W3 χ(H in Lp for all 1 < p < ∞. This difficulty can be resolved by exploiting the old method in [15] and [16]: We write G± (x − y, k) = e±ik|x| G± k,x (y) so that T ± (x, y) = −

1 πi

Z

0

∞

+ e−i(|x|∓|y|)k hF (k)V G± ˜ 2 )dk, y,k , V Gx,k ik χ(k

(1.14)

and apply the integration by parts twice to the k-integral in the right. This will yield the estimate |T ± (x, y)| ≤ Ch|x| ∓ |y|i−2 hxi−1/2 hyi−1/2 , hence the desired estimate. In Sect. 4, we prove that the low energy part W+ χ(H0 ) is also bounded in Lp for any 1 < p < ∞. Here we write R − (k 2 )V = R0− (k 2 )V (1 + R0− (k 2 )V )−1 in (1.4) and investigate the low energy behavior of (1 + R0− (k 2 )V )−1 following the argument in [7] and [2]. We find that, for 0 < k < 2c, c being a sufficiently small constant,

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

(1 +

R0− (k 2 )V )−1

can be written as the sum

4 X

131

dj (k)Kj + N (k): For 0 ≤ j ≤ 4, Kj

j =0

is an integral operator with the integral kernel Kj (x, y) which satisfies for some s > 1, Z R2

khxis V Kjy k2 dy < ∞,

Kjy (x) = Kj (x, x − y);

(1.15)

dj (k) satisfies |(∂/∂ξ )α dj (|ξ |)| ≤ Cα |ξ |−|α| , and the remainder N (k) is an operator valued function which satisfies the estimate k(d/dk)j N (k)kB(L2,−s ) ≤ Cj k 2−j | log k|, s > 3, for j = 0, 1, 2. (Actually d0 (k) = 1 and Kj for 1 ≤ j ≤ 4 are rank one operators.) The operator which is produced by inserting R0− (k 2 )V N (k)χ(k 2 ) in place of R − (k 2 )V in (1.4) is an integral operator with the kernel T˜ + (x, y) − T˜ − (x, y), T˜ ± (x, y) ˜ 2 ). being given by the right-hand side of (1.14) with N (k)χ(k 2 ) in place of F (k)V χ(k ± ± ˜ The method employed for estimating T (x, y) applies and yields |T (x, y)| ≤ Ch|x| ∓ |y|i−2 hxi−1/2 hyi−1/2 and the operator in question is bounded in Lp for any 1 ≤ p ≤ ∞. The operator produced by inserting R0− (k 2 )V dj (k)Kj in place of R − (k 2 )V in (1.4) may be written as −1 πi

Z 0

∞

R0− (k 2 )V Kj dj (k){R0+ (k 2 ) − R0− (k 2 )}χ(k 2 )kudk.

(1.16)

Observing that dj (k){R0+ (k 2 ) − R0− (k 2 )} = {R0+ (k 2 ) − R0− (k 2 )}dj (|D|) and that the integral operator may be written as Z

Z A(x, y)u(y)dy =

Z A(x, x − y)u(x − y)dy =

Ay (x)τy u(x)dy,

viz. the superposition of the composition of the multiplication by Ay (x) = A(x, x − y) and the translation τy by y, we rewrite (1.16) in the form

Z R2

−1 πi

Z 0

∞

R0− (k 2 )V Kjy {R0+ (k 2 ) − R0− (k 2 )}kdk

dj (|D|)χ (H0 )τy udy. (1.17)

The operator in the parenthesis is nothing but W (1) (V Kjy ) and, by virtue of (1.12), the Lp -norm of (1.17) may be estimated as follows:

W (V Kjy )dj (|D|)χ (H0 )τy udy

2 R p Z ≤ Ckukp kdj (|D|)χ (H0 )kB(Lp ) khxis V Kjy k2 dy.

Z

(1)

R2

Because Fourier multipliers dj (|D|)χ (H0 ) are bounded in Lp by the well known theorem in the Fourier analysis and because (1.15) implies that the integral in the right is finite, the operators arising from dj (k)Kj , j = 0, . . . , 4 are all bounded in Lp for any 1 < p < ∞. In what follows we shall substantiate the argument outlined in this section.

132

K. Yajima

2. Preliminaries Under Assumption 1.1, it is well known that the limiting absorption principle holds for H0 and H and, R(z) and R0 (z) considered as a B(L2,σ , H 2,−σ )-valued function of z ∈ C± , C± = {z ∈ C : ±Im z > 0} being the upper and the lower half complex plane and σ > 1/2, can be extended to a locally Hölder continuous function on C± ∪ (0, ∞), and the wave operator W+ may be expressed by the stationary representation formula: W+ u = u −

1 2πi

Z

∞

0

R − (λ)V {R0+ (λ) − R0− (λ)}udλ, u ∈ L2,s , s > 1/2.

(2.1)

˜ 0 ) we decompose W+ = I + W (1) + W (2) For estimating the high energy part W+ χ(H +W3 as in the introduction. Explicitly we have W (1) u = −

W (2) u =

−1 2πi

W3 u =

1 2πi Z

∞

0

Z

1 2πi

Z

∞

0

∞

R0− (λ)V {R0+ (λ) − R0− (λ)}udλ,

(2.2)

R0− (λ)V R0− (λ)V {R0+ (λ) − R0− (λ)}udλ,

(2.3)

R0− (λ)V R0− (λ)V R − (λ)V {R0+ (λ) − R0− (λ)}udλ.

(2.4)

0

˜ 0 ) are bounded operators in Lp (R2 ) In this section, we show that W (1) and W (2) χ(H for any 1 < p < ∞. We use the polar coordinates ξ = rω, r = |ξ | ∈ (0, ∞) and ω = ξ/r ∈ 6. Lemma 2.1. The operators W (1) and W (2) may be written in the form W

W

(2)

(1)

i u(x) = 4π

Z u(x) = C

62

Z

Z d

[0,∞)2

6

Z

∞

dω

K(t + 2xω, ω)u(x + tω)dt;

(2.5)

0

Kˆ 2 (t1 , t2 + 2xω2 , ω1 , ω2 )u(x + t1 ω1 + t2 ω2 )dt1 dt2 , (2.6)

where C = (i/4π)2 , d = dω1 dω2 and Z K(t, ω) =

∞

Vˆ (rω)eitr/2 dr,

0

Kˆ 2 (t1 , t2 , ω1 , ω2 ) =

Z [0,∞)2

ei(t1 s1 +t2 s2 )/2 Vˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 )ds1 ds2 .

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

133

Proof. We sketch the proof, referring readers to the proofs of Proposition Z 2.2, Lemma −1 eixξ Vˆ (ξ )dξ 2.3 and Lemma 2.4 of [15] for the details. By writing V (x) = (2π ) we have (FW

(1)

Z u)(ξ ) = − 0

∞

1 2π

Z

! Vˆ (η) 2 δ((ξ − η) − λ)u(ξ ˆ − η)dη dλ. ξ 2 − λ + i0

Computing the Fourier inverse transform in W (1) u(x) = −

1 (2π )2

Z Z

ˆ − η) eixξ Vˆ (η)u(ξ dηdξ 2ξ · η − η2 + i0

we obtain (2.5). For obtaining (2.6), we repeat similar computations. u t ˜ 0 ) by estimating the integral We prove the Lp -boundedness of W (1) and W (2) χ(H operators on the right of (2.5) and (2.6). We use the following lemmas. Lemma 2.2. Let χ(t) ˜ be the cut-off function introduced in the Introduction. Then: (1) The convolution operator with χ(t)/t ˜ is bounded in Lp (R1 ) for any 1 < p < ∞. (2) Let M be the integral operator defined by Z ∞ Z Z χ(t) ˜ χ(|x ˜ − y|) dω u(y)dy. (2.7) u(x + tω)dt = Mu(x) = 2 t 6 R2 |x − y| 0 Then, M χ˜ (H0 ) is bounded in Lp (R2 ) for any 1 < p < ∞. Proof. The Fourier transform of χ(t)/t ˜ is bounded. Indeed Z N Z N i sin ts sin t i χ(t)dt ˜ =√ χ(t/s)dt ˜ lim lim F χ˜ (s) = √ 2π N→∞ −N t 2π N →∞ −N t is an even function of s and for s > 0 we have Z Z sin t dt ≤ χ(t/s)dt ˜ ≤ 2 log 2 |t|≤2cs t cs≤|t|≤2cs |t| Z

±N

sin t dt is uniformly bounded with respect to s and N . Thus the convolution is ±2cs t ˜ satisfies the Hörmander condition: There bounded in L2 (R1 ). It is obvious that χ(t)/t exists a constant A such that Z 0 χ(t ˜ − s) − χ˜ (t − s ) dt ≤ A, whenever |s − s 0 | ≤ δ, δ > 0. t −s t − s0 |t−s|≥2δ

and

Hence the convolution operator with χ(t)/t ˜ is bounded in Lp for any 1 < p ≤ 2 by the well-known theorem (cf. e.g. [13], p 19). Since the operator is selfadjoint it is bounded for any 1 < p < ∞. The proof of the second statement is similar. Integration −2 is bounded for |ξ | ≥ c by parts shows that the Fourier transform G(ξ ) of χ(|x|)|x| ˜ ˜ 2 )u(ξ ˆ ). Hence M χ(H ˜ 0 ) is bounded in L2 (R2 ). It and FM χ˜ (H0 )u(ξ ) = (2π)G(ξ )χ(ξ

134

K. Yajima

−2 satisfies the Hörmander condition: There exists a constant is easy to see that χ(|x|)|x| ˜ A such that Z χ˜ (|x − y|) χ(|x ˜ − y 0 |) 0 |x − y|2 − |x − y 0 |2 dx ≤ A, whenever |y − y | ≤ δ, δ > 0. |x−y|≥2δ

˜ 0 ) commutes, Hence M χ(H ˜ 0 ) is bounded in Lp for any 1 < p ≤ 2. Since M and χ(H t M χ(H ˜ 0 ) is selfadjoint and it is bounded in Lp for any 1 < p < ∞. u Lemma 2.3. Let Fω for ω ∈ 6 be defined by Z ∞ χ(t ˜ + 2xω) u(x + tω)dt . Fω u(x) ≡ t + 2xω 0

(2.8)

Then, there exists a constant Cp independent of ω ∈ 6 such that kFω ukp ≤ Cp kukp . Proof. Take the rotation R(ω) of R2 which brings the vector (1, 0) to ω ∈ 6 and make a measure preserving change of variables x → R(ω)x. It suffices to show that Z ∞ Z ∞ χ˜ (t + 2x1 ) χ(t ˜ + x1 ) u(x1 + t, x2 )dt = u(t, x2 )dt (2.9) F u(x) ≡ t + 2x t + x1 1 0 x1 is bounded in Lp for any 1 < p < ∞. When x1 > 0, we clearly have Z ∞ |u(t, x2 )| dt. |F u(x)| ≤ t + x1 0 When x1 < 0, we write it in the form: Z ∞ Z − F u(x) = −∞

x1

−∞

χ(t ˜ + x1 ) u(t, x2 )dt. t + x1

The second integral on the right of (2.11) is bounded in modulus by Z 0 |u(t, x2 )| dt, x1 < 0. −∞ |t + x1 |

(2.10)

(2.11)

(2.12)

Both (2.10) and (2.12) are one dimensional integral operators with the homogeneous kernel |t + x1 |−1 and they are bounded respectively in Lp (0, ∞) and Lp (−∞, 0) for 1 < p < ∞ by the Hardy–Littlewood inequality([10]). The convolution with χ(t)/t ˜ is bounded in Lp (R1 ) for 1 < p < ∞ by virtue of Lemma 2.2. Hence, F is bounded in t Lp (R2 ) for any 1 < p < ∞. u Lemma 2.4. Let 2 < q ≤ ∞. Then there exists α0 > 1 such that for 1 < α < α0 , Z |Vˆ (ξ − η)|α dξ ≤ Cqα (kVˆ k2 + kVˆ kq )α . sup 2 |ξ | 2 R η∈R In particular, for any σ > 0, there exists α0 > 1 such that for 1 < α < α0 , Z |Vˆ (ξ − η)|α dξ ≤ Cασ khxiσ V kα2 . sup 2 |ξ | 2 R η∈R

(2.13)

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

135

Proof. For proving the first inequality, decompose the domain of integration into two parts |ξ | ≥ 1 and |ξ | ≤ 1 and use Hölder’s inequality. Note that |ξ |−1 is in L2+ in the former domain and in L2− in the latter for any > 0. Since kVˆ kq ≤ CkV kq ≤ Ckhxiσ V k2 byYoung’s inequality and Hölder’s inequalities if 2 < q is sufficiently close to 2, the second estimate (2.13) follows from the first. Here q is the dual exponent of q: t 1/q + 1/q = 1. u Hereafter q always denote the dual exponent of q: 1/q + 1/q = 1. Lemma 2.4 is used in the following form. Lemma 2.5. Let σ > 0. Then, there exists q0 > 2 such that for any q > q0 ,

Z ∞

itr/2

e u(rω)dr ˆ ≤ Cq khxiσ uk2 .

0

L1 (6ω ,Lq (Rt ))

Proof. Apply Young’s inequality to the one dimensional Fourier transform with respect to the variable r and use Hölder’s inequality to the integral with respect to ω. We obtain:

Z ∞

itr/2

e u(rω)dr ˆ ≤ Cku(rω)k ˆ L1 (6ω ,Lq ((0,∞)r ))

1

0 L (6ω ,Lq (Rt )) ! 1/q Z |u(ξ ˆ )|q dξ ≤ Ckuk ˆ Lq (6×(0,∞)) = C . |ξ | R2 The lemma follows by applying (2.13) to the right-hand side. u t We are now in a position to prove the following proposition. Proposition 2.1. The operator W (1) (V ) is bounded in Lp (R2 ) for any 1 < p < ∞. Moreover, for any s > 1, there exists a constant Cps such that kW (1) (V )ukp ≤ Cps khxis V k2 kukp .

(2.14)

Proof. Let σ > 0 be such that 1+σ < s. By virtue of Hölder’s inequality and Lemma 2.5, we have by choosing q > 2 large enough that kKkL1 (6×[−2,2]) ≤ Cq kKkL1 (6,Lq (R1 )) ≤ Cσ khxiσ V k2 . When |t| ≥ 1, by applying integration by parts, we write K(t, ω) in the form Z 2i Vˆ (0) 2i ∞ itr/2 ∂ ˆ e + V (rω)dr. K(t, ω) = t t 0 ∂r

(2.15)

By virtue of (the proof of) Lemma 2.5 again, there exists q > 2 such that  1/q

q

Z Z Z ∞ ˆ

∞ ˆ ∂ V ∂ V

eitr/2 ≤C (rω)dr (rω) drdω

1

0 ∂r 6 0 ∂r q L (6ω ,L (Rt )) ! 1/q Z |∇ Vˆ (ξ )|q dξ ≤ C1 ≤ Ckhxi1+σ V k2 . |ξ | R2

136

K. Yajima

The second summand in the right of (2.15) is therefore integrable with respect to (t, ω) ∈ χ(t) ˜ ∈ L1 (R1 × 6) and {|t| ≥ 1} × 6. It follows that K1 (t, ω) = K(t, ω) − 2i Vˆ (0) t kK1 (t, ω)kL1 (R1 ×6) ≤ Ckhxis V k2 . Change the variable t by t − 2xω and estimate as Z Z Z ∞ Z dω K (t + 2xω, ω)u(x + tω)dt ≤ dω 1 6

6

0

∞

−∞

(2.16)

|K1 (t, ω)u(xω + tω)|dt. (2.17)

Note that the mapping x → xω = x − 2(x, ω)ω is measure preserving. We obtain by applying Minkowski’s inequality:

Z Z ∞

dω |K1 (t, ω)u(xω + tω)|dt ≤ Ckhxis V k2 kukLp (R2 ) (2.18)

6

Lp (Rx2 )

0

for all 1 ≤ p ≤ ∞. If we define Fω by Z(2.8), then the operator obtained by inserting Fω u(x)dω and Lemma 2.3 implies χ˜ (t)/t in place of K in (2.5) is given by

Z

Fω u(x)dω

6

6

Lp (R2 )

≤ Cp kukLp (R2 ) .

(2.19)

t The proof of the proposition is completed since |Vˆ (0)| ≤ Cs khxis V k2 . u ˜ 0 ) is bounded in Lp (R2 ) for any 1 < p < ∞. Proposition 2.2. The operator W (2) χ(H For any s > 2 there exists a constant Cps > 0 such that kW (2) χ˜ (H0 )ukp ≤ Cps khxis V k22 kukp .

Proof. The reflection along w2 : x → xω2 = x − 2(xω2 )ω2 is measure preserving and Z Z (2) |W u(x)| ≤ C d |Kˆ 2 (t1 , t2 , ω1 , ω2 )||u(xω2 + t1 ω1 + t2 ω2 )|dt1 dt2 . 62

R2

(2.20)

Hence, if Kˆ 2 ∈ L1 ([0, ∞)2 × 6 2 ), which is the case for m ≥ 3 ([15] and [16]), the Minkowski inequality implies that W (2) is bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞ and that kW (2) ukp ≤ CkKˆ 2 kL1 kukp

(2.21)

with p-independent constant C. In two dimensions, unfortunately, Kˆ 2 misses being in L1 by a whisker and an additional argument is necessary. We prove the proposition by a series of lemmas. In what follows 0 < σ denotes an arbitrarily small but fixed number. Lemma 2.6. The function Kˆ 2 can be written as a sum Kˆ 2 = K21 + K22 + K23 , in such a way that

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

137

1. kK21 kL1 (R2 ×6 2 ) ≤ Ckhxi2+σ V k22 ; ˜ 1 )/t1 ) × K(t2 , ω2 ), where K(t, ω) is defined by (1.9); 2. K22 = 2i Vˆ (0)(χ(t 3. K23 = (χ˜ (t2 )/t2 ) × K 0 (t1 , ω1 ) with kK 0 kL1 (R1 ×6) ≤ Ckhxi1+σ V k22 . Proof. We first decompose Kˆ 2 into three pieces by using the cut-off functions χ(t) + χ˜ (t) = 1: ˜ 1 )Kˆ 2 + χ(t1 )χ(t ˜ 2 )Kˆ 2 . Kˆ 2 = χ (t1 )χ (t2 )Kˆ 2 + χ(t

(2.22)

By Young’s inequality applied to the two dimensional Fourier transform with respect to (s1 , s2 ) and by Hölder’s inequality for the (ω1 , ω2 )-integral, we have for any q > 2, kKˆ 2 kL1 (6 2 ,Lq (R2 )) ≤ Cq kVˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 )kLq (6 2 ×[0,∞)2 ) . If q > 2 is large enough, Lemma 2.4 implies kVˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 )kLq (6 2 ×[0,∞)2 ) !1/q Z |Vˆ (ξ1 )Vˆ (ξ2 − ξ1 )|q 2 = ≤ Cqσ khxiσ V k22 . dξ1 dξ2 |ξ1 ||ξ2 | R4 It follows by applying Hölder’s inequality to the (t1 , t2 )-integral that kχ (t1 )χ (t2 )Kˆ 2 kL1 (R2 ×6 2 ) ≤ Ckhxiσ V k22 and we put χ(t1 )χ(t2 )Kˆ 2 into K21 . Applying integration by parts with respect to the variable s1 , we have ˜ 1 )L1 + χ(t ˜ 1 )χ (t2 )L2 + χ(t ˜ 1 )χ(t ˜ 2 )L2 ; χ(t ˜ 1 )Kˆ 2 = χ(t L1 (t1 , t2 , ω1 , ω2 ) = −

L2 (t1 , t2 , ω1 , ω2 ) = −

2 it1

Z [0,∞)2

2Vˆ (0) it1

ei(t1 s1+t2 s2 )/2

Z

∞

(2.23)

eit2 s2 /2 Vˆ (s2 ω2 )ds2 ;

0

∂ ˆ V (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 ) ds1 ds2 . ∂s1 (2.24)

Then the first summand χ˜ (t1 )L1 in (2.23) may be written in the form χ(t ˜ 1) × K(t2 , ω2 ) 2i Vˆ (0) t1 and we put it into K22 . Denote by L20 the integral in the right of (2.24). By using the argument similar to one that is used for proving χ(t1 )χ (t2 )Kˆ 2 ∈ L1 (R2 × 6 2 ) we estimate, with q > 2 large enough, kL20 kL1 (6 2 ,Lq (R2 )) ≤ Ck(∂/∂s1 )(Vˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 ))kLq (6 2 ×[0,∞)2 ) !1/q Z |∇ξ1 (Vˆ (ξ1 )Vˆ (ξ2 − ξ1 ))|q =C ≤ Ckhxi1+σ V k2 khxiσ V k2 . dξ1 dξ2 |ξ1 ||ξ2 | R4

138

K. Yajima

It follows that kχ˜ (t1 )χ(t2 )L2 kL1 (R2 ×6 2 ) ≤ Ckhxi1+σ V k22 and we put χ(t ˜ 1 )χ (t2 )L2 into K21 . For studying χ˜ (t1 )χ˜ (t2 )L2 we further apply integration by parts with respect to the variable s2 in the right of (2.24) and decompose ˜ 2 )L21 + χ(t ˜ 1 )χ(t ˜ 2 )L22 , χ˜ (t1 )χ˜ (t2 )L2 = χ˜ (t1 )χ(t Z −4 ∂ eit1 s1 /2 Vˆ (s1 ω1 )Vˆ (−s1 ω1 ) ds1 ; L21 (t1 , t2 , ω1 , ω2 ) = t1 t2 ∂s1

(2.25)

[0,∞)

L22 (t1 , t2 , ω1 , ω2 ) =

−4 t1 t2

Z

ei(t1 s1 +t2 s2 )/2

[0,∞)2

∂2 ˆ V (s1 ω1 )Vˆ (s2 ω2 −s1 ω1 ) ds1 ds2 . ∂s1 ∂s2

We denote by M1 (t1 , ω1 ) the integral on the right of (2.25). Then, for 2 < q < ∞ large enough, we have by repeating the argument above and by using (2.13) that kM1 kL1 (6,Lq (R)) ≤ Ck(∂/∂s)(Vˆ (sω)Vˆ (−sω))kL1 (6,Lq (R)) !1/q Z |∇ξ (Vˆ (ξ )Vˆ (−ξ ))|q =C ≤ CkhDiσξ ∇ξ (Vˆ (ξ )Vˆ (−ξ ))k2 . dξ |ξ |

(2.26)

By using the Parseval–Plancherel formula, the inequality hxi1+σ ≤ Cσ (hx − yi1+σ + hyi1+σ ), and the Hausdorff–Young inequality we estimate the right-hand side by a constant times khxi1+σ (V ∗ V˜ )k2 ≤ Ck(hxi1+σ V ) ∗ V˜ )k2 ≤ Ckhxi1+σ V k2 kV˜ k1 ≤ Ckhxi1+σ V k22 ,

(2.27)

where we wrote V˜ (x) = V (−x). It follows that k(χ˜ (t1 )/t1 )M1 (t1 , ω1 )kL1 (R1 ×6) ≤ Ckhxi1+σ V k22 , ˜ 2 )L21 = (χ˜ (t2 )/t2 ) × (−4χ(t ˜ 1 )/t1 )M(t1 , ω1 ) into K23 . We have, and we put χ(t ˜ 1 )χ(t for large enough 2 < q < ∞,

Z 2

i(t1 +s1 +t2 s2 )/2 ∂

ˆ ˆ e V (s1 ω1 )V (s2 ω2 − s1 ω1 ) ds1 ds2

1 2 q 2

∂s1 ∂s2 [0,∞)2 L (6 ,L (R )) ≤ Ck(∂ 2 /∂s1 ∂s2 )(Vˆ (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 ))kL! q (6 2 ×[0,∞)2 ) Z |∇ξ1 ∇ξ2 (Vˆ (ξ1 )Vˆ (ξ2 − ξ1 ))|q ≤C dξ1 dξ2 1/q ≤ Ckhxi2+σ V k22 . 4 |ξ ||ξ | 1 2 R ˜ 1 )χ(t ˜ 2 )L22 into K21 . It follows kχ(t ˜ 1 )χ˜ (t2 )L22 k ≤ Ckhxi2+σ V k22 and we put χ(t ˜ 2 )K2 may be studied by changing the role of t1 and t2 in the foregoing χ(t1 )χ(t argument. We apply integration by parts with respect to the variable s2 to obtain: ˜ 2 )Kˆ 2 = χ(t1 )χ(t ˜ 2 )L31 + χ(t1 )χ(t ˜ 2 )L32 , χ(t1 )χ(t L31 (t1 , t2 , ω1 , ω2 ) = −

2 it2

Z [0,∞)

eit1 s1 /2 Vˆ (s1 ω1 )Vˆ (−s1 ω1 )ds1 ;

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

L32 (t1 , t2 , ω1 , ω2 ) =

−2 it2

Z [0,∞)2

ei(t1 s1 +t2 s2 )/2

139

∂ ˆ V (s1 ω1 )Vˆ (s2 ω2 − s1 ω1 ) ds1 ds2 . ∂s2

The function χ(t1 )χ˜ (t2 )L31 is of variable separable and the estimates (2.26) and (2.27) show that it is of type K23 , viz. it is a product of an integrable function of (ω1 , t1 ) ∈ ˜ 2 )/t2 . Finally the argument R1 × 6 with L1 (R1 × 6)-norm ≤ Ckhxiσ V k22 and χ(t ˜ 2 )L22 ∈ L1 (R2 × 6 2 ) implies similar to the one used to show χ(t ˜ 1 )χ(t ˜ 2 )L32 kL1 (R2 ×6 2 ) ≤ Ckhxi1+σ V k22 kχ(t1 )χ(t and we put χ(t1 )χ˜ (t2 )L32 into K21 . This completes the proof of the lemma. u t We have already shown by (2.21) that the operator E21 produced by inserting K21 ∈ L1 (R2 × 6 2 ) in place of Kˆ 2 in (2.6) is bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞ and the estimates in the first statement of Lemma 2.6 implies kE21 ukp ≤ Ckhxi2+σ V k22 kukp . The following two lemmas complete the proof of Proposition 2.2. Lemma 2.7. Let E22 be the operator which is produced by replacing Kˆ 2 in (2.6) by ˜ 1 )/t1 ) × K(t2 , ω2 ). Then E22 χ(H ˜ 0 ) is bounded in Lp (R2 ) for any K22 = 2i Vˆ (0)(χ(t 2 1+σ V k2 kukp . 1 < p < ∞ and kE22 ukp ≤ Cpσ khxi Proof. The operator E22 is, modulo a constant factor, of the form Z Z ∞ Z ∞ Z χ(t ˜ 1) ˆ 2 + 2xω2 , ω2 ) dω2 dt2 K(t dω1 dt1 u(x + t1 ω1 + t2 ω2 ) t1 6 Z 6 0 Z 0 ∞ ˆ dω2 K(t2 + 2xω2 , ω2 )Mu(x + t2 ω2 )dt2 = CW (1) Mu(x), = 6

0

˜ 0 ) is bounded in Lp (R2 ) for where M is the operator defined by (2.7). Hence E22 χ(H any 1 < p < ∞, by virtue of Proposition 2.1 and Lemma 2.2. We note |Vˆ (0)| ≤ CkV k1 ≤ khxi1+σ V k2 and conclude the proof of the lemma. (We remark that this is the ˜ 0 ) is bounded in only place where the low energy cut-off is necessary to prove W (2) χ(H t Lp , 1 < p < ∞.) u Lemma 2.8. The operator E23 produced by replacing Kˆ 2 by ˜ 2 )/t2 )K 0 (t1 , ω1 ) K23 = (χ(t in (2.6) is bounded in Lp (R2 ) for any 1 < p < ∞ and kE23 ukp ≤ Cpσ khxi1+σ V k22 kukp . Proof. Define the operator K 0 by Z Z K 0 u(x) = 6

∞

K 0 (t1 , ω1 )u(x + t1 ω1 )dω1 dt1 .

0

It is obvious that kK 0 ukp ≤ kK 0 kL1 (6×R) kukp for any 1 ≤ p ≤ ∞ and the operator E23 may be written in the form Z (2.28) E23 u(x) = C (Fω2 K 0 u)(x)dω2 6

where Fω is defined by (2.8). Hence the Minkowski inequality and (2.19) implies kE23 ukp ≤ Cp kK 0 kL1 (6×R) kukp ≤ Cpσ khxi1+σ V k22 kukp , which completes the proof of the lemma. u t

140

K. Yajima

3. High Energy Estimate In this section, we complete the proof of Lp -boundedness of the high energy part W+ χ˜ (H0 ) of W+ , 1 < p < ∞. We write as W+ = I + W (1) + W (2) + W3 as in the previous sections. We have already shown in Proposition 2.1 and Proposition 2.2 that ˜ 0 ) are bounded in Lp (R2 ) for any 1 < p < ∞. It remains to prove W (1) and W (2) χ(H the following proposition. ˜ 0 ) is bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞. Proposition 3.1. The operator W3 χ(H ˜ 0) Proof. We prove the proposition by showing that the integral kernel T (x, y) of W3 χ(H satisfies the condition of Schur’s lemma. We write Z ∞ 1 ˜ 0 )u = R0− (λ)V R0− (λ)V R − (λ)V {R0+ (λ) − R0− (λ)}χ˜ (λ)udλ. (3.1) W3 χ(H 2πi 0 We make the change of the variable λ = k 2 . It is well known that R0± (k 2 ) is the convolution operator with G± (x; k) =

±i ± H (k|x|), 4 0

where H0± (z) is the Hankel function: √ ∓iπ/4 ±iz Z ∞ it −1/2 2e e ± −t −1/2 e t dt. z± H0 (z) = π 2 0

(3.2)

We use the following two lemmas. ∓ik|y| G± (x − y; k) and regard G± (x) as a function Lemma 3.1. Define G± k,y (x) = e k,y of x with parameters k > 0 and y ∈ R2 . We have for any > 0,

j

Cj

−(j +1+) ∂ , k ≥ c > 0. (3.3) G± (x) ≤√

hxi k,y

2 2

∂k khyi L (Rx )

˜ ± (x, k) = e∓ik|x| G± (x, k). Then by differentiating (3.2), Proof. Let G j +(1/2) j Z ∞ 2k|x| dt ∂ ± −t −1/2 ˜ , e t G (x; k) = Cj √ j ∂k 2k|x| ± it |k| k|x| 0 we have

∂ j C ± ˜ (x; k) ≤ √j , k > 0. G k j k|x| ∂k

(3.4)

±ik(|x−y|−|y|) ˜ ± G (x − y; k) and ||x − y| − |y|| ≤ |x|, (3.4) implies Since G± k,y (x; k) = e

j ∂ j X Chxij hxi` ± ≤ , k>c>0. Gk,y (x) ≤ Cj √ √ ∂k k j −` k|x − y| k|x − y| `=0

Z Applying the estimate

R2

C hxi−2−2 dx ≤ , we obtain the lemma. u t |x − y| hyi

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

141

Lemma 3.2. Let s > 7/2. Then F (λ) = R0− (λ)V R − (λ) satisfies the following estimate for j = 0, 1, 2: C khxis V k3 ∂ j j ∞ ± + 2 , k ≥ c > 0. (3.5) hF (k )V Gk,y , V Gk,x i ≤ 3 √ ∂k k hxihyi Proof. It is well known (cf. e.g. [7]) that for k ≥ 1, > 0 and j = 0, 1, 2, khxi−j −(1/2)− (∂/∂k)j R ± (k 2 )hxi−j −(1/2)− kB(L2 ) ≤ Cj |k|−1 ,

(3.6)

and that the similar estimates hold for the free resolvent R0± (k 2 ). We differentiate by + k the function hF (k 2 )V G± k,y , V Gk,x i by using the Leibniz rule and applying the estimates (3.3) and (3.6). The estimate (3.5) follows. u t Completion of the proof of the proposition. If we set Z ∞ 1 + ± e−ik(|x|∓|y|) hF (k 2 )V G± ˜ 2 )dk, T (x, y) = k,y , V Gk,x ik χ(k πi 0

(3.7)

˜ 0 ) is given by then, by virtue of (3.1), the integral kernel of W3 χ(H T (x, y) = T + (x, y) − T − (x, y). It follows from (3.5) that the integral with respect to k in the right of (3.7) converges absolutely. We apply integration by parts with respect to the k-variable using the identity ∂2 1 1 − e−ik(|x|∓|y|) = e−ik(|x|∓|y|) . h|x| ∓ |y|i2 ∂k 2 The boundary terms do not appear because of the decay at high energy (3.5) and the low energy cut-off. The result is T ± (x, y) =

1 πih|x| ∓ |y|i2 Z ∞ ∂2 + e−ik(|x|∓|y|) 1 − 2 hF (k 2 )V G± ˜ 2 )dk , k,y , V Gk,x ik χ(k ∂k 0

which is estimated by using (3.5). We obtain Z ∞ C C k χ(k ˜ 2) . dk ≤ √ |T ± (x, y)| ≤ √ 2 3 h|x| ∓ |y|i 0 k hxihyi hxihyih|x| ∓ |y|i2 It follows that Z Z |T ± (x, y)|dx ≤ C sup hxi−1/2 hyi−1/2 h|x| ∓ |y|i−2 dx < ∞, sup y∈R2 R2

Z sup x∈R2

R2

y

|T ± (x, y)|dy ≤ C sup x

Z

hxi−1/2 hyi−1/2 h|x| ∓ |y|i−2 dy < ∞,

and Schur’s lemma implies the Proposition 3.1. u t

(3.8)

142

K. Yajima

4. Low Energy Estimate In this section, we show that the low energy part of W+ is also bounded in Lp . Proposition 4.1. Let the constant c > 0 for defining the cut-off function χ be sufficiently small. Then, the operator W+ χ (H0 ) is bounded in Lp (R2 ) for any 1 < p < ∞. We again start from the stationary representation formula (2.1). Change the variable λ to k 2 and write R − (k 2 )V = R0− (k 2 )V (1 + R0− (k 2 )V )−1 : χ (H0 )u W+ χ(H0 )u Z− ∞ 1 R0− (k 2 )V (1 + R0− (k 2 )V )−1 {R0+ (k 2 ) − R0− (k 2 )}χ(k 2 )kudk . =− πi 0

(4.1)

We need to estimate the operator defined by the integral on the right of (4.1). We begin with examining the asymptotic behavior as k → 0 of (1+R0± (k 2 )V )−1 . In what follows, s will denote an arbitrary constant satisfying 3 < s < δ/2 unless expicitly stated otherwise. As was stated in (1.1), R0− (k 2 ) has an asymptotic expansion R0− (k 2 ) = c− (k)P0 + G0 + O(k 2 log k),

k → +0,

(4.2)

where O(k 2 log k) stands for a function of k with values in B(L2,s , H 2,−s ), such that for j = 0, 1, 2 k(d/dk)j O(k 2 log k)kB(L2,s ,H 2,−s ) ≤ Cj k 2−j | log k|, 0 < k < 1/2. Z Hence, writing c0 =

(4.3)

V (x)dx, V0 (x) = c0−1 V (x), P = P0 V0 and c1 (k) = c0 c− (k),

we have 1 + R0− (k 2 )V = 1 + c1 (k)P + G0 V + O(k 2 log k), k → 0,

(4.4)

here now O(k 2 log k) is a B(L2,−s , H 2,−s )-valued function which satisfies the property (4.3) with the obvious change of the norm in the left. The operator P is a one dimensional (non-orthogonal) projection and we set Q = 1 − P . Projections P and Q decompose L2,−s into the algebraic direct sum 2,−s ˙ L2,−s = P L2,−s +QL , 2,−s is ˙ and the mapping u → (P u, Qu) is an isomorphism if (u1 , u2 ) ∈ P L2,−s +QL equipped with the norm ku1 kL2,−s +ku2 kL2,−s . It is convenient to represent the operators in the matrix form according to this decomposition and 1 + R0− (k 2 )V may be written by virtue of (4.4) as 1P + c2 (k)P P G0 V Q M11 M12 2 + O(k 2 log k). + O(k log k) ≡ M21 M22 QG0 V P 1Q + QG0 V Q (4.5)

Here c2 (k) = c1 (k) + d1 is a linear function of log k: c2 (k) = a + b log k with b 6 = 0 and the meaning of O(k 2 log k) is the same as in (4.4).

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

143

The operator M22 = 1Q + QG0 V Q is invertible by Assumption 1.2 and M11 is −1 = c3 (k)P , c3 (k) = (1 + c2 (k))−1 . Thus, clearly invertible for small k > 0 with M11 the first summand in (4.5) may be written as −1 0 M11 M12 M11 0 . (4.6) 1+ −1 0 M22 M22 M21 0 Note that c3 (k) = O((log k)−1 ) as k → +0. It follows that 2 −1 −1 P M12 M22 M12 M21 0 0 M11 = 1 − c (k) 1− 3 −1 −1 M22 M21 0 M21 P M12 0 M22 −1 −1 M21 and M22 M21 P M12 are of rank 1 and we is invertible. Moreover, both P M12 M22 have −1 −1 M21 = c4 (k)(1 ⊗ V0 ), c3 (k)M22 M21 P M12 = c3 (k)(ψ ⊗ φ). c3 (k)P M12 M22

Here c4 (k) = d2 c3 (k), d2 being a constant and functions φ and ψ are defined by φ(x) = (G0 V Q)∗ V0,f un (x) and ψ(x) = [(1Q + QG0 V Q)−1 QG0 Vf un ](x), (4.7) where Vf un (x) = V (x) is considered as a function, not a multiplication operator. Thus the inverse may be computed explicitly to yield 2 !−1 −1 c5 (k)1P M12 0 0 M11 = , 1− −1 0 1Q + c6 (k)ψ ⊗ φ M22 M21 0 where c5 (k) and c6 (k) are given by c5 (k) = 1 +

d2 , c2 (k) + 1 − d2

c6 (k) =

1 . c2 (k) + 1 − (φ, ψ)

(4.8)

Hereafter we choose 0 < c < 1 in such a way that the relation min{|c2 (k) + 1|, |c2 (k) + 1 − d2 |, |c2 (k) + 1 − (φ, ψ)|} > 1 for 0 < k < 2c (4.9) is satisfied. Since c2 (k) is a linear function of log k, such a choice is of course possible. In this way, we have proven that the following inverse exists: −1 −1 −1 −1 0 M11 M11 0 M12 M11 M12 = 1+ −1 M21 M22 0 M22 M22 M 0 21 −1 −1 M11 0 1P c5 (k)P −M11 M12 0 = −1 0 1Q + c6 (k)(ψ ⊗ φ) 0 M22 1Q −M22 M21 −1 −1 −1 c5 (k)P M11 −M11 M12 (Q + c6 (k)ψ ⊗ φ)M22 = , −1 −1 −1 −c5 (k)M22 M21 P M11 (Q + c6 (k)ψ ⊗ φ)M22 (4.10) and, if we set N(k) = (1 + R0− (k 2 )V )−1 −

M11 M12 M21 M22

−1

,

144

K. Yajima

we have by virtue of (4.5), for j = 0, 1, 2, k(d/dk)j N(k)kB(L2,−s ) ≤ Cj k 2−j hlog ki,

0 < k < 2c.

(4.11)

We denote the (ij )-component of the inverse by Bij :

M11 M12 M21 M22

−1

=

B11 B12 B21 B22

.

Lemma 4.1. The operator A defined by Z ∞ 1 R0− (k 2 )V N(k){R0+ (k 2 ) − R0− (k 2 )}χ(k 2 )kudk, u ∈ L2,s (R2 ) Au = − πi 0 may be extended to an operator bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞. Proof. The proof goes virtually in the same way as that of Proposition 3.1 and we use the notation there. Note first that (4.11) implies that the integral converges as a Bochner integral in L2,−s (R2 ). We define A± by Z ∞ 1 R0− (k 2 )V N (k)R0± (k 2 )χ (k 2 )kudk, A± u = − πi 0 so that A = A+ − A− , and write A± (x, y) for their integral kernels. Then, we have Z −1 ∞ −ik(|x|∓|y|) + ± 2 e hN (k)G± A (x, y) = k,y , V Gk,x ikχ(k )dk. πi 0 Using the estimate (3.4) and (4.11), we obtain as in (3.5) that C k 2−j hlog ki ∂ j j ± + 2 . hkχ(k )N (k)Gk,y , V Gk,x i ≤ √ ∂k hxihyi

(4.12)

Thus by integrating by parts twice as in (3.8), we obtain C |A (x, y)| ≤ h|x| ∓ |y|i2 ±

Z 0

2c

C hlog ki dk ≤ . √ √ hxihyi h|x| ∓ |y|i2 hxihyi

It follows by Schur’s lemma that A± are bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞, and so is A. u t We write Wij , i, j = 1, 2, for the operator which is obtained by inserting the (ij ) component Bij of (4.10) in place of (1 + R0− (k 2 )V )−1 in the integrand of (4.1): Z ∞ 1 R0− (k 2 )V Bij {R0+ (k 2 ) − R0− (k 2 )}χ(k 2 )kudk. (4.13) Wij u = − πi 0 B22 decomposes into the rank one part and the remainder (1)

(2)

−1 −1 + QM22 , B22 = B22 + B22 ≡ c6 (k)(ψ ⊗ φ)M22

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators (1)

145 (2)

and we decompose W22 accordingly: W22 = W22 + W22 . A little computation shows that the operators V Bij may be written as V B11 = c11 (k)(V ⊗ V0 ), V B21 = c21 (k)(V ψ ⊗ V0 ),

V B12 = c12 (k)(V ⊗ φ1 ), (1) V B22 = c22 (k)(V ψ ⊗ φ1 ),

(4.14)

where ψ is the function defined by (4.7), −1 ∗ ) (G0 V Q(1 + QG0 V Q)−1 Q)∗ V (x) φ1 (x) = (M22

and cij (k) are defined by c11 (k) = (c2 (k) + 1 − d2 )−1 , c21 (k) = −c11 (k),

c12 (k) = −c3 (k)(1 + d3 c6 (k)), c22 (k) = c0−1 c6 (k).

(4.15)

Here c0 and d2 are the same constants as before and d3 is another constant. Thus, we (1) (2) see that operators W11 , W12 , W21 , W22 and W22 are essentially of the same form, viz. they are all of the form Z −1 ∞ − 2 R0 (k )K{R0+ (k 2 ) − R0− (k 2 )}c(k)χ(k 2 )kudk, πi 0 −1 for where K is one of the integral operators on the right-hand sides of (4.14) (or QM22 (2) (2) W22 ) and c(k) is the corresponding cij (k) (or c(k) = 1 for W22 ). The following is well known in the Fourier analysis (cf. e.g. [13], p. 26).

Lemma 4.2. Let m(ξ ) ∈ C 1 (R2 \{0}) be bounded and satisfy |∇m(ξ )| ≤ C|ξ |−1 . Then, the Fourier multiplier m(D) is a bounded operator in Lp (R2 ) for any 1 < p < ∞. By virtue of the choice of the constant c > 0 for the cut-off function χ, it is easy to see that the functions cij (|ξ |)χ (ξ 2 ) satisfy the condition of Lemma 4.2 and we obtain the following lemma. Lemma 4.3. For 1 ≤ i, j ≤ 2, the Fourier multiplier cij (|D|)χ (H0 ) is a bounded operator in Lp (R2 ) for any 1 < p < ∞. The following two lemmas, Lemma 4.4 and Lemma 4.5, will be crucial in what follows. Lemma 4.4 has been proven in [16], however, we record its proof here for the convenience of readers. In the proof we use the fact that, for s > 1, R0± (k 2 ), k > 0 is a locally Hölder continuous family of operators from L2,s (R2 ) to L∞ (R2 ) which is obvious from the estimate of the convolution kernel (3.4). Lemma 4.4. Suppose that K is the integral operator with the integral kernel K(x, y) which satisfies 1/2 Z Z hxi2s |K(x, x − y)|2 dx dy ≡ k|Kk|s < ∞ (4.16) R2

R2

for some s > 1. Then the operator Z ∞ 1 R0− (k 2 )K{R0+ (k 2 ) − R0− (k 2 )}kudk, uˆ ∈ C0∞ (R2 \ {0}) ZK u(x) = − πi 0 can be extended to a bounded operator in Lp (R2 ) for any 1 < p < ∞ and kZK ukp ≤ Cps k|Kk|s kukp .

146

K. Yajima

Proof. We set Ky (x) = K(x, x − y) and τy u(x) = u(x − y). Then for u ∈ L∞ (R2 ), khxis Ky τy uk2 ≤ khxis Ky k2 kuk∞ and y 7→ Ky τy u is L2,s (R2 ) Bochner integrable and Z Z K(x, x − y)u(x − y)dy = Ky τy u(x)dy. Ku(x) = R2

R2

If uˆ ∈ C0∞ (R2 \{0}), then {R0+ (k 2 )−R0− (k 2 )}u is an L∞ (R2 )-valued Hölder continuous function of k which is supported by a compact set of (0, ∞). It follows by using Fubini’s theorem in the third step that, for v ∈ C0∞ (R2 ), Z −1 ∞ − 2 + 2 − 2 R0 (k )K{R0 (k ) − R0 (k )}kudk, v πi Z 0 ∞ −1 Z − 2 + 2 − 2 hR0 (k )Ky τy {R0 (k ) − R0 (k )}ku, vidy dk = Z0 πi Z R∞2 −1 − 2 + 2 − 2 hR0 (k )Ky {R0 (k ) − R0 (k )}kτy u, vidk dy. = πi 0 R2 Recalling (2.2), we conclude that hZK u, vi =

Z R2

hW (1) (Ky )τy u, vidy

and Proposition 2.1 implies Z khxis Ky k2 dy kukp kvkp = Cps k|Kk|s kukp kvkp . |hZK u, vi| ≤ Cps R2

This completes the proof of the lemma. u t −1 = Q(1 + The second lemma concerns the integral kernel of the operator M22 QG0 V Q)−1 Q. The assumption of Lemma 4.5 is much weaker than necessary for our purpose here, however, we state and prove it as it is for later convenience.

Lemma 4.5. Let s > 1. Suppose that 1/2 Z | log |x − y||2 |V (y)|2 dy ≤ Chxi−3s |x−y|≤1

and that Assumption 1.2 is satisfied. Then, (1 + QG0 V Q)−1 − 1 is a Hilbert–Schmidt operator in L2,−s (R2 ). The integral kernel K(x, y) of V (1 + QG0 V Q)−1 − V satisfies 1/2 Z Z |hxis K(x, x − y)|2 dx dy < ∞. (4.17) R2

R2

Proof. Splitting the region of integration as R2 = 1 ∪2 , where 1 = {y : |x−y| ≤ 1} and 2 = {y : |x − y| > 1}, and noticing that log |x − y| ≤ 2(loghxi + loghyi) in 2 , we obtain Z | log |x − y||2 |V (y)|2 hyi2s dy ≤ C(1 + loghxi)2 . (4.18) R2

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

147

The integral kernel of QG0 V Q = G0 V − P G0 V − G0 V P + P G0 V P is given by Z 1 1 V0 (z) log |z − y|dz V (y) 4(x, y) ≡ − (log |x − y|)V (y) + 2π Z 2π 1 V0 (z) log |x − z|dz V (y) + d1 V (y). + 2π

(4.19)

It follows from (4.18) and (4.19) that hxi−s hyis 4(x, y) ∈ L2 (Rx2 ×Ry2 ), and QG0 V Q is P a Hilbert–Schmidt operator in L2,−s (R2 ). Take a finite rank operator F = N j =1 ξj ⊗ηj such that ξj , ηj ∈ C0∞ (R2 ) and kQG0 V Q − F kH S ≤ 1/2, where k · kH S is the Hilbert–SchmidtPnorm in the Hilbert space L2,−s (R2 ). Denote L = QG0 V Q − j j F . Then the series ∞ j =0 (−1) L converges in the Hilbert–Schmidt norm and (1 + P j j −1 − 1 is of Hilbert–Schmidt. By As˜ L)−1 = ∞ j =0 (−1) L . Hence L = (1 + L) sumption 1.2, 1 + QG0 V Q is invertible. Hence (1 + (1 + L)−1 F ) is also invertible and (1 + QG0 V Q)−1 = (1 + (1 + L)−1 F )−1 (1 + L)−1 . Since (1 + L)−1 F is of finite rank, F˜ = (1 + (1 + L)−1 F )−1 − 1 is also of finite rank. It follows that (1 + QG0 V Q)−1 − 1 = F˜ + L˜ + F˜ L˜ is also a Hilbert–Schmidt operator in L2,−s (R2 ). Denote the integral kernel of L by L(x, y) = 4(x, y) − F (x, y), F (x, y) = PN ˜ ˜ ˜ j =1 ξj (x)ηj (y). Then, the integral kernel L(x, y) of L is given by L(x, y) = L(x, y)+ L1 (x, y), where L1 (x, y) =

Z Z ∞ X (−1)j · · · L(x, xj −1 ) · · · L(x1 , y)dx1 · · · dxj −1 .

(4.20)

j =2

By using Schwarz’ inequality repeatedly we have Z |L1 (x, y)| ≤

1/2 Z |L(x, z)|2 hzi2s dz

hzi−2s |L(z, y)|2 dz

1/2 X ∞ j =2

j −2

kLkH S . (4.21)

By virtue of (4.19) and the fact that F ∈ C0∞ (R4 ), we have Z R2

|L(x, y)|2 hyi2s dy ≤ C(1 + loghxi)2 .

(4.22)

Exploiting (4.22) and Lemma 4.6, which will be stated after the proof, we obtain that Z Z

s

1/2

Z

s

1/2

dy ≤ C |hxi hyi V (x)L(x, y)| dxdy |hxi V (x)L(x, x −y)| dx Z 1/2 ≤C (1 + loghxi)2 |hxi2s V (x)|2 dx <∞. 2

2s

2

(4.23)

148

K. Yajima

By using Lemma 4.6 again and (4.21), we estimate the corresponding integral for L1 (x, y) as follows: 1/2 Z Z s 2 dy |hxi V (x)L1 (x, x − y)| dx 1/2 Z ≤C |hxi2s hyis V (x)L1 (x, y)|2 dxdy Z 1/2 Z 2s s 2 −s s 2 ≤C |hxi V (x)L(x, z)hzi | dxdz · |hzi L(z, y)hyi | dzdy Z 1/2 Z ≤C |(1 + loghxi)hxi2s V (x)|2 dx · |(1 + loghzi)hzi−s |2 dz < ∞. (4.24) ˜ Combining (4.23) and (4.24), we see the same estimate holds for L(x, y): 1/2 Z Z ˜ x − y)|2 dx dy < ∞. |hxis V (x)L(x,

(4.25)

P −1 ˜ ˜˜ Since (1 + L)−1 F = N j =1 ((1 + L) ξi ) ⊗ ηj , F and F L may also be written ˜ ˜ as linear combinations of ξi ⊗ ηj and ξi ⊗ η˜ j , 1 ≤ i, j ≤ N , respectively, where ˜ i and η˜ j = L˜ ∗ ηj . Recalling that (1 + L)−1 is bounded in ξ˜i = (1 + L)−1 ξi = ξi + Lξ L2,−s and using the relation (4.22), we estimate Z −1 −1 ˜ |Lξi (x)| = |L(1 + L) ξi (x)| = L(x, y){(1 + L) ξi (y)}dy 1/2 Z khxi−s (1 + L)−1 ξi k ≤ C(1 + loghxi)kξi kL2,−s . ≤ |L(x, y)hyis |2 dy (4.26) It follows that |ξ˜i (x)| ≤ C(1 + loghxi) and khxi2s V (x)ξ˜ (x)k2 < ∞. Applying the estimate (4.26) with hxis f placing ξi to the third step, we obtain for any f ∈ L2 that s ˜ f ))| |(hxis η˜ j , f )| = |(hxiZs (L˜ ∗ ηj ), f )| = |(ηj , L(hxi

≤ Ckhxis f kL2,−s

R2

|ηj (x)|(1 + loghxi)dx = C 0 kf k2 , f ∈ L2 (R2 ),

hence, khxis η˜ j k2 < ∞ by the duality. Then, by virtue of Lemma 4.6, 1/2 Z Z s 2 ˜ dy ≤ Ckhxi2s V ξ˜i k2 khxis ηj k2 < ∞, hxi V (x)ξi (x)ηj (x − y)| dx R2

(4.27) Z

Z R2

hxis V (x)ξ˜i (x)η˜ j (x − y)|2 dx

1/2

dy ≤ Ckhxi2s V ξ˜i k2 khxis η˜ j k2 < ∞. (4.28)

The combination of (4.25), (4.27) and (4.28) implies (4.17) and concludes the proof of the lemma. u t

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

149

The following lemma, which has already been employed in the proof of the previous lemma, will be of use several times in what follows. Lemma 4.6. For s > 1, we have the following inquality: 1/2 1/2 Z Z Z s 2 2s s 2 |hxi K(x, x −y)| dx dy ≤ Cs |hxi hyi K(x, y)| dxdy . R2

R2 ×R2

R2

Proof. Apply Schwarz’ inequality to the y-integral in the left and estimate it by 1/2 Z Z |K(x, x − y)hxis hyis |2 dxdy khyi−s k2 . R2

R2

Use the inequality hyis ≤ Cs hxis hx − yis to estimate hyis and change the variables (x, y) → (x, x − y). The lemma follows immediately. u t (i)

We are in position to prove that the operators Wij and W22 are bounded in Lp . We give a proof for W11 separately for exibiting the typical argument to be exploited for the other operators in what follows. Lemma 4.7. The operator W11 is bounded in Lp (R2 ) for any 1 < p < ∞. For any s > 1, we have kW11 ukp ≤ Cps khxi2s V k22 kukp , u ∈ L2,s ∩ Lp .

Proof. Replacing V B11 in (4.13) by (4.14) and (4.15), we have Z −1 ∞ − 2 R0 (k )(V ⊗ V0 ){R0+ (k 2 ) − R0− (k 2 )}c11 (k)χ (k 2 )kudk . W11 u = πi 0

(4.29)

(4.30)

Observing that c11 (k)χ(k 2 ){R0+ (k 2 ) − R0− (k 2 )}u = {R0+ (k 2 ) − R0− (k 2 )}χ(H0 )c11 (|D|)u and recalling that χ(H0 )c11 (|D|) is bounded in Lp for any 1 < p < ∞ by virtue of Lemma 4.2, we rewrite (4.30) in the form Z −1 ∞ − 2 R0 (k )(V ⊗ V0 ){R0+ (k 2 ) − R0− (k 2 )}kc11 (|D|)χ (H0 )udk . W11 u = πi 0 Here K(x, y) = V (x)V0 (y) satisfies the condition (4.16): 1/2 Z Z |hxis V (x)V0 (x − y)|2 dx dy ≤ Cs khxi2s V k22 < ∞ R2

R2

for any 1 < s < δ/2. Lemma 4.7 follows by virtue of Lemma 4.4. u t (1)

Lemma 4.8. The operators W12 , W21 and W22 are bounded in Lp (R2 ) for any 1 < p < ∞.

150

K. Yajima

Proof. The argument in the proof of the previous Lemma 4.7 clearly implies, by virtue of Lemma 4.4 and Lemma 4.3, that it suffices for proving the lemma to show that, if K(x, y) represents any one of the functions V (x)φ1 (y), V (x)ψ(x)V0 (y) and V (x)ψ(x)φ1 (y) which appear in the right of (4.14), then Z khxis Ky k2 dy < ∞, Ky (x) = K(x, x − y) (4.31) R2

for some s > 1. For proving (4.31), we use the following lemma. We set R = (1 + QG0 V Q)−1 − 1. Lemma 4.9. Let s > 1 be as in Lemma 4.5. Then: |G0 Vf (x)| + |Rf (x)| ≤ C(1 + loghxi)khxi−s f k2 , f ∈ L2,−s (R2 ).

(4.32)

Proof. We use the notation of the proof of Lemma 4.5. Schwarz’ inequality and (4.18) imply Z 1 | log |x − y|||V (y)f (y)|dy ≤ C(1 + loghxi)khxi−s f k2 . |G0 Vf (x)| ≤ 2π P ˜ For proving the corresponding estimate for R we write R = L˜ + N i,j =1 cij ξi ⊗ ηj + PN ˜ ˜ i,j =1 c˜ij ξi ⊗ η˜ j . The estimate (4.26) with f in place of ξi shows that |Lf (x)| ≤ C(1+

loghxi)khxi−s f k2 . As was shown in the proof of Lemma 4.5, ηj and η˜ j , 1 ≤ j ≤ N, belong to L2,s and |ξ˜i (x)| ≤ C(1 + loghxi). Hence |ξ˜i (x)||(ηj , f )| and |ξ˜i (x)||(η˜ j , f )| are both bounded by a constant times (1 + loghxi)khxi−s f k2 for all 1 ≤ i, j ≤ N. We obtain (4.32) by combining these estimates. u t

Completion of the proof of Lemma 4.8. We prove (4.31). By virtue of Lemma 4.6, it suffices to show hxi2s V , hxi2s V ψ and hxis φ1 ∈ L2 (R2 ) for some s > 1. By Assumption 1.1, it is obvious that hxi2s V ∈ L2 (R2 ) if 1 < s < (δ − 1)/2. Recall that ψ(x) = (1 + QG0 V Q)−1 QG0 V (x) = QG0 V (x) + RG0 V (x). Then the application of (4.32) imples that |ψ(x)| ≤ C(1 + loghxi) and hxi2s V ψ ∈ L2 (R2 ) if 1 < s < (δ − 1)/2. By virtue of Lemma 4.5, the operator Q(1 + QG0 V Q)−1 Q is bounded in L2,−s . Hence (4.32) implies |G0 V Q(1 + QG0 V Q)−1 Qhxis f (x)| ≤ C(1 + loghxi)khxis f kL2,−s = C(1 + loghxi)kf k2 . It follows for any f ∈ L2 that |(hxis φ1 , f )| = |(VZ , G0 V Q(1 + QG0 V Q)−1Qhxis f )| ≤C

|V (x)|(1 + loghxi)dx kf k2

and we have hxis φ1 ∈ L2 (R2 ) by the duality. This completes the proof of the lemma. t u The following lemma completes the proof of Proposition 4.1, hence of Theorem 1.3. (2)

Lemma 4.10. The operator W22 is bounded in Lp (R2 ) for any 1 < p < ∞.

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

151

−1 Proof. We decompose the operator V QM22 = V Q(1 + QG0 V Q)−1 Q as follows:

V Q(1 + QG0 V Q)−1 Q = V + V {(1 + QG0 V Q)−1 − 1} −V (1 + QG0 V Q)−1 P − V P (1 + QG0 V Q)−1 + V P (1 + QG0 V Q)−1 P ≡ T0 + T1 + T2 + T3 + T4 , and, for j = 0 ∼ 4, define the operator Zj by the integral Z −1 ∞ − 2 R0 (k )Tj {R0+ (k 2 ) − R0− (k 2 )}χ(k 2 )kudk , Zj u = πi 0 P (2) so that W22 = 4j =0 Zj . Recalling (2.2), we have Z0 = W (1) χ(H0 ) and Proposition 2.1 implies Z0 is bounded in Lp (R2 ) for 1 < p < ∞. We prove that the operators Z1 ∼ Z4 are bounded in Lp (R2 ) for any 1 < p < ∞ by the argument of the proof of Lemma 4.7. Thus, it suffices to show that the integral kernel Tj (x, y) of Tj satisfies (4.16) with Tj in place of K for 1 ≤ j ≤ 4. We have already shown in Lemma 4.5 that T1 (x, y) satisfies (4.16). The operators T2 , T3 and T4 are of rank one and their integral kernels are given as follows: T2 (x, y) = κ1 (x)V0 (y),

κ1 = V (1 + QG0 V Q)−1 1 ∈ L2,δ−s ⊂ L2,2s , for all 1 < s < 2;

T3 (x, y) = V (x)κ2 (y),

κ2 = ((1 + QG0 V Q)−1 )∗ V0 ∈ L2,s , for some 1 < s < δ − 1;

T4 (x, y) = d3 V (x)V (y).

It is then obvious to see, by virtue of Lemma 4.6, that Tj (x, y) for j = 2, 3, 4 satisfies (4.16). This completes the proof. u t References 1. Agmon, S.: Spectral properties of Schrödinger operators and scattering theory. Aaa. Scuola Norm. Sup. Pisa Ser. IV, 2, 151–218 (1975) 2. Bollé, D., Gesztesy, F. and Danneels, C.: Threshold scattering in two dimensions.Ann. Inst. Henri Poincaré 48, 175–204 (1988) 3. Galtbayar, A. and Yajima, K.: Lp -boundedness of wave operators for one dimensional Schrödinger operators. Preprint, The University of Tokyo (1999) 4. Kato, T.: Growth properties of solutions of the reduced wave equation with variable coefficients, Comm. Pure. Appl. Math. 12, 403–422 (1959) 5. Kato, T. and Kuroda, S.T.: Theory of simple scattering and eigenfunction expansions, Functional analysis and related fields. Berlin–Heidelberg–New York: Springer-Verlag, 1970, pp. 99–131 6. Kuroda, S.T.: Scattering theory for differential operators, I and II. J. Math. Soc. Japan 25, 75–104 and 222–234 (1972) 7. Murata, M.:Asymptotic expansions in time for solutions of Schrödinger -type equations. J. Funct.Analysis 49, 10–56 (1982) 8. Jensen, A.: Results in Lp (Rd ) for the Schrödinger equation with a time dependent potential. Math. Ann. 299, 117–125 (1994) 9. Jensen, A. and Nakamura, S.: Mapping properties of functions of Schrödinger operators between Lp spaces and Besov spaces, Spectral and scattering theory and applications. Advanced Studies in Pure Math. 22, Tokyo: Kinokuniya, 1994, pp. 187–210 10. Hardy, G., Littlewood, J.E. and Polya, G.: Inequalities. Second ed., Cambrigde: Cambridge Univ. Press, 1952 11. Simon, B.: Schrödinger semigroups. Bull. Am. Math. Soc. 7, 447–526 (1982)

152

K. Yajima

12. Shenk, N. and Thoe, D.: Outgoing solution of (−4 + q − k 2 )u = f in an exterior domain. J. Math. Anal. Appl. 31, 81–116 (1970) 13. Stein, E.M.: Harmonic analysis: Real-variable methods, orthogonality, and oscillatory integrals. Princeton, NJ: Princeton University Press, 1993 14. Weder, R.: The Wk,p -continuity of the Schrödinger wave operators on the line. Preprint, UNAM (1999), to appear in Commun. Math. Phys. 15. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47, 551–581 (1995) 16. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators III. J. Math. Sci. Univ. Tokyo 2, 311–346 (1995) Communicated by B. Simon

Commun. Math. Phys. 208, 153 – 172 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Localization of Surface Spectra Vojkan Jakši´c1 , Stanislav Molchanov2 1 Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Avenue, Ottawa, ON,

K1N 6N5, Canada

2 Department of Mathematics, University of North Carolina, Charlotte, NC 28223, USA

Received: 3 December 1998 / Accepted: 27 May 1999

Abstract: We study spectral properties of the discrete Laplacian H on the half-space = Zd × Z+ with random boundary condition ψ(n, −1) = λV (n)ψ(n, 0); the Zd+1 + V (n) are independent random variables on a probability space (, F, P ) and λ is the coupling constant. It is known that if the V (n) have densities, then on the interval [−2(d+ 1), 2(d + 1)] (= σ (H0 ), the spectrum of the Dirichlet Laplacian) the spectrum of H is P -a.s. absolutely continuous for all λ [JL1]. Here we show that if the random potential V satisfies the assumption of Aizenman–Molchanov [AM], then there are constants λd and 3d such that for |λ| < λd and |λ| > 3d the spectrum of H outside σ (H0 ) is P -a.s. pure point with exponentially decaying eigenfunctions.

1. Introduction This paper deals with the spectral theory of the discrete Laplacian on a half-space with a random boundary condition. The history of this problem and its physical aspects are discussed in [JMP,KP]. For some recent rigorous work on the subject we refer the reader to [AM,BS,Gri,JM1,JM2,JMP,JL1,JL2,KP,M1,P]. In this section we introduce the model, review some known results and state our theorems. At the end of the section we will briefly explain the basic ideas of our proofs and discuss some open problems.

:= Zd × Z+ , where Z+ = {0, 1, . . . }. 1.1. The model. Let d ≥ 1 be given and let Zd+1 + d+1 We denote the points in Z+ by (n, x), for n ∈ Zd and x ∈ Z+ . Let H be the discrete Laplacian on l 2 (Zd+1 + ) with boundary condition ψ(n, −1) = V (n)ψ(n, 0). When V = 0 the operator H reduces to the Dirichlet Laplacian which we denote by H0 . The operator H

154

V. Jakši´c, S. Molchanov

acts as

(P

(H ψ)(n, x) = where |n|+ = operator

Pd

0, x0)

if x > 0,

ψ(n0 , 0) + V (n)ψ(n, 0)

if x = 0,

|n−n0 |+ +|x−x 0 |=1 ψ(n

ψ(n, 1) +

i=1 |ni |.

P

|n−n0 |+ =1

Note that the operator H can be viewed as the Schrödinger H = H0 + V ,

(1.1)

where the potential V acts only along the boundary ∂Zd+1 = Zd , that is, (V ψ)(n, x) = 0 + if x > 0 and (V ψ)(n, 0) = V (n)ψ(n, 0). For many purposes, it is convenient to adopt this point of view and we will do so in the sequel. Since H0 is bounded, the operator H is properly defined as a self-adjoint operator. We recall that the spectrum of H0 is purely absolutely continuous and that σ (H0 ) = [−2(d + 1), 2(d + 1)]. We are interested in the spectral results which hold for “almost every” boundary potential V . More precisely, let be the set of all boundary potentials, that is, the functions V : Zd 7 → R. The set can be identified with d

= RZ =

× R. Zd

Let F be the σ -algebra in generated by the cylinder sets {V : V (n1 ) ∈ B1 , . . . , V (nk ) ∈ Bk }, where B1 , . . . , Bk are Borel subsets of R. For each n ∈ Zd let µn be a probability measure on R, and let P be a measure on (, F) defined by P :=

× µn .

n∈Zd

Note that µn is the probability distribution of the random variable 3 V 7→ V (n). We say that the random variable V (n) has a density if the measure µn is absolutely continuous with respect to the Lebesgue measure. Obviously, the random variables {V (n)} are independent1 , and we say that they are i.i.d. if all the measures µn are equal to µ. Recall that the topological support of µ, suppµ, is the complement of the largest open set B such that µ(B) = 0. Let U0 be a given background boundary potential on Zd . We will always assume that U0 is bounded. In this paper we will study the operators H = H0 + U0 + λV , V ∈ .

(1.2)

Here, λ is a real constant which measures the strength of the disorder. As usual in the theory of random Schrödinger operators, we are interested in the spectral properties of H which hold P -a.s., that is, for a set of V ’s of P -measure 1. For additional information about random Schrödinger operators we refer the reader to [CL,CFKS]. Let us briefly summarize the known results about the model (1.2). 1 We remark that the method of Aizenman-Molchanov (and therefore of our paper) easily allows for correlated random variables. For notational simplicity, however, we will deal only with independent random variables.

Localization of Surface Spectra

155

(1) For any V , the wave operators W ± = s − lim eitH e−itH0 t→∓∞

exist. In particular, σ (H0 ) ⊂ σac (H ). Moreover, if the random variables V (n) have densities, then the spectrum of H in σ (H0 ) is P -a.s. purely absolutely continuous. These results are proven in [JL1,JL2]. We emphasize that the first result is deterministic while the second is random – there are examples of potentials V (which even satisfy lim|n|→∞ V (n) = 0) such that H0 + V has embedded eigenvalues in σ (H0 ) [MV]. (2) If U0 = const. and the random variables {V (n)} are i.i.d. with distribution µ, then (1.2) is an ergodic family of random operators. In particular, it follows from the standard argument that there exists a set 6 such that σ (H ) = 6 P -a.s. The set 6 can be computed (see [JL1]). We set λ = 1 and absorb U0 in V . Let o n S := E + a + a −1 : E ∈ [−2d, 2d], a ∈ suppµ and |a| ≥ 1 . Then 6 = σ (H0 ) ∪ S. Note that whenever suppµ ∩ (R \ [−1, 1]) 6 = ∅, the set 6 has parts lying outside σ (H0 ). (3) Assume that d = 1, U0 = const. and that the random variables {V (n)} are i.i.d. with distribution µ. Assume that dµ = p(x)dx, that p ∈ L∞ (R) and that the topological boundary of suppµ is a discrete set. Under these assumptions it was shown in [JM1] that for any λ the spectrum of H outside σ (H0 ) is P -a.s. pure point and that the corresponding eigenfunctions decay faster than any polynomial in the n-variable, and exponentially fast in the x-variable. Unfortunately, the techniques of [JM1] are sensitive to addition of (even periodic) background potentials U0 . If however suppµ is an unbounded set and p ∈ L∞ (R), then for any bounded background potential U0 and all λ, the spectrum of H = H0 + U0 + λV outside σ (H0 ) is P -a.s. pure point and the corresponding eigenfunctions decay as above. Although this last result was not explicitly stated in [JM1], it is an easy consequence of the results proven in [JM1,JM4]. (4) In [AM] and [Gri] it is shown that for arbitrary dimensions we have localization away from the edges of σ (H0 ), that is, ∀λ there exist δ(λ) > 0 such that the spectrum of H in the set {E : |E| > 2(d + 1) + δ(λ)} is P -a.s. pure point with exponentially decaying eigenfunctions. Moreover, δ(λ) ↓ 0 as |λ| ↑ ∞. Similar results hold for fixed λ and large |E|. The assumption on the µn ’s under which this result is proven in [AM] is Hypothesis B(d) below (which should hold for some 0 < s < 1). In [Gri], the assumption on the µn ’s is the usual assumption of multiscale analysis. This work grew from our attempts to extend the results of (3) to d > 1 and thus improve the results of [AM] and [Gri]. More precisely, we are seeking under which conditions on λ and µn ’s, the operator H has P -a.s. only pure point spectrum outside σ (H0 ). Such a result and (1) would yield that P -a.s. σac (H ) = σ (H0 ), σpp (H ) = σ (H ) \ σ (H0 ), σsc (H ) = ∅. For d = 1, (1.3) follows from (1) and (3) above.

(1.3)

156

V. Jakši´c, S. Molchanov

1.2. The results. For 0 < s < 1 we set Z (|x − α|s /|x − β|s )dµn (x) Z , ks (n) := inf α,β∈C (1/|x − β|s )dµn (x) (1.4)

Z

(|x|s /|x − β|s )dµn (x) Z , Ks (n) := sup β∈C (1/|x − β|s )dµn (x) and ks := lim inf ks (n), n→∞

Ks := lim sup Ks (n).

(1.5)

n→∞

We will use the conventions 0−1 = ∞, ∞−1 = 0. Certain positive constants cd (s) will play an important role in this paper. These constants are defined at the end of Sect. 1.4 by Relation (1.22). We mention only that cd (s) is defined for s > d/(d + 1) and that ]d/(d + 1), ∞[3 s 7 → cd (s) is a strictly decreasing C ∞ function with cd (1) = 1. We set h i1 −1 s for some s ∈]d/(d + 1), 1[ , 3d := inf λ : λ > (cd (s) + 2d)ks (1.6) n o λd := sup λ : λ < [cd (s)Ks ]− s for some s ∈]d/(d + 1), 1[ , 1

where we use the convention inf ∅ = ∞. We make the following hypotheses: Hypothesis A. For all n, the measure µn is absolutely continuous with respect to the Lebesgue measure. Hypothesis B(d). ks > 0 for some s ∈ ]d/(d + 1), 1[. Hypothesis C(d). Ks < ∞ for some s ∈ ]d/(d + 1), 1[. Hypotheses B(d) and C(d) ensure that λd and 3d are finite positive numbers. Note that these hypotheses require that ks > 0 and Ks < ∞ for values of s close to 1. In this respect, our results differ from the localization results in [A,AM]. Various conditions under which Hypotheses B(d) and C(d) hold are discussed in [A, AM,Gra,M1]. For example, they hold if the random variables {V (n)} are i.i.d. with any of the following distributions: (a) the uniform distribution in some interval, (b) the Gaussian distribution, (c) the Cauchy distribution. Hypotheses B(d) and C(d) also allow for random potentials such that V or V −1 vanish at infinity in a suitable probabilistic sense. We will discuss Hypotheses B(d) and C(d) in more detail in Sect. 1.3. Our main result is

Localization of Surface Spectra

157

Theorem 1.1. Assume that Hypotheses A and B(d) hold. Let U0 be an arbitrary bounded background potential and H = H0 + U0 + λV , V ∈ . Then for any |λ| > 3d the operator H has P -a.s. only pure point spectrum outside σ (H0 ) with exponentially decaying eigenfunctions. As we will explain in Sect. 1.5, it is not likely that Theorem 1.1 holds for arbitrary λ if the dimension d + 1 is sufficiently high. However, if the background potential is equal to zero, we can deal with the weak coupling regime. Theorem 1.2. Assume that Hypotheses A and C(d) hold and let H = H0 + λV , V ∈ . Then for |λ| < λd the operator H has P -a.s. only pure point spectrum outside σ (H0 ) with exponentially decaying eigenfunctions. Remark 1. If λd kV k ≤ 1 P -a.s., then for |λ| ≤ λd , the operator H has P -a.s. no spectrum outside σ (H0 ). Thus, for bounded random variables, the above theorem could be an empty statement. Using densities of the form αp(x) + (1 − α)`−1 p(`−1 x), α ∈ ]0, 1[, ` > 0, one can construct a large class of i.i.d. bounded random variables for which λd kV k∞ > 1. In this case, for kV k−1 ∞ < |λ| < λd the operator H has some essential spectrum outside σ (H0 ), and Theorem 1.2 asserts that this spectrum is P -a.s. pure point with exponentially decaying eigenfunctions. Remark 2. If the random variables {V (n)} are i.i.d. and unbounded, then for all λ 6 = 0 the operator H has P -a.s. some essential spectrum outside σ (H0 ). For example, if the random variables {V (n)} are i.i.d. with the Gaussian or Cauchy distribution, then for all λ 6 = 0 σ (H ) = R P -a.s., and the theorem asserts that for λ sufficiently small the spectrum of H in R \ σ (H0 ) is P -a.s. pure point with exponentially decaying eigenfunctions. Remark 3. We will discuss below some non-i.i.d. examples for which Theorems 1.1 and 1.2 hold for all λ 6 = 0. 1.3. Examples. We first consider the case where the {V (n)} are i.i.d. random variables with distribution dµ = p(x)dx. In this case the constants in (1.4) √ are equal respectively to ks and Ks . In this section we will use the shorthand hxi = 1 + x 2 . Hypothesis B(d) holds for all d if p ∈ L∞ (R). Moreover, there are explicit constants cs , which depend on s only, such that ks ≥ cs kpk−s ∞ (for the proof see [Gra]). If

Z

|x|γ p(x)dx < ∞,

(1.7)

and p is piecewise continuous and strictly monotone for large |x|, then Ks < ∞ for s < min(1, γ /2) (see [AM]). Thus, if in addition γ > 2d/(d + 1), C(d) holds. In particular, for the Gaussian distribution, C(d) holds for all d. The above criterion fails for the Cauchy distribution even if d = 1.

158

V. Jakši´c, S. Molchanov

If p(x) ≤ Chxi−1−α for some α > 0, then Ks < ∞ for s < min(1, α/2). The proof of this result is elementary and we will skip it. Thus, if in addition α > 2d/(d + 1), C(d) holds. In particular, for the Cauchy distribution, C(d) holds for all d. We remark that for the Cauchy distribution the integrals in (1.4) can be explicitly evaluated (see [M2]) and one can take Ks = 1/ cos(sπ/2), irrespective of the parameters of the distribution. A different condition under which Ks < ∞ has been discussed in [A], Appendix I. The condition of Aizenman, however, requires that s < 1/3, and is not applicable in our case. An interesting class of non-i.i.d. examples arises as follows. Let {an }n∈Zd be a real sequence with an 6 = 0 and let {W (n)} be i.i.d. random variables with distribution dµ = p(x)dx. We denote the constants (1.4) associated to W by ks,w and Ks,w . Let V (n) := an W (n).

(1.8)

Then the distribution of V (n) is dµn (x) = |an |−1 p(an−1 x) and ks (n) = |an |s ks,w , Ks (n) = |an |s Ks,w . In particular, if B(d) holds for {W (n)} and lim |an | = ∞, then Theorem 1.1 holds for all λ 6 = 0. If C(d) holds for {W (n)} and lim |an | = 0, then Theorem 1.2 holds for all λ. To illustrate these results with a concrete example, take an = hniβ and assume that {W (n)} has either the Cauchy or Gaussian distribution. Let V be given by (1.8) and H = H0 +V . Then it follows from Theorems 1.1 and 1.2 that for any β 6 = 0 the operator H has P -a.s. only pure point spectrum outside σ (H0 ) with exponentially decaying eigenfunctions. One can show that in the case of the Cauchy distribution, σess (H ) = R P -a.s. if β ∈ [−d, d], and that σess (H ) = σ (H0 ) P -a.s. if β 6 ∈ [−d, d]. In the case of the Gaussian distribution, σess (H ) = R P -a.s. if β ∈ [0, d], and σess (H ) = σ (H0 ) P -a.s. if β 6 ∈ [0, d]. In all the above cases, the spectrum of H in σ (H0 ) is purely absolutely continuous P -a.s. [JL1]. The spectral properties of the Anderson model with decaying randomness have been discussed recently in [KKO]. 1.4. About the proofs. In this section we sketch some of the ideas involved in our proofs. The first idea, which has been used in practically all work on the spectral theory of operators (1.1), is to “integrate” the x-variable and reduce the d +1-dimensional spectral problem to a non-linear d-dimensional spectral problem. The details of the argument are given in [JM1] and here we summarize the results which we will need in the sequel. Let T = R/2πZ be the circle and Td the d-dimensional torus. We denote the points in Td by φ = (φ1 , . . . , φd ) and by dφ the usual Lebesgue measure. We set 8(φ) := 2

d X i=1

cos φi .

Localization of Surface Spectra

159

For z ∈ C \ σ (H0 ), let λ(φ, z) be such that λ(φ, z) +

1 + 8(φ) = z, |λ(φ, z)| < 1. λ(φ, z)

We set2 jˆ(φ, z) = λ(φ, z) + 8(φ), j (n, z) = (2π )−d

Z Td

jˆ(φ, z)e−inφ dφ.

(1.9)

(1.10)

One can show that the function j (n, z) decays exponentially in the variable n. Let h0 (z) be the operator on l 2 (Zd ) defined by X j (n − k, z)ψ(k). (h0 (z)ψ)(n) = k∈Zd

We define a one-parameter family of random operators on l 2 (Zd ) by h(z) = h0 (z) + U0 + λV , z ∈ C \ σ (H0 ), V ∈ .

(1.11)

The key property of these operators is that ∀m, n ∈ Zd , (δ(m,0) |(H − z)−1 δ(n,0) ) = (δm |(h(z) − z)−1 δn )

(1.12)

(for the proof see [JM1] or [JL1]). Since the set of vectors {δ(n,0) }n∈Zd is cyclic for H (see [JL1]), the spectral properties of H are encoded by the family h(z). In particular, it follows from the Simon–Wolff theorem (see Sect. 2.1 for details) that Theorems 1.1 and 1.2 follow from a suitable estimate on the matrix elements (δm |(h(E) − E − iε)−1 δn ), E ∈ R \ σ (H0 ).

(1.13)

In comparison with the usual theory of random Schrödinger operators, the difficulties in estimating the matrix elements (1.13) stem from the fact that h0 (E) is a long-range Laplacian which depends on the energy. To study the resolvent (h(E) − E − iε)−1 with the standard techniques one needs efficient estimates on the kernel j (n, E) for E ∈ R \ σ (H0 ). Let us describe the estimates previously used in the literature and the estimate we will use in this paper. We set Z −d λ(φ, E)e−inφ dφ. (1.14) t (n, E) := (2π ) Td

In the Fourier representation (h0 (E) − E)−1 acts as multiplication by −λ(φ, E) and for any p, q ∈ Zd , (δp |(h0 (E) − E)−1 δq ) = −t (p − q, E) = −t (q − p, E)

(1.15)

(these relations will be used in Sect. 4). From the definition of j (n, E) it follows that j (n, E) = t (n, E) + δ1|n|+ ,

(1.16)

2 There are typographical errors in similar formulas in [JM1, Relation (1.5)] and [JM2], where the factor (2π)−d is missing in the front of the integral.

160

V. Jakši´c, S. Molchanov

where δij stands for the Kronecker symbol. To estimate t (n, E), it is useful to note that (see [JM1] or [JL1]) t (n, E) = (δ(0,0) |(E − H0 )−1 δ(n,0) ).

(1.17)

From this identity one easily gets the estimate (see e.g. Lemma III.4 in [S]) |t (n, E)| ≤ CE e−dE |n|+ , where −1

CE = (|E| − 2(d + 1))

(1.18)

2(d + 1) . , dE = ln |E|

A better estimate can be obtained using (1.14) and the analyticity properties of λ(φ, E) (see Prop. 2.2 in [JM1]): |t (n, E)| ≤ e−a(E)|n|+ ,

(1.19)

a(E) = ln γE , and γE + γE−1 = (|E| − 2)/d.

(1.20)

where3

Either of the estimates (1.18), (1.19) suffices for the arguments in [AM] and [Gri]. However, the estimate (1.18) blows up as E approaches ±2(d + 1) while (1.19) gives the useless bound |t (n, E)| ≤ 1. Therefore, these estimates are not useful near the edges of σ (H0 ). In fact one can easily show that a uniform exponential estimate of t (n, E) near ±2(d + 1) is not possible – otherwise, the function λ(φ, ±2(d + 1)) would be analytic in φ, which is not the case. We will derive an useful bound near the edges of σ (H0 ) from the following observations: (i) t (n, E) = (−1)|n|+ t (n, −E). (ii) The function E 7 → t (n, E) is positive and strictly decreasing on [2(d + 1), ∞[. Q d+1 (iii) For some C, |t (n, 2(d + 1))| ≤ C di=1 (1 + |ni |)− d . From (i)–(iii) it follows that for s > d/(d + 1), X |t (n, E)|s ≤ cd (s), sup E6∈σ (H0 )

sup

E6∈σ (H0 )

n∈Zd

X

|j (n, E)|s ≤ cd (s) + 2d,

(1.21)

n∈Zd

where cd (s) :=

X

|t (n, 2(d + 1))|s .

(1.22)

n∈Zd

These estimates are sufficient to employ the method of Aizenman–Molchanov. We will prove Theorem 1.1 using the second relation in (1.21) and by following an elegant presentation of Aizenman–Molchanov theory in [S]. In the proof of Theorem 1.2, which deals with the weak coupling regime, we use the first relation (1.21) and essentially follow the argument of Aizenman [A]. 3 There is another unfortunate typographical error in [JM1], where in the second formula in (1.20) the factor d is replaced with 2d.

Localization of Surface Spectra

161

1.5. Some remarks. First, we would like to remark that Theorems 1.1 and 1.2 are not simply extensions of the results in [JM1] to higher dimension. Theorem 1.1 allows for a background potential, which is important in physical applications. The above two theorems also establish exponential decay of the eigenfunctions. The method of the proof allows for correlated random variables and can be used to prove dynamical localization outside σ (H0 ) (see [A,RJLS,GD]). None of these is covered by the method of [JM1]. Moreover, the proofs of Theorems 1.1 and 1.2 follow relatively easily from the Aizenman–Molchanov theory, while the arguments in [JM1] are quite elaborate. On the other hand, if d = 1, the techniques of [JM1] yield localization for all λ and do not require that random variables are unbounded if λ is small. Theorems 1.1 and 1.2 do not yield such a result. This brings us to our second remark. We believe that in many cases Theorem 1.2 holds for all λ and d. It would be interesting to exhibit at least some classes of distributions for which this result holds. We finish this section with a brief explanation of why we do not expect that Theorem 1.1 will hold for small λ’s and arbitrary U0 . Let U0 be a large constant (it suffices that |U0 | > 4d + 2). Then, the spectrum of H0 + U0 is purely absolutely continuous, and consists of two disjoint components, σ (H0 ) and [−2d, 2d] + U0 + U0−1 . If physicists’ expectations about the Anderson model are correct, one may expect that for d ≥ 3 and λ small, the operator H will have some absolutely continuous spectrum on the second branch [−2d, 2d] + U0 + U0−1 (note however that since the dimension of our half-space is d + 1, d ≥ 3 corresponds to the unphysical d + 1 ≥ 4). This absolutely continuous spectrum would have an interesting property – the corresponding generalized eigenfunctions would decay exponentially fast in the x-variable and would be extended in the n-variable. Such generalized eigenfunctions describe propagating surface states (surface waves), see [JMP] and [KP] for discussion. It is an interesting question as to whether propagating surface states exist in the random models studied here. Theorems 1.1 and 1.2 yield that in many situations all the propagating surface states with energies E 6 ∈ σ (H0 ) (which exist if the boundary potential is constant or periodic) are exponentially localized by the random fluctuations of the boundary. This is physically the most interesting consequence of our results. Finally, we remark that although it is known that the spectrum of H in σ (H0 ) is P -a.s. purely absolutely continuous, the structure of the generalized eigenfunctions is not known, and in particular it is not known whether surface states with energies in σ (H0 ) exist. 2. Preliminaries 2.1. Simon–Wolff criterion. As we have already remarked, our proofs of Theorems 1.1 and 1.2 are based on a suitable variant of the Simon–Wolff theorem. In this section we describe this variant and collect some related technical results which will be used in the sequel. In this section I = ]a, b[ is a fixed open interval outside σ (H0 ). We denote by m the Lebesgue measure on R (the symbol a.e. without qualification will always mean with respect to Lebesgue measure). Condition C(1). ∀m ∈ Zd and for P × m - a.e. (V , E) ∈ × I , lim k(H − E − iε)−1 δ(m,0) k < ∞. ε↓0

(2.1)

162

V. Jakši´c, S. Molchanov

Condition C(2). ∀m, n ∈ Zd and for P × m - a.e. (V , E) ∈ × I , lim |(δ(m,0) |(H − E − iε)−1 )δ(n,0) | ≤ CV ,E,m e−a(E)|n|+ , ε↓0

(2.2)

for some a(E) > 0. The existence of the limit (2.1) follows from monotonicity. The existence and finiteness of the limit (2.1) for P × m - a.e. (V , E) follows from Fubini’s theorem and the well-known property of Herglotz functions. The estimate (2.2) implies that for all x ≥ 0, lim |(δ(m,0) |(H − E − iε)−1 )δ(n,x) | ≤ CV ,E,m e−a(E)|n|+ −b(E)x , ε↓0

(2.3)

where b(E) = supφ∈Td | ln λ(φ, E)| (λ(φ, E) is given by (1.9)). See Sect. 2 in [JM1] for details. Consider the following statements: Statement S(1). The spectrum of H in I is P -a.s. pure point. Statement S(2). The spectrum of H in I is P -a.s. pure point with exponentially decaying eigenfunctions. Theorem 2.1. Assume that Hypothesis A holds. Then C(1) ⇒ S(1) and C(1) + C(2) ⇒ S(2). This result follows from the Simon–Wolff theorem [SW] and the fact that the set of vectors {δ(m,0) }m∈Zd is cyclic for H . Our next lemma shows that changing the distributions within a finite box does not affect conditions C(1) and C(2). Lemma 2.2. Assume that P1 and P2 are measures on (, F) of the form P1 = (1)

× µ(1) n ,

n∈Zd

P2 =

× µ(2) n ,

n∈Zd

(2)

that µn = µn for |n|+ > l, and that conditions C(1) and C(2) hold for the measure P1 . Then these conditions also hold for the measure P2 . Proof. We will deal with condition C(2). A similar argument applies to condition C(1). Let Bl = {n ∈ Zd : |n|+ ≤ l}, Bl = {n ∈ Zd : |n|+ > l}, l = RBl , l = RBl , and for i = 1, 2, let Pil =

× µ(i)n,

n∈Bl

Pil =

× µ(i)n.

n∈Bl

Obviously, = l × l , Pi = Pil × Pil ,

Localization of Surface Spectra

163

and by the assumption, P1l = P2l .

(2.4)

In what follows we view the points in as the pairs V = (Vl , Vl ), Vl ∈ l , Vl ∈ l .

Since condition C(2) holds for the measure P1 , for P1l × P1l × m a.e. (Vl , Vl , E) ∈ l × l × I the estimate lim |(δ(m,0) |(H − E − iε)−1 δ(n,0) )| ≤ CV ,E,m e−a(E)|n|+ ε↓0

(2.5)

˜ ⊂ of full P l holds for all m, n ∈ Zd . By Fubini’s theorem, there exists a set l l 1 ˜ , the estimate (2.5) holds for P l ×m a.e. (Vl , E) ∈ l ×I . measure such that, for Vl ∈ l 1 ˜ l ⊂ l of full Now fix Vl ∈ l . By Fubini’s theorem there exists a (Vl -dependent) set l ˜ l , the estimate (2.5) holds for a.e. E ∈ I . We now fix P1 measure such that, for Vl ∈ ˜ Vl ∈ l and set V = (Vl , Vl ). Let W ∈ l be arbitrary and HW = H + W. Then (δ(m,0) |(HW − E − iε)−1 δ(n,0) ) = (δ(m,0) |(H − E − iε)−1 δn,0 ) P −λ p∈Bl W (p)(δ(m,0) |(HW − E − iε)−1 δ(p,0) )(δ(p,0) |(H − E − iε)−1 δ(n,0) ). (2.6) Since for a.e. E the limits lim |(δ(m,0) |(HW − E − iε)−1 δ(p,0) )| ε↓0

exist and are finite, we derive from (2.6) that the estimate (2.5) holds for (Vl + W, Vl ) ˜ and all Vl ∈ l , the estimate (2.5) holds for and a.e. E ∈ I . Therefore, for Vl ∈ l a.e. E ∈ I . By Fubini’s theorem and (2.4) this estimate then also holds for P2 × m a.e. t (V , E) ∈ × I , and the condition C(2) holds for the measure P2 . u We now introduce a new condition. Recall that the operators h(E) are defined by (1.11). Condition C(3).

∀m and for P × m - a.e. (V , E) ∈ × I , lim k(h(E) − E − iε)−1 δm k < ∞. ε↓0

Lemma 2.3. (i) C(1) ⇔ C(3). (ii) If C(3) holds then ∀m, n ∈ Zd and for P × m a.e. (V , E) ∈ × I , lim(δ(m,0) |(H − E − iε)−1 δ(n,0) ) = lim(δm |(h(E) − E − iε)−1 δn ). ε↓0

ε↓0

(2.7)

164

V. Jakši´c, S. Molchanov

Proof. Part (i) of this lemma is proven in [JM1, Lemma 2.1]. In fact, a stronger result holds: for all (V , E) ∈ × I , the limit (2.1) is finite iff the limit (2.7) is finite. To prove Part (ii) we will use the relation (1.12). The resolvent identity yields that (δm |(h(E + iε) − E − iε)−1 δn ) − (δm |(h(E) − E − iε)−1 δn ) ≤ kh0 (E + iε) − h0 (E)k k(h(E + iε) − E − iε)−1 δm k k(h(E) − E − iε)−1 δn k, and the result follows from the estimate k(h0 (E + iε) − h0 (E)k = sup |λ(φ, E + iε) − λ(φ, E)| = O(ε). φ∈Td

t u

Our last condition is Condition C(4). ∀m, n ∈ Zd and for P × m - a.e. (V , E) ∈ × I , lim |(δm |(h(E) − E − iε)−1 δn )| ≤ CV ,E,m e−a(E)|n|+ , ε↓0

(2.8)

for some a(E) > 0. We can not guarantee a priori the existence of the limits (2.8). However, by Lemma 2.3, if C(3) holds then the limits (2.8) exist and C(3) + C(4) ⇒ C(1) + C(2). Before we state our final criterion under which statement S(2) holds, we need Lemma 2.4. Let {fn }n∈Zd be a sequence of random variables on the probability space (, F, P ) such that for some 0 < s < 1 and ∀n, E(|fn |s ) ≤ Ce−d|n|+ . (E stands for the expectation). Then there are finite constants DV such that |fn (V )| ≤ DV e−d|n|+ P − a.s.

Proof. Let n o An = V ∈ : |fn (V )| > e−d|n|+ . By Chebyshev’s inequality,

Thus,

P

P (An ) ≤ esd|n|+ E(|fn |s ) ≤ Ce−(1−s)d|n|+ . t P (An ) < ∞, and the statement follows from the Borel–Cantelli lemma. u

Lemma 2.5. Assume that for some 0 < s < 1, ε0 > 0 and a(E) > 0 the relation (2.9) sup E |(δm |(h(E) − E − iε)−1 δn )|s ≤ CE e−a(E)|n−m|+ , 0<ε<ε0

holds for all E ∈ I and m, n ∈ Zd . Then conditions C(3) and C(4) hold. In particular, statement S(2) holds.

Localization of Surface Spectra

165

Proof. We first establish C(3). Let m be fixed. Then, X |(δm |(h(E) − E − iε)−1 δn )|2 . k(h(E) − E − iε)−1 δm k2 =

(2.10)

n

Since for any 0 < q ≤ 1 and any sequence of complex numbers xk we have X q X |xk |q , xk ≤ (2.10) yield (take q = s/2) E(k(h(E) − E − iε)−1 δm ks ) ≤

X

E(|(δm |(h(E) − E − iε)−1 δn )|s ).

n

It follows from (2.9) that for any E ∈ I , lim E(k(h(E) − E − iε)−1 δm ks ) < ∞, ε↓0

and by the Monotone Convergence Theorem, that −1 s E lim k(h(E) − E − iε) δm k < ∞. ε↓0

This estimate and Fubini’s theorem yield C(3). Since C(3) holds, by Part (ii) of Lemma 2.3, for P × m -a.e (V , E) ∈ × I the limits lim(δm |(h(E) − E − iε)−1 δn ) ε↓0

exist and are finite. Therefore, by Fatou’s Lemma, for a.e. E ∈ I , −1 s E lim |(δm |(h(E) − E − iε) δn )| ≤ lim inf E(|(δm |(h(E) − E − iε)−1 δn )|s ) ε↓0

ε↓0

≤ CE e−a(E)|n−m|+ . Condition C(4) now follows from Lemma 2.4 and Fubini’s theorem. u t 2.2. The key estimates. In this section we collect some technical results which we will need for our proofs. P First, we need a lemma about the Dirichlet Laplacian H0 . Recall that |n|+ = di=1 |ni |. Lemma 2.6. Let a(n, k) := (δ(0,0) |H0k δ(n,0) ), where k ≥ 0. Then a(n, k) = 0 if k < |n|+ or k − |n|+ is odd, and a(n, k) > 0 if k − |n|+ is even. Proof. An elementary induction. u t We now prove the properties of the sequence t (n, E) described in Sect. 1.4. At the boundary of σ (H0 ) (E = ±2(d + 1)) we define λ(φ, ±2(d + 1)) by Eq. (1.9) and the condition |λ(φ, ±2(d + 1)| ≤ 1. The sequences t (n, ±2(d + 1)) are defined by (1.14). It follows easily from (1.14) that for all n, E 7 → t (n, E) is a continuous function on R \ int σ (H0 ).

166

V. Jakši´c, S. Molchanov

Lemma 2.7. For E ≥ 2(d + 1), t (n, E) = (−1)|n|+ t (n, −E). Proof. For E > 2(d + 1) it follows from (1.17) and Lemma 2.6 that t (n, E) =

∞ X p=0

1

2p+|n|+

(δ(0,0) |H0

E 2p+1+|n|+

δ(n,0) ),

(2.11)

and t (n, −E) =

∞ X (−1)|n|+ 2p+|n|+ (δ(0,0) |H0 δ(n,0) ). 2p+1+|n| + E

(2.12)

p=0

Clearly, these relations yield the statement for E > 2(d + 1). By the continuity of t (n, E), the statement also holds for E = 2(d + 1). u t Lemma 2.8. The function E 7 → t (n, E) is positive and strictly decreasing on [2(d + 1), ∞). Proof. It follows from Lemma 2.6 and (2.11) that for E > 2(d + 1), t (n, E) > 0,

d t (n, E) < 0. dE

These two observations yield the result. u t Lemma 2.9. There exists a constant C such that |t (n, 2(d + 1))| ≤ C

d Y d+1 (1 + |ni |)− d .

(2.13)

i=1

Proof. Let n = (n1 , . . . , nd ). For notational simplicity, we assume that ni > 0. Since E = 2(d + 1) is fixed, in the sequel we write λ(φ) for λ(φ, 2(d + 1)), etc. We recall that Z λ(φ)e−inφ dφ, t (n) = (2π )−d Td

where λ(φ) = Since 8(φ) = 2

Pd

p 1 2(d + 1) − 8(φ) − (2(d + 1) − 8(φ))2 − 4 . 2

i=1 cos φi ,

we can write λ(φ) as

λ(φ) = 91 (φ)92 (φ) + 93 (φ), where 92 and 93 are C ∞ functions on Td and 91 (φ) =

d X i=1

φi sin 2 2

! 21 .

Localization of Surface Spectra

167

Clearly, 91 is C ∞ away from the point φ = 0, and it is a simple exercise to verify that the function X αi ≤ d + 1, ∂φα11 . . . ∂φαdd 91 (φ), αi ≥ 0, is in L1 (Td ). Integration by parts yields that for all j and some C > 0, |t (n)| ≤ C|nj |

−1

d Y

!−1 ni

.

i=1

Multiplying these relations we derive (2.13). u t We are now ready to prove the key properties of the sequences t (n, E) and j (n, E). Recall that the constant cd (s) is defined by (1.22). Lemma 2.10. If s ∈]d/(d + 1), 1] and |E| ≥ 2(d + 1) then X n X

|t (n, E)|s ≤ cd (s), (2.14)

|j (n, E)|s ≤ cd (s) + 2d.

n

Moreover, ]d/(d + 1), ∞[ 3 s 7 → cd (s) is a strictly decreasing C ∞ function with cd (1) = 1. Proof. The first bound in (2.14) follows from Lemmas 2.7 and 2.8. The second bound follows from the first, Relation (1.16), and the inequality |a + b|s ≤ |a|s + |b|s , which holds for a, b ∈ R and 0 < s ≤ 1. The regularity properties of cd (s) follow from Lemma 2.9. Finally, since the sequence t (n, 2(d + 1)) is positive, cd (1) =

X

t (n, 2(d + 1)) = λ(0, 2(d + 1)) = 1.

t u

n

Our next set of technical results concerns the Aizenman-Molchanov technique. The next lemma is motivated by [S]. Lemma 2.11. Let r ∈ l 1 (Zd ) be a non-negative Psequence and R the corresponding convolution operator on l ∞ (Zd ). Assume that n r(n) < 1. Let f, g ∈ l ∞ (Zd ) be non-negative functions and suppose that (1 − R)f ≤ g. Then f ≤ (1 − R)−1 g.

168

V. Jakši´c, S. Molchanov

Proof. Since for any ψ ∈ l ∞ (Zd ), Rψ(n) =

X

r(n − k)ψ(k),

k

the operator R is positivity preserving on l ∞ (Zd ) and has the norm (1 − R)−1 =

∞ X

P

n r(n).

Since

Rj ,

j =0

the operator (1−R)−1 is also positivity preserving on l ∞ (Zd ). This yields the statement. t u Lemma 2.12. Let r ∈ l 1 (Zd ) be a non-negative sequence and R the corresponding convolution operator on l ∞ (Zd ). Assume that r(n) ≤ Ae−a|n|+ P for some a > 0 and that n r(n) < 1. Then (1 − R)−1 is the operator of convolution by the non-negative sequence s(n) which satisfies s(n) ≤ Be−b|n|+ for some b > 0. Proof. Set X

r(n)einφ , n Z (1 − rˆ (φ))−1 e−inφ dφ. s(n) := (2π )−d

rˆ (φ) :=

Td

Since rˆ (φ) is an analytic function on Td and 1 > max |ˆr (φ)|, the function (1 − rˆ (φ))−1 is also analytic on Td . Thus, the sequence s(n) decays exponentially and (1 − R)−1 is the operator of convolution by s(n). Finally, since (1 − R)−1 is positivity preserving we derive that s(n) is a non-negative sequence. u t The final result we will need is the following well-known rank-one perturbation formula. Let V˜ and m ∈ Zd be given. Set V = V˜ + α(δm | · )δm , ˜ h(E) = h0 (E) + V˜ , h(E) = H0 + V . Then the resolvent identity yields (see e.g. [S]) Lemma 2.13. For any n and z, (δn |(h(E) − z)−1 δm ) =

˜ (δn |(h(E) − z)−1 δm ) . ˜ 1 + α(δm |(h(E) − z)−1 δm )

Localization of Surface Spectra

169

3. The Strong Coupling Regime In this section we prove Theorem 1.1. We fix s ∈]d/(d + 1), 1[ such that ks > 0. Let δ ∈]0, ks [. Since lim inf ks (n) = ks , there exists an l such that for all n with |n|+ > l, ks (n) < ks − δ =: ks,δ .

(3.1)

By changing the distributions µn within the box |n|+ ≤ l we may assume that (3.1) holds for all n. By Lemma 2.2, such a change does not affect Theorem 2.1. Let m ∈ Zd and E 6 ∈ σ (H0 ) be given. For ε > 0 we set G(n) ≡ G(m, n; E + iε) := (δm |(h(E) − E − iε)−1 δn ),

(3.2)

and write z = E + iε. The function G satisfies the equation X j (n − k, E)G(k) + (λV (n) + U0 (n) − z)G(n) = δmn . k

Then, X |j (n − k, E)|s E(|G(k)|s ) E |λV (n) + U0 (n) − z)|s |G(n)|s ≤ δmn +

(3.3)

k

(E stands for the expectation). It follows from Lemma 2.13 that |G(n)|s =

|a|s , |λV (n) + b|s

where a and b are functions of {V (l)}l6=n . Let α = U0 (n) − z. Averaging only over V (n) we get Z

|a|s

|λV + α|s dµn (V ) = |a|s |λ|s |λ|−s |λV + b|s

Z

|V + λ−1 α|s dµn (V ) |V + λ−1 b|s Z 1 dµn (V ) ≥ ks,δ |a|s |λ|s |λ|−s |V + λ−1 b|s Z |a|s dµn (V ), = ks,δ |λ|s |λV + b|s

where we used the relations (1.4) and (3.1). Averaging over {V (l)}l6=n we get E |λV (n) + U0 (n) − z)|s |G(n)|s ≥ ks,δ |λ|s E(|G(n)|s ). Let g(n) := E(|G(n)|s ). Note that g ∈ l ∞ (Zd ) (g(n) ≤ 1/εs ). Relations (3.3) and (3.4) yield that −1 −1 |λ|−s R)g ≤ ks,δ |λ|−s δm , (1 − ks,δ

(3.4)

170

V. Jakši´c, S. Molchanov

where R is the operator of convolution by |j (n, E)|s . By the choice of s (recall Lemma 2.10) X |j (n, E)|s ≤ cd (s) + 2d. If λ is such that ks,δ |λ|s > cd (s) + 2d,

(3.5)

then it follows from Lemma 2.11 that −1 −1 |λ|−s (1 − ks,δ |λ|−s R)−1 δm . g ≤ ks,δ

Lemma 2.12 and the estimate (1.19) yield that there exist constants CE and a(E) > 0 such that g(n) ≤ CE e−a(E)|n−m|+ . Therefore, for all E 6 ∈ σ (H0 ), sup E |G(m, n; E + iε)|s ≤ CE e−a(E)|n−m|+ . ε>0

Since δ in (3.1) is arbitrary, Theorem 1.1 follows from Lemma 2.5. 4. The Weak Coupling Regime In this section we prove Theorem 1.2. We fix s ∈]d/(d + 1), 1[ such that Ks < ∞. Let δ > 0. Since lim sup Ks (n) = Ks , there exists an l such that for all n with |n|+ > l, Ks (n) < Ks + δ =: Ks,δ .

(4.1)

By changing the distributions µn within the box |n|+ ≤ l we may assume that (4.1) holds for all n. Let m ∈ Zd and E 6 ∈ σ (H0 ) be given. The resolvent identity yields that (δm |(h(E) − E − iε)−1 δn ) = (δm |(h0 (E) − E)−1 δn ) P − k (λV (k) − iε)(δm |(h(E) − E − iε)−1 δk )(δk |(h0 (E) − E)−1 δn ). Using the relation (1.15) and shorthand (3.2) we rewrite (4) as X (λV (k) − iε)t (n − k, E)G(k). G(n) = −t (n − m, E) + k

Then, E(|G(n)|s ) ≤ |t (n − m, E)|s +

X

|t (n − k, E)|s E (|λ|s |V (k)|s + |ε|s )|G(k)|s .

k

(4.2)

Localization of Surface Spectra

171

Averaging first over V (k) and then over {V (l)}l6=k , we derive from (1.4), (4.1) and Lemma 2.13 that E |V (k)|s |G(k)|s ≤ Ks,δ E(|G(k)|s ).

(4.3)

Let g(n) := E(|G(n)|s ), f (n) := |t (n − m, E)|s . Clearly, g, f ∈ l ∞ (Zd ) and we derive from (4.2) and (4.3) that (1 − (|ε|s + |λ|s Ks,δ )R)g ≤ f, where R is the operator of convolution by |t (n, E)|s . By the choice of s (recall Lemma 2.10) X

|t (n, E)|s ≤ cd (s).

k

We choose λ such that |λ|s Ks,δ < cd (s)−1 , and ε0 > 0 such that |ε0 |s + |λ|s Ks,δ < cd (s)−1 . In the sequel we assume that 0 < ε < ε0 . Lemma 2.11 yields that g ≤ (1 − (|ε|s + |λ|s Ks,δ )R)−1 f, and it follows from Lemma 2.12 and the estimate (1.19) that for some constants CE and a(E) > 0 g(n) ≤ CE e−a(E)|n−m|+ . Therefore, for all E 6 ∈ σ (H0 ), sup E(|G(m, n; E + iε)|s ) ≤ CE e−a(E)|n−m|+ .

0<ε<ε0

Since δ in (4.1) is arbitrary, Theorem 1.2 follows from Lemma 2.5. Acknowledgements. We are grateful to Y. Last, L. Pastur and B. Simon for many discussions on the subject of this paper and to W. Burgess, B. Jesup, L. Langsetmo, D. Macdonald and the referee for helpful remarks. The research of the first author was supported in part by NSERC and of the second by NSF. Part of this work was done during the visit of the second author to the University of Ottawa, which was supported by NSERC.

172

V. Jakši´c, S. Molchanov

References [A]

Aizenman, M.: Localization at Weak Disorder: Some Elementary Bounds. Rev. Math. Phys. 6, 1163 (1994) [AM] Aizenman, M., Molchanov, S.: Localization at Large Disorder and at Extreme Energies: An Elementary Derivation. Commun. Math. Phys. 157, 245 (1993) [BS] Boutet de Monvel, A., Surkova, A.: Localisation des états de surface pour une classe d’opérateurs de Schrodinger discrets à potentiels de surface quasi-périodiques. Helv. Phys. Acta 71, 459 (1998) [CL] Carmona, R., Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Boston: Birkhäuser, 1990 [CFKS] Cycon, H., Froese, R., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin–Heidelberg: SpringerVerlag, 1987 [Gra] Graf, G-M.: Anderson Localization and the Space-Time Characteristic of Continuum States. J. Stat. Phys. 75, 337 (1994) [Gri] Grinshpun, V.: Localization for random potentials supported on a subspace. Lett. Math. Phys. 34, 103 (1995) [GD] Germinet, F., De Bievre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, 323 (1998) [RJLS] del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum IV. Hausdorf dimensions, rank-one perturbations and localization. J. Anal. Math. 69, 153 (1996) [JL1] Jakši´c, V., Last, Y.: Corrugated Surfaces and A.C. Spectrum. Submitted [JL2] Jakši´c, V., Last, Y.: Spectral Structure of Anderson Type Hamiltonians. Submitted [JM1] Jakši´c, V., Molchanov, S.: On the Surface Spectrum in Dimension Two. Helv. Phys. Acta 71, 629 (1999) [JM2] Jakši´c, V., Molchanov, S.: On the Spectrum of the Surface Maryland Model. Lett. Math. Phys. 45, 185 (1998) [JM3] Jakši´c, V., Molchanov. S.: Wave Operators for the Surface Maryland Model. Submitted [JM4] Jakši´c V., Molchanov, S.: Localization for One Dimensional Long Range Random Hamiltonians. Rev. Math. Phys. 11, 103 (1999) [JMP] Jakši´c, V., Molchanov, S., Pastur, L.: On the Propagation Properties of Surface Waves. Wave Propagation in Complex Media IMA Vol. Math. Appl. 96, 143 (1998) [KKO] Kirsch, W., Krishna, M., Obermeit, J.: Anderson Model with decaying randomness: Mobility Edge. Preprint [KP] Khoruzenko, B.A., Pastur, L.: The localization of surface states: An exactly solvable model. Physics Reports 288, 109–126 (1997) [M1] Molchanov, S.: Lectures on Random Media. In: Lectures on Probability, ed. P. Bernard, Lecture Notes in Mathematics, 1581, Heidelberg: Springer-Verlag, 1994 [M2] Molchanov, S.: Hierarchical random matrices and operators. Aplication to Anderson model. In: Multidimensional statistical analysis and theory of random matrices, Bowling Green, OH, Utrecht: VSP, 1996, pp. 179–194 [MV] Molchanov, S., Vainberg, B.: Unpublished [P] Pastur L.: Surface waves: Propagation and localization. Journées “Équations aux dérivées partielles” (Saint-Jean-de Monts, 1995), Exp. No. VI, École Polytech. Palaiseau (1995) [S] Simon, B.: Spectral Analysis of Rank One Perturbations and Applications. CRM Proc. Lecture Notes 8, Providence, RI: AMS, 1995 [SW] Simon, B., Wolff, T.: Singular Continuous Spectrum Under Rank One Perturbations and Localization for Random Hamiltonians. Commun. Pure Appl. Math. 39, 75 (1986) Communicated by B. Simon

Commun. Math. Phys. 208, 173 – 193 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Spectral Shift Function for Trapping Energies in the Semiclassical Limit Shu Nakamura Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo 153-8914, Japan. E-mail: [email protected] Received: 14 December 1998 / Accepted: 1 June 1999

Abstract: Semiclassical asymptotics of the spectral shift function (SSF) for Schrödinger operator is studied at trapping energies. It is shown that the SSF converges to sum of a smooth function and a step function, which is essentially the counting function of resonances. In particular, the Weyl asymptotics is proved.

1. Introduction In this paper we study the semiclassical asymptotics of the spectral shift function (SSF) of a class of Schrödinger operators. In particular, we are concerned with the behavior of the SSF for certain trapping energies. The Schrödinger operator is defined by H = H0 + V (x), H0 =

n X j =1

pj2 on L2 (Rn ),

(1.1)

where pj = −i h¯ ∂xj and h¯ > 0 is the Planck constant. We always assume V (x) is realvalued and is bounded, hence H is self-adjoint with D(H ) = D(H0 ) = H 2 (Rn ), where D(A) denotes the definition domain of an operator A, and H s () denotes the Sobolev space of order s on . We assume V (x) satisfies the following short-range condition: Assumption A. V is a C ∞ -class real-valued function on Rn , and there exists µ > n such that for any multi-index α, |∂xα V (x)| ≤ Cα hxi−µ−|α| , where hxi = (1 + |x|2 )1/2 .

x ∈ Rn ,

174

S. Nakamura

When V satisfies Assumption A, the SSF ξ(λ) is defined as a real-valued function on R satisfying the Birman–Krein formula: Z (1.2) Tr (f (H ) − f (H0 )) = f 0 (λ)ξ(λ)dλ for any f ∈ C0∞ (R). ξ(λ) is fixed up to a constant by the formula, and we normalize ξ(λ) so that ξ(λ) = 0 for λ < min inf σ (H ) ≤ 0. See the review by Birman and Yafaev [1] or the textbook of Yafaev [20] for more information about the SSF. We fix an energy E0 > 0, and study ξ(λ) in the semiclassical limit: h¯ → 0 for λ in a neighborhood of E0 . We set G(E) = {x ∈ Rn | V (x) ≤ E},

F(E) = Rn \ G(E)

for E ∈ R. G(E) and F(E) are called the classically accessible region, and the classically forbidden region for the energy E, respectively. Under our assumption, if E > 0 then G(E) contains a neighborhood of infinity, and we may write G(E) = Gi (E) ∩ Ge (E), where Ge (E) is the unbounded connected component of G(E) and Gi (E) is the sum of the other compact connected components. Let (x(t; y, η))t∈R be the solution to the Newton equation: x(t; ¨ y, η) = −2∇V (x(t; y, η)), x(0; y, η) = y, x(0; ˙ y, η) = η. Assumption B. (i) Gi (E0 ) 6 = ∅. (ii) There exists a neighborhood I of E0 such that λ ∈ I is nontrapping in Ge (λ) in the sense of Robert–Tamura, i.e., for any R > 0 there is T > 0 such that if y ∈ Ge (λ), |y| ≤ R, η2 + V (y) = λ, then |x(t; y, η)| ≥ R for |t| ≥ T . In order to state our main result explicitly, we introduce a couple of Hamiltonians. Let j (j = 1, 2) be open sets such that Gi (E0 ) ⊂⊂ 1 ⊂⊂ 2 ⊂⊂ (Rn \ Ge (E0 )), and choose V e (x) and V i (x) ∈ C ∞ (Rn ) so that V e (x) = V (x) ≥ E0 + δ

if x ∈ / 1 , if x ∈ 1 ,

V i (x) = V (x) ≥ E0 + δ

if x ∈ 2 , if x ∈ / 2 ,

with some δ > 0. We assume V i is bounded (in fact, we may assume V i is constant outside a compact set). We set H e = H0 + V e ,

H i = H0 + V i .

Spectral Shift Function for Trapping Energies in Semiclassical Limit

175

It is easy to see that H i has discrete spectrum in (−∞, E0 + δ), and that λ in a small neighborhood of E0 is nontrapping on Rn with respect to V e . The SSF for nontrapping energy is well-understood, and it is known that it has an asymptotic expansion: ξe (λ) := ξ(λ; H e , H0 ) ∼

∞ X

aj (λ)h¯ −n+j ,

h¯ → 0,

j =0

where ξe (λ) denotes the SSF for the pair H e and H0 (see Sect. 2). The coefficients aj (λ) are explicitly computed in terms of V e (x) (cf. Robert–Tamura [16]). We denote N(λ) = ]{eigenvalues of H i ≤ λ} = dim(Range of EH i (λ)) as the number of the eigenvalues of H e not greater than λ, counting multiplicity. It is well-known that N(λ) ≤ C h¯ −n for each λ ≤ E0 + δ. Theorem 1.1. Suppose Assumptions A and B. Then there exist constants ε, α and β > 0 such that |ξ(λ) − ξe (λ) + N (λ)| ≤ Ce−α/h¯ , h¯ ∈ (0, 1],

(1.3)

if |λ − E0 | ≤ ε and dist(λ, σ (H i )) > e−β/h¯ . The constants α and β depend on V and the choice of j (j = 1, 2), and we will give them explicitly in the proof (cf. Proposition 3.3). Theorem 1.1 implies that ξ(λ) is decomposed to the smooth part (or classical part) ξe (λ) and the stepping part (or resonance part) N(λ), and ξ(λ) jumps in a small neighborhood of each eigenvalue of H i . If we suppose analyticity of the potential V , we can study more precise behavior of ξ(λ) in a neighborhood of each eigenvalue of H i (see Gerard–Martinez–Robert [3]). However, we can prove the following generalization of Theorem 1.1, which is applicable to any energy in a neighborhood of E0 . This result implies that there is no overshoot around each jump of the SSF. Theorem 1.2. Suppose Assumptions A and B, and let ε, α and β > 0 as in Theorem 1.1. Then there is C > 0 such that ξe (λ) − N(λ + e−β/h¯ ) − Ce−α/h¯ ≤ ξ(λ) ≤ ξe (λ) − N (λ − e−β/h¯ ) + Ce−α/h¯

(1.4)

for λ in a neighborhood of E0 . The spectral shift function was first introduced by I. M. Lifshits, and then mathematically studied by Krein and Birman as a part of (two-body) scattering theory. We refer to [1] and [20] for the theory and also for the history. We also want to mention recent development by Sobolev [19] and Pushnitski [10–12]. The SSF is related to the scattering by the following formula due to Birman and Krein: det S(λ) = e−2π iξ(λ) , a.e. λ > 0, where S(λ) is the scattering matrix. Hence, the SSF is also called the scattering phase. On the other hand, if λ < 0 it is easy to see ξ(λ) = −]{eigenvalues of H < λ},

176

S. Nakamura

provided λ is not an eigenvalue of H . In this sense, the SSF may be considered as a generalization of the counting function of the eigenvalues. The semiclassical and short wave asymptotics of the SSF has been studied by many authors, especially on its analogy of the Weyl formula on the number of eigenvalues (see, e.g., [6] and references in [16,15]. See also [12,7,14]). In particular, Robert and Tamura [16] studied the semiclassical limit of the SSF for Schrödinger operators at nontrapping energy, and obtained the full asymptotic expansion of the SSF in h¯ . In particular, if we apply their result to our model, we have Z τn n/2 (λ − V e (x))+ − λn/2 dx + O(h¯ −n+2 ) ξe (λ) = − n (2π h¯ ) for λ in a neighborhood of E0 , which is one version of the Weyl formula for unbounded domains. Here τn denotes the volume of the unit ball in Rn . A result of Lavine [5] (and its generalization by Robert and Tamura [17]) suggests that the same formulaRholds for any energy, if it is averaged in energy by a smooth function, i.e., if we consider ϕ(λ)ξ(λ)dλ with a test function ϕ ∈ C0∞ (R). However, the pointwise estimate is expected to be quite different in general because of the presence of resonances. Gerard, Martinez and Robert studied the SSF (or its derivative) for a trapping energy region for the shape resonance model using the complex resonance theory ([3]). They showed that the SSF jumps in a small neighborhood of each resonance. On the other hand, if the energy is not too close to the resonances, then the scattering amplitude is known to be very close to the one derived from the exterior domain ([8]), and this suggests that ξ(λ) should be very close to ξe (λ) modulo integers. One purpose of this paper is to explain how one obtains the Weyl formula for the SSF in the trapping energy region, combining the SSF ξe (λ) which is derived essentially from classical mechanics of the exterior domain, and the jumps generated by the presence of the quasi-stable states (resonances) in the interior region. In fact, if we combine Theorem 1.2 with the above asymptotic formula for ξe (λ) and the classical Weyl formula for H i , we recover the Weyl formula in a neighborhood of E0 . Corollary 1.3. Suppose Assumptions A and B. Then Z τn n/2 (λ − V (x))+ − λn/2 dx + o(h¯ −n ) ξ(λ) = − (2π h¯ )n for λ in a neighborhood of E0 . This paper is organized as follows: We recall basic definitions and prepare several preliminary results in Sect. 2. We prove Theorem 1.1 in Sect. 3, and key lemmas are proved in Sect. 4. We prove Theorem 1.2 in Sect. 5 by modifying the proof of Theorem 1.1. Throughout this paper, C denotes a (h¯ -independent) generic constant, which may change line to line. 2. Preliminary 2.1. Construction of the SSF by Krein’s method. At first we recall the definition of Krein’s SSF following Birman and Yafaev [1]. Let A and A0 be bounded self-adjoint operators on a Hilbert space H such that W = A − A0 ∈ I1 ,

Spectral Shift Function for Trapping Energies in Semiclassical Limit

177

where Is denotes the trace ideal of order s ≥ 1. Then the perturbation determinant is defined by 1(z) := 1A/A0 (z) = det(1 + W (A0 − z)−1 ), z ∈ ρ(A0 ). Under the above assumption, it is shown that ξ(λ) := ξ(λ; A, A0 ) =

1 lim arg 1A/A0 (λ + iε) π ε↓0

exists for a.e. λ ∈ R, where arg(1(z)) is normalized so that arg(1(z)) → 0 as Im z → ∞. ξ(λ) is called (Krein’s) spectral shift function (SSF), and it satisfies the formula (1.2). Lemma 2.1. For z ∈ / σ (A) ∪ σ (A0 ), arg 1A/A0 (z) =

1 log det(1 − 2i(Im z)(A − z¯ )−1 W (A0 − z)−1 ). 2i

(2.1)

In particular, ξ(λ) = lim

γ ↓0

1 log det(1 − 2iγ (A − λ + iγ )−1 W (A0 − λ − iγ )−1 ) 2πi

(2.2)

for a.e. λ ∈ R. Proof. We note 1 + W (A0 − z)−1 = (A − z)(A0 − z)−1 = (1 − W (A − z)−1 )−1 for z ∈ / σ (A) ∪ σ (A0 ). Hence, by direct computations, we have arg 1(z) = Im log det(1 + W (A0 − z)−1 ) 1 log det(1 + W (A0 − z)−1 ) − log det(1 + W (A0 − z¯ )−1 ) = 2i 1 log det((1 + W (A0 − z)−1 )(1 + W (A0 − z¯ )−1 )−1 ) = 2i 1 log det((1 + W (A0 − z)−1 )(1 − W (A − z¯ )−1 )) = 2i 1 log det 1 − (z − z¯ )W (A0 − z)−1 (A − z)−1 = 2i 1 log det 1 − 2i(Im z)(A − z¯ )−1 W (A0 − z)−1 . t u = 2i Lemma 2.2 (Stability of the SSF). Let J be a unitary operator such that for any f ∈ C0∞ (R), (1 − J )f (A) ∈ I1 . Then ξ(λ; A, J AJ −1 ) = 0

for λ ∈ R.

178

S. Nakamura

Proof. Let f ∈ C0∞ (R). Then we have Tr (f (A) − f (J AJ −1 )) = Tr (f (A) − Jf (A)J −1 ) = Tr ((1 − J )f (A) + Jf (A)J −1 (J − 1)) = Tr ((1 − J )f (A)) − Tr (Jf (A)(1 − J )J −1 ) = Tr ((1 − J )f (A)) − Tr (f (A)(1 − J )) = 0. By the formula (1.2), this implies the assertion. u t

2.2. The SSF for H and H0 . Let m = [n/2] be the smallest integer greater than n/2 − 1. Then it is well-known that (H − z)−m − (H0 − z)−m ∈ I1 for z ∈ / σ (H0 ) ∪ σ (H ). (We will discuss the proof of this briefly in the next subsection.) We let M ≥ − inf V (x) + 1 ≥ − inf σ (H ) + 1, and we set µ(λ) = −M m (λ + M)−m + 1, Re λ > −M. We set A = µ(H ),

A0 = µ(H0 )

and consider ξ(λ; A, A0 ) = ξ(λ; µ(H ), µ(H0 )). By the invariance principle for the SSF ([1], (1.7)), we have ξ(µ(λ); µ(H ), µ(H0 )) = ξ(λ; H, H0 ), λ ∈ (−M, ∞). (Note that µ(λ) is monotone increasing on (−M, ∞).) We may consider this as the definition of the SSF for H and H0 . It is easy to see σ (A) = [0, 1] ∪ {µ(E) | E ∈ σd (H )}, σ (A0 ) = σac (A) = σac (A0 ) = [0, 1].

Spectral Shift Function for Trapping Energies in Semiclassical Limit

179

2.3. Symbol classes. Let g = hxi−2 dx 2 + hξ i−2 dξ 2 , x, ξ ∈ Rn be a Riemannian metric on R2n . For a given function m = m(h¯ ; x, ξ ), the symbol class S(m, g) is defined as follows (cf. Hörmander [4]): a(h¯ ; x, ξ ) ∈ S(m, g) if and only if a(h¯ ; ·, ·) is C ∞ -class function on R2n and for any α, β, it satisfies α β ∂ ∂ a(h¯ ; x, ξ ) ≤ Cαβ m(h¯ ; x, ξ )hxi−|α| hξ i−|β| , x, ξ ∈ Rn . x ξ The quantization of a(h¯ ; x, ξ ) is given by a(h¯ ; x, h¯ D)ϕ(x) = (2π h¯ )

−n

ZZ

ei(x−y)ξ/h¯ a(h¯ ; x, ξ )ϕ(y)dydξ

for ϕ ∈ S(Rn ). By the Calderon–Vaillancourt theorem, a(h¯ ; x, h¯ D) is bounded in L2 (Rn ) if a ∈ S(1, g). We write B ∈ OPS(m, g) if there is b ∈ S(m, g) such that B = b(h¯ ; x, h¯ D). It is easy to see H , H0 ∈ OPS(hξ i2 , g). Moreover, the resolvents of H and H0 are elements of OPS(hξ i−2 , g), and A, A0 ∈ OPS(hξ i−2m , g). The assertion is proved by mimicking the construction of the parametrix for elliptic operators. By the asymptotic expansion formula, we also learn A − A0 ∈ OPS(hξ i−2(m+1) hxi−µ , g). In particular, the principal symbol of A − A0 is given by M m (ξ 2 + V (x) + M)−m − (ξ 2 + M)−m = −M m

m X

(ξ 2 + V (x) + M)−j V (ξ 2 + M)−(m+1−j ) .

j =1

Combining this with the next lemma, we observe A − A0 ∈ I1 . Lemma 2.3. Let p ≥ 1 and let l > n/p. Then a(h¯ ; x, h¯ D) ∈ Ip for a ∈ S(hxi−l hξ i−l , g). Moreover, ka(h¯ ; x, h¯ D)kIp ≤ C h¯ −n/p , where k · kIp denotes the norm of Ip . This lemma is standard, and we omit the proof.

180

S. Nakamura

2.4. Agmon metric and tunneling estimates. In this subsection, we consider a Schrödinger operator H = H0 +V on L2 (Rn ), which does not necessarily satisfy our assumptions in the Introduction. We only suppose that V (x) is continuous and bounded. F(E), G(E) and µ(λ) are defined as before. For each E ∈ R, the Agmon metric (with respect to V ) is defined as the pseudo-metric given by dsE2 = (V (x) − E)+ dx 2 , x ∈ Rn , where (·)+ = max(·, 0). dsE vanishes on G(E), and the induced distance (the Agmon distance) is given by Z d E (x, y) = inf

0

1

1/2 (V (γ (t)) − E)+ dγ (t) γ ∈ P C 1 ([0, 1]; Rn ),

γ (0) = x, γ (1) = y for x, y ∈ Rn . d E (·, ·) defines a (nondegenerate) distance on F(E). We will use tunneling estimates of the following form, which may be called BCD (Briet-Combes-Duclos)-type resolvent estimates (cf. [2], see also [9]). Lemma 2.4. Let ⊂ F(E) be a compact set, and let χ be the characteristic function. Let 0 < d < d E (G(E), ). Then there is ε > 0 such that if Re z ≤ E + ε and k(H − z)−1 k ≤ ed/h¯ , then kχ (H − z)−1 k ≤ C with some C > 0 uniformly in h¯ ∈ (0, 1]. We also need a variation of the BCD-type resolvent estimate on weighted L2 -spaces: Lemma 2.5. Let and d > 0 as in Lemma 2.4. Let β > 0. Then there is ε > 0 such that if Re z ≤ E + ε and k(H − z)−1 k ≤ ed/(1+β)h¯ , then kχ (H − z)−1 hxiβ k ≤ C. Proof. We prove the assertion when β is a positive integer. Then the general case follows by the complex interpolation. We choose ε > 0 and W ∈ C ∞ (Rn ) so that W is supported in a small neighborhood of G(E + ε) and that V (x) + W (x) ≥ E + 2ε,

d E (supp W, ) > d.

Spectral Shift Function for Trapping Energies in Semiclassical Limit

181

Then we have χ (H − z)−1 hxiβ = χ (H + W − z)−1 hxiβ + χ (H + W − z)−1 W (H − z)−1 hxiβ −1 = χ hxiβ hxi−β (H + W )hxiβ − z −1 + χ hxiβ hxi−β (H + W )hxiβ − z W hxi−β (H − z)−1 hxiβ . It is easy to see that the first term is bounded if h¯ is sufficiently small. By the standard argument, we can show (cf. [9]),

χ hxiβ hxi−β (H + W )hxiβ − z −1 W ≤ Ce−d/h¯ . On the other hand, we can prove by elementary commutator computations that

−β

hxi (H − z)−1 hxiβ ≤ C (H − z)−1 β+1 . Combining these, we conclude the assertion. u t The next estimate is well-known, and can be proved using Lemma 2.4. Lemma 2.6. Let and d > 0 as in Lemma 2.4. Then there is ε > 0 such that if ψ(x) is an eigenfunction of H with an eigenvalue λ ≤ E + ε then kχ ψk ≤ Ce−d/h¯ kψk. Moreover, for any multi-index α, kχ ∂xα ψk ≤ Ce−d/h¯ kψk, if V (x) is smooth. 2.5. Scattering theory for H e . By virtue of the nontrapping condition, we have the celebrated semiclassical resolvent estimate of Robert–Tamura: Proposition 2.7. Suppose Assumptions A and B and let ν > 1/2. Then there is ε > 0 such that sup khxi−ν (H e − λ ± iγ )−1 hxi−ν k ≤ C h¯ −1 , λ ∈ [E0 − ε, E0 + ε].

γ >0

In particular, khxi−ν (H e − λ ± i0)−1 hxi−ν k ≤ C h¯ −1 , λ ∈ [E0 − ε, E0 + ε]. Then we can construct a generalized eigenfunction expansion for H e for energy in a neighborhood of E0 . For ξ ∈ Rn , we set 90 (ξ ; x) = (2π h¯ )−n/2 eix·ξ/h¯ ,

x ∈ Rn .

90 (ξ ; x) is a generalized eigenfunction of H0 , and 90 (ξ ; ·) ∈ L2,−s (Rn ) for s > n/2. Here we denote the weighted L2 -space of order s by L2,s : L2,s (Rn ) := ϕ ∈ L2loc (Rn ) | hxis ϕ(x) ∈ L2 (Rn ) .

182

S. Nakamura

Then we set 9e± (ξ ; x) = 90 (ξ ; x) − (H e − |ξ |2 ± i0)−1 V e 90 (ξ ; ·), x ∈ Rn , for ξ such that |ξ |2 ∈ [E0 − ε, E0 + ε]. It is well-known that 9e± (ξ ; ·) is a generalized eigenfunction of H e , and we have ZZ f (|ξ |2 )9e± (ξ ; x)9e± (ξ ; y)ϕ(y)dydξ, ϕ ∈ L2,s (Rn ) f (H e )ϕ = if f ∈ C0∞ ([E0 − ε, E0 + ε]) and s > n/2. The integration in ξ is taken as a strong integral in L2,−s (Rn ). We note that by the semiclassical resolvent estimate, we have

±

9 (ξ ; ·) 2,−s ≤ C h¯ −n/2−1 . e L The next lemma is proved in exactly the same way as the Agmon-type estimates for (usual) eigenfunctions. Lemma 2.8. Let ⊂⊂ F(E0 ) and let 0 < d < d E0 (Ge (E0 ), ). Then there is ε > 0 such that

χ 9 ± (ξ ; x) ≤ Ce−d/h¯ , if |ξ |2 ∈ [E0 − ε, E0 + ε]. e Moreover, for any multi-index α,

χ ∂ α 9 ± (ξ ; x) ≤ Ce−d/h¯ , if |ξ |2 ∈ [E0 − ε, E0 + ε], x e and hence

sup ∂xα 9e± (ξ ; x) ≤ Ce−d/h¯ , if |ξ |2 ∈ [E0 − ε, E0 + ε].

x∈

Lemma 2.9. Let ⊂⊂ F(E0 ε) with sufficiently small ε > 0, l ≥ 0 and let ν > 1/2. Then

χ pα (H e − λ ± iγ )−1 hxi−ν hpi−l ≤ C for λ ∈ [E0 − ε, E0 + ε], γ > 0, if |α| ≤ l + 2. Proof. We mimic the proof of Lemma 2.5, and let W and ε as in the proof. Then we have χ pα (H e − z)−1 hxi−ν hpi−l = χ pα (H e + W − z)−1 hxi−ν hpi−l

+ χ pα (H e + W − z)−1 W hxiν hpi−l hpil hxi−ν (H e + M)−l/2 hxiν × × hxi−ν (H e − z)−1 hxi−ν hxiν (H e + M)l/2 hxi−ν hpi−l .

We can prove kχ pα (H e + W − z)−1 W hxi−ν hpi−l k ≤ Ce−δ/h¯ with some δ > 0 for z in a neighborhood of E0 . Combining this with the semiclassical resolvent estimate, we have kχ pα (H e − λ ± iγ )−1 W hxi−ν hpi−l k ≤ C + Ce−δ/h¯ h¯ −1 ≤ C 0 for λ in a neighborhood of E0 and γ > 0. u t

Spectral Shift Function for Trapping Energies in Semiclassical Limit

183

3. Proof of the Main Result We set V c (x) ∈ C ∞ (Rn ) so that

(

V e (x) if x ∈ 1 , / 1 . V i (x) if x ∈

c

V (x) =

Then V c (x) ≥ E0 + δ for x ∈ Rn , and V c (x) = V (x) if x ∈ 2 \ 1 . We also set H c = H0 + V c on L2 (Rn ) with D(H c ) = H 2 (Rn ) as before. Now we set H = H ⊕ H c , H0 = H e ⊕ H i on H := L2 (Rn ) ⊕ L2 (Rn ), and we approximate H in terms of H0 . Let je (x) and ji (x) be nonnegative smooth functions on Rn such that x ∈ Rn , je (x)2 + ji (x)2 = 1, n supp ji ⊂ 2 . supp je (x) ⊂ R \ 1 , Moreover, we suppose supp ∇je , supp ∇ji ⊂⊂ 2 \ 1 . Then we define a unitary operator J on H by u1 (x) u1 (x) j (x) −ji (x) for ∈ H. J u(x) = e ji (x) je (x) u2 (x) u2 (x) By the above construction, it is easy to see e V V −1 J = . J Vc Vi On the other hand, H0 J

H0

J −1 =

H0

H0

+J

H0

H0

, J −1 ,

and by straightforward computations, we learn je [H0 , je ] + ji [H0 , ji ] je [H0 , ji ] − ji [H0 , je ] H0 −1 . ,J = J H0 ji [H0 , je ] − je [H0 , ji ] ji [H0 , ji ] + je [H0 , je ] Each entry of the right hand side is of the form: h¯ 2 f1 (x) + h¯ f2 (x)p with supp fk ⊂ 2 \ 1 . Thus we have shown: Lemma 3.1. H = J H0 J −1 + T , T = h¯ 2 t1 (x) + h¯ 2 t2 (x)p, where tj (x) are 2 × 2-matrix valued smooth functions on Rn and supp tj ⊂⊂ 2 \ 1 (j = 1, 2).

184

S. Nakamura

In order to prove Theorem 1.1, we use the chain rule for the SSF, i.e., ξ(λ; A, C) = ξ(λ; A, B) + ξ(λ; B, C) for operators A, B and C. For λ ≤ E0 + δ, we have ξ(λ; H, H0 ) = ξ(λ; H ⊕ H c , H0 ⊕ H c ) = ξ(λ; H ⊕ H c , H e ⊕ H i ) + ξ(λ; H e ⊕ H i , H0 ⊕ H c ). By Definition (1.2), we learn ξ(λ; H e ⊕ H i , H0 ⊕ H c ) = ξ(λ; H e , H0 ) + ξ(λ; H i , H c ) = ξ(λ; H e , H0 ) − N (λ). Here we have used the fact σ (H c ) ⊂ [E0 + δ, ∞) and Theorem 3.1 of [1]. On the other hand, we can write ξ(λ; H ⊕ H c , H e ⊕ H i ) = ξ(λ; H, H0 ) = ξ(λ; H, J H0 J −1 ) + ξ(λ; J H0 J −1 , H0 ). It is easy to see that (1 − J )f (H0 ) ∈ I1 for f ∈ C0∞ (R). We can apply the stability of the SSF (Lemma 2.2), and hence ξ(λ; J H0 J −1 , H0 ) = 0, λ ∈ R. Combining these, we have the following formula: ξ(λ; H, H0 ) − ξ(λ; H e , H0 ) − N (λ) = ξ(λ; H, J H0 J −1 )

(3.1)

for λ ≤ E0 + δ. Thus it remains only to estimate the right-hand side of (3.1). We recall ξ(λ; H, J HJ −1 ) = ξ(µ(λ); µ(H), µ(J H0 J −1 )), and we set 4(z) = (Im µ(z))(µ(H) − µ(z))−1 T˜ (µ(J H0 J −1 ) − µ(z))−1 for z ∈ / σ (H) ∪ σ (H0 ), where T˜ = µ(H) − µ(J H0 J −1 ) ∈ I1 . Then we have ξ(λ; H, J H0 J −1 ) = lim

γ →0

1 log det(1 − 2i4(λ + iγ )) 2i

by Lemma 2.1. Proposition 3.2. There exist ε and δ1 > 0 such that k4(λ + iγ )kI1 ≤ Cγ · h¯ 1−n if λ ∈ [−M + 1/2, E0 + ε], γ > 0 and dist(λ + iγ , σ (H0 ) ∪ σ (H)) ≥ e−δ1 /h¯ .

(3.2)

Spectral Shift Function for Trapping Energies in Semiclassical Limit

185

We set D = 2 \ 1 , so that T is supported in an interior of D. Proposition 3.3. Suppose α and β are constants such that 0 < β < d E0 (Gi (E0 ), D),

0 < α < 2d E0 (Ge (E0 ), D).

Then there exist ε > 0 such that if λ ∈ [E0 − ε, E0 + ε] and dist(λ, σ (H i )) ≥ e−β/h¯ then k4(λ + iγ )kI1 ≤ Cγ + Ce−α/h¯ . In particular, lim sup k4(λ + iγ )kI1 ≤ Ce−α/h¯ γ →0

if λ satisfies the conditions. We will prove these propositions in the next section, and we now prove Theorem 1.1 using them. Proof of Theorem 1.1. In order to prove Theorem 1.1, we need to be careful about the branch of the logarithm function. We first recall | log det(1 + A)| ≤ CkAkI1 if kAkI1 ≤ c < 1. Let α, β ε and δ1 as in the above propositions. We set 1 z1 = −M + , 2 z3 = λ + ie−δ1 /h¯ ,

z2 = −M +

1 + ie−δ1 /h¯ , 2

z4 = λ,

where λ is an energy satisfying the conditions of Theorem 1.1. We let z move along the lines: z1 → z2 → z3 → z4 . By Propositions 3.2 and 3.3, we learn k4(z)kI1 ≤ Ce−δ2 /h¯ along the lines with some δ2 > 0. Hence det(1 − 2i4(z)) stays very close to 1 for such z (if h¯ is sufficiently small). On the other hand, by the definition of ξ(λ), we have ξ(z1 ) =

1 log det(1 − 2i4(z1 )) = 0, 2i

and hence log det(1 − 2i4(z)) stays very close to 0 for such z on these lines. (Note that 4(z) is a continuous I1 -valued function on the upper half plane.) Combining these, we obtain |ξ(λ; H, J HJ −1 )| ≤ C lim sup k4(λ + iγ )kI1 . γ →0

Now we apply the second assertion of Proposition 3.3, and by (3.1) we conclude the assertion of Theorem 1.1. u t

186

S. Nakamura

4. Proof of Propositions 3.2 and 3.3 4.1. Proof of Proposition 3.2. Lemma 4.1. Let A = H or H0 . Then there are bounded operators F0 (z, A) and F1 (z, A), which are functions of z and A, such that (µ(A) − µ(z))−1 = F0 (z, A) + (A − z)−1 F1 (z, A)

(4.1)

/ σ (A). for z in a neighborhood of [−M + 1/2, E0 + 1] and z ∈ Proof. By simple computations, we have µ(A) − µ(z) = −M m (A + M)−m − (z + M)−m m−1 X m −m −m j m−1−j (A + M) (z + M) = M (A + M) (z + M) (A − z) j =0

m−1 X (A + M)−(m−1)+j (z + M)−1−j . = M m (A + M)−1 (A − z) j =0

The last term: L :=

m−1 X

(A + M)j (z + M)m−1−j

j =0

is invertible for z satisfying the conditions since A + M > 1, and hence (µ(A) − µ(z))−1 = M −m (A + M)(A − z)−1 L−1 = (M + z)M −m L−1 + (A − z)−1 M −m L−1 . By setting F0 (z, A) = (M + z)M −m L−1 and F1 (z, A) = M −m L−1 , we have formula (4.1). u t Lemma 4.2. Let T˜ = µ(H) − µ(J H0 J −1 ). Then T˜ = M m

m X

(H + M)−(m+1−j ) T J (H0 + M)−j J −1 .

j =1

The proof is an easy computation, and we omit it. Lemma 4.3. There are δ1 > 0 and C > 0 such that

(µ(H) − µ(z))−1 T˜ (µ(J H0 J −1 ) − µ(z))−1 ≤ C h¯ 1−n I 1

for z satisfying the conditions of Proposition 3.2.

(4.2)

Spectral Shift Function for Trapping Energies in Semiclassical Limit

187

Proof. By Lemmas 4.1 and 4.2, (µ(H) − µ(z))−1 T˜ (µ(J H0 J −1 ) − µ(z))−1 is a sum of the terms of the following form: Bj kl = M m Fk (z, H)(H − z)−k (H + M)−(m+1−j ) × × T J (H0 + M)−j (H0 − z)−l Fl (z, H0 )J −1 ,

(4.3)

where 1 ≤ j ≤ m, k, l = 0 or 1. We consider Bj 11 only. The other terms are easier to estimate, m1 = 2(m + 1 − j ),

m2 = 2j.

Then m1 +m2 = 2m+2 > n and m1 , m2 > 0. We set p and q > 1 so that p−1 +q −1 = 1 and m1 p > n,

m2 q > n.

Then we have kBj 11 kI1 ≤ M m kF1 (z, H)k · kχ D (H − z)−1 (H + M)−m1 /2 kIp × × kT (H0 − z)−1 (H0 + M)−m2 /2 kIq · kF1 (z, H0 )k ≤ Ckχ D (H − z)−1 hxim1 k · khxi−m1 (H + M)−m1 /2 kIp × × kT J (H0 − z)−1 hxim2 k · khxi−m2 (H0 + M)−m2 /2 kIq . By Lemma 2.5, we observe kχ D (H − z)−1 hxim1 k ≤ C, kT J (H0 − z)−1 hxim2 k ≤ C h¯ if δ1 is sufficiently small. On the other hand, by Lemma 2.3, we have khxi−m1 (H + M)−m1 /2 kIp ≤ C h¯ −n/p , khxi−m2 (H0 + M)−m2 /2 kIq ≤ C h¯ −n/q . Combining these, we obtain kBj 11 kI1 ≤ C h¯ 1−n/p−n/q = C h¯ 1−n .

t u

Proposition 3.2 follows immediately from Lemma 4.3 since |Im µ(z)| ≤ C|Im z| if Re z > −M + 1/2 and |Im z| ≤ 1. u t

188

S. Nakamura

4.2. Proof of Proposition 3.3. We set ηε (λ) ∈ C0∞ (R) such that ηε (λ) = 1 if λ ∈ [E0 − ε/2, E0 + ε/2], supp ηε ⊂ [E0 − ε, E0 + ε] with sufficiently small ε > 0. We also write ηε (λ) = 1 − ηε (λ). Lemma 4.4. Let ν > 1/2, l ≥ 0 and let 0 < β < d E0 (Gi (E0 ), D). Then there is ε > 0 such that

sup χ D pα (H0 − λ ± iγ )−1 hxi−ν hpi−l ≤ C γ >0

if λ ∈ [E0 − ε/3, E0 + ε/3], |α| ≤ l + 2 and dist(λ, σ (H i )) ≥ e−β/h¯ . Proof. It suffices to show

χ D pα (H e − λ ± iγ )−1 hxi−ν hpi−l ≤ C, and

χ D pα (H i − λ ± iγ )−1 hpi−l ≤ C,

γ > 0,

γ > 0,

(4.4)

(4.5)

for λ satisfying the conditions. Equation (4.4) follows immediately from Lemma 2.9. We write χ D pα (H i − λ ± iγ )−1 hpi−l = χ D pα (H i − λ ± iγ )−1 ηε (H i )hpi−l + χ D pα (H i − λ ± iγ )−1 ηε (H i )hpi−l = I + II. Clearly, II is bounded if h¯ is sufficiently small. On the other hand, we have X χ D pα (H i − λ ± iγ )−1 ηε (H i )ϕ = (λj − λ ± iγ )−1 ηε (λj )hϕ, ψj iχ D pa ψj λj

for ϕ ∈ L2 (Rn ), where the sum runs over λj ∈ σ (H i ) ∩ [E0 − ε, E0 + ε], and ψj is a normalized eigenfunction with the eigenvalue λj . Then by Lemma 2.6, we have

kIk ≤ χ D pl (H i − λ ± iγ )−1 ηε (H i )ϕ X |λj − λ|−1 kχ D pl ψj k ≤ ]{λj } × C eβ/h¯ e−d/h¯ ≤ λj

with 0 < β < d < d E0 (Gi (E0 ), D). Since ]{λj ∈ σ (H i ) | λj ≤ E0 + ε} = O(h¯ −n ), this implies kIk ≤ C and hence (4.5). u t Lemma 4.5. Let ν, β and l as in Lemma 4.4. Then

χ D pα (H − λ ± iγ )−1 hxi−ν hpi−l ≤ C for λ satisfying the conditions of Lemma 4.4, where |α| ≤ l + 2.

Spectral Shift Function for Trapping Energies in Semiclassical Limit

189

Proof. By the resolvent formula, we have −1 (H − z)−1 = J (H0 − z)−1 J −1 1 + T J (H0 − z)−1 J −1 = J (H0 − z)−1 J −1 hxi−ν hpi−l × −1 l ν hpi hxi . × 1 + hpil hxiν T J (H0 − z)−1 J −1 hxi−ν hpi−l Now the lemma follows from Lemma 4.4 and its proof. u t We now estimate Im zBj kl (z) as in the proof of Lemma 4.3 (cf. (4.3) for the definition of Bj kl ). As before, we estimate Bj 11 only. The other terms are easier to handle. Let l be an integer such that m m+1 ≤l≤ + 1. 2 2 Note that 2l > n/2. We set χ ∈ C0∞ (Rn ) so that supp χ ⊂ D and χ T = T . Then we write Bj 11 (z) = M m F1 (z, H)(H − z)−1 (H + M)−m−1+j χ (H0 + M)−l+m+1−j × × (H0 + M)l−m−1+j T J (H0 + M)−j (H0 − z)−1 F1 (z, H0 )J −1 , and hence

Bj 11 ≤ C (H0 + M)−l+m+1−j χ (H − z)−1 (H + M)−m−1+j × I1 I2

l−m−1+j −1

T J (H0 − z) (H0 + M)−j I . × (H0 + M) 2

(4.6)

We first consider the second component.

(H0 + M)l−m−1+j T J (H0 − z)−1 (H0 + M)−j I2

≤ (H0 + M)l−m−1+j T J (H0 − z)−1 ηε (H0 )(H0 + M)−j I

2 l−m−1+j −1 −j

+ (H0 + M) T J (H0 − z) ηε (H0 )(H0 + M) . I 2

It is easy to see that the first term is bounded, uniformly for z = λ ± iγ with λ ∈ [E0 − ε/3, E0 + ε/3] and h¯ ∈ (0, 1]. The latter term is bounded by

C (H0 + M)l−m−1+j T J (H0 − z)−1 ηε (H0 ) I X 2

pα T J (H0 − z)−1 ηε (H0 ) . ≤C I |α|≤2(l−m−1+j )+

We represent the last expression using the eigenfunction expansion to obtain kp α T J (H0 − z)−1 ηε (H0 )k2I2 = Tr pα T J (H0 − z)−1 ηε (H0 )2 (H0 − z)−1 J ∗ T ∗ pα ZZ −2 = |p α T J 9e (ξ ; x)|2 ηε (|ξ |2 )2 |ξ |2 − z dxdξ XZ |pα T J ψj (x)|2 dx ηε (λj )2 |λj − z|2 = I + II. + λj

2

190

S. Nakamura

We fix λ ∈ [E0 − ε/3, E0 + ε/3] satisfying the conditions of Proposition 3.3, and set z = λ + iγ with γ > 0. We also set ργ (t) = π −1 Im z|t − z|−2 = π −1 γ ((λ − t)2 + γ 2 )−1 . R Then ργ (t)dt = 1 for γ > 0, and ργ (t) → δ(t − λ) as γ → 0. Using these symbols, we have Z π −1 γ · I ≤ kpα T J 9e (ξ ; ·)k2 ηε (|ξ |2 )ργ (|ξ |2 )dξ ≤ Ce−2d/h¯ by Lemma 2.8 with d < d E0 (Ge (E0 ), D). The term II is estimated as in the proof of Lemma 4.4 and it is bounded. Combining these, we have

γ 1/2 (H0 + M)l−m−1+j T J (H0 − λ − iγ )−1 (H0 + M)−j I

2

≤ Cγ 1/2 + Ce−d/h¯

(4.7)

for γ > 0. We next consider

(H0 + M)−l+m+1−j χ (H − z)−1 (H + M)−m−1+j I2

−l+m+1−j χ −1

≤ (H0 + M) (H − z) ηε (H)(H + M)−m−1+j I 2

−l+m+1−j χ −1 −m−1+j

+ (H0 + M) . (H − z) ηε (H)(H + M) I 2

The first term in the right hand side is bounded similarly as above, and the second term is bounded by C

X α

≤

kpα χ (H − z)−1 ηε (H)kI2

X α

kpα χ (H0 − z)−1 ηε (H)kI2 +

X α

kp α χ (H − z)−1 T (H0 − z)−1 ηε (H)kI2

by the second resolvent formula, where α runs over |α| ≤ 2(−l + m + 1 − j )+ . The first term in the last line can be estimated as in the proof of (4.7), and we have γ 1/2 kpα χ (H0 − λ − iγ )−1 ηε (H)kI2 ≤ Cγ 1/2 + Ce−d/h¯ for each α. In order to estimate the last term, we compute kpα χ (H0 − z)−1 T (H − z)−1 ηε (H)kI2

0 0 ≤ pα χ (H − z)−1 hxi−n (H0 + M)−l · (H0 + M)l hxiν T (H0 − z)−1 ηε (H) I . 2

The first component is bounded by Lemma 4.5 with l 0 ≥ (−l +M +1−j )+ . The second component is estimated as in the proof of (4.7) again, and we have 0

γ 1/2 k(H0 + M)l hxiν T (H0 − λ − iγ )−1 ηε (H)kI2 ≤ Cγ 1/2 + Ce−d/h¯ .

Spectral Shift Function for Trapping Energies in Semiclassical Limit

191

Combining these, we obtain

γ 1/2 (H0 + M)−l+m+1−j χ (H − λ − iγ )−1 (H + M)−m−1−j I

2

≤ Cγ

1/2

+ Ce−d/h¯ .

(4.8)

Now using (4.6), (4.7) and (4.8), we conclude γ kBj 11 (λ + iγ )kI1 ≤ Cγ + Ce−2d/h¯ .

(4.9)

The first assertion of Proposition 3.3 follows immediately from this estimate. If we take the limit γ → 0 in (4.9), we then obtain lim sup γ kBj 11 (λ + iγ )kI1 ≤ Ce−2d/h¯ γ →0

for 0 < d < d E0 (Ge (E0 ), D). This implies the last assertion of Proposition 3.3. u t 5. Proof of Theorem 1.2 We fix λ ∈ [E0 − ε, E0 + ε] with sufficiently small ε > 0, and we write 3 = λj ∈ σ (H i ) λ − e−β/h¯ ≤ λj ≤ λ + e−β/h¯ . We let ψj be the normalized eigenfunction of H i corresponding to λj , as in the last section. We also write X hϕ, ψj iψj , ϕ ∈ L2 (Rn ). P ϕ := λj ∈3

We define H0± and H± on L2 (Rn ) ⊕ L2 (Rn ) by H0± = H e ⊕ (H i ± 2e−β/h¯ P ), H± = H ± 2e−β/h¯ J (0 ⊕ P )J −1 . Then it is easy to see H± − J H0± J −1 = T , and H0± satisfies the same properties as H0 . In particular, it satisfies Lemmas 2.4 – 2.6. By the definition of P , we have σ (H i ± 2−β/h¯ P ) ∩ (λ − e−β/h¯ , λ + e−β/h¯ ) = ∅. Hence, by mimicking the proof of Theorem 1.1, we have |ξ(λ; H± , H0± )| ≤ Ce−α/h¯ with any α < 2d E0 (Ge (E0 ), D). We also have ξ(λ; H0± , H0 ⊕ H c ) = ξ(λ; H e , H0 ) + ξ(λ; H i ± 2e−β/h¯ P , H c ) = ξe (λ) − N ± (λ),

192

S. Nakamura

where N ± (λ) = ]{eigenvalues of H i ± 2e−β/h¯ P } = N (λ ∓ e−β/h¯ ) (cf. (3.1)). Combining these, we obtain ξ(λ; H± , H0 ⊕ H c ) − (ξe (λ) − N (λ ∓ e−β/h¯ )) ≤ Ce−α/h¯ . On the other hand, since ±(H± − H) is nonnegative, we can apply the monotonicity theorem of the SSF (Theorem 6.6 of [1]), and we have ± ξ(λ; H± , H) ≥ 0, or equivalently

± ξ(λ; H, H± ) ≤ 0.

We now use the chain rule of the SSF again to learn ξ(λ; H, H0 ) = ξ(λ; H, H0 ⊕ H c ) ≤ ξ(λ; H+ , H0 ⊕ H c ) ≤ ξe (λ) − N (λ − e−β/h¯ ) + Ce−α/h¯ . Similarly, we have ξ(λ; H, H0 ) ≥ ξ(λ; H− , H0 ⊕ H c ) ≥ ξe (λ) − N (λ + e−β/h¯ ) − Ce−α/h¯ . These complete the proof of Theorem 1.2. u t Acknowledgement. A part of this work was done when the author was visiting the Erwin Schrödinger Institute for Mathematical Physics, Vienna, in June 1998, and he wishes to thank the institute for the kind invitation and the hospitality.

References 1. Birman, M. Sh., Yafaev, D. R.: The spectral shift function. The work of M. G. Krein and its further development. St. Petersburg Math. J. 4, 833–870 (1993) 2. Briet, Ph., Combes, J. M., Duclos, P.: Spectral stability under tunneling. Commun. Math. Phys. 126, 133–156 (1989) 3. Gerard, C., Martinez,A., Robert, D.: Breit-Wigner formulas for the scattering phase and the total scattering cross-section in the semi-classical limit. Commun. Math. Phys. 121, 323–336 (1989) 4. Hörmander, L.: The Analysis of Partial Differential Operators. Vol. 3, Berlin–Heidelberg–New York: Springer Verlag, 1985 5. Lavine, R.: Classical limit of the number of quantum states. In: Quantum Mechanics in Mathematics, Chemistry and Physics. K. E. Gustafson, W. P. Reinhardt eds., New York: Plenum, 1981 6. Majda, A., Ralston, J.: An analogue of Weyl’s theorem for unbounded domains. I, II and III. Duke Math. J. 45, 183–196 (1978); 45, 513–536 (1978); 46, 725–731 (1979) 7. Melrose, R.: Weyl asymptotics for the phase in obstacle scattering. Comm. P. D. E. 13, 1431–1439 (1988) 8. Nakamura, S.: Scattering theory for the shape resonance model I. Non-resonant energies; II. Resonance scattering. Ann. Inst. H. Poincaré (Phys. Théo.) 50, 115–131 (1989); 50, 133–142 (1989) 9. Nakamura, S.: Agmon-type exponential decay estimates for pseudodifferential operators. J. Math. Sci. Univ. Tokyo 5, 693–712 (1998) 10. Pushnitski,A. B.: Representation for the spectral shift function for perturbations of a definite sign. Preprint. To appear in St. Petersburg Math. J. 11. Pushnitski, A. B.: Integral estimates for the spectral shift function. Preprint 12. Pushnitski, A. B.: Spectral shift function of the Schrödinger operator in the large coupling constant limit. Preprint 13. Robert, D.: Autour de l’approximation semiclassique. Basel–Boston: Birkhäuser, 1983

Spectral Shift Function for Trapping Energies in Semiclassical Limit

193

14. Robert, D.: On the Weyl formula for obstacles. Partial differential equations and mathematical physics (Copenhagen,1995; Lund, 1995), Progr. Nonlinear Differential Equations Appl. 21, Boston–Boston: Birkhauser, MA, 1996, pp. 264–285 15. Robert, D.: Semiclassical asymptotics for the spectral shift function. Differential Operators and Spectral Theory, V. Buslaev, M. Solomyak, D. Yafaev eds., Amer. Math. Soc. Transl. (Ser. 2) 189, 187–203 (1999) 16. Robert, D., Tamura, H.: Semi-classical bounds for resolvents of Schrödinger operators and asymptotics for scattering phases. Comm. P. D. E. 9, 1017–1058 (1984) 17. Robert, D., Tamura, H.: Semi-classical asymptotics for local spectral densities and time delay problems in scattering processes. J. Funct. Anal. 80, 124–147 (1988) 18. Robert, D., Tamura, H.: Asymptotic behavior of scattering amplitudes in semi-classical and low energy limits. Ann. Inst. Fourier (Grenoble) 39, 155–192 (1989) 19. Sobolev, A. V.: Effective bounds for the spectral shift function. Ann. Inst. H. Poincaré (Phys. Théo.) 58, 55–83 (1993) 20. Yafaev, D. R.: Mathematical Scattering Theory. Providence, RI: American Math. Soc., Proidence, RI, 1992 Communicated by B. Simon

Commun. Math. Phys. 208, 195 – 223 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Yangian of the Queer Lie Superalgebra Maxim Nazarov Department of Mathematics, University of York, York YO1 5DD, England. E-mail: [email protected] Received: 18 March 1999 / Accepted: 1 June 1999

Abstract: Consider the complex matrix Lie superalgebra glN |N with the standard generators Eij , where i, j = ±1, . . . , ± N. Define an involutory automorphism η of glN |N by η(Eij ) = E−i,−j . The twisted polynomial current Lie superalgebra g = {X(u) ∈ glN |N [u] : η(X(u)) = X(−u)} has a natural Lie co-superalgebra structure. We quantise the universal enveloping algebra U(g) as a co-Poisson Hopf superalgebra. For the quantised algebra we give a description of the centre, and construct the double in the sense of Drinfeld. We also construct a wide class of irreducible representations of the quantised algebra. 1. Introduction In this article we will work with certain Lie superalgebras [K] over the complex field C. Their universal enveloping algebras are Z2 -graded associative unital algebras, and we will always keep to the following convention. Let A and B be any two associative complex Z2 -graded algebras. Their tensor product A ⊗ B will be a Z2 -graded algebra such that for any homogeneous X, X 0 ∈ A and Y, Y 0 ∈ B, 0

(X ⊗ Y )(X 0 ⊗ Y 0 ) = XX0 ⊗ Y Y 0 · (−1)degX degY , deg(X ⊗ Y ) = degX + degY. Throughout this article we will denote by θ the isomorphism A ⊗ B → B ⊗ A defined by X ⊗ Y 7→ Y ⊗ X · (−1)degXdegY . If the algebra A is unital denote by ιp its embedding into the tensor product A⊗n as the p th tensor factor: ιp (X) = 1⊗(p−1) ⊗ X ⊗ 1⊗(n−p) ,

1 6 p 6 n.

196

M. Nazarov

We will also use various embeddings of the algebra A⊗m into A⊗n for any m 6 n. For any choice of pairwise distinct indices p1 , . . . ,pm ∈ {1, . . . ,n} and an element X ∈ A⊗m of the form X = X(1) ⊗ . . . ⊗ X(m) we will denote Xp1 ...pm = ιp1 X (1) . . . ιpm X (m) ∈ A⊗n . Let a be an arbitrary finite-dimensional Lie superalgebra. Then consider the polynomial current Lie superalgebra a[u]. It consists of the polynomial functions of a complex variable u valued in a. For any two such functions their supercommutator in a[u] is determined pointwise. Let K ∈ a⊗2 be an a-invariant element: we have the equality X1 + X2 , K = 0 in a⊗2 for any X ∈ a. Here a is regarded as a subspace in the enveloping algebra U(a) and the square brackets stand for the supercommutator. Also suppose that K is of Z2 -degree zero and symmetric: K12 = K21 . Then the rational function r(u, v) =

K u−v

(1.1)

of two complex variables u, v satisfies the classical Yang–Baxter equation for a[u]: r12 (u, v), r13 (u, w) + r12 (u, v), r23 (v, w) + r13 (u, w), r23 (v, w) = 0 (1.2) in a⊗3 . This can be verified by momentary calculation. Furthermore, the function (1.1) is antisymmetric: r12 (u, v) + r21 (v, u) = 0. Therefore the co-supercommutator ϕ : a[u] → a[u]⊗2 = a⊗2 [u, v] can be defined by (1.3) ϕ X(u) = X1 (u) + X2 (v), r(u, v) . This definition makes a[u] into a Lie bi-superalgebra. In particular, if a is a simple Lie algebra and K is the Casimir element, one gets a natural Lie bialgebra structure on a[u]. It gives rise to a natural co-Poisson structure on the universal enveloping algebra U(a[u]), which is a co-commutative Hopf algebra by definition. The more general case of a simple Lie superalgebra a was considered in [LS]. Now consider the queer Lie superalgebra qN . This is the most interesting superanalogue of the general linear Lie algebra glN , see for instance [S2]. We will realise qN as a subalgebra in the general linear Lie superalgebra glN |N . Let the indices i, j run through ±1, . . . , ± N. We will always write ı¯ = 0 if i > 0 and ı¯ = 1 if i < 0. Consider the Z2 -graded vector space CN|N . Let ei ∈ CN |N be an element of the standard basis. The Z2 -gradation on CN|N is defined so that deg ei = ı¯. Let Eij ∈ End (CN |N ) be the standard matrix units. The algebra End (CN |N ) is Z2 -graded so that degEij = ı¯ + ¯. We will also regard Eij as generators of the complex Lie superalgebra glN |N . The queer classical Lie superalgebra qN is the fixed point subalgebra in glN |N with respect to the involutive automorphism η : Eij 7 → E−i,−j .

(1.4)

The queerness of qN reveals itself in that all the symmetric qN -invariants in q⊗2 N of ⊗2 Z2 -degree zero are trivial: for a = qN we always have K ∈ C ·E , where E = E11 + E−1,−1 + . . . + EN N + E−N,−N .

(1.5)

Yangian of Queer Lie Superalgebra

197

Hence in this case the co-supercommutator (1.3) vanishes and there is no natural Lie bi-superalgebra structure on qN [u]. However such a structure can be defined, in compensation, on the twisted polynomial current Lie superalgebra g = X(u) ∈ glN |N [u] : η(X(u)) = X(−u) . (1.6) Our definition is based on the following general scheme [A1,A2,FR]. Let a, K be arbitrary as above and ω be an automorphism of the Lie superalgebra a of finite order n. Let ζ be a primitive nth root of unity. Generalising (1.1) put r(u, v) =

X id ⊗ ωm (K) . u − ζ mv

(1.7)

m∈Zn

Proposition 1.1. Suppose that ω⊗2 (K) = ζ K. Then the function (1.7) is antisymmetric and obeys the classical Yang–Baxter equation (1.2). Proof. The function r(u, v) determined by (1.1) satisfies Eq. (1.2). Let us apply to the left-hand side of (1.2) with that r(u, v) the operator id ⊗ ωk ⊗ ωl in a⊗3 and substitute ζ k v, ζ l w for v, w respectively. Taking then the sum over k, l ∈ Zn and using ω⊗2 (K) = ζ K, we will obtain the left-hand side of (1.2) with the function r(u, v) determined by (1.7). For the latter function r(u, v) we also have r21 (v, u) =

X ωm ⊗ id(K21 ) X ωm ⊗ id(K) = v − ζ mu v − ζ mu

m∈Zn

m∈Zn

X id ⊗ ω−m (K) = −r(u, v). = ζ −m v − u

t u

m∈Zn

Note that for a simple Lie algebra a always ω⊗2 (K) = K, and in compliance with [BD] this construction does not give any new solutions of (1.2). Let a be the Lie superalgebra glN|N and ω be the involutive automorphism (1.4). The element P =

X

Eij ⊗ Ej i · (−1)¯

(1.8)

ij ⊗2 of gl⊗2 N|N is symmetric and glN |N -invariant. Moreover, we have η (P ) = −P . Due to Proposition 1.1 by setting K = P in (1.7) we get an antisymmetric solution of the Yang– Baxter equation (1.2). Therefore (1.3) defines a co-supercommutator ϕ : g → g⊗2 . Thus we obtain a Lie bi-superalgebra structure on g. For any simple finite-dimensional Lie algebra a, quantisation of the co-Poisson Hopf algebraU(a[u])was described in [D1]. The quantised Hopf algebra is denoted by Y(a) and called the Yangian of the Lie algebra a. The algebra Y(a) contains the universal enveloping algebra U(a) as a subalgebra. However, the case a = slN is exceptional since only for a = slN there exists a homomorphism Y(a) → U(a) identical on the subalgebra U(a), see [D1,Theorem 9]. There is also a Hopf algebra Y(glN ), which is a quantisation of the co-Poisson Hopf algebra U(glN [u]). Again, the algebra Y(glN ) contains the enveloping algebra U(glN ) as a subalgebra, and admits a homomorphism Y(glN ) → U(glN ) identical on U(glN ). Moreover, the algebraY(glN )can be defined

198

M. Nazarov

entirely in terms of the classical representation theory [O1]. For further details on the Yangian Y(glN ) see [MNO] and references therein. The main aim of this article is to define the Yangian of the Lie superalgebra qN . It cannot be defined as a quantisation of the enveloping algebra U(qN [u]), because the latter Hopf superalgebra has no natural co-Poisson structure. Instead of qN [u] we will consider the twisted polynomial current Lie superalgebra g. In Sect. 2 we define a certain Hopf superalgebra Y(qN , h) over the ring C[[h]] of the formal power series in h. The quotient Y(qN , h)/ h Y(qN , h) is isomorphic to U(g) as a co-Poisson Hopf superalgebra. All specialisations of Y(qN , h) at h ∈ C \ {0} are isomorphic to each other as Hopf superalgebras. The specialisation at h = 1 will be denoted by Y(qN ) and called the Yangian of Lie superalgebra qN . Similarly to the Yangian Y(glN ), the algebra Y(qN ) contains the enveloping algebra U(qN ) as a subalgebra, and admits a homomorphism Y(qN ) → U(qN ) identical on U(qN ). In Sect. 3 we describe the centre of the Z2 -graded algebra Y(qN ). In Sect. 4 we construct the double of this Yangian in the sense of [D3]. In Sect. 5 we study an analogue for Y(qN ) of the Drinfeld functor [D2] for the Yangian Y(glN ). 2. Definition of the Yangian In this section we introduce the Yangian of the Lie superalgebra qN . This is a complex (s) associative unital Z2 -graded algebra Y(qN ) with the countable set of generators Tij , where s = 1, 2, . . . and i, j = ±1, . . . , ±N. The Z2 -gradation on the algebra Y(qN ) (s) is determined by setting deg Tij = ı¯ + ¯ for s > 1. To write down defining relations for these generators we will employ the formal series (1)

(2)

Tij (u) = δij · 1 + Tij u−1 + Tij u−2 + . . .

(2.1)

from Y(qN )[[u−1 ]]. Then for all possible indices i, j and k, l we have the relations ¯ ¯ ¯¯ (u2 − v 2 ) · Tij (u), Tkl (v) · (−1)ı¯k+¯ı l+k l = (u + v) · Tkj (u)Til (v) − Tkj (v)Til (u) (2.2) ¯k+l¯ −(u − v) · T−k,j (u)T−i,l (v) − Tk,−j (v)Ti,−l (u) · (−1) in Y(qN )((u−1 , v −1 )). The square brackets here stand for the supercommutator. Moreover, for all possible indices i, j we impose the relations Tij (−u) = T−i,−j (u).

(2.3)

We will also use the following matrix form of the relations (2.2). Regard Eij as elements of the algebra End (CN|N ). Combine all the series (2.1) into the single element X Eij ⊗ Tij (u) T (u) = ij

of the algebra End (CN|N ) ⊗ Y(qN )[[u−1 ]]. For any positive integer n and each s = 1, . . . ,n we denote (2.4) Ts (u) = ιs ⊗ id T (u) ∈ End (CN |N )⊗n ⊗ Y(qN )[[u−1 ]].

Yangian of Queer Lie Superalgebra

199

Regard (1.8) as an element of the algebra End (CN |N )⊗2 . Consider the element X Ei,−i · (−1)ı¯ (2.5) J = i

of the algebra End (CN|N ), it has Z2 -degree one. Note that the supercommutant of this element in End (CN|N ) coincides with the image of the defining representation qN → End (CN|N ). Introduce the rational function of two complex variables u, v P J1 J2 P + = u−v u+v X (−1)¯ X (−1)¯ Eij ⊗ Ej i · Eij ⊗ E−j,−i · − 1− u−v u+v R(u, v) = 1 −

ij

(2.6)

ij

valued in the algebra End (CN |N )⊗2 . Then the relations (2.2) can be rewritten as (2.7) R(u, v) ⊗ 1 · T1 (u)T2 (v) = T2 (v)T1 (u) · R(u, v) ⊗ 1 . Namely, after multiplying each side of (2.7) by u2 − v 2 it becomes a relation in the algebra End (CN |N )⊗2 ⊗ Y(qN )((u−1 , v −1 )) equivalent to the collection of all relations (2.2). Also note that the function (2.6) satisfies the quantum Yang–Baxter equation for the algebra End (CN |N )⊗3 (u, v, w) R12 (u, v)R13 (u, w)R23 (v, w) = R23 (v, w)R13 (u, w)R12 (u, v).

(2.8)

Furthermore, consider (1.4) as an automorphism of the algebra End (CN |N ). The collection of all relations (2.3) is equivalent to the single equation η ⊗ id T (u) = T (−u). (2.9) Observe that by the definition (2.6) of R(u, v) we also have in End (CN |N )⊗2 (u, v), η ⊗ id R(u, v) = R(−u, v), (2.10) id ⊗ η R(u, v) = R(u, −v). (2.11) We call the function (2.6) the rational R-matrix for the Lie superalgebra qN . For any i, j put Fij = Eij + E−i,−j . Then we have the equality η(Fij ) = Fij in End (CN|N ). We will also regard Fij as generators of the universal enveloping algebra U(qN ). Due to (2.2) there is a homomorphism Y(qN ) → U(qN ) : Tij (u) 7 → δij − Fj i u−1 · (−1)¯ .

(2.12)

The relations (2.2),(2.3) imply that there is also a homomorphism (1)

U(qN ) → Y(qN ) : Fj i 7→ −Tij · (−1)¯ .

(2.13)

Note that the composition of the homomorphisms (2.13) and (2.12) is the identical map U(qN ) → U(qN ). Hence (2.13) is an embedding of Z2 -graded associative unital algebras. The homomorphism (2.12) is identical on the subalgebra U(qN ). It will be called the evaluation homomorphism for Y(qN ) and denoted by πN .

200

M. Nazarov

The element T (u)of the algebra End (CN |N ) ⊗ Y(qN )[[u−1 ]] is invertible, we put X Eij ⊗ Teij (u). T (u)−1 = i,j

Then the relations (2.7),(2.9) along with the identity R(u, v)R(−u, −v) = 1 −

1 1 − 2 (u − v) (u + v)2

(2.14)

imply that the assignment Tij (u) 7 → Teij (−u) determines an automorphism of the algebra Y(qN ). This automorphism is evidently involutive. We will use two different ascending Z-filtrations on the algebra Y(qN ). They are (s) obtained by assigning to the generator Tij the degree s or s − 1 respectively. The (s)

corresponding Z-graded algebras will be denoted by grY(qN ) and gr -Y(qN ). Let Gij ∈ (s)

gr -Y(qN ) be the element corresponding to the generator Tij

(s) deg Gij

∈ Y(qN ). The algebra

= ı¯ + ¯. gr -Y(qN ) inherits Z2 -gradation from Y(qN ) such that Take the enveloping algebra U(g) of the twisted current Lie superalgebra (1.6). The algebra U(g) also has a natural Z2 -gradation: the Z2 -degree of the element (s)

Fij = Eij us + E−i,−j (−u)s

(2.15)

equals ı¯ + ¯ for any s > 0. We have the following easy observation. Proposition 2.1. The assignment for every s > 0, (s)

(s+1)

Fj i 7→ −Gij

· (−1)¯

(2.16)

determines a surjective homomorphism U(g) → gr -Y(qN ) of Z2 -graded algebras. Proof. The elements (2.15) generate the algebra U(g). The defining relations for these generators can be written as (s) (r) ¯ ¯ (s+r) (s+r) − δkj Fli · (−1)(¯ı +¯)(l+k) Fj i , Flk = δil Fj k (2.17) ¯ ¯ (s+r) (s+r) + δi,−l F−j,k · (−1)s − δ−k,j Fl,−i · (−1)(¯ı +¯)(l+k)+s for all r, s > 0 and (s)

(s)

F−j,−i = (−1)s · Fj i .

(2.18)

On the other hand, by (2.2) we obtain the relations in the algebra gr -Y(qN ), ¯ ¯ ¯ ¯ (s) (r) (s+r−1) (s+r−1) − δil Gkj (−1)ı¯k+¯ı l+k l · Gij , Gkl = δkj Gil (s+r−1)

+ δk,−j G−i,l for r, s > 1. Due to (2.3),

¯ ¯

(s+r−1)

· (−1)k+l+s − δ−i,l Gk,−j (s)

¯ ¯

· (−1)k+l+s

(s)

G−i,−j = (−1)s · Gij .

Comparison of these relations to (2.17) and (2.18) shows that (2.16) determines a homomorphism U(g) → gr -Y(qN ). This homomorphism is surjective and preserves Z2 gradation by definition. u t

Yangian of Queer Lie Superalgebra

201

There is a natural Hopf superalgebra structure on Y(qN ). Due to (2.7),(2.9) the comultiplication 1 : Y(qN ) → Y(qN ) ⊗ Y(qN ) can be defined by X ¯ ¯ Tik (u) ⊗ Tkj (u) · (−1)(¯ı +k)(¯+k) , (2.19) Tij (u) 7 → k

where the tensor product is taken over the subalgebra C[[u−1 ]] in Y(qN )[[u−1 ]] and the index k runs through ±1, . . . , ± N. The counit ε : Y(qN ) → C is defined so (s) that ε : Tij 7 → 0 for every s > 1. Then the assignment Tij (u) 7→ Teij (u) determines the antipodal map S : Y(qN ) → Y(qN ). It is an antiautomorphism of the Z2 -graded algebra Y(qN ). Note that Y(qN ) contains U(qN ) as a Hopf sub-superalgebra: by definitions (2.13) and (2.19) for any F ∈ qN we have 1(F ) = F ⊗ 1 + 1 ⊗ F, ε(F ) = 0, S(F ) = −F. The comultiplication (2.19) on the Z2 -graded algebra Y(qN ) allows us to define for any n = 1, 2, . . . a representation Y(qN ) → End (CN |N )⊗n depending on n arbitrary complex parameters z1 , . . . ,zn . Indeed, by comparing (2.8),(2.10) to (2.7),(2.9) respectively we obtain that for any z ∈ C the assignment End (CN|N ) ⊗ Y(qN )[[u−1 ]] → End (CN |N )⊗2 [[u−1 ]] : T (u) 7 → R(u, z)

(2.20)

determines a representation Y(qN ) → End (CN |N ). More explicitly, we have (s+1) 7 → − Ej i zs + E−j,−i (−z)s · (−1)¯ , s > 0. Tij

(2.21)

When z = 0 this representation Y(qN ) can be also obtained from the standard representation U(qN ) → End (CN |N ) by virtue of the evaluation homomorphism (2.12). Now for any z1 , . . . ,zn ∈ C take the tensor product of the representations (2.20) of the algebra Y(qN ) with z = z1 , . . . ,zn . Due to (2.19) the respective homomorphism Y(qN ) → End (CN|N )⊗n is determined by the assignment End (CN|N ) ⊗ Y(qN )[[u−1 ]] → End (CN |N )⊗(n+1) [[u−1 ]] : T (u) 7 → R12 (u, z1 ) . . . R1,n+1 (u, zn ).

(2.22)

Proposition 2.2. Let the complex parameters z1 , . . . ,zn and positive integer n vary. Then the kernels of all representations (2.22) of Y(qN ) have zero intersection. (s )

(s )

Proof. Take any finite linear combination of the products Ti1 j11 . . . Tim jmm ∈ Y(qN ) with (s ...s )

certain complex coefficients Ai1 1j1 ...imm jm , where the indices s1 , . . . ,sm > 1 and the number m > 0 may vary. Consider the image of this combination under the representation Y(qN ) → End (CN|N )⊗n determined by (2.22); it depends on z1 , . . . ,zn polynomially. Take the terms of this polynomial which have the maximal total degree in z1 , . . . ,zn . Let A be the sum of these terms and d be their degree. (s) Consider the ascending Z-filtration on algebra Y(qN ), where the generator Tij with s > 1 has degree s −1. Equip the tensor product Y(qN )⊗n with the ascending Z-filtration where the degree is the sum of the degrees on the tensor factors. Then by definition (2.19) under the comultiplication Y(qN ) → Y(qN )⊗n we have X (s) (s) 1⊗(r−1) ⊗ Tij ⊗ 1⊗(n−r) + lower degree terms, s > 1. Tij 7 → 16 r 6 n

202

M. Nazarov

Therefore A ∈ End (CN|N )⊗n coincides with the image of the sum X (s ...s ) (s −1) (s −1) Ai1 1j1 ...imm jm Fj1 i11 . . . Fjmmim · (−1)m+¯1 +...+¯m ∈ U(g) s1 +...+sm =d+m

under the tensor product of the evaluation representations (s)

U(g) → End (CN|N ) : Fij 7→ Eij zs + E−i,−j (−z)s , s > 0 (s)

at the points z = z1 , . . . ,zn ∈ C; see definition (2.15) of the element Fij ∈ g, and formula (2.21) for the representation Y(qN ) → End (CN |N ) corresponding to z ∈ C. Due to Proposition 2.1 it now suffices to show that when z1 , . . . ,zn ∈ C and the positive integer n vary, the kernels of the tensor products of the evaluation representations of the algebra U(g) at z = z1 , . . . ,zn ∈ C have zero intersection. This will also imply that the homomorphism (2.16) is injective. The algebra U(g) is a subalgebra in the universal enveloping algebra of the Lie superalgebra glN|N [u]. We will show that the intersection of the kernels of all finite tensor products of evaluation representations U(glN |N [u]) → End (CN |N ), is zero. Denote by $n the supersymmetrisation map in the tensor product (glN |N [u])⊗n normalised so that $n2 = $n . We will identify the vector space (glN |N [u])⊗n with gl⊗n N |N [u1 , . . . ,un ], where u1 , . . . ,un are independent complex variables. The vector space glN|N is identified with End (CN |N ). Choose any linear basis X1 , . . . ,X4N 2 in glN|N such that X1 = E as in (1.5). The element X1 ∈ glN |N is then identified with the operator 1 ∈ End (CN |N ). Take any finite non-zero linear combination of the elements (Xa1 us1 ) . . . (Xam usm ) ∈ U(glN |N [u]),

(2.23)

where the indices s1 , . . . ,sm > 0 and the number m > 0 may vary. We assume that for every fixed m the elements $m Xa1 us1 ⊗ . . . ⊗ Xam usm ∈ (glN |N [u])⊗m = End (CN |N )⊗m [u1 , . . . ,um ] are linearly independent. Further, we will suppose that in every product (2.23) the indices a1 , . . . ,ap > 1 for certain p 6 m, while ap+1 = . . . = am = 1. We will also suppose that sp+1 , . . . ,sq > 0 for some q > p, while sq+1 = . . . = sm = 0. For any n > p consider the tensor product ν of the evaluation representations of the algebra U(glN|N [u]) at u1 , . . . ,un ∈ C. Let us denote by P the subspace in End (CN |N )⊗n spanned by the vectors Xb1 ⊗ . . . ⊗ Xbn , where either the number of indices br > 1 is less than p, or br = 1 for at least one r 6 p. The image of (2.23) under ν is a polynomial in u1 , . . . ,un valued in End (CN|N )⊗n , of the form Y s (u1r + . . . + usnr ) · nm−q (2.24) p!$p Xa1 us1 ⊗ . . . ⊗ Xap usp ⊗ 1⊗(n−p) · p
N |N )⊗n . Here the tensor factor plus the terms valued in the subspace P ⊂ End (C s s p 1 $p Xa1 u ⊗ . . . ⊗ Xap u is regarded as an element of End (CN |N )⊗p [u1 , . . . ,up ] by identifying this algebra with (glN |N [u])⊗p . The numbers p for various products (2.23) from our linear combination may differ. Take those products (2.23) where the number p is maximal. For any n > p the images

Yangian of Queer Lie Superalgebra

203

of the remaining products under ν are polynomials in u1 , . . . ,un taking values in the subspace P ⊂ End (CN|N )⊗n . But a non-zero linear combination of the polynomials (2.24) with the maximal p, cannot vanish identically for all n > p by the Poincaré– Birkhoff–Witt theorem [MM,Theorem 5.15] for Lie superalgebras. u t In the course of the proof of Proposition 2.2 we established that the homomorphism (2.16) is injective. Together with Proposition 2.1, this yields the following result. Theorem 2.3. Z2 -graded algebras U(g) and gr -Y(qN ) are isomorphic via (2.16). (s)

Let us now return to the first Z-filtration on the algebra Y(qN ). Let tij be the (s)

element of the algebra grY(qN ) corresponding to the generator Tij ∈ Y(qN ). The (s)

algebra grY(qN ) inherits Z2 -gradation from Y(qN ) such that deg tij = ı¯ + ¯. (s)

Corollary 2.4. The algebra grY(qN ) is supercommutative with free generators tij and (s)

ti,−j , where s = 1, 2, . . . and i, j = 1, . . . ,N. Proof. The Z-graded algebra grY(qN ) is supercommutative due to the relations (2.2). (s) (s) Moreover, by (2.3) for any s > 1 we have the relation t−i,−j = (−1)s tij . The super(s)

t commuting generators tij with i > 0 are free due to Theorem 2.3. u To finish this section let us show that the Hopf superalgebra Y(qN ) provides a quantisation of the co-Poisson Hopf superalgebra U(g) in the sense of [D1]. Let h be a formal parameter. Take the tensor product C[[h]]⊗Y(qN ), where h has Z2 -degree zero. Denote by Y(qN , h) the unital subalgebra in this tensor product, generated by all the elements (s) (s) Hij = Tij hs−1 with s > 1. Due to Theorem 2.3 an isomorphism of Z2 -graded algebras Y(qN , h)/ h Y(qN , h) → U(g) can be defined by (s)

(s−1)

Hij + h Y(qN , h) 7→ −Fj i

· (−1)¯ ,

(2.25)

see (2.15). Let us extend the comultiplication 1 to Y(qN , h) by C[[h]]-linearity. Definition (2.19) implies that the assignment (2.25) defines an isomorphism of Hopf superalgebras. Let ψ : Y(qN , h) → U(g) be the composition of the projection Y(qN , h) → Y(qN , h)/ h Y(qN , h) with the isomorphism (2.25). Now let us consider the co-supercommutator ϕ : g → g⊗2 ⊂ glN |N [u] ⊗ glN |N [v] determined by (1.3), where according to (1.7) we put r(u, v) =

X ij

Eij ⊗ Ej i ·

(−1)¯ X (−1)¯ + . Eij ⊗ E−j,−i · u−v u+v

(2.26)

ij

Extend ϕ to the co-Poisson bracket U(g) → U(g)⊗2 . Denote this extension by the same letter ϕ. Further, denote by 1◦ the composition of the comultiplication 1 on Y(qN , h) with the involutive automorphism θ of the algebra Y(qN , h)⊗2 , defined in the beginning of Sect. 1. To show that Y(qN , h) is a quantisation of the co-Poisson Hopf superalgebra U(g) it remains to prove the following proposition: Proposition 2.5. For any element X ∈ Y(qN , h) we have the equality (ψ ⊗ ψ) (1(X) − 1◦ (X))/ h = ϕ ψ(X) .

(2.27)

204

M. Nazarov (s)

Proof. If suffices to verify the equality (2.27) for the generators Hij of the algebra Y(qN , h). By definitions (2.25) and (1.3),(2.26) for s > 1 we have in U(g)⊗2 , (s) (s−1) = −ϕ Fj i · (−1)¯ = ϕ ψ Hij X ¯ ¯ ¯ (r−1) (s−r−1) (r−1) (s−r−1) ⊗ Fj k · (−1)(¯ı +k+1)(¯+k) − Fj k ⊗ Fki · (−1)¯+k . Fki 16r 6s−1

On the other hand, by definition (2.19) for any s > 1 we have in Y(qN , h)⊗2 X ¯ ¯ (s) (s) (s) (r) (s−r) 1 Hij = Hij ⊗ 1 + 1 ⊗ Hij + h · Hik ⊗ Hkj · (−1)(¯ı +k)(¯+k) , ◦

1

(s) Hij

16r 6s−1

=

(s) Hij

(s) ⊗ 1 + 1 ⊗ Hij

+

X

16r 6s−1

(r)

(s−r)

h · Hkj ⊗ Hik

. (s)

t Thus using again definition (2.25) we get the equality (2.27) for X = Hij . u 3. Centre of the Yangian In this section we will give a description of the centre of the Z2 -graded algebra Y(qN ). By definition an element of Y(qN ) is central if it supercommutes with any element of Y(qN ). However, we will see that the centre of Y(qN ) consists of even elements only. We will use some arguments from [MNO, Prop. 2.12]. Let τ be the antiautomorphism of the Z2 -graded algebra End (CN |N ) defined by the assignment Eij 7→ Ej i · (−1)ı¯(¯+1) for any i and j . Introduce the element of the algebra End (CN |N )⊗2 , X Eij ⊗ Eij · (−1)ı¯¯ . Q = id ⊗ τ (P ) = i,j

Denote

T¯ (u) = τ ⊗ id Te(u) ∈ End (CN |N ) ⊗ Y(qN )[[u−1 ]].

The following construction of central elements in Y(qN ) goes back to [N1, Sect. 1]: Proposition 3.1. For a certain element Z(u) ∈ Y(qN )[[u−1 ]] we have the equality (3.1) Q ⊗ 1 · T1 (u)T¯2 (u) = Q ⊗ Z(u) in the algebra End (CN|N )⊗2 ⊗ Y(qN )[[u−1 ]]. The coefficients of the series Z(u) are of Z2 -degree zero and belong to the centre of the algebra Y(qN ). ¯ Proof. Introduce the rational function R(u, v) = id ⊗ τ R(u, v) valued in the algebra End (CN|N )⊗2 . One can directly verify the identity ¯ ¯ R(u, v)R(−u, −v) = 1. By making use of this identity we derive from (2.7) the relation ¯ ¯ −v) ⊗ 1 . R(−u, −v) ⊗ 1 · T1 (u)T¯2 (v) = T¯2 (v)T1 (u) · R(−u,

(3.2)

Yangian of Queer Lie Superalgebra

205

Let us multiply each side of this relation by u − v and then put u = v. We obtain Q ⊗ 1 · T1 (u)T¯2 (u) = T¯2 (u)T1 (u) · Q ⊗ 1 . Since the image of the endomorphism Q ∈ End (CN |N )⊗2 has dimension one, we get the first statement of Proposition 3.1. Since Q has Z2 -degree zero, the equality (3.1) shows that every coefficient of the series Z(u) has Z2 -degree zero in Y(qN ). Let us now work with the algebra End (CN |N )⊗3 ⊗ Y(qN )[[u−1 , v −1 ]]. Using relations (2.7),(3.2) and definition (3.1) we get the equalities Q23 R¯ 13 (−u, −v)R12 (u, v) ⊗ 1 · T1 (u)T2 (v)T¯3 (v) (3.3) = Q23 ⊗ 1 · T2 (v)T¯3 (v)T1 (u) · R¯ 13 (−u, −v)R12 (u, v) ⊗ 1 = Q23 ⊗ Z(v) · T1 (u) · R¯ 13 (−u, −v)R12 (u, v) ⊗ 1 . On the other hand, by (2.14) we have the identity in End (CN |N )⊗3 (u, v), 1 1 − . R13 (−u, −v)P23 R12 (u, v) = P23 · 1 − (u − v)2 (u + v)2 So 1 1 ¯ − . Q23 R13 (−u, −v)R12 (u, v) = Q23 · 1 − (u − v)2 (u + v)2 Due to the latter identity we obtain from (3.3) the equality T1 (u) · Q23 ⊗ Z(v) = Q23 ⊗ Z(v) · T1 (u). (s)

Hence every coefficient of the series Z(v) commutes with any generator Tij of the t algebra Y(qN ). u Let us consider the square S 2 of the antipodal map. It is an automorphism of the Z2 -graded algebra Y(qN ). Here is an alternative definition of the series Z(u). Proposition 3.2. We have S 2 Tij (u) = Tij (u) · Z −1 (u) for all indices i and j . Proof. Definition (3.1) is equvalent to the collection of relations in Y(qN )[[u−1 ]], X Tij (u)Teki (u) = Z(u)δj k . (3.4) i

On the other hand, by the definition of the anipodal map S we have the relations X ¯ Tki (u)S Tij (u) · (−1)(¯ı +k)(¯ı +¯) = δj k . (3.5) i

By applying the antiautomorphism S to each side of the latter equality we get X S 2 Tij (u) Teki (u) = δj k . i

By comparing the last equality with (3.4) we prove Proposition 3.2. u t Corollary 3.3. We have the equalities of formal series in u−1 , 1 Z(u) = Z(u) ⊗ Z(u), ε Z(u) = 1, S Z(u) = Z −1 (u).

206

M. Nazarov

Proof. Let θ be the involutive automorphism of the algebra Y(qN ) ⊗ Y(qN ) as defined in the beginning of Sect. 1. Since 1◦ S = θ ◦ (S ⊗ S)◦ 1 we get X Tekj (u) ⊗ Teik (u) 1 Teij (u) = k

from definition (2.19). Now by using (2.19) again we obtain the first equality in Corollary 3.3 from (3.4). The second equality follows directly from (3.4). To obtain the third equality in Corollary 3.3 apply the antiautomorphism S to each side of (3.4) and then use (3.5) along with Proposition 3.2. u t Observe that due to relations (2.3) we have Z(−u) = Z(u). Thus Z(u) = 1 + Z (2) u−2 + Z (4) u−4 + . . . for certain central elements Z (2) , Z (4) , . . . ∈ Y(qN ). We have the following theorem. Theorem 3.4. Elements Z (2) , Z (4) , . . . are free generators of the centre of Y(qN ). We will present the main steps of the proof as separate propositions. We will make (s) use of the second ascending filtration on the algebra Y(qN ). Take the element Gij of (s)

the Z-graded algebra gr -Y(qN ) corresponding to generator Tij ∈ Y(qN ). Denote X (s) Gii · (−1)ı¯ . G(s) = i

G(s)

= 0 if the number s is even. Theorem 2.3 Note that by the relation (2.3) here provides an isomorphism between gr -Y(qN ) and the enveloping algebra U(g) of the Lie superalgebra (1.6). In particular, the elements G(1) , G(3) , . . . ∈ gr -Y(qN ) are algebraically independent. Proposition 3.5. For any index s = 2, 4, . . . the element of the algebra gr -Y(qN ) corresponding to Z (s) ∈ Y(qN ) is (s − 1) · G(s−1) . Proof. Amongst other relations the collection (3.2) contains the equality (−1)ı¯ X Tkj (u)Tej k (v) − Teki (v)Tik (u) Tij (u), Tej i (v) = u−v k

for any indices i and j . The square brackets here stand for the supercommutator. By performing summation in this equality over the index i we get X X (−1)ı¯ . Tij (u)Tej i (v) = 1 − Teki (v)Tik (u) · u−v i

i,k

By setting u equal to v in the latter equality we obtain due to (3.4) that X Teki (v)T˙ik (v) · (−1)ı¯ , Z(v) = 1 − where

(3.6)

i,k

(1) (2) T˙ik (v) = −Tik v −2 − 2Tik v −3 − . . . is the first derivative of the formal series Tik (v) with respect to the parameter v. By the definition of the second filtration on Y(qN ) the element of the Z-graded algebra gr -Y(qN ) corresponding to the coefficient at v −s in the expansion of the right-hand side t of (3.6) is (s − 1) · G(s−1) whenever s > 1. u

Yangian of Queer Lie Superalgebra

207

To prove Theorem 3.4 it suffices to show that the elements G(1) , G(3) , . . . generate the centre of gr -Y(qN ). By Theorem 2.3 this means that for the element E ∈ qN defined by (1.5), the elements E, Eu2 , Eu4 , . . . ∈ g generate the centre of the Z2 -graded algebra U(g). To prove the latter statement we will consider the following general situation. Let b be an arbitrary finite-dimensional Lie superalgebra. Let ω be any involutive automorphism of b. Consider the corresponding twisted polynomial current Lie superalgebra h = X(u) ∈ b[u] : ω X(u) = X(−u) . Proposition 3.6. Suppose that the centre of the Lie superalgebra b is trivial. Then the centre of the universal enveloping algebra U(h) is also trivial. Proof. We will prove that the adjoint action of h in the supercommutative algebra S(h) has only trivial invariant elements. Choose a homogeneous basis X1 , . . . ,Xn in b and let X cpqr Xr , [Xp , Xq ] = r

where cpqr ∈ C is a structure constant of b. We put r¯ = 0 if the element Xr ∈ b is even and r¯ = 1 if this element is odd. Assume that for some h 6 n we have ω(Xr ) = Xr when 1 6 r 6 h and ω(Xr ) = −Xr when h < r 6 n. The elements Xr t s , where 1 6 r 6 h when s = 0, 2, . . . and h < r 6 n when s = 1, 3, . . . , form a basis in the Lie superalgebra h. Let us order the set of the pairs (s, r) here lexicographically: (0, 1) ≺ . . . ≺ (0, h) ≺ (1, h + 1) ≺ . . . ≺ (1, n) ≺ . . . . A basis in the supercommutative algebra S(h) is then formed by all finite ordered products of the elements (Xr us )d over the set of pairs (s, r), where d = 0, 1, 2, . . . when r¯ = 0 but d = 0, 1 when r¯ = 1. Let us now fix any h-invariant element Y ∈ S(h). Let m be the maximal integer such that Xr um occurs in Y for some index r. Suppose that m is even. Then the element Y is a finite sum X Yd1 ...dh · (X1 um )d1 . . . (Xh um )dh , Y = d1 ...dh

where any factor Yd1 ...dh ∈ S(h) depends only on elements Xr us ∈ h with s < m. This factor is zero if dp > 1 for some index p 6 h with p¯ = 1. By our assumption ad(Xq u) · Y = 0 ; q = h + 1, . . . ,n.

(3.7)

The minimal component of the left-hand side of (3.7) that depends on elements Xr um+1 ∈ h is the sum over d1 , . . . ,dh , of the products in S(h), X (X1 um )d1 . . . (Xp um )dp −1 . . . (Xh um )dh Yd1 ...dh p 6h

×

X

h
dp cpqr (Xr um+1 ) · (−1)fp ,

208

M. Nazarov

where

fp =

X

s¯ ds (q¯ + r¯ ) =

p<s 6h

X

s¯ ds p¯

p<s 6h

if cpqr 6 = 0. That component must be equal to zero. So for all q, r = h + 1, . . . ,n, X X Yd1 ...dh · (X1 um )d1 . . . (Xp um )dp −1 . . . (Xh um )dh · dp cpqr (−1)fp = 0. d1 ...dh

p 6h

Thus for any sequence d1 , . . . ,dh of non-negative integers such that dp 6 1if p¯ = 1, X Yd1 ...dp +1...dh (dp + 1)cpqr · (−1)fp = 0; q, r = h + 1, . . . ,n. (3.8) p 6h

By our assumption we have along with (3.7) the collection of equalities in S(h), ad(Xq u2 ) · Y = 0; q = 1, . . . ,h.

(3.9)

By considering the minimal component of the left-hand side of (3.9) that depends on elements Xr um+2 ∈ h we get along with (3.8) the equalities X Yd1 ...dp +1...dh (dp + 1)cpqr · (−1)fp = 0; q, r = 1, . . . ,h. (3.10) p 6h

Let us now make use of the assumption that the centre of the Lie superalgebra h is trivial. It implies that the system of linear equations on the variables z1 , . . . ,zh X zp (dp + 1)Xp · (−1)fp , Xq = 0; q = 1, . . . ,n p 6h

has only the trivial solution. Rewrite the latter system as X zp (dp + 1)cpqr · (−1)fp = 0; q, r = 1, . . . ,n p 6h

and compare the result with (3.8),(3.10). We obtain that Yd1 ...dp +1...dh = 0 for any index p = 1, . . . ,h. The sequence d1 , . . . ,dh here can be chosen arbitrarily. So we get t Y = Y0, ... ,0 . The case when m is odd can be treated similarly. u By applying Proposition 3.6 to the quotient Lie superalgebra b = glN |N /C · E we complete the proof of Theorem 3.4. Before closing this section let us consider the images in U(qN ) of the elements Z (2) , Z (4) , . . . ∈ Y(qN ) with respect to the evaluation homomorphism πN . By definition (2.12) we obtain from (3.6) that X πN : Z (2) 7 → − Fkk = −2E k and X ¯ ¯ Fk2 k1 . . . Fks+1 ks Fk1 ks+1 · (−1)k1 +...+ks πN : Z (s+2) 7 → − k1 , ... ,ks+1

for each s = 2, 4, . . . , where the indices k, k1 , . . . ,ks+1 run through ±1, . . . , ± N . So the elements πN (Z (2) ), πN (Z (4) ), . . . generate the centre of the algebra U(qN ) by [S1,Theorem 1]. In particular, we have the following corollary to Theorem 3.4.

Yangian of Queer Lie Superalgebra

209

Corollary 3.7. The image of the centre of Y(qN ) with respect to the evaluation homomorphism πN coincides with centre of the Z2 -graded algebra U(qN ). Different construction of a distinguished linear basis in the centre of the Z2 -graded algebra U(qN ) was given in [N3, Sect. 4]. In particular, that construction yields a qN analogue of the classical Capelli identity [C] for the enveloping algebra U(glN ). Results of the next section are underlying for that construction, cf. [N4, Sect. 3]. 4. Double of the Yangian The general notion of a quantum double was introduced in [D3, Sect. 13]. Here we consider the quantum double of the Yangian Y(qN ); cf. [S] and [BL, Sect. 3.3]. We employ it to define the universal R-matrix for the Hopf superalgebra Y(qN ). Firstly consider a complex associative unital Z2 -graded algebra Y∗(qN ) with the (−s) countable set of generators Tij , where s = 1, 2, . . . and i, j = ±1, . . . , ±N . The (−s)

Z2 -gradation on the algebra Y∗(qN ) is determined by setting deg Tij s > 1. To write down defining relations for these generators we put (−1)

Tij∗(v) = δij · 1 + Tij

(−2)

+ Tij

= ı¯ + ¯ for each

(−3) 2

v + Tij

v + . . . ∈ Y∗(qN )[[v]].

(4.1)

Let us now combine all the series (4.1) into the single element X Tij∗(v) ⊗ Eij ∈ Y∗(qN ) ⊗ End (CN |N )[[v]]. T ∗ (v) = i,j

Further, for any positive integer n and each s = 1, . . . ,n we will denote Ts∗(v) = id ⊗ ιs T ∗(v) ∈ Y∗(qN ) ⊗ End (CN |N )⊗n [[v]]. Then the defining relations in Y∗(qN ) can be written as T1∗(u)T2∗(v) · 1 ⊗ R(u, v) = 1 ⊗ R(u, v) · T2∗(v)T1∗(u), id ⊗ η T ∗(v) = T ∗(−v).

(4.2)

(4.3) (4.4)

After multiplying each side of (4.3) by u2 − v 2 it becomes a relation in the algebra Y∗(qN ) ⊗ End (CN |N )⊗2 [[u, v]]. It is equivalent to the collection of relations in the algebra Y∗(qN )[[u, v]], ¯ ¯ (u2 − v 2 ) · Tij∗(u), Tkl∗ (v) · (−1)ı¯¯+¯ı l+¯l = (u + v) · Til∗(u)Tkj∗ (v) − Til∗(v)Tkj∗ (u) + ∗ ∗ ∗ ∗ (u)Tk,−j (v) − T−i,l (v)T−k,j (u) · (−1)ı¯+¯ (u − v) · Ti,−l

(4.5)

for all possible indices i, j and k, l. Then (4.4) is equivalent to the collection of ∗ (v). Tij∗(−v) = T−i,−j

(4.6)

210

M. Nazarov

There is a natural structure of Z2 -graded bialgebra onY∗(qN ). Due to (4.3) and (4.4) we can define a comultiplication 1 : Y∗(qN ) → Y∗(qN ) ⊗ Y∗(qN ) by Tij∗(v) 7 →

X k

¯

¯

Tik∗ (v) ⊗ Tkj∗ (v) · (−1)(¯ı +k)(¯+k) ,

(4.7)

similarly to (2.19). But here the tensor product is taken over the subalgebra C[[v]]. The (−s) 7→ 0 for s > 1. Note that counit ε : Y∗(qN ) → C is determined so that ε : Tij ∗ Y (qN ) is a bi-superalgebra but not a Hopf superalgebra. The antipode is defined for a completion Y 0(qN ) of Y∗(qN ) such that T ∗(0) ∈ Y 0(qN ) ⊗ End (CN |N ) is invertible. We will construct such a completion later in this section. There is a canonical bilinear pairing h , i : Y(qN ) × Y∗(qN ) → C. We shall describe the corresponding linear map β : Y(qN ) ⊗ Y∗(qN ) → C. The latter map will be defined following [RTF, Sect. 2] so that for all numbers m, n = 0, 1, 2, . . . , End (CN|N )⊗m ⊗ Y(qN ) ⊗ Y∗(qN ) ⊗ End (CN |N )⊗n → End (CN |N )⊗(m+n) : → Y → Y Rk,m+l (uk , vl ) (4.8) T1 (u1 ) . . . Tm (um ) ⊗ T1∗(v1 ) . . . Tn∗(vn ) 7→ 16k 6m 16l 6n

under the map id ⊗ β ⊗ id. Here u1 , . . . ,um , v1 , . . . ,vn are independent variables and the product of the rational functions Rk,m+l (uk , vl ) should be expanded as a formal −1 power series in u−1 1 , . . . ,um , v1 , . . . ,vn . In particular, when m = n = 0 we get the equality h1,1i = 1. Due to the relations (2.7), (2.9) and (4.3), (4.4) the consistency of this definition follows from (2.10), (2.11) and (2.8). The following lemma describes a basic property of the pairing h , i. Lemma 4.1. Let s1 , . . . ,sm and r1 , . . . ,rn be any numbers from {1, 2, . . . }. Then

(s1 ) (s ) (−r ) (−r ) Ti1 j1 . . . Tim jmm , Tim+11jm+1 . . . Tim+nnjm+n 6= 0 ⇒ s1 + . . . + sm > r1 + . . . + rn for all m, n = 0, 1, 2, . . . and any choice of the indices i1 , j1 , . . . ,im+n , jm+n . Proof. First suppose that r1 , . . . ,rn > 2. Then by our definition the value of the pairing in Lemma 4.1 is up to the factor ±1 the coefficient at r1 −1 −sm 1 . . . vnrn −1 Ei1 j1 ⊗ . . . ⊗ Eim+n jm+n · u−s 1 . . . um · v1

(4.9)

−1 in the expansion of the product in End (CN |N )⊗(m+n) [[u−1 1 , . . . ,um , v1 , . . . ,vn ]] → Y → Y

1−

16k 6m 16l 6n

X v s−1 l

usk s >1

Pk,m+l (1 + (−1)s Jk Jm+l ) ,

where we have used (2.6). If here the coefficient at (4.9) is non-zero then evidently s1 + . . . + sm > r1 + . . . + rn .

Yangian of Queer Lie Superalgebra

211

Now suppose that some of the numbers r1 , . . . ,rn are equal to 1. Without loss of generality we will assume that r1 , . . . ,rp > 2 and rp+1 , . . . ,rn = 1 for some p < n. Rewrite the product over the indices k, l at the right-hand side of (4.8) as → Y → Y Rk,m+l (uk , vl ) · Rk,m+l (uk , vl ) .

→ Y → Y

16l 6p 16k 6m

p
Now the value of the pairing in Lemma 4.1 is up to the factor ±1 the coefficient at (4.9) in the expansion of the product → Y → Y 16l 6p 16k 6m → Y → Y

1−

X v s−1 l

s >1

1−

p
usk

X v s−1 l

s >1

usk

Pk,m+l (1 + (−1)s Jk Jm+l )

×

Pk,m+l (1 + (−1) Jk Jm+l ) − 1 .

s

If here that coefficient is non-zero then s1 + . . . + sm > r1 + . . . + rp + n − p.

(4.10)

t u

We will equip the algebra Y∗(qN ) with the descending Z-filtration defined by as(−s) signing to the generator Tij the degree s for any s > 1. The corresponding Z-graded algebra will be denoted by gr Y∗(qN ). The formal completion of the algebra Y∗(qN ) with respect to this filtration will be denoted by Y 0(qN ). We will extend the comultiplication 1 on Y∗(qN ) to the algebra Y 0(qN ), and still denote this extension by 1. The image 1 Y 0(qN ) lies in the formal completion of the algebra Y∗(qN ) ⊗ Y∗(qN ) with respect (−r) (−s) to the descending Z-filtration, defined by assigning to the element Tij ⊗ Tkl the (−r) is a finite sum of elements degree r + s. Indeed, with respect to this filtration 1(Tij of degree not less than r. (−s) (−s) of the Let Gij ∈ Y∗(qN ) be the element corresponding to the generator Tij ∗ ∗ ∗ algebra Y (qN ). The algebra gr Y (qN ) inherits Z2 -gradation from Y (qN ) such that (−s) for any s > 1 we have deg tij = ı¯ + ¯. By the relations (4.6) we have (−s)

(−s)

G−i,−j = (−1)s+1 Gij

, s > 1.

(4.11)

h , i : gr Y(qN ) × gr Y∗(qN ) → C

(4.12)

Furthermore, we can define a bilinear pairing

by making equal to

(s )

(s )

(−r )

(−r )

(−r )

ti1 j11 . . . timmjm , Gim+11 jm+1 . . . Gim+nn jm+n (s )

(s )

(−r )

Ti1 j11 . . . Tim jmm , Tim+11jm+1 . . . Tim+nnjm+n

if s1 + . . . + sm = r1 + . . . + rn and equal to zero otherwise. Here m, n > 0 and s1 , . . . ,sm , r1 , . . . ,rn > 1 while the indices i1 , j1 , . . . ,im+n , jm+n are arbitrary. This definition is correct due to Lemma 4.1. Now for each s = 0, 1, 2, . . . denote by gr s Y(qN ) and gr s Y∗(qN ) the subspaces of degree s in the Z-graded algebras grY(qN ) and grY∗(qN ) respectively.

212

M. Nazarov

Lemma 4.2. Restriction of the pairing (4.12) to gr s Y(qN )×gr s Y∗(qN )is not degenerate for any s > 0. Proof. Fix any integers s1 , . . . ,sm , r1 , . . . ,rn > 1 such that s1 + . . . + sm = r1 + . . . + rn . Without loss of generality we will assume that s1 > . . . > sm and r1 > . . . > rn . Suppose that r1 , . . . ,rp > 2 while rp+1 , . . . ,rn = 1 for some p > 0. Now we do not exclude the case p = n. Let us consider the coefficient at r1 −1 −sm 1 . . . vnrn −1 u−s 1 . . . um · v1

(4.13)

−1 in the expansion of the product (4.10) as a series in u−1 1 , . . . ,um , v1 , . . . ,vn . By our assumptions this coefficient can be non-zero only if m = n and sk = rk for all indices k = 1, . . . ,m. Suppose that this is the case. For r = 1, 2, . . . denote by Sr the segment of the sequence 1, . . . ,m consisting of all k such that sk = r. Then the coefficient at (4.13) in the expansion of (4.10) equals Y X Y Pg(l),m+l 1 + (−1)r Jg(l) Jm+l , (4.14) (−1)m · r >1

g l∈Sr

where the index g runs through the set of all permutations of the sequence Sr . Note that the factors in each of the above two products commute. (s ) (s ) Choose any basis in the space grs Y(qN ) consisting of monomials ti1 j11 . . . timmjm such that s1 > . . . > sm > 1, s1 + . . . + sm = s and

ik ∈ {1, . . . ,N}, jk ∈ {±1, . . . , ± N }

for k = 1, . . . ,m while the number m > 0 can vary. The above argument using the expression (4.14) shows that for any two elements of this basis (s )

(s )

(r )

(r )

n 1 ti1 j11 . . . timmjm and tim+1 jm+1 . . . tim+n jm+n

the value

(s )

(s )

(−r )

(−r )

ti1 j11 . . . timmjm , Gjm+11 im+1 . . . Gjm+nn im+n

is non-zero only if m = n and for each index k = 1, . . . ,m we have the equalities im+k = ik , jm+k = jk , rk = sk . In the latter case that value up to the factor ±1 is the product a!b! . . . , where a, b, . . . are multiplicities in the sequence of the triples (i1 , j1 , s1 ), . . . ,(im , jm , sm ). But the products (−s ) (−s ) Gj1 i11 . . . Gjm imm ∈ grY∗(qN ) t corresponding to elements of our basis in gr s Y(qN ) span the space gr s Y∗(qN ). u Take the subalgebra g0 = u · g in the Lie superalgebra glN |N [u], see definition (1.6). Consider the corresponding universal enveloping algebra U(g0 ). Corollary 4.3. The Z2 -graded algebras gr Y∗(qN ) and U(g0 ) are isomorphic.

Yangian of Queer Lie Superalgebra

213 (s)

Proof. Consider the elements Fij of the universal enveloping algebra of glN |N [u] with s > 0, defined by (2.15) . Any relation between these elements follows from (2.17),(2.18). (−s) On the other hand, the generators Gij of the algebra grY∗(qN ) with s > 1 satisfy (4.11). Due to (4.5) they also satisfy the relations ¯ ¯ (s) (r) (−s−r) (−s−r) − δil Gkj (−1)ı¯¯+¯ı l+¯l · Gij , Gkl = δkj Gil (−s−r)

+ δk,−j G−i,l

(−s−r)

· (−1)ı¯+¯+s − δ−i,l Gk,−j

· (−1)ı¯+¯+s

for all r, s > 1. Therefore one can define a homomorphism of the algebra U(g0 ) onto gr Y∗(qN ) by (s) (s+1) · (−1)ı¯ . uFj i 7 → −Gij But Lemma 4.2 implies that the kernel of this homomorphism is trivial. u t We formulate the main property of the pairing h , i as the next proposition. Proposition 4.4. The bilinear map h , i : Y(qN ) × Y∗(qN ) → C is a non-degenerate bi-superalgebra pairing. Proof. Lemma 4.1 and Lemma 4.2 show that the pairing h , i is non-degenerate. Due to (2.19) and (4.7) the definition (4.8) implies that for any X, Y ∈ Y(qN ) and X0 , Y 0 ∈ Y∗(qN ) we have

XY, X0 = X⊗Y, 1(X0 ) and X, X0 Y 0 = 1(X), X0 ⊗Y 0 , (4.15) where we employ the convention

0 X⊗Y, X 0 ⊗Y 0 = X, Y X0 , Y 0 · (−1)degX degY for the homogeneous elements X and Y 0 . Also by definition we have h1, 1i = 1. Moreover, by setting n = 0 in definition (4.8) we get for any s1 , . . . ,sm > 1,

(s1 ) (s ) Ti1 j1 . . . Tim jmm , 1 = 0, m > 1. Thus hX, 1i = ε(X) for the counit ε on Y(qN ). Furthermore, by setting m = 0 in Lemma 4.1 we obtain for any r1 , . . . ,rn > 1,

(−r ) (−r ) 1, Ti1 j1 1 . . . Tin jn n = 0, n > 1. t Therefore h1, X 0 i = ε(X0 ) for the counit ε on Y∗(qN ). u By Lemma 4.1 the pairing Y(qN ) × Y∗(qN ) → C extends to Y(qN ) × Y 0(qN ). Let us now choose any linear basis in the vector space Y(qN ). An element of this basis will be denoted by Yσ . There is a system of vectors Y σ ∈ Y 0(qN ) dual to this basis. The formal sum of elements from Y 0(qN ) ⊗ Y(qN ), X Y σ ⊗ Yσ R= σ

does not depend on the choice of basis inY(qN ). It is called the universal R-matrix for the Yangian Y(qN ). The double of the Yangian is an associative complex unital algebra

214

M. Nazarov

DY(qN ) which contains Y(qN ) and Y∗(qN ) as subalgebras. Moreover, it is generated by these two subalgebras. We also impose the relations R · 1(X) = 1◦(X) · R, X ∈ Y∗(qN ),

(4.16)

where 1◦ is the composition of the comultiplications 1 on Y∗(qN ) with the involutive automorphism θ of the algebra Y∗(qN ) ⊗ Y∗(qN ). Either side of the equality (4.16) makes sense as a formal sum of elements from Y 0(qN ) ⊗ DY(qN ). The equalities (4.15) imply that for the comultiplications 1 on Y(qN ),Y 0(qN ), 1 ⊗ id(R) = R13 R23 , id ⊗ 1(R) = R12 R13 ,

(4.17)

where R12 =

X

Y σ ⊗ Yσ ⊗ 1, R13 =

σ

X σ

Y σ ⊗ 1 ⊗ Yσ , R23 =

X

1 ⊗ Y σ ⊗ Yσ .

σ

It follows from (4.17) that R−1 = id ⊗ S(R) for the antipodal map S on Y(qN ). Let us now regard the parameter z in the definition (2.20) as a formal parameter. Then we get a representation Y(qN ) → End (CN |N )[z]. We will denote it by ρz . Moreover, by comparing (2.8),(2.11) to (4.3),(4.4) respectively we obtain that Y∗(qN ) ⊗ End (CN|N )[[v]] → End (CN |N )⊗2 [[z−1 , v]] : T ∗(v) 7 → R(z, v)

(4.18)

determines a representationY∗(qN ) → End (CN |N )[z−1 ]. We will denote it by ρz∗ . More explicitly, for each index s > 1 we have (−s)

ρz∗ : Tij

7 → − Ej i z−s + E−j,−i (−z)−s · (−1)ı¯ .

Therefore we can now extend ρz∗ to a representation Y 0(qN ) → End (CN |N )[[z−1 ]]. Proposition 4.5. We have ρz∗ ⊗ id(R) = T (z)and also id ⊗ ρz (R) = T ∗(z). Proof. By the definition of our canonical pairing Y(qN ) ⊗ Y∗(qN ) → C, for any m > 0 the element T ∗(z) ∈ Y∗(qN ) ⊗ End (CN |N )[[z]] has the property that End (CN|N )⊗m ⊗ Y(qN ) ⊗ Y∗(qN ) ⊗ End (CN |N ) → End (CN |N )⊗(m+1) : T1 (u1 ) . . . Tm (um ) ⊗ T ∗(z) 7→ R1,m+1 (u1 , z) . . . Rm,m+1 (um , z) under the map id ⊗ β ⊗ id. To get the second equality in Proposition 4.5 it suffices to show that the element id ⊗ ρz (R) ∈ Y 0(qN ) ⊗ End (CN |N )[z] has the same property. Due to the definition of the element R the latter property amounts to id ⊗ ρz : End (CN|N )⊗m ⊗ Y(qN ) → End (CN |N )⊗m+1 : T1 (u1 ) . . . Tm (um ) 7 → R1,m+1 (u1 , z) . . . Rm,m+1 (um , z), which holds by (2.20). Proof of the first equality in Proposition 4.5 is similar. u t Corollary 4.6. Representations ρz of Y(qN ) and ρz∗ of Y∗(qN ) determine a representation of the algebra DY(qN ) in End (CN |N )[z, z−1 ].

Yangian of Queer Lie Superalgebra

215

Proof. According to (4.16) we have to verify for any X ∈ Y∗(qN ) the relation id ⊗ ρu (R) · id ⊗ ρu∗ 1(X) = id ⊗ ρu∗ 10 (X) · id ⊗ ρu (R) in Y 0(qN ) ⊗ End (CN|N )[u, u−1 ]. It suffices to set here X = Tij∗(v). Due to the definitions (4.7) and (4.18) the collection of the resulting relations for all indices i, j is exactly the defining relation (4.3). u t To write down commutation relations in the algebra DY(qN ) we will use the tensor product End (CN|N ) ⊗ DY(qN ) ⊗ End (CN |N ). There is a natural embedding of the algebra End (CN|N )⊗2 into this tensor product: X ⊗ Y 7 → X ⊗ 1 ⊗ Y for any eleb v) the image of (2.6) with respect to this ments X, Y ∈ End (CN|N ). Denote by R(u, embedding. Then we obtain another corollary to Proposition 4.5. Corollary 4.7. In End (CN|N ) ⊗ DY(qN ) ⊗ End (CN |N )[[u−1 , v]] we have b v) · T (u) ⊗ 1 . b v) · 1 ⊗ T ∗(v) = 1 ⊗ T ∗(v) · R(u, T (u) ⊗ 1 · R(u, Tij∗(v)

(4.19)

ρu∗

Proof. Put X = in (4.16). Apply the homomorphism ⊗ id to the resulting equality and use definition (4.7). Then we get the equality X ¯ ¯ T (u) · ρu∗ Tik∗ (v) ⊗ Tkj∗ (v) · (−1)(¯ı +k)(¯+k) k

=

X k

ρu∗ Tkj∗ (v) ⊗ Tik∗ (v) · T (u)

End (CN|N )

⊗ DY(qN )[[u−1 , v]] by Proposition 4.5. Due to definition (4.18) the in collection of the above equalities for all indices i, j is equivalent to (4.19). u t Theorem 4.8. The relation (4.19) implies the defining relations (4.16). Proof. Let u1 , u2 , . . . be independent formal parameters. For each n = 1, 2, . . . take the tensor product ν ∗ of the representations ρz∗ : Y 0(qN ) → End (CN |N )[[z−1 ]] with z = u1 , . . . ,un . Using our descending Z-filtration on the algebra Y∗(qN ) and Corollary 4.3, we can prove that the kernels of all representations ν ∗ have zero intersection. The proof is similar to the proof of Proposition 2.2 and is omitted here. Hence it suffices to derive from the relation (4.19) that for any X ∈ Y∗(qN ), (4.20) ν ∗ ⊗ id R · 1(X) = ν ∗ ⊗ id 1◦(X) · R . Let us again use Proposition 4.5 along with definition (4.7). The collection of all equalities (4.20) for X = Tij∗(v) with various indices i, j can be written as the single relation −1 in the algebra End (CN|N )⊗n ⊗ DY(qN ) ⊗ End (CN |N )[[u−1 1 , . . . ,un , v]], b1,n+1 (u1 , v) . . . R bn,n+1 (un , v) · 1 ⊗ T ∗(v) T1 (u1 ) . . . Tn (un ) ⊗ 1 · R b1,n+1 (u1 , v) . . . R bn,n+1 (un , v) · T1 (u1 ) . . . Tn (un ) ⊗ 1 , = 1 ⊗ T ∗(v) · R

(4.21)

bn,n+1 (un , v) are respectively the images of the elements b1,n+1 (u1 , v), . . . ,R where R −1 R1,n+1 (u1 , v), . . . ,Rn,n+1 (un , v) ∈ End (CN |N )⊗(n+1) [[u−1 1 , . . . ,un , v]]

under the natural embedding of the latter algebra to the former one. But using (4.19) repeatedly, we obtain (4.21). u t Thus we have proved that the relations (4.19) together with the relations (2.7), (2.9) and (4.3), (4.4) are defining relations for the algebra DY(qN ); cf. [KT, Sect. 2].

216

M. Nazarov

5. Representations of the Yangian Here we construct a wide class of irreducible representations of the algebra Y(qN ), by using irreducible representations of a certain less complicated algebra An , where the index n = 1, 2, . . . may vary. The algebra An has been introduced in [N2] and called the degenerate affine Sergeev algebra, in honour of the author of [S1, S2]. This is an analogue of the degenerate affine Hecke algebra, which was employed in [D2] to construct irreducible representations of the Yangian Y(glN ) of the general linear Lie algebra glN . Results presented in this section were reported for the first time in the summer of 1991 at the Wigner Symposium in Goslar, Germany. They were also reported in the autumn of 1992 at the Symposium on Representation Theory in Yamagata, Japan. The non-degenerate affine Sergeev algebra was defined in [JN], cf. [O2]. Consider the crossed product Hn of the symmetric group Sn with the Clifford algebra over the complex field C on n anticommuting generators. These generators are denoted by c1 , . . . ,cn and are subjected to the relations cp2 = −1, cp cq = −cq cp if p 6= q. The group Sn acts on the Clifford algebra by permutations of these n generators. Let wpq ∈ Sn be the transposition of two numbers p 6 = q. There is a representation Hn → End (CN|N )⊗n determined by the assignments wpq 7 → Ppq and cp 7→ Jp , see definitions (1.8) and (2.5). The supercommutant of the image of this representation in End (CN|N )⊗n coincides by [S2, Theorem 3] with the image of the nth tensor power of the defining representation U(qN ) → End (CN |N ). By definition, the complex algebra An is generated by the algebra Hn and the pairwise commuting elements x1 , . . . ,xn with the following relations: xp wq,q+1 = wq,q+1 xp if p 6= q, q + 1; xp wp,p+1 = wp,p+1 xp+1 − 1 − cp cp+1 ; xp cq = cq xp if p 6 = q, xp cp = −cp xp .

(5.1)

The algebra An is Z2 -graded so that deg cp = 1 while deg xp = deg wpq = 0. Proposition 5.1. Let Y range over a basis in Hn and let each of s1 , . . . ,sn range over the non-negative integers. Then the products Y x1s1 . . . xnsn form a basis in An . Proof. For m = 0, 1, 2, . . . one can define a homomorphism γm : An → Hm+n by X (1+cm+p cr )wm+p,r . γm : wpq 7 → wm+p,m+q , cp 7 → cm+p , xp 7 → 16r<m+p

This can be verified directly by (5.1). Suppose that m > s1 + . . . + sn . Choose for every p = 1, . . . ,n a subsequence Mp in 1, . . . ,m of cardinality sp so that all these subsequences are disjoint. Write the image of x1s1 . . . xnsn under γm as a linear combination of the elements cr . . . cr 0 w ∈ Hm+n , where 1 6 r < . . . < r 0 6 m + n and w ∈ Sm+n . Consider the terms in this linear combination where w has the maximal possible length. Amongst them we find the term → Y Y 16p6n r∈Mp

wm+p,r

Yangian of Queer Lie Superalgebra

217

which allows us to restorethe exponents s1 , . . . ,sn and the basis element Y ∈ Hn from the image γm Y x1s1 . . . xnsn uniquely. By using the relations (5.1) every element of the algebra An can be expressed as a finite linear combination of the products Y x1s1 . . . xnsn . Now take any such a linear combination and suppose that for all its terms m > s1 + . . . + sn . Then the above analysis shows that t for all the terms, the images γm Y x1s1 . . . xnsn are linearly independent in Hm+n . u Along with pairwise commuting generators x1 , . . . ,xn we need the non-commuting generators X (1+cp cq )wpq ; p = 1, . . . ,n. yp = xp − 16q
Observe that the generators y1 , . . . ,yn belong to the kernel of the homomorphism γ0 : An → Hn as defined in the proof of Proposition 5.1. By using this observation, wyp w−1 = yw(p) , w ∈ Sn ; yp cq = cq yp if p 6= q, yp cp = −cp yp .

(5.2)

Relations (5.1) and relations in the first line of (5.2) yield the commutation relations wpq [yp , yq ] = yp −yq + cp cq (yp +yq ) for the generators yp , yq with arbitrary indices p, q = 1, . . . ,n. Now take the tensor product of the Z2 -graded algebras End (CN |N )⊗n and An . Since the elements x1 , . . . ,xn ∈ An pairwise commute, the assignment End (CN|N ) ⊗ Y(qN )[[u−1 ]] → End (CN |N )⊗(n+1) ⊗ An [[u−1 ]] : T (u) 7→ → Y 1 1 + P1,p+1 J1 Jp+1 ⊗ 1 − P1,p+1 ⊗ (5.3) u − xp u + xp 16p6n

determines a homomorphism Y(qN ) → End (CN |N )⊗n ⊗ An , see (2.6) and (2.22). As usual, the fractions 1/(u ± xp ) in (5.3) should be expanded as formal power series in u−1 . The next proposition is a key to our construction, cf. [BGHP, Sect. 2.1]. Proposition 5.2. a) The difference between the product (5.3) and the sum 1−

X 16p 6n

P1,p+1 ⊗

X 1 1 + P1,p+1 J1 Jp+1 ⊗ u − yp u + yp

(5.4)

16p6n

belongs to the left ideal in the algebra End (CN |N )⊗(n+1) ⊗ An [[u−1 ]] generated by all the elements 1 − Pp+1,q+1 ⊗ wpq and 1 − Jp+1 Jq+1 ⊗ cp cq with p 6 = q. b) The sum (5.4) commutes with the elements Pp+1,q+1 ⊗ wpq and Jp+1 ⊗ cp . Proof. Part (b) immediately follows from the relations (5.2). To prove (a), we will use induction on n. When n = 1, the equality x1 = y1 provides the induction base. Suppose

218

M. Nazarov

that n > 1 and that Proposition 5.2 is true for n − 1 instead of n. Then the difference between (5.3) and (5.4) equals X X 1 1 P1,p+1 ⊗ + P1,p+1 J1 Jp+1 ⊗ − 1− u − yp u + yp 16p
where we have used the fact that xn commutes with y1 , . . . ,yn−1 . In these two lines, the second tensor factors differ by changing u to −u. It suffices to show that in the first line, the second tensor factor equals zero in An [[u−1 ]]. Multiplying this tensor factor on the left by u − xn , on the right by u − yn , and using the relations (5.2) in An we get the sum X (1+cn cp )wpn . (u − xn ) − (u − yn ) + 16p
t But this sum equals zero by the definition of the element yn ∈ An . u Take any representation ξ : An → End (U ), where the complex vector space U is Z2 -graded, but not necessary finite-dimensional. The algebra End (U ) is then Z2 -graded. We assume that the homomorphism ξ preserves Z2 -gradation. Take the tensor product of vector spaces (CN |N )⊗n ⊗ U . We identify the tensor product End (CN|N )⊗n ⊗ End (U ) with the algebra End (CN |N )⊗n ⊗ U so that (A ⊗ B) · (a ⊗ b) = (Aa) ⊗ (Bb) · (−1)deg a deg B for any homogeneous a ∈ (CN|N )⊗n , b ∈ U and A ∈ End (CN |N )⊗n , B ∈ End (U ). There is an action of the hyperoctahedral group Sn n Zn2 in (CN |N )⊗n ⊗ U . The transpoth sition wpq ∈ Sn acts as the operator Ppq ⊗ ξ(wpq √ ) while the generator of the p direct n factor Z2 in Z2 acts as the operator Jp ⊗ ξ(cp ) · −1. Denote by α this action. Let V be the space of co-invariants with respect to α. This is the quotient space of (CN |N )⊗n ⊗ U with respect to the subspace spanned by the images of all the operators α(g)−1 with g ∈ Sn n Zn2 . Each of these operators has Z2 -degree zero, therefore the vector space V inherits Z2 -gradation from (CN|N )⊗n ⊗ U . Let us expand the element (5.4) of End (CN |N )⊗(n+1) ⊗An [[u−1 ]] relative to the basis of standard matrix units in the first tensor factor End (CN |N ). Then for s = 0, 1, 2, . . . , the coefficient in (5.4) at Eij u−s−1 ∈ End (CN |N )[[u−1 ]] is X ιp (Ej i ) + (−1)s ιp (E−j,−i ) ⊗ yps · (−1)¯+1 ∈ End (CN |N )⊗n ⊗ An , 16p6n

Yangian of Queer Lie Superalgebra

219

where ιp denotes embedding of End (CN |N ) to End (CN |N )⊗n as the pth tensor factor. All these coefficients commute with the elements Ppq ⊗ wpq and Jp ⊗ cp in the algebra End (CN|N )⊗n ⊗ An by part (b) of Proposition 5.2. By part (a), one can now define a (s+1) representation of the Z2 -graded algebra Y(qN ) in V by assigning to the generator Tij with s > 0 the action, induced in V by the operator X (5.5) ιp (Ej i ) + (−1)s ιp (E−j,−i ) ⊗ ξ(yps ) · (−1)¯+1 16p6n

in the space (CN|N )⊗n ⊗ U . The correspondence U 7→ V is the Y(qN )-analogue of the Drinfeld functor [D2] for the Yangian Y(glN ). Denote our correspondence by FN . For any positive integer n0 consider the tensor product An ⊗An0 of Z2 -graded algebras. It is isomorphic to the subalgebra in An+n0 generated by transpositions wpq , where 1 6 p < q 6 n or n + 1 6 p < q 6 n + n0 , along with all the elements cp and xp , where 1 6 p 6 n + n0 . Take any Z2 -graded representation ξ 0 : An0 → End (U 0 ). Consider the representation of the algebra An+n0 induced from the representation of An ⊗ An0 in U ⊗ U 0 . The algebra An+n0 acts in the vector space An+n0 ⊗ U ⊗ U 0 via left multiplication at the first tensor factor. We realise the induced representation in the quotient space of An+n0 ⊗ U ⊗ U 0 by the following relations: for homogeneous b ∈ U, b0 ∈ U 0, X ∈ An+n0 , Y ∈ An , Y 0 ∈ An0 , 0 (XZ) ⊗ b ⊗ b0 = X ⊗ ξ(Y )b ⊗ ξ 0 (Y 0 )b0 · (−1)deg b deg Y ,

(5.6)

where Z stands for the image of Y ⊗ Y 0 ∈ An ⊗ An0 in the algebra An+n0 . Let us denote by U U 0 this quotient space. Consider the representation of the algebra Y(qN ) in the space V 0 = FN U 0 . Determine a representation of Y(qN ) in V ⊗ V 0 using the comultiplication (2.19). Proposition 5.3. Representation of the algebra Y(qN ) corresponding to U U 0 , is equivalent to the representation of Y(qN ) in V ⊗ V 0 . 0

Proof. By definition, V ⊗ V 0 is a quotient space of (CN |N )⊗n ⊗ U ⊗ (CN |N )⊗n ⊗ U 0 . 0 Identify the latter tensor product with (CN |N )⊗(n+n ) ⊗ U ⊗ U 0 via the linear map 0

a ⊗ b ⊗ a 0 ⊗ b0 7 → a ⊗ a 0 ⊗ b ⊗ b0 · (−1)deg a deg b . 0

Let W be the quotient space of (CN |N )⊗(n+n ) ⊗ U ⊗ U 0 corresponding to V ⊗ V 0 . To determine the action of the algebra Y(qN ) in W , we can use the representation 0 Y(qN ) → End (CN|N )⊗(n+n ) ⊗ End (U ) ⊗ End (U 0 ), with respect to which the element N|N ) ⊗ Y(qN )[[u−1 ]] is represented by the product T (u) ∈ End (C → Y

×

16 p 6 n → Y 16q 6n0

1 1 + P1,p+1 J1 Jp+1 ⊗ 1 − P1,p+1 ⊗ u − ξ(xp ) ⊗ 1 u + ξ(xp ) ⊗ 1

1 1 + P1,n+q+1 J1 Jn+q+1 ⊗ 1 − P1,n+q+1 ⊗ u − 1 ⊗ ξ 0 (xq ) u + 1 ⊗ ξ 0 (xq ) 0

in the algebra End (CN|N )⊗(n+n +1) ⊗ End (U ) ⊗ End (U 0 ). Here we used (2.19).

220

M. Nazarov

The space of the representation of Y(qN ) corresponding to U U 0 , is a quotient 0 space of (CN|N )⊗(n+n ) ⊗ An+n0 ⊗ U ⊗ U 0 . The assignment 0

a ⊗ a 0 ⊗ b ⊗ b0 7 → a ⊗ a 0 ⊗ 1 ⊗ b ⊗ b0 ∈ (CN |N )⊗(n+n ) ⊗ An+n0 ⊗ U ⊗ U 0 induces an isomorphism of W to this quotient. This isomorphism commutes with the action of the algebra Y(qN ), since by (5.6) for any 1 6 p 6 n and 1 6 q 6 n0 , the actions of xp and xn+q on the class of 1 ⊗ b ⊗ b0 in U U 0 yield the classes of t 1 ⊗ ξ(xp )b ⊗ b0 and 1 ⊗ b ⊗ ξ 0 (b0 ) respectively. u To give an example of the correspondence FN : U 7→ V , consider any principal series representation of the algebra An . This is the representation induced from a character χ of the commutative subalgebra in An generated by x1 , . . . ,xn . Note that this subalgebra is maximal commutative by [N2, Proposition 3.1]. Take the character χ such that χ(x1 ) = z1 , . . . ,χ(xn ) = zn in C. Due to Proposition 5.1, the space Uz1 ...zn of the corresponding principal series representation of An is identified with the algebra Hn , which acts on itself via left multiplication. The action of xp ∈ An is then uniquely determined by the assignment 1 7 → zp in the space Hn . Corollary 5.4. The representation of the algebra Y(qN ) corresponding to Uz1 ...zn is equivalent to representation (2.22). Proof. The representations of the algebra An in Uz1 ...zn and Uz1 . . . Uzn are equivalent. Due to Proposition 5.3 it suffices to consider the case n = 1. The space of the representation of Y(qN ) corresponding to Uz is the quotient of CN |N ⊗ H1 by the relations √ a ⊗ c1 · −1 = (J a) ⊗ 1 · (−1)deg a for homogeneous a ∈ CN|N . Assignment a ⊗ 1 7→ a induces an isomorphism of this (s+1) ∈ Y(qN ) acts on the quotient to CN|N . By (5.5), for any s > 0 the generator Tij N|N s ⊗ Uz as the operator Ej i ⊗ z + E−j,−i ⊗ (−z)s · (−1)¯+1 . vector a ⊗ 1 ∈ C Comparing this with definition (2.21), we complete the proof. u t Let us use the notion of a Z2 -graded irreducibility. When the Z2 -graded vector space U is finite-dimensional, the representation ξ in U will be called irreducible if any Z2 graded subspace in U preserved by ξ is either the zero space or U itself. Theorem 5.5. Suppose that the representation of the Z2 -graded algebra An in U is finite-dimensional and irreducible. Then the finite-dimensional representation of the Z2 -graded algebra Y(qN ) in V = FN (U ) is also irreducible. Proof. We extend the arguments from [A, Sect. 4]. The algebra Y(qN ) contains the enveloping algebra U(qN ) as a subalgebra, see (2.13). Representation of this subalgebra in (CN|N )⊗n ⊗U is the tensor product of n copies of the defining representation in CN |N and the trivial representation in U , see (5.3). Let V0 be any non-zero Z2 -graded subspace in V = FN U , preserved by the action of Y(qN ). In particular, V0 is preserved by the action of U(qN ). By [S2, Theorem 3] there is a Z2 -graded subspace U0 ⊂ U preserved by ξ(Hn ), such that V0 ⊂ V corresponds to U0 . Assume that for any non-zero vector b ∈ U0 the image in V0 of the subspace (CN |N )⊗n ⊗ b ⊂ (CN |N )⊗n ⊗ U0 is not zero. Since ξ is irreducible, ξ(An ) · U0 = U . Let us show that the subspace V0 ⊂ V is also Y(qN )-cyclic.

Yangian of Queer Lie Superalgebra

221

Consider the representation of the algebra An induced from the representation of its subalgebra Hn in U0 ; ξ is a quotient of this induced representation. Thus instead of ξ it suffices to take the induced representation. Realise it in the quotient space U 0 of An ⊗ U0 with respect to the relations (XY ) ⊗ b = X ⊗ ξ(Y )b for all homogeneous X ∈ An , Y ∈ Hn and b ∈ U0 . Instead of the subspace U0 ⊂ U it suffices to take the image of the subspace 1 ⊗U 0 ⊂ An ⊗U0 in this quotient. Then we have to prove that the subspace in V 0 = FN U 0 corresponding to the image in U 0 of 1 ⊗ U0 , is Y(qN )-cyclic. There is an ascending Z-filtration on the algebra An such that any generator xp is of degree one while wpq and cp are of degree zero, see (5.1). The filtration on An induces an ascending Z-filtration on the vector space (CN |N )⊗n ⊗ An ⊗ U0 and on its quotient V 0 . This filtration on V 0 is compatible with the action of the algebra Y(qN ), when it is (s+1) is s. But the corresponding Z-graded Z-filtered so that the degree of the generator Tij algebra is isomorphic to U(g), see Theorem 2.3. The corresponding Z-graded action of U(g) can be realised in the space of co-invariants under the action of the group Sn n Zn2 in (CN|N )⊗n ⊗ U0 ⊗ C[x1 , . . . ,xn ]. Here the action in (CN |N )⊗n ⊗ U0 is determined by α while the action in C[x1 , . . . ,xn ] is standard: any permutation w ∈ Sn acts as xp 7 → xw(p) , the generator of the q th factor Z2 in Zn2 acts as xp 7→ (−1)δpq xp . Let W be this space of co-invariants. The Z-graded action of the algebra U(g) in W is induced by (s) its action in the space (CN|N )⊗n ⊗ U0 ⊗ C[x1 , . . . ,xn ], where the generator Fij ∈ g acts as the operator X ιp (Eij ) + (−1)s ιp (E−i,−j ) ⊗ 1 ⊗ xps . 16p6n

We have to prove that the subspace V0 ⊗ 1 ⊂ W is cyclic under the action of U(g). Let u1 , . . . ,un be complex variables. Let $n be the supersymmetrisation map in the space ⊗n = End (CN |N )⊗n [u1 , . . . ,un ], End (CN|N )[u] as in the proof of Proposition 2.2. We have normalised this map so that $n2 = $n . Consider the homomorphism U(g) → End (CN |N )⊗n [u1 , . . . ,un ] determined by X (s) (5.7) ιp (Eij ) + (−1)s ιp (E−i,−j ) usp . Fij 7 → 16 p 6 n

The mage of U(g) under this homomorphism consists of all polynomials F (u1 , . . . ,un ) valued in End (CN|N )⊗n which are $n -invariant and for each p = 1, . . . ,n satisfy id⊗(p−1) ⊗ η ⊗ id⊗(n−p) F (u1 , . . . ,un ) = F (u1 , . . . ,−up , . . . ,un ). This follows from the Poincaré–Birkhoff–Witt theorem for Lie superalgebras. Now consider the subspace in (CN |N )⊗n ⊗ U0 ⊗ C[x1 , . . . ,xn ] consisting of all invariants under the action of the group Sn n Zn2 . Denote by W∗ this subspace. Also consider the subspace W0 in the tensor product (CN |N )⊗n ⊗ U0 consisting of all αinvariants. We shall prove that the subspace W0 is cyclic under the action of the algebra End (CN|N )⊗n in the first tensor factor. The above description of the image of U(g) under (5.7) will then imply that the subspace W0 ⊗ 1 ⊂ W∗ is U(g)-cyclic. But this will yield U(g)-cyclicity of the subspace V0 ⊗ 1 ⊂ W . Take any α-invariant inner product h, i on the vector space (CN |N )⊗n ⊗ U0 . Now suppose that the subspace W0 ⊂ (CN |N )⊗n ⊗ U0 is not End (CN |N )⊗n -cyclic. Then we

222

M. Nazarov

have (CN|N )⊗n ⊗ b, W0 = {0} for some non-zero vector b ∈ U0 . But this contradicts t our initial choice of the subspace U0 ⊂ U . u A method for constructing the irreducible finite-dimensional representations of the algebra An was developed in [N2]. Restriction of any such representation U to the subalgebra Hn ⊂ An is a quotient of the left regular representation of Hn . The restriction of the corresponding representation V of Y(qN ) to the subalgebra U(qN ) is then a quotient of the representation of U(qN ) in (CN |N )⊗n . But there are irreducible finitedimensional representations of the algebra U(qN ), which do not appear as quotients of the representation in (CN|N )⊗n for any n , see [P]. Thus our correspondence U 7 → V cannot provide all irreducible representations of the algebra Y(qN ). It would be interesting to give a parametrisation of all irreducible finite-dimensional representations of the algebra Y(qN ); cf. [D4, Theorem 2] and [M]. Acknowledgements. I am grateful to I. Cherednik, G. Olshanski and A. Sudbery for helpful discussions. I am also grateful to V. Drinfeld, P. Kulish and E. Sklyanin for their kind interest in this work. Support from the EPSRC by an Advanced Research Fellowship, and from the EC under the TMR grant FMRX-CT97-0100, is gratefully acknowledged.

References [A]

Arakawa, T.: Drinfeld functor and finite-dimensional representations of the Yangian. Commun. Math. Phys. 205, 1–18 (1999) [A1] Avan, J.: Graded Lie algebras in the Yang–Baxter equation. Phys. Lett. B 245, 491–496 (1990) [A2] Avan, J.: Current algebra realization of R-matrices associated to Z2-graded Lie algebras. Phys. Lett. B 252, 230–236 (1990) [BD] Belavin, A. and Drinfeld, V.: The classical Yang–Baxter equation for simple Lie algebras. Funct. Anal. Appl. 17, 220–221 (1983) [BL] Bernard, D. and LeClair, A.: The quantum double in integrable quantum field theories. Nucl. Phys. B 399, 709–748 (1993) [BGHP] Bernard, D., Gaudin, M., Haldane, F. and Pasquier, V.: Yang-Baxter equation in long-range interacting systems. J. Phys. A 26, 5219–5236 (1993) [C] Capelli, A.: Sur les opérations dans la théorie des formes algébriques. Math. Ann. 37, 1–37 (1890) [D1] Drinfeld, V.: Hopf algebras and the quantumYang–Baxter equation. Soviet Math. Dokl. 32, 254–258 (1985) [D2] Drinfeld, V.: Degenerate affine Hecke algebras and Yangians. Funct. Anal. Appl. 20, 56–58 (1986) [D3] Drinfeld, V.: Quantum groups. In: International Congress of Mathematicians 1986, Amer. Math. Soc., Providence, RI: 1987, pp. 798–820 [D4] Drinfeld, V.: A new realization of Yangians and quantized affine algebras. Soviet Math. Dokl. 36, 212–216 (1988) [FR] Faddeev, L. and Reshetikhin, N.: Hamiltonian structures for integrable field theory models. Theoret. Math. Phys. 56, 847–862 (1983) [JN] Jones, A. and Nazarov, M.: Affine Sergeev algebra and q-analogues of the Young symmetrizers for projective representations of the symmetric group. J. London Math. Soc. 78, 481–512 (1999) [K] Kac, V.: Lie superalgebras. Adv. in Math. 26, 8–96 (1977) [KT] Khoroshkin, S. and Tolstoy, V.: Yangian double. Lett. Math. Phys. 36, 373–402 (1996) [LS] Leites, D. and Serganova, V.: Solutions of the classical Yang–Baxter equation for simple Lie superalgebras. Theoret. Math. Phys. 58, 16–24 (1984) [M] Molev, A.: Finite-dimensional irreducible representations of twisted Yangians. J. Math. Phys. 39, 5559–5600 (1998) [MM] Milnor, J. and Moore, J.: On the structure of Hopf algebras. Ann. of Math. 81, 211–264 (1965) [MNO] Molev, A., Nazarov, M. and Olshanski, G.: Yangians and classical Lie algebras. Russian Math. Surveys 51, 205–282 (1996) [N1] Nazarov, M.: Quantum Berezinian and the classical Capelli identity. Lett. Math. Phys. 21, 123–131 (1991) [N2] Nazarov, M.: Young’s symmetrizers for projective representations of the symmetric group. Adv. in Math. 127, 190–257 (1997)

Yangian of Queer Lie Superalgebra

[N3] [N4] [O1] [O2] [P] [RTF] [S] [S1] [S2]

223

Nazarov, M.: Capelli identities for Lie superalgebras. Ann. Scient. Ec. Norm. Sup. 30, 847–872 (1997) Nazarov, M.: Yangians and Capelli identities. Amer. Math. Soc. Transl. 181, 139–163 (1998) Olshanski, G.: Representations of infinite-dimensional classical groups, limits of enveloping algebras, and Yangians. Adv. in Soviet Math. 2, 1–66 (1991) Olshanski, G.: Quantized universal enveloping superalgebra of type Q and a super-extension of the Hecke algebra. Lett. Math. Phys. 24, 93–102 (1992) Penkov, I.: Characters of typical irreducible finite-dimensional q(n)-modules. Funct. Anal. Appl. 20, 30–37 (1986) Reshetikhin, N., Takhtajan, L. and Faddeev, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–225 (1990) Smirnov, F.: Dynamical symmetries of massive integrable models. Internat. J. Modern. Phys. A7 Suppl. 1B, 813–858 (1992) Sergeev, A.: The centre of enveloping algebra for Lie superalgebra Q(n, C). Lett. Math. Phys. 7, 177–179 (1983) Sergeev, A.: The tensor algebra of the identity representation as a module over the Lie superalgebras GL(n, m) and Q(n). Math. Sbornik 51, 419–427 (1985)

Communicated by T. Miwa

Commun. Math. Phys. 208, 225 – 243 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Integrable Cases of the Einstein Equations Andrew Dancer? , McKenzie Y. Wang Department of Mathematics & Statistics, McMaster University, 1280 Main St. West, Hamilton, Ontario, L8S 4K1, Canada. E-mail: [email protected] Received: 7 April 1999 / Accepted: 1 June 1999

Abstract: We produce explicit solutions for some cases of the cohomogeneity one Einstein equations by finding generalised first integrals of the Hamiltonian form of these equations. The resulting manifolds have dimension 10, 11 and 27. 0. Introduction In this paper we study some new examples of Hamiltonian systems with a 4-dimensional phase space which have the property that subject to an additional constraint they become integrable. More precisely, there is a function F on phase space, functionally independent of the Hamiltonian, whose Poisson bracket with the Hamiltonian vanishes on the submanifold defined by the constraint. The Poisson bracket need not, however, vanish on all of phase space. Using the existence of such a generalised first integral, we are able to solve by quadratures the induced Hamiltonian flow on the constraint manifold. Our new examples differ from the Goryachev-Chaplygin top in mechanics (see [6, pp. 232–233]). Recall that the latter is a 6-dimensional Poisson system on the dual of the Euclidean Lie algebra E(3) with two trivial constants of motion, one of which is the angular momentum about its axis of symmetry. There exists a third function on phase space whose Poisson bracket with the Hamiltonian is a certain function times the angular momentum. Hence the Poisson bracket vanishes if the angular momentum is set to be zero. The analogy, however, ends here, for setting the values of the two trivial constants of motion corresponds to choosing a coadjoint orbit of E(3) and the induced Hamiltonian system on an orbit corresponding to zero angular momentum integrable in the classical sense. In our examples, integrability holds only on the zero set of the Hamiltonian (which is 3-dimensional). The new examples in this paper arise from our study (see [3]) of the cohomogeneity one Riemannian Einstein equation Ric(g) = 3g, in arbitrary dimensions, from a ? Present address: Jesus College, Oxford University, Oxford, OX1 3DW, UK. E-mail: [email protected]

226

A. Dancer, M. Y. Wang

Hamiltonian viewpoint. The cohomogeneity one assumption means that a Lie group acts isometrically on the Einstein manifold with generic orbit of real codimension one. In that situation the Einstein equation reduces to a system of ordinary differential equations where the independent variable is a coordinate transverse to the group orbits. In [3] we described how this system of ODEs can be written as the Hamiltonian flow on the zero level set of a suitable Hamiltonian H. We are therefore dealing with a Hamiltonian system with a constraint. This, as well as most of our other results in [3] carry over with minor modifications to the Lorentzian case with spacelike orbits. If the irreducible summands in the isotropy representation of the principal orbit are distinct, our Hamiltonian has a particularly simple form. To describe it, let d1 , . . . , dr be the real dimensions of the summands, and let d denote the vector (d1 , . . . , dr ). The dimension of the principal orbit is then n = d1 + . . . + dr . If we fix a homogeneous background metric on the principal orbit, then any other homogeneous metric on the principal orbit is obtained by scaling the background metric on the i th irreducible sum1 mand by fi2 . If we let q = (q1 , . . . , qr ), where eqi = fi2 , then e 2 d·q is the volume of the orbit relative to the background. The conjugate momenta are denoted by the vector p = (p1 , . . . , pr ). Our Hamiltonian is now given by H = e− 2 d·q J + e 2 d·q ((n − 1)3 − S), 1

1

(0.1)

where J is the non-degenerate quadratic form (with signature r − 2)  2 r r p2 X 1 X  j pj − , J = n−1 dj j =1

j =1

and 3 is the Einstein constant. Furthermore, S denotes the scalar curvature of the orbit and we can write X Aw ew·q , S= w∈W

where Aw are nonzero constants and W is a finite set of vectors in Rr completely determined by the principal orbit type. They can be explicitly computed using the formula for the scalar curvature in [9]. It follows from this formula that the vectors in W are of three possible types: Type I : one entry is −1, the others are zero, Type II : one entry is 1, two are −1, the others are zero, Type III : one entry is −2, one is 1, the others are zero. In [3], we studied the existence of constants of motion for the above Hamiltonian. As the dynamics we are concerned with take place on the set H = 0, we considered solutions F, φ of the equation {F, H} = φH,

(0.2)

where { , } is the associated Poisson bracket. In this situation we called F a generalised first integral. We showed that for a large class of orbit Ptypes there are no nontrivial solutions to (0.2) which may be written as a finite sum Fb eb·q with Fb a polynomial in pi and b ∈ Rr .

Integrable Cases of Einstein Equations

227

Our methods did not apply in the case when there are exactly two summands in the isotropy representation, mainly because the quadratic form J then splits into two linear factors. This case is particularly tantalising because of the examples of Bérard Bergery [1] (see also Page and Pope [8]), which generalise the four-dimensional metric of Page [7]. In these examples, an open dense subset of the Einstein manifold is foliated by circle bundles over a Kähler–Einstein base. Moreover, the metric on the total space of each bundle is a Riemannian submersion over the base. There are two functions involved, one giving a scale factor on the base, the other in the fibers. If the base is a generalised flag manifold G/H , then the circle bundle is also a homogeneous space G/K and the Einstein metric is of cohomogeneity one. If moreover the base is an irreducible hermitian symmetric space other than the hyperquadric, then the Bérard Bergery system is equivalent to the full cohomogeneity one Einstein equations with orbit type G/K. In all cases, the Bérard Bergery equations are a Hamiltonian system with a 4dimensional phase space, and, as shall be seen in Sect. 1, there is a nontrivial generalised first integral for any sign of the Einstein constant. In both [1] and [8], the authors solved the equations explicitly (but not using the Hamiltonian formalism) and discussed global properties of the solutions. In this paper, we shall use the Hamiltonian formalism in [3] to study some new examples where the full cohomogeneity one Einstein system is explicitly integrable in the Ricci-flat case. The principal orbits are the product of two isotropy irreducible spaces, and the dimensions of the resulting Ricci-flat manifolds are ten or eleven (see Theorem (2.1)). These examples lie in the family considered by Böhm [2], where existence of complete examples was shown by getting estimates on the behaviour of the trajectories. We are able to get explicit solutions by exploiting the existence of a conserved quantity of the constrained Hamiltonian system. The same system also yields examples of inhomogeneous complete Ricci-flat metrics if we replace the non-collapsing isotropy irreducible factor by inhomogeneous Einstein manifolds. Finally, we use similar ideas to find an integrable system of Einstein equations in dimension twenty-seven. The manifolds in this case are foliated by hypersurfaces which are the total spaces of a certain family of T 8 -bundles over a product of nine copies of the complex projective line. See Theorem (5.1) for a precise statement. As in [3], the results of this paper can easily be modified to give explicitlyintegrable examples in the Lorentz case where the principal orbits are space-like. (See Remark (3.1).)

1. A Hamiltonian Formulation of the Bérard Bergery Examples In this section we recall some results of our earlier paper [3] concerning generalised first integrals for the Hamiltonian H. If F =

X

Fb eb·q

b

P

solves Eq. (0.2) then, putting ψ = φ − 21 ( ψ=

X b

i

∂F di ∂p ), and i

ψb eb·q ,

228

A. Dancer, M. Y. Wang

we have, when 3 = 0, the recursion relations X Aw (ψb−d−w + (w + d) · ∇Fb−d−w ) . (b · ∇J )Fb − J ψb = −

(1.1)

w

We may define the level of a vector in Rr to be the sum of its coordinates. Then, if c is a vector with the lowest level among all the b with Fb and ψb nonzero, then we have (c · ∇J )Fc = J ψc . We seek solutions where Fb and ψb are polynomial in p1 , . . . , pr . As discussed in [3], the polynomial J is irreducible for r > 2. Moreover, c · ∇J is nonconstant if c 6= 0. So if c 6 = 0 then we must have Fc = J Qc and ψc = (c · ∇J )Qc for some polynomial Qc . In [3] this was used to show that after subtracting an element of the ideal generated by the Hamiltonian, one can assume that the lowest level c is zero. In this paper, we are taking r = 2, so the situation is more complicated. Now J has the following factorisation: n(n − 1)J = ((d2 S − 1)p1 + (−d1 S − 1)p2 ) ((−d2 S − 1)p1 + (d1 S − 1)p2 ) , (1.2) where

s S=

n−1 . d1 d2

So, in contrast with the case of three or more summands, we can start our recursion at a level c such that J has a factorisation J = (c · ∇J )θ

(1.3)

and put Fc = θ ψc . It can be checked that c · ∇J divides J if and only if c is a null vector for J . So c may lie on either of two lines through the origin. Let us discuss the Bérard Bergery system in this framework. Denoting by G/K the principal orbit, we have an Ad(K)-invariant decomposition g = k ⊕ p1 ⊕ p2 , where p1 is a trivial one-dimensional representation. The set W consists of vectors w = (0, −1), v = (1, −2). The quadratic form J factors as (c · ∇J )θ, where 1 θ = − p1 2 and c = (−2, 2 − n).

(1.4)

Integrable Cases of Einstein Equations

229

One can now verify that the ODE system has a generalized first integral of the form Fc ec·q + Fc+v+d e(c+v+d)·q , where the nonzero coefficients Fb , ψb are given by Fc =

p12 2p1 , Fc+v+d = Av , and ψc = − . n−1 n−1

In terms of the recursion (1.1), the crucial point is that with this choice of Fc , there is a solution of the first level of the recursion with both Fc+v+d and Fc+w+d constant (in fact one of them is zero), and both ψc+v+d and ψc+w+d equal to zero. So we get a solution to the full recursion by setting all Fb , ψb at higher levels equal to zero. The following lemma makes this more precise. The proof is just a short calculation. Lemma 1.1. Suppose W has only two elements, w and v (not necessarily given by (1.4)). Let c, θ be such that the factorisation (1.3) holds. Suppose further that we have functions Fc , 0v , 0w such that Fc = J 0w + ρθ s

(1.5)

and 0

Fc = J 0v + τ θ s ,

(1.6)

(w + d) · ∇ρ = (v + d) · ∇τ = 0,

(1.7)

s = −((w + d) · ∇θ )−1 , s 0 = −((v + d) · ∇θ )−1

(1.8)

where

and

are positive integers. Then we can solve the recursion up to the first level by putting Fc+u+d = −Au 0u : ψc+u+d = Au (u + d) · ∇0u

:

u ∈ {v, w}.

If moreover 0v , 0w are constant, then we can solve the whole recursion by putting Fb = ψb = 0 for all other b. Hence F = Fc ec·q − Av 0v e(c+v+d)·q − Aw 0w e(c+w+d)·q

(1.9)

is a generalised first integral. In the Bérard Bergery case, the above relations hold with 0w = 0, 0v = −1 : ρ =

2 4 , τ =− ((3 − n)p1 + 2p2 ) : s = 2, s 0 = 1. n−1 n−1

In particular, we see that the function φ in (0.2) is really nonzero. Motivated by the above lemma, we shall try to construct new examples of generalised first integrals by satisfying (1.5–1.8) for constant 0w , 0v .

230

A. Dancer, M. Y. Wang

2. Generalised First Integrals As indicated at the end of [3], we can use the above analysis of the Bérard Bergery system to construct some new integrable examples. We shall now discuss this in detail. For these examples G = G1 ×G2 and the principal orbit type is a product (G1 /K1 )× (G2 /K2 ) of isotropy irreducible spaces. Hence the only vectors in W are v = (−1, 0) and w = (0, −1).

(2.1)

(If one of the factors is a circle then W contains only one vector. This case was analysed by Bérard Bergery [1] so we exclude it in what follows.) Suppose that 0

Fc = J 0w + ρθ s = J 0v + τ θ s , where s, s 0 are positive integers given by (1.8), and ρ, τ are polynomials in p1 , p2 such that (w + d) · ∇ρ = (v + d) · ∇τ = 0. Without loss of generality, we can take s 0 ≤ s. Then we can write 0

0

J (0w − 0v ) = θ s (τ − ρθ s−s ).

(2.2)

If 0w , 0v are constant, it follows that s 0 = 1, and hence, from (1.8), we have (v + d) · ∇θ ∈ Z+ . (w + d) · ∇θ (Note that this condition is unaffected by scaling θ .) Referring to (1.2) for the possibilities for θ one finds that we need 1 − n − d2 S = ν(1 − n + d1 S)

(2.3)

for some positive integer ν. After some rearrangements, one finds that (2.3) implies (2ν − 1)d1 + 1 . d1 (ν − 1)2 − 1

n = 1 + d1 +

The integrality of n and d1 imposes strong constraints. The only possibilities are ν=2:

(d1 , d2 ) = (2, 8), (3, 6), (5, 5).

(2.4)

For future reference we record the relation d2 =

4d1 , d1 − 1

(2.5)

which follows from (2.3) with ν = 2. Conversely, let us suppose that (d1 , d2 ) is one of the pairs listed in (2.4). Then we have (v + d) · ∇θ/(w + d) · ∇θ = 2, so by rescaling θ we can assume that (1.8) holds. Now c is uniquely determined by (1.3), since J is nondegenerate. We can choose a linear form L annihilated by (v + d) · ∇, and the equation ρθ − χL = −c · ∇J

Integrable Cases of Einstein Equations

231

can be solved for constants ρ, χ as the determinant of this linear system is always nonzero. So, putting τ = χ L, we have ρθ 2 = τ θ − J. We can therefore write down a function Fc satisfying (1.5-1.8) for constant 0w , 0v and hence produce a generalised first integral of the form (1.9). As an example we consider the case d1 = d2 = 5. Here we take c = (−2, −4) and θ =

1 (p1 − 2p2 ), 6

and the factorisation J = (c · ∇J )θ is

p22 p12 1 12 1 2 (p1 + p2 ) − − = (p1 − 2p2 ) − (2p1 − p2 ) . 9 5 5 6 45

Then, if we take ρ = − 45 , τ = − 23 p1 +

8 15 p2 ,

we have

ρθ 2 = τ θ − J, so we can satisfy (1.5-1.6) with 0w = 0, 0v = −1. Moreover (v + d) · ∇τ = (w + d) · ∇ρ = 0. Note also that (w + d) · ∇θ = − 21 and (v + d) · ∇θ = −1 as required. Lemma 1.1 gives us the generalised first integral F =−

1 (p1 − 2p2 )2 e−2q1 −4q2 + Av e2q1 +q2 . 45

The other cases are dealt with similarly. In all cases s = 2 and s 0 = 1. We summarise the results of this section as follows. Theorem 2.1. Let G1 /K1 and G2 /K2 be isotropy irreducible homogeneous spaces whose dimensions are (2, 8), (3, 6) or (5, 5). Then the cohomogeneity one Einstein equations with principal orbit type (G1 × G2 )/(K1 × K2 ) and zero Einstein constant admit a non-trivial generalised first integral. 3. The Equations Having found the generalised first integrals, we will next show that this leads to an explicit solution of the Einstein equations by quadratures. It is convenient at this point to go back to the form of the Einstein equations given in [5]. We write the metric as dt 2 + gt , where gt denotes the metric on the principal orbits, considered as an endomorphism with respect to a fixed background metric. We denote by rt the Ricci curvature of gt , considered as an endomorphism with respect to gt . As we are assuming that all the summands in the isotropy representation of the principal orbit are distinct, the endomorphism gt is diagonal with entries fi2 . These are related to the Darboux coordinates pi , qi by fi2 = eqi . We use a prime to denote differentiation with respect to t. The canonical equations qi0 = ∂H /∂pi enable us to express pi in terms of fi . If the isotropy representation consists of two distinct irreducible summands of dimension d1 and d2 , we obtain 0 1 f1 /f1 d1 (d1 − 1) d1 d2 p1 = e 2 d·q . p2 d1 d2 d2 (d2 − 1) f20 /f2

232

A. Dancer, M. Y. Wang

The 2 × 2 matrix is just the inverse of the matrix defining the quadratic form J . The Einstein equations with Einstein constant 3 are equivalent to L0 + (tr L)L − r + 3 Id = 0, tr (L0 ) + tr (L2 ) + 3 = 0, where L is the second fundamental form defined by g 0 = 2gL. Recall that there is a third equation for the components Ric(X, ∂/∂t), where X is tangent to the principal orbits, but because gt remains diagonal for all t, a lemma in ([1], 3.18– 3.19) implies that this third equation automatically holds. Let us now specialise to the situation when the principal orbit is the product of two isotropy irreducible spaces G1 /K1 and G2 /K2 of dimension d1 and d2 respectively. We view the principal orbit, as in Sect. 2, as the homogeneous space (G1 × G2 )/(K1 × K2 ). The weight vectors for the scalar curvature of the principal orbit are now given by (2.1). In the Ricci-flat case, the above Einstein equations become 0 2 f 0f 0 f f100 Av + (d1 − 1) 1 + d2 1 2 − = 0, (3.1) f1 f1 f1 f2 d1 f12 0 2 f 0f 0 f f200 Aw + (d2 − 1) 2 + d1 1 2 − = 0, f2 f2 f1 f2 d2 f22 d1

f100 f 00 + d2 2 = 0. f1 f2

(3.3)

The constraint H = 0 can be written as 0 2 0 2 f 0f 0 f f2 + 2d1 d2 1 2 + d2 (d2 − 1) · d1 (d1 − 1) 1 f1 f1 f2 f2 − Av f12v1 f22v2

− Aw f12w1 f22w2

(3.2)

(3.4)

= 0,

where vi denotes the i th coordinate of v, et cetera. Let us now assume that (d1 , d2 ) are as in Theorem 2.1. The conservation law F = −C from the new first integral may be written as 0 2 0 f1 f2 −1 C = + + λ2 A , (3.5) λ1 v f1 f2 f1e1 f2e2 ρf1−2v1 f2−2v2 where e = 2(c + d + v), and the numbers λi are given by λ = J −1 ∇θ.

Integrable Cases of Einstein Equations

233

It is easy to check from (2.3) that ∇θ is null for the quadratic form defined by J −1 ; equivalently, λ is orthogonal to ∇θ in the Euclidean inner product. Also, observe that taking c · ∇ of Eq. (1.3), and recalling that c must be a null vector for J , shows that c · ∇θ = 1.

(3.6)

Now, the relations (1.8) can be used to determine θ = θ1 p1 + θ2 p2 . As v, w are given by (2.1), we find 1 2 − d2 θ1 = . θ2 2(1 − n) 1 + d1 It follows that

λ1 λ2

=

−d1 − 21 d2

,

so we can rewrite (3.5) as 0 2 2 f10 f20 1 2 f20 1 C 2 f1 + d1 d2 + d2 = − −2v −2v Av + e1 e2 . (3.7) d1 f1 f1 f2 f2 4 f1 f2 ρf1 1 f2 2 Knowledge of θ, together with the factorisation (1.3), enables us to determine c. We have n−1 d1 − 1 c1 . = c2 (d1 + 1)(2 − d2 ) 2(d2 − 1) In fact, we can simplify this further. Using (2.5) we see that the factor (n − 1)/(d1 + 1)(2 − d2 ) is −1/2, so we have 1 c1 (1 − d1 ) = 2 . c2 1 − d2 Note that

e = 2(c + d + v) =

d1 − 1 . 2

Now, (1.8) and (3.6) show that e = 2(c + d + v) is orthogonal to ∇θ . But as observed earlier, λ is orthogonal to ∇θ. So λ is proportional to e. The upshot is that we can rewrite (3.5) as s f10 f20 a C = Av + e1 e2 , (3.8) e1 + e2 f1 f2 f1 f1 f2 where a is a constant.√ Explicitly calculating the factor of proportionality between λ and e, we find that a = ± −ρ. We shall take the positive sign in the following calculations since taking the negative sign amounts to changing the coordinate t to −t. This gives us our first key equation. To derive the second, we use the Hamiltonian constraint (3.4). First, we need an explicit expression for ρ. We have, from (1.5–1.7), the equation ρθ 2 = −J + τ θ.

234

A. Dancer, M. Y. Wang

Applying the directional derivative operator (v + d) · ∇ twice to this equation yields ρ = −J (v + d, v + d). With our choice of v, this becomes ρ = (1 − d1 )/d1 .

(3.9)

Now we can derive our second equation. Multiplying (3.7) by (d1 − 1), (3.4) by d1 and subtracting, we eliminate the terms in Av and (f20 /f2 )2 . We obtain −d1 d2 (1 + d1 )

0 2 f10 f20 f 3 1 d1 C d 1 Aw + (d1 d2 − d1 d22 − d22 ) 2 − (e +2) e + = 0. 1 2 f1 f2 4 4 f2 f22 f1 f2 (3.10)

Using (2.5) we see that

−d1 d2 (1 + d1 ) d1 d2 − 43 d1 d22 − 41 d22

is proportional to λ and hence to e. So we can rewrite (3.10) as f0 f0 f20 κ2 κ1 1 e1 1 + e2 2 = 2 + 2 , e e f2 f1 f2 f2 f1 f1 1 f2 2

(3.11)

where the constants κ1 , κ2 may be calculated to be κ1 =

Aw (d1 − 1)2 , 4d1 (d1 + 1)

κ2 = −

C(d1 − 1)2 . 4d1 (d1 + 1)

We can use Eqs. (3.8) and (3.11) to perform the integration of the Einstein system by quadratures. The form of these equations suggests introducing new dependent variables ξ = f1e1 f2e2 , and β = f22 . Then (3.8) gives s a C ξ0 = (3.12) Av + , ξ f1 ξ while (3.11) yields 2κ1 2κ2 β 0ξ 0 = + 2 . βξ β f1 ξ We introduce a new independent variable r defined by 1 dr = . dt f1 Then Eq. (3.12) is in ξ only: p dξ Av ξ 2 + Cξ , =a dr

(3.13)

Integrable Cases of Einstein Equations

235

which, if C 6 = 0, integrates to ξ=

p C cosh a Av r + γ − 1 , 2Av

where γ is a constant. Observe that √ −1 p 2 Av dξ cosech a Av r + γ = dr aC and

a p γ 2/a 2 C = tanh Av r + , 2 2 which tends to 1 as r tends to ∞. Equation (3.13) becomes e

R

1/ dξ dr

dβ dξ = 2κ1 f12 ξ + 2κ2 β. dr dr

(3.14)

As e2 = 2, we see that f12 = (ξ/β)2/e1 so (3.14) can be rewritten as 2κ2 dβ − β= dr dξ/dr

2κ1 ξ (e1 +2)/e1 dξ/dr

! β −2/e1 .

This is a Bernoulli equation with known coefficients, so can be linearised and integrated by quadratures. Explicitly, putting u = β (e1 +2)/e1 , we have u = κ3 e

R

κ4 / dξ dr

Z

ξ (e1 +2)/e1 dξ dr

! e−

R

κ4 / dξ dr

dr,

(3.15)

where κ3 =

2κ1 (e1 + 2) , e1

κ4 =

2κ2 (e1 + 2) . e1

Note that the exponent 2κ4 /a 2 C occurring in the integral is actually √ equal to −1, as can be seen from our formulae for κ4 , κ2 , a and ρ. So, putting R = a Av r + γ , we can write u as a constant multiple of the function Z tanh( R2 )(cosh(R) − 1)(e1 +2)/e1 R dR. (3.16) coth( ) 2 sinh(R) If d1 = 2 or 3 (so e1 = 1 or 2) this can be converted into the integral of a rational function by using tanh( R2 ) as the new variable. If d1 = 5, it becomes the integral of an algebraic function. √ Note that if C = 0, then ξ = γ ea Av r . Again, β satisfies a Bernoulli equation with known coefficients.

236

A. Dancer, M. Y. Wang

Remark 3.1. We have analysed the full cohomogeneity one Einstein equation when the principal orbit is a product of two isotropy irreducible spaces of dimensions (2, 8), (5, 5), (3, 6). In fact, our results can produce Einstein metrics in more general situations. Consider the metric g = dt 2 + gt , where gt is the product metric f12 (t)h1 ⊕ f22 (t)h2 , and h1 , h2 are Einstein metrics on spaces M1 and M2 with positive Einstein constant. We do not now assume that Mi are isotropy irreducible, or even homogeneous. The Ricci-flat Einstein equations for g are then Eqs. (3.1–3.3), where Aw , Av are positive constants. If the dimensions of M1 , M2 are as in our three cases, all our analysis is still valid. If the Einstein constants of Mi are not both positive, then the equations are still integrable, but, for example, when Av is negative, the solution to (3.12) will now involve trigonometric or rational functions. In order to treat the case of a Lorentz metric g = −dt 2 + gt with space-like orbits, it is only necessary to observe that the Ricci-flat Einstein equations are simply (3.1–3.3) with a negative sign before Av and Aw . In other words, by allowing these constants to assume both signs, our integrability discussion already includes the Lorentz case. 4. Completeness In this section we shall use our explicit description of the metric in Sect. 3 to analyse when it is complete. It is convenient to use the coordinate p (4.1) R = a Av r + γ introduced above. R dξ 1/ dr If we take the constant of integration C to be positive, it is clear that ξ, dξ dr and e are finite and positive on 0 < R < ∞. This is also true, therefore, for u, and hence for β and for 1/e1 ξ (4.2) f1 = β and

f2 =

ξ f1e1

1/e2

,

(4.3)

so our metric is well-defined on the range 0 < R < ∞. Moreover, as R → ∞ we have (suppressing multiplicative constants) 2R

u ∼ e e1 ,

2R

β ∼ e e1 +2

and hence R

f2 ∼ e e1 +2 . Using (4.2), we see that R

f1 ∼ e e1 +2

Integrable Cases of Einstein Equations

237

also. Referring back to the definition of the coordinate r, we see that as r, and hence R, tends to ∞, the arclength coordinate t tends to ∞ also. So there are no completeness problems at infinity. More precisely, both f1 and f2 behave asymptotically like constant multiples of t, so the metric is asymptotically a cone over an Einstein product metric on the principal orbit. In particular, the volume growth is maximal. We must now consider what happens as R tends to zero from above. We shall choose the constant of integration γ in (4.1) to be zero, so that R = 0 is equivalent to r = 0. Also, we choose the constant of integration in our integral formula (3.16) for u to be zero (this ensures that u vanishes at R = 0). From our formula for u, we calculate that e1 + e2 1 du dξ 2 =a C . (4.4) lim R→0 u dr dr e1 Now, (3.14) yields that e1 ξ 1 du dξ 2 f1 = −2κ2 + , 2κ1 β e1 + e2 u dr dr and using (4.2), (4.4) and our formulae for κ1 , κ2 , a we find that e1 +e2 C 2(d1 + 1) . = 1+ lim f1 R→0 Aw d1 − 1

(4.5)

So f1 approaches a positive constant as R tends to zero. We see from (3.16) that as R tends to zero, u is to leading order a nonzero multiple 2+

4

of R e1 . It follows that β = f22 behaves like a constant multiple of R 2 ; in particular, f2 vanishes at R = 0. (Alternatively, we could deduce this from (4.3), (4.5) and the vanishing of ξ at R = 0.) We also need to check the derivatives with respect to t of f1 and f2 at R = 0. Observe first that 2 1 dξ = + O(r) ξ dr r and, from (4.4), lim

R→0

r du u dr

= lim

R→0

1 du dξ 2 u dr dr a 2 C

=2

e1 + e2 e1

.

Hence 1 dβ 1 dξ e1 du 1 dξ − = − ξ dr β dr ξ dr (e1 + e2 )u dr is an odd function with no negative powers of r in its Laurent expansion, hence vanishes at R = 0. It follows using (4.2) that df1 /dr, and hence f10 vanish at R = 0 as required. Eliminating e1 f10 /f1 + e2 f20 /f2 from (3.8) and (3.11), and using our formulae for κ1 , κ2 , a, we obtain s s ! f20 C C d − 1 − 1 d 1 1 Aw − e1 +e2 . Av f22 + e1 = f1 d1 4(d1 + 1) f1 f1

238

A. Dancer, M. Y. Wang

We need f20 to equal 1 at R = 0. Substituting into the above equation, this is equivalent to s √ C d1 − 1 C d1 − 1 A = − . w d1 4(d1 + 1) (f1 (0))e1 +e2 (f1 (0))(e1 +e2 )/2 Using (4.5), we find after some manipulation that this in turn is just equivalent to Aw =

4d1 (3d1 + 1) . (d1 − 1)2

Using the relation (2.5) satisfied by our dimensions d1 and d2 , we see that this is equivalent to Aw = d2 (d2 − 1).

(4.6)

Now as f2 vanishes at R = 0 we need the collapsing space G2 /K2 to be a sphere. Equation (4.6) is now just the (true!) statement that the round sphere of unit radius and dimension d2 has scalar curvature d2 (d2 − 1). So f2 has the correct derivative at R = 0. To sum up, if G2 /K2 is a sphere we can complete the metric by adding in a special orbit G1 /K1 = G/(K1 × G2 ) at R = 0. These are among the complete examples of Böhm [2]. In the special cases that we have been considering, we have an explicit formula for the metric because of the existence of generalised first integrals. 5. A Second Family We can also apply Lemma (1.1) to find first integrals for the Hamiltonian (0.1), in certain cases, when the two vectors in W are v = (0, −1),

w = (1, −2).

(5.1)

If d1 = 1 we are in a situation analysed by Bérard-Bergery [1]. We shall therefore assume that d1 > 1 from now on. With this assumption, the only way that ((v + d) · ∇θ )/((w + d) · ∇θ) or its reciprocal can equal 2 is if 1 + d1 S − n = 2(1 + d2 S + 2d1 S − n), which simplifies to n = d1 +

9d1 . d1 − 4

(Note that the relation 9d1 = d2 (d1 − 4) which follows from the above is the analogue of (2.5).) It follows that the possible values for the integers d1 and d2 are (d1 , d2 ) = (5, 45), (6, 27), (7, 21), (8, 18), (10, 15), (13, 13), (16, 12), (22, 11), (40, 10). As before, for these values of di we can construct a generalised first integral of the form (1.9) with Fc = ρθ 2 = −J + τ θ and (1.5–1.8) holding.

Integrable Cases of Einstein Equations

239

We need to see when these Hamiltonian flows can actually be interpreted as Einstein equations. To do this we examine the formula for the Ricci tensor of a Riemannian submersion to determine how we might obtain a scalar curvature function of the form we have been considering. Because Av and Aw are constants and the coefficient corresponding to the vector (−1, 0) in the scalar curvature formula is 0, we are led to consider metrics g = dt 2 + gt , where gt is a metric on the principal space P of a torus fibration π

T d1 −→ P −→ M, where M is an Einstein manifold. Furthermore, the connection of the Riemannian submersion must be Yang–Mills since we want the cross terms of the Ricci tensor to vanish. In other words, the curvature form of the connection on the torus bundle must be harmonic with respect to the base metric. Lastly, certain quadratic expressions of the curvature form must be proportional to the metrics on the fibres and base. A special situation in which these quadratic expressions have the desired form was studied in [10]. There, the base M was taken to be a product M1 × . . . × Mm of Kähler– Einstein manifolds of real dimension ni . Since the real dimension of M is d2 which is now even, our pair (d1 , d2 ) must be (8, 18), (16, 12) or (40, 10). Let us write c1 (Mi ) = yi αi , where αi ∈ H 2 (Mi , Z) is indivisible. Also, we shall write the Kähler–Einstein metrics on the factors Mi as xi gi , where xi are positive constants to be chosen and the Einstein constant of gi is yi . Denote by ωj the Kähler form of gj on Mj , and let X1 , . . . , Xd1 be a basis for the Lie algebra of T d1 . We choose the metrics gt on P to be obtained by Riemannian submersion, with the dependence on t being encoded in two functions f1 , f2 of t. We take the metric on the fiber to be f1 (t)2 ζ , where ζ is some fixed left-invariant metric with ζ (Xk , Xl ) denoted by ζkl . The metric on the base is chosen to be f2 (t)2 times the product of the Kähler-Einstein metrics xi gi . Finally, we choose the connexion on P to have curvature form = π ∗ η, where η=

d1 X

ηk Xk ,

k=1

and ηk =

m X

bkj ωj

:

bj k ∈ Z.

j =1

This ensures that η is harmonic with respect to any of the base metrics we are considering. Specifically, we would like to be able to choose bkj and ζ so that the Ricci tensor of gt has the form   2 µf1 0   d1 f24 I d   2µf 2 λ 0 ( f 2 − d f 14 )I d 2

2 2

with respect to the decomposition of T P into vertical and horizontal spaces. Here µ, λ are nonzero constants where the latter is the Einstein constant of the metric on the base.

240

A. Dancer, M. Y. Wang

Then the equations for dt 2 + gt to be Ricci-flat will be the Hamiltonian flow for (0.1) with r = 2 and W given by (5.1). Referring to the calculations of [10] (which use the O’Neill formulae), we see that to get the desired Ricci tensor in the vertical and horizontal directions, we need: m X ni bki bli i=1

4xi2

µζ kl d1

=

yi =λ xi

:

:

1 ≤ k, l ≤ d1 ,

1 ≤ i ≤ m,

d1 X µ ζkl bki bli =4 2 d2 xi k,l=1

:

(5.2)

(5.3)

1 ≤ i ≤ m.

(5.4)

1 ≤ k, l ≤ d1 ,

(5.5)

1 ≤ i ≤ m,

(5.6)

These imply the equations m X ni bki bli i=1

yi2

= Eζ kl

d1 X ζkl bki bli = E0 2 y i k,l=1

:

:

where E, E 0 are nonzero constants. Note that multiplying (5.5) by ζkl and summing over k, l, and multiplying (5.6) by ni and summing over i, tells us that d1 E = d2 E 0 . So a solution to (5.5)–(5.6) will give a solution to (5.2)–(5.4). Let B be the d1 × m matrix (bki ), and let N be the m × m diagonal matrix with entries ni /yi2 . Then the equations become BNB T = Eζ −1 , (B T ζ B)ii =

yi2 Ed1 d2

:

1 ≤ i ≤ m.

(5.7)

(5.8)

Observe that (5.2) means that the nonsingular matrix ζ −1 is the sum of m rank one matrices, so we must have m ≥ d1 . The only remaining possibilities, then, are to take d1 = 8, m = 8, M = (CP1 )7 × Y, where Y is a Kähler-Einstein surface, or d1 = 8, m = 9, M = (CP1 )9 . In the first case, B is square and nonsingular, so (5.7) tells us that ζ = E(B T )−1 N −1 B −1 . Substituting into (5.8) gives

Integrable Cases of Einstein Equations

241

(N −1 )ii =

4yi2 , 9

so ni = 9/4, which is a contradiction. In the second case, letting Q = 2Eζ −1 the equations become BB T = Q, (B T Q−1 B)ii = 8/9

:

(5.9) 1 ≤ i ≤ 9.

(5.10)

Lemma 5.1. If B is a maximal rank matrix whose row sums are all zero, then defining Q by (5.9) gives a solution to (5.10). Proof. We use two bases of Euclidean R9 . Let i (1 ≤ i ≤ 9) denote the standard basis of R9 , and put = 1 + . . . + 9 . Also, if we let Bk denote the k th row of B, then since B has maximal rank, it follows that B1 , . . . , B8 , is also a basis of R9 . Note that the condition on the row sums becomes hBk , i = 0 for k = 1, . . . , 8. In this new basis, the relation hi , i i = 1 becomes, X 1 hi , Bk ihi , Bl iQkl + hi , i2 , 1= 9 k,l

or equivalently, X

hi , Bk ihi , Bl iQkl =

k,l

8 , 9

which is Eq. (5.10). u t We therefore obtain the following theorem. Theorem 5.1. Let B, Q be as in Lemma 5.1 and P be the principal T 8 -bundle over (CP1 )9 whose Euler class is given by B. Define a t-dependent metric gt on P as in the preceding discussion. Then, the Einstein equations with zero Einstein constant for g = dt 2 + gt correspond to the Hamiltonian flow on the zero level set of the Hamiltonian (0.1) with r = 2, d1 = 8, d2 = 18 and W given by (5.1). Furthermore, this Hamiltonian flow has a nontrivial generalised first integral of the form (1.9). We note that the principal bundles in the above theorem are actually homogeneous spaces of the form [SU (2)]9 /K, where K is determined by the matrix B. Its identity component is a circle subgroup of U (1)9 ⊂ SU (2)9 and K/K0 is a finite abelian group. Explicit integration of the Einstein equations proceeds in a similar fashion to that in Sect. 3. The relation F = −C may be written as f10 f20 2 1 C + Av + e1 e2 = 0, (5.11) λ1 + λ2 f1 f2 f1 f2 ρf22 where e = 2(c + d + v) as before. Again, we can calculate θ, c, λ, ρ; we obtain 1 3 − d2 θ1 = , θ2 2(1 − n) 2 + d1

242

A. Dancer, M. Y. Wang

c= λ=

1

3 (1 − d1 ) 1 2 (1 − d2 )

− 3d21 −d2

It follows that we have 2 e1 = (2d1 + 1) and e2 = 3

,

As before, (5.11) can be rewritten as f0 f0 a e1 1 + e2 2 = f1 f2 f2

ρ=

4d2 9d1

, 1 − d2 . d2

(2d1 + 1) = d2 − 1.

s Av +

C

, f1e1 f2e2

(5.12)

√ where a = ± −ρ. Once again, we can eliminate the Av and f10 /f1 terms simultaneously from the equations F = −C and H= 0 to obtain: f0 f0 κf 2 f10 µ (5.13) e1 1 + e2 2 = 41 + e e +2 , 1 f1 f1 f2 f2 f1 f2 2 where κ, µ are constants. Putting ξ = f1e1 f2e2 and β = f12 , and defining a coordinate r by 1 dr = , dt f2 we obtain p dξ = a Av ξ 2 + Cξ dr

(5.14)

2κβ 2 ξ dξ dβ = 2µβ + . dr dr f22

(5.15)

and

Equation (5.14) is integrated as before. Equation (5.15) then becomes the Bernoulli equation   ! e2 −2 e2 e 2µ 2κξ dβ 2+ 1 − dξ β =  dξ  β e2 , dr dr

dr

which can then be linearised and integrated by quadratures. The corresponding metrics are defined for r in an interval of the form [r0 , ∞) and there are no completeness problems at infinity. The function f1 is asymptotically constant while f2 is asymptotically of the form exp(R/e2 ), where R is given by (4.1). The volume therefore grows like t 19 , which is not maximal. This behaviour at infinity is similar to that of the Euclidean Taub-NUT metric (see, for example, [4]). However, our metrics do not have a smooth completion beyond r0 (as a cohomogeneity one metric).

Integrable Cases of Einstein Equations

243

6. Final Remarks One could try to obtain further examples of generalised first integrals by trying to solve (1.5)–(1.8) for some functions 0v , 0w which are nonconstant. However, in order to be able to still set all other Fb equal to 0, at Fc+(d+v)+(d+w) we need to satisfy the additional condition (v − w) · ∇(0v − 0w ) = 0.

(6.1)

If this condition holds, we again obtain a first integral of the form (1.9). However, Eq. (6.1) implies that θ cannot divide 0 = 0v − 0w , and so (1.5)–(1.6) force min(s, s 0 ) to equal one, and hence s/s 0 or s 0 /s to be a positive integer. Similarly, θ cannot divide (v + d) · ∇0. Using this fact on the equation got by applying the operator ((v + d) · ∇)((v − w) · ∇) to (1.5) minus (1.6), we find that s/s 0 (respectively s 0 /s) must be less than four. If the set W of weight vectors for the scalar curvature is as in Sect. 2, the only way that s/s 0 or s 0 /s can be a positive integer is if it equals 2. This is just the situation analysed in Sect. 2. If W is as in Sect. 5, there are values for (d1 , d2 ) such that s/s 0 is 3. However, one can check that it proves impossible in these cases to solve Eqs. (1.5)–(1.7) for ρ and τ . Therefore, we have found all the generalised polynomial first integrals of the form (1.9) that can be obtained from Lemma (1.1). Lastly, one may also try to find generalised polynomial first integrals using Lemma (1.1) in the case of the nonzero Einstein constant 3. Similar computations show that in this case we only get the Bérard Bergery examples. Acknowledgements. The authors acknowledge the support of NSERC under grants OPG0184235 and OPG0009421 respectively.

References [1] Bérard Bergery, L.: Sur des nouvelles variétés riemanniennes d’Einstein. Nancy: Publications de l’Institut Elie Cartan, 1982 [2] Böhm, C.: Complete non-compact cohomogeneity one Einstein metrics. Bull. Math Soc. Fr. 127, 135–177 (1999) [3] Dancer, A. and Wang, M.: The cohomogeneity one Einstein equations from the Hamiltonian viewpoint. Preprint 1998 [4] Eguchi, T. and Hanson, A.J.: Asymptotically flat self-dual solutions to Euclidean gravity. Phys. Lett. 74B (3), 249–251 (1978) [5] Eschenburg, J.H. and Wang, M.: The initial value problem for cohomogeneity one Einstein metrics. J. of Geometric Analysis (to appear) [6] Guillemin, V. and Sternberg, S.: Symplectic Techniques in Physics. Cambridge: Cambridge University Press, 1984 [7] Page, D.: A compact rotating gravitational instanton. Phys. Lett. 79B, 235–238 (1979) [8] Page, D. and Pope, C.: Inhomogeneous Einstein metrics on complex line bundles. Classical and Quantum Gravity 4, 213–225 (1987) [9] Wang, M. and Ziller, W.: Existence and non-existence of homogeneous Einstein metrics. Invent. Math. 84, 177–194 (1986) [10] Wang, M. and Ziller, W.: Einstein metrics on principal torus bundles. J. Diff. Geom. 31, 215–248 (1990) Communicated by A. Connes

Commun. Math. Phys. 208, 245 – 265 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Environment-Induced Superselection Rules in Markovian Regime R. Olkiewicz? Physics Faculty and BiBoS, University of Bielefeld, 33615 Bielefeld, Germany. E-mail: [email protected] Received: 1 February 1999 / Accepted: 2 June 1999

Abstract: A generalization of the Jacobs–deLeeuw–Glicksberg splitting for a quantum dynamical semigroup is introduced. This generalization, called the isometric-sweeping decomposition, is used to derive effective superselection rules in the quantum system.

1. Introduction One of the fundamental principles of quantum mechanics is the superposition principle. However, it is evident that some superpositions of quantum pure states do not take place in the real world. Well known examples of such a phenomenon encompass the absence of superpositions of states with different electric charge or with integer and half-integer spin. This fact led to the introduction, in 1952, of superselection rules [39], which axiomatically exclude certain superpositions from being observable: “We shall say that a superselection rule operates between subspaces if there are neither spontaneous transitions between their state vectors, nor there are measurable quantities with finite matrix elements between their state sectors”. An alternative equivalent description is that there exists a self-adjoint operator with the property that relative phase of complex numbers α and β in the superposition αψ1 + βψ2 of two eigenvectors belonging to any pair of its distinct eigenvalues is unobservable. For a review of this subject see the recent paper by Wightman [40]. It was a useful postulate, but the question about an explanation of its appearance arose. One of the efforts to answer it was connected with the algebraic approach of Haag and Kastler [18], in which the theory of superselection rules can be reduced to the study of states on the quasi-local C ∗ -algebra. On the other hand, it was shown in a wide range of examples that the interaction with an environment leads to the loss of the coherence of the reduced density matrices, and thus to the appearance of classical observables in the quantum system. This is, in fact, the basic idea of the ? Permanent address: Institute of Theoretical Physics, University of Wrocław, 50-204 Wrocław, Poland. E-mail: [email protected]

246

R. Olkiewicz

program of environment-induced superselection rules of Zurek. In [41] he showed that when a quantum system is open, interacting with an environment, superselection rules do not need to be postulated. They arise naturally as a result of the decoherence process, which effectively destroys superpositions between macroscopically different states with respect to a local observer, so that the system appears to be in one or the other of those states. By the term “destroys superposition” we understand that the off-diagonal elements of the superposition are unavailable with respect to a specific set of observations. Let us point out that two years earlier Araki [2] presented a model, in which a continuous superselection rule plays an important role for the reduction of wave functions. The idea was further elaborated in [23,24,31]. It is worth noting that there is a difference between the effective superselection rules arising through the interaction with an environment and the meaning attributed to the term “superselection rules”. Superselection rules are said to operate between subspaces of a Hilbert space if the phase factors between vectors belonging to two distinct subspaces are unobservable. In the case of effective superselection rules, phase coherence between vectors from some preferred set of pure states is being continuously destroyed by the interaction. However, as was shown in a recent paper by Giulini, Kiefer and Zeh [17], also in the case of electric charge, classical properties of the quantum system can emerge through the interaction of a charged particle with electromagnetic fields. In order to study decoherence, the analysis of the evolution of the reduced density matrix obtained by tracing out the environment variables is the most convenient strategy. If the interaction is such that the reduced density matrix becomes approximately diagonal in a particular basis selected by the interaction (in the simplest case), then it is said that an environment-induced superselection structure has occurred. Generally, the procedure of tracing out environment variables, being the composition of a unitary automorphism with a conditional expectation, leads to a complicated integro-differential equation for the reduced statistical operator. However, for a large class of interesting physical phenomena we can derive, using certain limiting procedures, an approximate Markovian master equation for the reduced density matrix [1]. More recently, the derivation of the master equation for the reduced density matrix of a system coupled linearly to an ohmic, subohmic and supraohmic environment at arbitrary temperature, was obtained in [20]. Another way by which master equations can arise is the completely positive coupling of quantum and classical systems (see [4,5] for discrete classical systems and [29,30] for continuous ones). Such a coupling also results in the appearance of the master equation for the evolution of the quantum system. The loss of quantum coherence in the Markovian regime was established in a number of open systems [36,37] giving clear evidence of environment-induced superselection rules. However, the question is left open as to whether it is a fundamental property of a dynamical semigroup to induce a superselection structure in a quantum system (possibly a trivial one, as the concept of dynamical semigroup encompasses also a unitary evolution). In 1996 Kupsch concluded that “a more rigorous mathematical investigation of the superselection structure of dynamical semigroups has still to be made” [27]. In the present paper we try to provide such an analysis. The paper is organized as follows. In Sect. 2 we consider the existence of an invariant density for a Markov semigroup acting on the space of trace class operators (Thm. 1 and 2). Moreover, following the classical case, the concept of sweeping semigroup is introduced. In Sect. 3 we provide a generalization of the Jacobs–deLeeuw–Glicksberg splitting [26], called the isometric-sweeping decomposition (Thm. 8 and 11). The structure of the isometric part and the action of the semigroup on this part is described

Environment-Induced Superselection Rules

247

(Corollary 18 and Thm. 19). It allows us to formulate some sufficient conditions for the validity of the noncommutative Foguel alternative (Prop. 22), and to obtain the characterization of the peripheral point spectrum in the irreducible case (Prop. 23). Finally, in Sect. 4, the isometric-sweeping decomposition is used to discuss the appearance of effective superselection rules in the quantum system (Thm. 24).

2. The Existence of Invariant Densities One of the basic problems in the analysis of the asymptotic behavior of a quantum open system, which is closely related to a possible decomposition of its dynamical semigroup, is the question about the approach to equilibrium when the system, irrespectively of its initial state, evolves into one specific state. The convenient tool in the study of the approach to equilibrium is the existence of a faithful density matrix [13, 14, 38] or, at least, the existence of a family of invariant states whose recurrent subspace projection asymptotically approaches the identity operator [15]. For example, the existence of a faithful family of Tt -invariant states implies that {Tt } is relatively compact in the weak operator topology [16], and hence the validity of the Jacobs–deLeeuw–Glicksberg splitting. Therefore, in this section we study conditions for the existence of an invariant density for a Markov semigroup. At first we introduce some notation. Let H be a separable Hilbert space. By Tr(H), HS(H), K(H) and B(H) we denote respectively the space of trace class, Hilbert-Schmidt, compact and bounded linear operators on H. They are Banach spaces with norms k · k1 , k · k2 , k · k∞ respectively. The hermitian and positive operators will be denoted by Tr(H)SA , Tr(H)+ and so on. Because Tr(H) = K(H)∗ and B(H) = Tr(H)∗ , so on Tr(H) we also define both weak and weak∗ -topology. The set of density matrices is given by D = {φ ∈ Tr(H)+ : Trφ = 1}. The space of all projectors we denote by P(H) and Pf (H) is the subspace of all finite-dimensional projectors. T : Tr(H) → Tr(H) is said to be Markov if it is a contractive, positive and trace preserving operator on Tr(H). Finally, we call a map t → Tt , t ≥ 0 a Markov semigroup if Tt is a strongly continuous semigroup of Markov operators on Tr(H). We start with the following definition: φ ∈ D is called invariant for Tt if Tt φ = φ for all t ≥ 0. Observe that, if a discrete subsemigroup Tn = T1n , n = 0, 1, ... of a Markov semigroup Tt has an invariant density, then Tt also has an invariant density. To see this R1 assume that T φ = φ for some φ ∈ D. Then φ = 0 Tt (φ)dt is an invariant density for Tt . Therefore we can consider only the discrete case. Theorem 1. For a Markov operator T the following conditions are equivalent: a) ∃φ ∈ D such that {T n φ} is relatively weakly compact. b) ∃φ ∈ D∃φ0 ∈ Tr(H)+ such that lim sup k(T n φ − φ0 )+ k1 < 1. n→∞

c) ∃φ ∈ D∃e ∈ Pf (H) such that lim sup T n φ(e⊥ ) < 1. n→∞

248

R. Olkiewicz

d) ∃φ ∈ D∃e ∈ Pf (H)∃Lim, a Banach limit on l ∞ , such that Lim(T n φ(e⊥ )) < 1. e) ∃φ ∈ D such that T φ = φ. Proof. We show that a) ⇒ b) ⇒ c) ⇒ d) ⇒ e) ⇒ a). a) ⇒ b) By Thm. 5.4, Chap.III in [34], there exists ω ∈ Tr(H)+ such that ∀ > 0∃δ() with the following property: (x ∈ B(H), kxk∞ ≤ 1 and ω(x ∗ x + xx ∗ ) < δ) ⇒ ∀n|T n φ(x)| < . Let us take = 21 and denote the corresponding δ by δ0 . Suppose φ0 = δ20 · ω. Because T n φ − φ0 ∈ Tr(H)SA , so (T n φ − φ0 )+ = (T n φ − φ0 )fn , where fn is the support projection of (T n φ − φ0 )+ . Hence k(T n φ − φ0 )+ k1 = T n φ(fn ) − φ0 (fn ) = T n φ(fn ) −

2 ω(fn ). δ0

If ω(fn ) < δ20 , then ∀kT k φ(fn ) < 21 and so k(T n φ − φ0 )+ k1 < 21 . If ω(fn ) ≥ k(T n φ − φ0 )+ k1 = 0 because T n φ(fn ) ≤ 1 for all n. Therefore lim sup k(T n φ − φ0 )+ k1 ≤ n→∞

δ0 2 , then

1 . 2

b) ⇒ c) Since φ0 ∈ Tr(H)+ , so for any > 0 there exists e ∈ Pf (H) such that φ0 (e⊥ ) < , e⊥ = 1 − e. Because lim sup k(T n φ − φ0 )+ k1 < 1, n→∞

so ∃n0 ∀n ≥ n0 Tr(T n φ − φ0 )+ ≤ λ < 1. Let us take = 21 (1 − λ). Because (T n φ − φ0 )(e⊥ ) ≤ (T n φ − φ0 )+ (e⊥ ) ≤ λ, so

1 (1 + λ). 2 c) ⇒ d) It is clear because for any Lim we have Lim(an ) ≤ lim supn→∞ an . d) ⇒ e) Although proof of this implication is similar to that given by Socała in the commutative case of L1 (X, µ) space [33] we present it for the reader’s convenience. Let us define ω(x) = Lim(T n φ(x)) for x ∈ B(H). Clearly ω is a state on B(H). Let ω = ωn + ωs be a unique decomposition of ω to a normal and singular part. It is evident that both ωn and ωs are positive. ωs (e) = 0 for all e ∈ Pf (H) and it does not majorize nontrivial normal functionals, i.e. if φ ∈ Tr(H)+ and φ ≤ ωs , then φ = 0. Moreover ωn is the biggest normal functional that is majorized by ω, i.e. if φ ∈ Tr(H)+ and φ ≤ ω, then φ ≤ ωn . To see this it suffices to notice that φ(e) ≤ ωn (e) for every e ∈ Pf (H). Next we show that ωn is a fixed point of T . Since Lim(a1 , a2 , ...) = Lim(a2 , a3 , ...), so for any x ∈ B(H)+ we have T n φ(e⊥ ) ≤ λ + φ0 (e⊥ ) <

ωn ◦ T ∗ (x) = ωn (T ∗ x) ≤ ω(T ∗ x) = ω(x),

Environment-Induced Superselection Rules

249

where T ∗ denotes the adjoint operator. Hence ωn ◦ T ∗ ≤ ω and so ωn ◦ T ∗ ≤ ωn . Since T ∗ 1 = 1, so for any e ∈ P(H), ωn (T ∗ e) + ωn (T ∗ e⊥ ) = ωn (e) + ωn (e⊥ ), which implies ωn (T ∗ e) = ωn (e). Therefore ωn ◦ T ∗ = ωn . To this end we show that ωn 6 = 0. Suppose the contrary is true. Thus ω = ωs . By assumption there exists e ∈ Pf (H) such that Lim(T n φ(e⊥ )) < 1. Because ω(e) = 0, so ω(e⊥ ) = 1 = Lim(T n φ(e⊥ )), a contradiction. Hence ωn /kωn k1 is an invariant density. e) ⇒ a) This is clear. u t Theorem 2. For a Markov operator T the following conditions are equivalent: a) ∀φ ∈ D 0 is a weak∗ limit point of {T n φ}. b) ∀φ ∈ D∀e ∈ Pf (H) lim inf T n φ(e) = 0. n→∞

c) ∃λ < 1∀φ ∈ D∀e ∈ Pf (H)

lim inf T n φ(e) ≤ λ. n→∞

d) T does not have an invariant density. e) ∀φ ∈ D∀e ∈ Pf (H) n−1

1X k T φ(e) = 0. n→∞ n lim

k=0

Proof. Again we show that a) ⇒ b) ⇒ c) ⇒ d) ⇒ e) ⇒ a). a) ⇒ b) Because K(H) is separable, so weak∗ topology on the unit ball in Tr(H) is metrizable. Thus for any x ∈ K(H)+ there exists a subsequence {nk } such that limk→∞ T nk φ(x) = 0. Since Pf (H) ⊂ K(H)+ , so condition b) follows. b) ⇒ c) This is obvious. c) ⇒ d) Assume the contrary. Then we have φ0 ∈ D such that T φ0 = φ0 . Choosing e ∈ Pf (H) such that φ0 (e) > λ, we get the contradiction. d) ⇒ e) Suppose e) is not true. Then there exist φ0 ∈ D and e ∈ Pf (H) such that n−1

1X k T φ0 (e) = δ > 0. n→∞ n lim

k=0

In the same way as in the classical case of L1 (X, µ) space [25] we construct a Banach limit Lim such that n−1

1X k T φ0 (e⊥ ) = 1 − δ. Lim(T φ0 (e )) = lim inf n→∞ n n

⊥

k=0

However, by point d) in Thm. 1, this leads to the existence of an invariant density for T , which contradicts the assumption.

250

R. Olkiewicz

e) ⇒ a) First we consider x ∈ K(H)+ . For any > 0 there exist e ∈ Pf (H) and a constant C > 0 such that 0 ≤ xe = ex ≤ Ce and ke⊥ xk∞ ≤ . Thus n−1

n−1

k=0

k=0 n−1 X

1X k 1X k T φ(x) = lim [T φ(ex) + T k φ(e⊥ x)] n→∞ n n→∞ n lim

1 n→∞ n

≤ C lim

k=0

n−1

1X k kT φk1 ke⊥ xk∞ ≤ . n→∞ n

T k φ(e) + lim

k=0

Because was arbitrary we obtain that n−1

1X k T φ(x) = 0. n→∞ n lim

k=0

However T k φ(x) ≥ 0 and hence lim inf n→∞ T n φ(x) = 0. For arbitrary x ∈ K(H) we make the following estimate: |T n φ(x)| = |T n φ(x1+ − x1− + ix2+ − ix2− )| ≤ T n φ(x1+ + x1− + x2+ + x2− ), which ends the proof. u t Following the classical case we call a Markov operator T sweeping if ∀φ ∈ D∀e ∈ Pf (H)

lim T n φ(e) = 0.

n→∞

Then we have. Proposition 3. For the following conditions: a) T is sweeping, b) T does not have an invariant density, it is evident that a) ⇒ b). 3. The Decomposition of T Our next objective is to obtain a decomposition of the Banach space Tr(H) to an isometric and sweeping part for T . The relation of such a decomposition to the Jacobs–deLeeuw– Glicksberg splitting will be discussed. We will also try to find a condition for T which establishes the equivalence between the points a) and b) in Prop. 3 and thus shows the validity of the Foguel alternative (T has an invariant density or is sweeping) in the noncommutative case. We consider here only the discrete time semigroups, since the generalization to the continuous case is straightforward. In order to achieve our goals we assume that: (i) T is contractive in both the trace and operator norms, (ii) T is 2-positive.

Environment-Induced Superselection Rules

251

The last requirement means that T ⊗id : Tr(H)⊗M2×2 → Tr(H)⊗M2×2 maps positive operators (acting on H ⊕ H) into positive operators. Notice that we do not require that T preserves Tr. However, since T is contractive in the trace norm, so TrT φ ≤ Trφ for all φ ∈ Tr(H)+ . The class of operators introduced above appears naturally in quantum dynamical systems. Assume therePis a completely positive and normal operator T ∗ P acting on B(H). Then T ∗ (A) = i Vi∗ AVi , where i Vi∗ Vi is strongly convergent. Assume moreover that T ∗ (1) ≤ 1, where 1 denotes the identity operator on H, and that Tr isPsubinvariant for T ∗ . Then the preadjoint of T ∗ , say T : Tr(H) P → Tr(H), T (φ) = i Vi φVi∗ , satisfiesP(i) and (ii). Indeed T ∗ (1) ≤ 1 implies that i Vi∗ Vi ≤ 1 the trace norm and TrT ∗ φ ≤ Trφ gives us Vi Vi∗ ≤ 1. Therefore T is contractive in P and completely positive. It is also a contraction in the operator norm. If i Vi∗ Vi = 1, then the operator T ∗ is called a dynamical P map (or a dynamical semigroup in the case of continuous time), when, in addition, i Vi Vi∗ = 1, then T ∗ is said to be a doubly stochastic dynamical map. We start with the following. Lemma 4. Suppose T satisfies (i) and (ii). Let T ∗ be its adjoint. Then T and T ∗ (together with their restrictions and extensions) T : K(H) → K(H), T ∗ : B(H) → B(H), T , T ∗ : HS(H) → HS(H), T , T ∗ : Tr(H) → Tr(H), are contractions in all above spaces. Moreover both T and T ∗ are strongly positive, T ∗ is normal and T extends uniquely to a normal and contractive operator in B(H), which we denote also by T . We recall that a hermitian operator B acting on a C∗ - algebra A is strongly positive if B(x ∗ x) ≥ B(x)∗ B(x) for all x ∈ A. Proof. The first line is clear since T is contractive in the operator norm and Tr(H) is dense in K(H) in the operator norm. Because T is 2-positive and contractive in K(H), and K(H) has an approximate identity, so T is strongly positive on K(H) [11]. Therefore, for any φ ∈ Tr(H) we have kT φk22 = Tr(T φ)∗ T φ ≤ TrT (φ ∗ φ) ≤ kφk22 , and so T extends to a contraction in HS(H). Next we consider T ∗ . Because (T ⊗ id2×2 )∗ = T ∗ ⊗ id2×2 is positive so T ∗ is 2-positive. Since TrT φ ≤ Trφ for any φ ∈ Tr(H)+ , so T ∗ (1) ≤ 1. Hence T ∗ is also strongly positive. Assume now φ ∈ Tr(H)+ . Then TrT ∗ φ = lim Tr(T ∗ φ)en , n→∞

where {en } is an increasing sequence of commuting finite-dimensional projectors such that en → 1. But Tr(T ∗ φ)en = TrφT (en ) ≤ Trφ, since T (en ) ≤ 1. Hence T ∗ reduces to a bounded operator on Tr(H). Moreover T ∗ |Tr(H) being strictly positive extends to a contraction on HS(H). Finally, we define T ∗∗ = (T ∗ |Tr(H) )∗ . Because T ∗∗ 1 ≤ 1 and T ∗∗ is strongly positive it is also contractive. This implies that T ∗ |Tr(H) is a contraction too. To this end observe that T ∗∗ is normal and coincides with T on compact operators. Thus, it is the unique normal extension of T onto B(H). u t

252

R. Olkiewicz

By the above lemma T is a contraction in the Hilbert space HS(H). It is well known that HS(H) can be decomposed to a unitary subspace K of T , K = {x ∈ HS(H) : kT n xk2 = kT ∗n xk2 = kxk2 ∀n ∈ N}, and its orthogonal complement K ⊥ . K is a closed linear subspace and both K and K ⊥ are T and T ∗ invariant. Moreover T ∗ T x = T T ∗ x = x for all x ∈ K. In addition, for x ∈ K ⊥ we have w − lim T n x = w − lim T ∗n x = 0. n→∞

n→∞

In the following proposition we collect some properties of the subspace K: Proposition 5. Suppose (i), (ii) hold. Then: a) x ∈ K ⇒ x ∗ ∈ K, b) x = x ∗ ∈ K ⇒ |x|, x + , x − ∈ K, c) x, y ∈ K ⇒ x · y ∈ K,P d x = x ∗ ∈ K, then x = i λi ei , λi 6= 0 and pojectors ei ∈ K∀i, e) x ∈ K ⇒ |x| ∈ K. Proof. a) It is clear because x ∈ K iff T ∗n T n x = T n T ∗n x = x for all n ∈ N. b) Since, by assumption, x = x ∗ , so −|x| ≤ x ≤ |x| and hence −T n |x| ≤ T n x ≤ T n |x|. Since k · k2 is an absolutely monotone norm, therefore kxk2 = kT n xk2 ≤ kT n |x|k2 ≤ k|x|k2 = kxk2 which implies kT n |x|k2 = k|x|k2 . Because the same equality holds for T ∗ , so |x| ∈ K. c) First we show that if x ∈ K, then also x ∗ x ∈ K. Since T and T ∗ are strongly positive so T ∗n T n (x ∗ x) ≥ (T ∗n T n x)∗ T ∗n T n x = x ∗ x and

kx ∗ xk2 ≤ kT ∗n T n (x ∗ x)k2 ≤ kT n (x ∗ x)k2 ≤ kx ∗ xk2 .

It implies that kT n (x ∗ x)k2 = kx ∗ xk2 and, by a similar argument, kT ∗n (x ∗ x)k2 = kx ∗ xk2 . To show that xy ∈ K it suffices to consider only hermitian x and y. But then 2xy = (x + y)2 − x 2 − y 2 + i[(x − iy)∗ (x − iy) − x 2 − y 2 ] which implies xy ∈ K. d) First we rearrange the spectral decomposition of x in such a way that |λ1 | > |λ2 | > . . . . Then X λi x ( )n ei ( )n = e1 + λ1 λ i=2 1 P belongs to K. But | λλ1i | are strictly decreasing and less than 1, so i=2 ( λλ1i )n ei → 0 in HS(H) when n → ∞. Therefore e1 ∈ K and, by induction, ei ∈ K for all i. e) By point c), |x|2 ∈ K ∩ Tr(H)+ and using d) we obtain that also |x| ∈ K. u t By this proposition we see that K is generated by projectors, necessarily finitedimensional. The collection of them we denote by P(K). The next proposition collects some properties of operators T and T ∗ when restricted to the set P(K).

Environment-Induced Superselection Rules

253

Proposition 6. Assume that (i) and (ii) hold. Then: a) e ∈ P(K) ⇒ T (e), T ∗ (e) ∈ P(K) and dim T (e) = dimT ∗ (e) = dim e, b) T and T ∗ are bijective on P(K), c) if e, f ∈ P(K) and e⊥f , then also T (e)⊥T (f ), T ∗ (e)⊥T ∗ (f ), d) if e, f ∈ P(K), then e ∧ f, e ∨ f ∈ K, too. Proof. a) For φ ∈ Tr(H)+ , TrT φ ≤ Trφ and TrT ∗ φ ≤ Trφ, so it suffices to Pconsider the operator T only. Since T is contractive in the operator norm, so T (e) = i ai Pi , ai ∈ orthogonal projectors not necessarily belonging to (0, 1], where Pi are one-dimensional P K. Hence TrT (e) = i ai ≤ dime. However X ai2 = kT (e)k22 = kek22 = dim e. So

P

i

ai ≤

P

i

i

ai2 , which implies ai = 1 for all i.

b) It is evident that T is one-to-one. Because T T ∗ e = e so any e ∈ P(K) ∈ imT |P (K) . c) Since e⊥f , so kek22 + kf k22 = ke + f k22 = kT (e) + T (f )k22 = Tr(T (e) + T (f ))2 = kek22 + kf k22 + 2TrT (e)T (f ). Therefore T (e)T (f ) = 0. d) Let rp(x) denote the range projector of x ∈ HS(H). Because rp(x) = rp|x ∗ | and, by Prop.5 e), |x ∗ | ∈ K so also rp(x) ∈ K for any finite- dimensional x ∈ K. But e ∨ f = f + rp(e − f e) and e ∧ f = e − rp(e − ef ), so both e ∨ f and e ∧ f belong to K. u t Proposition 7. Suppose (i) and (ii) hold. Then: a) x ∈ K ⇒ |T (x)| = T (|x|), b) x, y ∈ K ⇒ T (xy) = T (x)T (y). Proof. a) First observe that T (|x|2 ) = (T |x|)2 since, by Prop. 6 c), T maps orthogonal projectors to orthogonal ones. Therefore |T x|2 = (T x)∗ T x ≤ T (x ∗ x) = (T |x|)2 . It implies that for any v ∈ H, hv, |T x|2 v >≤< v, (T |x|)2 vi. However, kT xk2 = kxk2 = k|x|k2 = kT |x|k2 so X X hvi , |T x|2 vi i = hvi , (T |x|)2 vi i, i

i

where {vi } is an orthonormal basis in H. Thus hv, |T x|2 vi = hv, (T |x|)2 vi for any v ∈ H, and so |T x|2 = (T |x|)2 .

254

R. Olkiewicz

b) Above we showed that (T x)∗ T x = T (x ∗ x) for all x ∈ K. Because T is strongly positive on K(H), so for any φ ∈ Tr(H)+ the sesquilinear form bφ : K(H) → K(H) given by bφ (x, y) = φ[T (x ∗ y) − (T x)∗ T y] is positive. Hence bφ (x, x) = 0 implies bφ (x, y) = 0 for all y ∈ K(H). Because for x ∈ K, bφ (x, y) = 0 for all φ ∈ Tr(H)+ so T (xy) = T (x)T (y) for all x ∈ K and y ∈ K(H). u t Now, assuming (i) and (ii), we formulate the decomposition theorem for T . Theorem 8. In Banach space Tr(H) there exist two closed T -invariant subspaces Tr(H)iso and Tr(H)s such that: a) Tr(H)iso and Tr(H)s are ∗ -invariant, b) Tr(H)iso ⊥Tr(H)s in the following sense: ∀φ ∈ Tr(H)iso ∀ψ ∈ Tr(H)s there is Trφψ = 0, c) Tr(H)iso is generated by projectors, d) Tr(H) = Tr(H)iso ⊕ Tr(H)s , T = T1 ⊕ T2 , e) T1 is an invertible isometry while w∗ − limn→∞ T2n ψ = 0 for any ψ ∈ Tr(H)s . Hence T2 is sweeping. Proof. Define Tr(H)iso = K ∩ Tr(H) and Tr(H)s = K ⊥ ∩ Tr(H). Then a), b) and c) follow. d) From the very definition we have that Tr(H)iso ∩ Tr(H)s = 0. Suppose φ ∈ Tr(H)+ . ⊥ Then φ = φ1 + φ2 , where φ1 ∈ K and φP 2 ∈ K . Clearly both φ1 and φ2 are hermitian. a e Hence, assuming that φ1 6 = 0, φ1 = i i i , ai 6 = 0 and ei ∈ K for all i. Since ei φ ⊥ φ2 ∈ K , so for every index i, Trei φ2 = Trei φ − ai Trei = 0. Thus ai = Tr Trei and so ai > 0. It means that φ1 ≥ 0 and X X ai Trei = Trei φ ≤ Trφ = kφk1 . kφ1 k1 = Trφ1 = i

i

Therefore φ1 ∈ K ∩ Tr(H)+ and φ2 ∈ K ⊥ ∩ Tr(H)SA . Because the positive cone Tr(H)+ is generating, the first assertion follows. Finally, since T is contractive in the trace norm so T = T1 ⊕ T2 , where T1 (T2 ) is the restriction of T to Tr(H)iso (Tr(H)s ) respectively. e) Since T T ∗ = T ∗ T = id on K so T1 is invertible. By Prop.7 a), we have that T1 |φ| = |T1 φ| for all φ ∈ Tr(H)iso . By Prop.6 a), TrT1 |φ| = Tr|φ|, hence kT1 φk1 = kφk1 . Since for any ψ ∈ Tr(H)s and x ∈ HS(H) limn→∞ TrxT2n ψ = 0 and HS(H) is dense in t K(H) in the operator norm, so also w ∗ − limn→∞ T2n ψ = 0. u In this way we obtained a new decomposition, say the isometric-sweeping one, for the operator T , Tr(H) = Tr(H)iso ⊕ Tr(H)s . Let us now recall the definition of the reversible part of T : Tr(H)r = Lin{φ ∈ Tr(H) : T φ = eiα φ for some α ∈ R}. An advantage of space Tr(H)iso over Tr(H)r follows from the fact that it can be nonzero while Tr(H)r = 0. For example, if U is a bilateral shift with multiplicity one and T (φ) = U φU ∗ , then Tr(H)iso = Tr(H) while Tr(H)r = 0. In general, the following holds. Proposition 9. Tr(H)r ⊂ Tr(H)iso .

Environment-Induced Superselection Rules

255

Proof. Suppose T φ = eiα φ, φ ∈ Tr(H). Then ke−iα φ − T ∗ φk22 = kφk22 − he−iα φ, T ∗ φiH S − hT ∗ φ, e−iα φiH S + kT ∗ φk22 ≤ 0, t and so T ∗ φ = e−iα φ. Therefore φ ∈ K ∩ Tr(H). u It implies in particular that if there is a T -invariant density φ, then φ and, by Prop.5 d), its spectral projectors corresponding to strictly positive eigenvalues belong to Tr(H)iso . When the set {T n } is relatively compact in B(Tr(H)) in the weak operator (wo) topology, then the Jacobs–deLeeuw–Glicksberg splitting holds: Tr(H) = Tr(H)r ⊕ Tr(H)0 , where Tr(H)0 is given by Tr(H)0 = {φ ∈ Tr(H) : 0 is a weak limit point of {T n φ}}. Both Tr(H)r and Tr(H)0 are T -invariant closed subspaces. In our setting more can be said. Proposition 10. Suppose {T n } is wo-relatively compact. Then: a) Tr(H)r and Tr(H)0 are ∗ -invariant. b) Tr(H)r ⊥Tr(H)0 in the sense of Thm. 8 b). c) Suppose ψ ∈ Tr(H). If ψ⊥Tr(H)r , then ψ ∈ Tr(H)0 . Proof. a) Let T denote the weak operator closure of the set {T n }. Clearly T is an abelian, wo-compact semi-topological semigroup. Let Q1 be the unit in the kernel of T . Then Tr(H)r = Q1 Tr(H) and Tr(H)0 = (I − Q1 )Tr(H), see [26] for the definitions and more details. Because Q1 is a wo-limit point of {T n } and T n (φ ∗ ) = (T n φ)∗ so the same property holds for Q1 . b) Since T and T ∗ are contractions in HS(H), so both {T n } and {T ∗n } are wo-relatively compact in B(HS(H)). Let T2 (T2∗ ) denote the wo-closure of {T n }({T ∗n }) in B(HS(H)) ˜ 2 ) be the unit in the kernel of T2 (T ∗ ) respectively. Since (T2 )∗ = T ∗ so and Q2 (Q 2 2 ∗ ˜ 2 = Q . However, the reversible parts of T and T ∗ in HS(H) coincide, hence imQ2 = Q 2 ˜ 2 = imQ∗ . Because Q2 = Q2 , Q∗2 = Q∗ , so Q∗ Q2 = Q2 and Q2 Q∗ = Q∗ . imQ 2 2 2 2 2 2 2 Therefore (Q2 − Q∗2 )2 = 0, which implies that Q2 = Q∗2 . Thus, for any x ∈ HS(H)r and y ∈ HS(H)0 we have hx, yiH S = hQ2 x, (I − Q2 )yiH S = 0. However, Tr(H)r ⊂ HS(H)r and Tr(H)0 ⊂ HS(H)0 , hence the assertion follows. c) ψ = ψ1 + ψ2 , where ψ1 ∈ Tr(H)r and ψ2 ∈ Tr(H)0 . By the assumption, TrψTr(H)r = 0. Because ψ1∗ ∈ Tr(H)r and, by point b), Trψ2 ψ1∗ = 0, so Trψ1 ψ1∗ = 0. Hence t ψ1 = 0, which ends the proof. u If T n belongs to the class of strongly operator (so) relatively compact semigroups, then the following theorem holds. Theorem 11. If {T n } is so-relatively compact and T satisfies conditions (i) and (ii), then the isometric-sweeping decomposition coincides with the Jacobs–deLeeuw–Glicksberg splitting, i.e. Tr(H)r = Tr(H)iso and Tr(H)0 = Tr(H)s .

256

R. Olkiewicz

Proof. Suppose {T n } is so-relatively compact. Then, by Lemma 4.2 in [28], for any φ ∈ Tr(H)0 , limn→∞ kT n φk1 = 0. If T satisfies (i) and (ii), then, by Prop. 9, Tr(H)r ⊂ Tr(H)iso . Thus ψ ∈ Tr(H)s ⇒ TrψTr(H)iso = 0 ⇒ TrψTr(H)r = 0 ⇒ ψ ∈ Tr(H)0 . The last implication follows from Prop. 10 c). Hence Tr(H)s ⊂ Tr(H)0 . Suppose now / Tr(H)s . Let ψ = ψ1 + ψ2 that Tr(H)s 6 = Tr(H)0 . We take ψ ∈ Tr(H)0 such that ψ ∈ be its isometric-sweeping decomposition, i.e. ψ1 ∈ Tr(H)iso and ψ2 ∈ Tr(H)s with ψ1 6 = 0. Then ψ1 = ψ −ψ2 ∈ Tr(H)0 and so limn→∞ kT n ψ1 k1 = 0. On the other hand, by Thm. 8 e), kT n ψ1 k1 = kψ1 k1 > 0, the contradiction. Therefore Tr(H)s = Tr(H)0 . t Because, by Prop. 9, Tr(H)r ⊂ Tr(H)iso , so the equality Tr(H)r = Tr(H)iso holds. u We saw that for T (φ) = U φU ∗ , where U is a bilateral shift, Tr(H)iso = Tr(H) and Tr(H)s = 0. Clearly, {T n } is not wo-relatively compact in this case. Here we present another example of a non-wo-relatively compact semigroup for which Tr(H)iso = 0 and Tr(H)s = Tr(H). Thus, a non-trivial isometric-sweeping decomposition for non-worelatively compact semigroups is also available. Example 1 ([6, 7]). Let us consider a homogeneous configuration space Q = G/K, where G is a Lie group and K is a closed subgroup. Assume that both G and K are unimodular. Let (π, H) be a unitary strongly continuous representation of G, such that for every k ∈ K, π(k)v0 = eia(k) v0 for some unit vector v0 ∈ H. It follows that for each q ∈ Q there is a one-dimensional projector Pq = |π(g)v0 >< v0 π(g)|, where [g] = q. We also assume that the R system of generalized coherent states {Pq } is square integrable and normalized, i.e. Q dα(q)Pq = 1 in the strong sense, where dα is a unique Ginvariant and σ -finite measure on Q. The quantum algebra Aq = {Pq , q ∈ Q}00 equals B(H). When the quantum system interacts with the classical environment, then the following master equation for the reduced density matrix appears [7]: Z ρ˙t = −i[H, ρt ] + λ dα(q)Pq ρt Pq − λρt . Q

Clearly, it generates a dynamical semigroup Tt on Tr(H), which satisfies conditions (i) and (ii). Moreover {Tt } is not wo-relatively compact. From the spectral properties of the generator of Tt on Tr(H) and its extension on K(H), it follows that Tt is stable on HS(H), i.e. limn→∞ kT n xk2 = 0 for every x ∈ HS(H). Hence the subspace K = 0 and so Tr(H)iso = 0. In order to obtain a more precise description of the isometric part of T we use the fact that both T and T ∗ have normal and contractive (in the operator norm) extensions to B(H). Because the case K = 0 is trivial we assume that K 6 = 0. We start with the following definition: a projector e ∈ P(K) is called minimal if in P(K) there is no nontrivial subprojector of e. We denote the set of such projectors by P min (K). By Prop.6 d) P min (K) generates K. Moreover we can find a strictly increasing sequence, possibly finite, of natural numbers {nk } such that P min (K) = ∪k=1 Pkmin (K), where Pkmin (K) consists of those projectors e ∈ P min (K) such that dime = nk . It is evident that T and T ∗ are bijective maps on Pkmin (K) for each k. Next we define a von Neumann algebra M as the closure in the strong operator topology of a space A = Lin{e : e ∈ P min (K)}.

Environment-Induced Superselection Rules

257

To see that M is indeed a von Neumann algebra, it suffices to check that A is a ∗ -algebra. Suppose e, f ∈ P min (K). Then both the hermitian and anti-hermitian part of ef belong to A, so ef ∈ A too. Moreover, all finite-dimensional projectors from M belong to P(K) (as if a net xα ∈ A tends to a finite dimensional projector P ∈ M in the σ -strong topology, then T ∗n T n (P ) = T n T ∗n (P ) = P for all n ∈ N, and thus P ∈ K). Hence any e ∈ P min (K) is also minimal in M. Proposition 12. Pkmin (K)⊥Plmin (K) if k 6= l. Proof. Let e ∈ Pkmin (K) and f ∈ Plmin (K). Suppose that ef 6 = 0. Then e and f have non-zero equivalent subprojectors. But this is impossible since dime 6 = dimf and both e and f are minimal. u t Using the above results we decompose M as follows. Let Ak = Lin{e : e ∈ Pkmin (K)}. It is also an algebra. To check this, let e, f ∈ Pkmin (K) and suppose that ef ∈ / Ak . Then there exists e1 ∈ Plmin (K), l 6= k, such that f e1 6= 0 or ee1 6 = 0, the contradiction. Therefore Ak is a ∗ -algebra. Let Mk be its closure in the strong operator topology. Then we have: Proposition 13. M = ⊕Mk and Mk is of type I for all k. Proof. Let E and Ek denote the unit in M and Mk respectively, that is E(Ek ) is a projector in M(Mk ) such that EA = AE = A for all A ∈ M(Mk ). It is clear that e ∈ Pkmin (K)} and E = ∨Ek . Moreover, by Prop.12, Ek El = δkl Ek . Ek = ∨{e : P Hence E = k Ek . Clearly, each Ek ∈ Z(M), the center of M. Observe also that Mk = MEk . Next, we show that for any non-zero projector f ∈ Z(Mk ) there exists e ∈ Pkmin (K) such that ef = e. Because ef = f e, so ef is a subprojector of e. However, e is minimal, hence ef = 0 or ef = e. Suppose that ∀e ∈ Pkmin (K) there is f e = 0. t Then also f Mk = 0, the contradiction. So, by definition, Mk is of type I. u Mk can be further decomposed as follows: Proposition 14. Mk = ⊕n Mkn and Mkn is a type I factor for all n. Proof. Let e ∈ Pkmin (K) and Ck (e) denote its central cover in Mk . If Ck (e) = Ek , then e is faithful. Since it is also minimal, Mk is a type I factor, by Corollary 10 maximal family of projectors in Pkmin (K) such that in [35]. In general, let {en } be a P Ck (en )Ck (em ) = δnm Ck (en ) and n Ck (en ) = Ek . It is clear that {en } is countable and en em = δnm en . Let Mkn = Mk Ck (en ). Then Mk = ⊕n Mkn . Since en ∈ Mkn is t minimal and faithful so Mkn is a type I factor. u Corollary 15. Z(Mk ) =

P

n C · Ck (en ).

Now we describe the structure of the restriction of operator T to M. First we prove a lemma. Lemma 16. Mk is T and T ∗ invariant, T is an automorphism of Mk and there is a permutation σ of the set {en } such that T (Ck (en )) = Ck (σ (en )). Moreover T is an isomorphism from Mkn onto Mkσ (n) , where Mkσ (n) = Mk Ck (σ (en )).

258

R. Olkiewicz

Proof. Because T , T ∗ : Pkmin (K) → Pkmin (K) and both are normal, so they also map Mk → Mk . Clearly, T ∗ = T −1 and so T is an automorphism by Prop.7 b). As an automorphism T maps central projectors into central ones. Hence for any en there is exactly one em such that T (Ck (en )) = Ck (em ). We call it σ (en ). Clearly, the map σ is bijective on set {en }. Suppose now An ∈ Mkn . Hence An = ACk (en ) for some A ∈ Mk and so T (An ) ∈ Mkσ (n) . It is evident that T : Mkn → Mkσ (n) is onto. u t Theorem 17. T |Mk = TU ◦ Tσ , where TU (A1 , A2 , . . . ) = (U1∗ A1 U1 , U2∗ A2 U2 , . . . ), An , Un ∈ Mkn and Tσ (A1 , . . . ) = (u(σ −1 (1)1)∗ Aσ −1 (1) u(σ −1 (1)1), . . . ). where u(nσ (n)) is a partial isometry for all n. Proof. To simplify the notation we write cn = Ck (en ) and cσ (n) = Ck (σ (en )). Both Mkn and Mkσ (n) are homogeneous with the same degree of homogeneity. Hence cn and cσ (n) are equivalent, although in a bigger von Neumann algebra B(Ek H) = Ek B(H)Ek . Let u(nσ (n)) be a partial isometry in B(Ek H) such that u(nσ (n))∗ u(nσ (n)) = cσ (n) and u(nσ (n))u(nσ (n))∗ = cn . Hence u(nσ (n)) is an isometry from cσ (n) H onto cn H. Using the decomposition Ek H = ⊕n cn H we define a unitary operator V on Ek H by setting its coefficients Vnm : cm H → cn H, Vnm = δmσ (n) u(nσ (n)). Direct computations show that V ∗ V = V V ∗ = Ek . Let us define Tσ (A) = V ∗ AV and TU (A) = T (V AV ∗ ) for A ∈ Mk . Then [Tσ (A)]nm =

X X (V ∗ )nr (A)rs Vsm = (Vrn )∗ δrs Ar Vsm r,s

=

X

r,s

δnσ (r) δmσ (r) u(rσ (r))∗ Ar u(rσ (r))

r

= δnm u(σ −1 (n)n)∗ Aσ −1 (n) u(σ −1 (n)n). Thus Tσ (A) ∈ Mk , [Tσ (A)]11 = u(σ −1 (1)1)∗ Aσ −1 (1) u(σ −1 (1)1) and so on. By similar calculations we obtain that V cn V ∗ = cσ −1 (n) , which implies TU (cn ) = cn for all n. Therefore TU leaves the center Z(Mk ) pointwise invariant. By Corollary 2, Part III, Chap. 3 in [9], there exists a unitary operator U ∈ Mk such that TU (A) = U ∗ AU . Using Prop. 14 we see that U = ⊕n Un , where Un is unitary in Mkn . Thus TU (A1 , ...) = t (U1∗ A1 U1 , ...). Moreover, TU ◦ Tσ = T on Mk . u Using the above theorem we obtain the following result for space Tr(H)iso . Corollary 18. Tr(H)iso = ⊕k Tr(H)k , Tr(H)k · Tr(H)l = 0 if k 6 = l. T1 preserves each Tr(H)k and there exists a normal partial isometry U (k) such that T1 (φ) = U (k)∗ φU (k) for any φ ∈ Tr(H)k .

Environment-Induced Superselection Rules

259

Proof. Define Tr(H)k = Ak , the closure being taken in k · k1 -norm, and Ak as in Prop.12. Then the first part follows. Let U (k) = V (⊕Un ), where V and Un are as in Thm. 17. Then, for φ ∈ Tr(H)k , U (k)∗ φU (k) = TU (Tσ φ) = T φ = T1 φ. t Moreover, U (k)∗ U (k) = U (k)U (k)∗ = Ek . u A simple example of an isometric-sweeping decomposition is given by putting T (φ) = AφA∗ , where A is a contraction on a Hilbert space H and φ ∈ Tr(H). Clearly assumptions (i) and (ii) hold in this case. Because for any contraction A there is a decomposition H = H1 ⊕ H2 , such that both H1 and H2 are A-invariant and A is unitary on H1 and completely non-unitary on H2 , so Tr(H)iso = P Tr(H)P , Tr(H)s = P ⊥ Tr(H)P + P Tr(H)P ⊥ + P ⊥ Tr(H)P ⊥ , where P is the projector onto H1 and P ⊥ is its orthogonal complement. In the continuous case of the semigroup Tt we obtain a more precise description. Because Tt preserves the center Z(Mk ) pointwise so V = Ek and we obtain the following result. Theorem 19. Tr(H)iso = ⊕k,n Tr(H)kn , Tr(H)kn · Tr(H)lm = 0 if (kn) 6 = (lm) and T1 (t) preserves each Tr(H)kn . For any (kn) there exists a Banach space isomorphism α : Tr(H)kn → Tr(H˜ kn ), where H˜ kn is some Hilbert space. Moreover, α ◦ T1 (t) ◦ α −1 is given by Ut∗ · Ut , where Ut is a one-parameter strongly continuous group of unitary operators on H˜ kn . Proof. Define Tr(H)kn = Ck (en )Tr(H)k . By the remark above T1 (t) : Tr(H)kn → Tr(H)kn . Since Mkn is a type I factor so it is spatially isomorphic to a von Neumann matrix algebra MN (C), where N = N (k, n) is the degree of homogeneity of Mkn . MN (C) acts on a Hilbert space H˜ kn , which is the direct sum of N copies of range en , where en is a minimal and faithful projector in Mkn (see Prop.14). Hence MN (C) = B(H˜ kn ). It is clear that the above isomorphism, say α, maps trace class operators onto trace class operators. Moreover, since α ◦ T (t) ◦ α −1 is an automorphism on B(H˜ kn ), it is inner. Finally, let us define T−t = Tt∗ . Because, for φ ∈ Tr(H)iso , kTt∗ φ − φk1 = kTt∗ φ − Tt∗ Tt φk1 ≤ kφ − Tt φk1 , so Tt∗ is strongly continuous on Tr(H)iso . Therefore both (the extension of) Tt and Tt∗ are weakly∗ continuous on M and so α ◦ T1 (t) ◦ α −1 is a weakly∗ continuous group of ∗ automorphisms on B(H ˜ kn ). Thus Ut is a strongly continuous group of unitary operators ˜ t on Hkn . u Let Fix(Tt∗ ) = {A ∈ B(H) : Tt∗ (A) = A∀t ≥ 0}. Corollary 20. If Fix (Tt∗ ) = C · 1, then the sum in Thm. 19 consists only of one element ˜ and α ◦ T1 (t) ◦ α −1 = Ut∗ · Ut . i.e. α : Tr(H)iso → Tr(H) Proof. Because Tt∗ (Ek ) = Ek , so M = M1 and E1 = 1. It is clear that M1 is a factor. u t

260

R. Olkiewicz

Corollary 21. Suppose again that Fix(Tt∗ ) = C · 1. If there exists a one-dimensional projector e ∈ K, then Tr(H)iso = Tr(H). Proof. Because M = M1 and dime = 1 so M is generated by one-dimensional projectors and E = 1. Since it is a factor there exists a sequence {en } of minimal and P equivalent, hence one-dimensional, projectors, such that n en = 1. Let f ∈ M0 , f en is one-dimensional so either f en = 0, or being a projector. Because en f = f en andP f en = en . Therefore f en ∈ M and also f ( n en ) = f ∈ M. Because M∩M0 = C·1 so f = 1. It implies that M0 = C · 1. Hence M = B(H) and thus Tr(H)iso = Tr(H). t u Now we discuss the Foguel alternative for the operator T . We want to pose the question when T is sweeping if we know that it has no invariant density. Let us notice that, by Thm.1, {T n } cannot be wo-relatively compact in this case. Because T = T1 ⊕ T2 and T2 is sweeping, see Thm. 8, so it suffices to consider operator T1 only. By the linearity and continuity the problem can be further reduce to the question when lim n→∞ Trf T1n e = 0 for all e, f ∈ P min (K). Let us recall that, by Corollary 18, Tr(H)iso ⊂ Tr(EH), E is the unit in M, and T1 (φ) = U ∗ φU , where U is a unitary operator on EH. Therefore the problem is essentially the same as in the classical case. Proposition 22. Assume that T has no invariant density. If one of the following conditions holds, then T is sweeping. a) The continuous singular spectrum of U is empty: σsing (U ) = ∅ or, more generally, n-times convolution, for some n ∈ N, of the continuous singular part of the spectral measure of U is absolutely continuous.P n b) ∀e ∈ P min (K)∃A ∈ B(H)+ such that ∞ n=0 T1 (e) ≤ A. c) K is commutative, i.e. xy = yx for all x, y ∈ K. Proof. a) First we show that the point spectrum σp (U ) is empty. Suppose on the contrary that eiα ∈ σp (U ) and so U v = eiα v for some non-zero v ∈ EH. Let Pv be the onedimensional projector onto Cv. Then T1 (Pv ) = Pv and so Pv is an invariant density for T , the contradiction. By assumption σsing (U ) = ∅, hence σ (U ) = σac (U ) [32]. It means that a complex measure hv1 , E(dλ)v2 i, where E(dλ) is the spectral measure of U and v1 , v2 ∈ EH, is absolutely continuous with respect to the Lebesgue measure P P on the circle S 1 . Since e, f are finite-dimensional, so e = k1 Pi , f = l1 Qj , where Pi = |vi >< vi |, Qj = |wj >< wj | are one-dimensional projectors on EH. Hence, it suffices to check the behavior of Z TrPi T1n Qj = |hvi , U n wj i|2 = | einλ hvi , E(dλ)wj i|2 . S1 1 L -function

on S 1 , so it tends to zero when However, this is the Fourier transform of a n → ∞. If µ(dλ) = hvi , E(dλ)vi i is singular but µ∗µ∗· · ·∗µ is absolutely continuous, then its Fourier transform also tends to zero when n → ∞. P b) It is clear because n T1n (e) ≤ A implies that Trf

∞ X n=o

T1n (e)

and so limn→∞ Trf T1n (e) = 0.

=

∞ X n=o

Trf T1n (e) ≤ Trf A,

Environment-Induced Superselection Rules

261

c) It reduces to the case b). First, notice that ∀e 6= f ∈ P min (K) there is ef = 0. Now suppose there exists n0 such that T1n0 (e) = e. But this implies the existence of an invariant density which contradicts the assumption. Therefore T1n (e) 6= e and so 0 for all n ≥ 1. Furthermore, T1n (e) · T1m (e) = 0 if n 6 = m, which implies T1n (e) P· e = n t that n T1 (e) ≤ 1. u Remark. Let us point out that condition a) is sufficient but not necessary. For example, in [10] a class of more general continuous singular measures with their Fourier transforms vanishing at infinity was presented. Finally, we discuss the structure of the peripheral point spectrum of operator T which, in particular, contains the information about the existence of an invariant density. Assuming that T ∗ is a normal, identity preserving and strongly positive operator on a von Neumann algebra N , Groh showed that σp (T ) ∩ S 1 of its preadjoint T is a subgroup of S 1 if T is irreducible and σp (T ) ∩ S 1 6 = ∅, Sect. D-III in [16]. T is irreducible if there is no non-trivial closed and T -invariant hereditary cones in N∗+ . With our assumptions (i) and (ii), when we control not only the behavior of T with respect to the trace norm but also to the operator norm, the irreducibility has a great impact on the spectrum σp (T ) ∩ S 1 . Proposition 23. If T is irreducible, then σp (T ) ∩ S 1 = ∅. Proof. Assume on the contrary that eiα ∈ σp (T ). Then T (φ) = eiα φ for some φ ∈ Tr(H). Clearly, φ ∈ Tr(H)r and, by Prop.9, P φ ∈ Tr(H)iso . Hence T (φ) = T1 (φ) = U ∗ φU and so |φ|2 ∈ Fix(T ). Let |φ|2 = i ai ei , a1 > a2 > ... > 0, be the spectral decomposition of |φ|2 . By Prop.5 d), ei ∈ K so T (|φ|2 ) =

X i

ai T (ei ) =

X

ai ei

i

and T (ei ) are mutually orthogonal projectors by Prop.6 a) and c). However the uniqueness of the spectral measure implies that T (ei ) = ei for all i. Let C+ = {φ ∈ Tr(H)+ ; φ ≤ Ce1 for some C > 0}. Then C+ is a non-trivial hereditary cone, which is T - invariant, which contradicts the assumption. u t Remark. If Fix(T ∗ ) = C · 1, then also σp (T ) ∩ S 1 = ∅. Example 2. We use Prop.23 to generalize slightly a result of Davies [8], who proved the non-existence of invariant densities for irreducible quantum stochastic processes. Suppose (X, µ) is a σ -finite measure space, H infinite dimensional but separable Hilbert spaceRand x → V (x) ∈ B(H) a weakly measurable map on X. Suppose further that 3 = V (x)∗ V (x)µ(dx), the integral defined in the strong topology on B(H). If H is a self-adjoint operator on H, then Z L(ρ) = −i[H, ρ] +

1 V (x)ρV (x)∗ µ(dx) − {3, ρ} 2

262

R. Olkiewicz

is a generator of a Markov semigroup Tt = etL on Tr(H). It was shown in [8] Thm.19 (see also [12]) that if Tt is irreducible and V (x) are normal a.e., then Tt has no invariant density. Assume less, namely that Z Z ∗ V (x)V (x) µ(dx) ≤ V (x)∗ V (x)µ(dx). Then Tt satisfies (i) and (ii) and the irreducibility of Tt implies that σp (Tt ) ∩ S 1 = ∅. 4. Effective Superselection Rules In this section we use the isometric-sweeping decomposition to discuss the appearance of dynamically induced superselection rules. We assume that Tt is a strongly continuous semigroup of contractive and positive operators on Tr(H). Suppose Pˆ is a linear, bounded and positive operator on Tr(H) such that Pˆ 2 = Pˆ and TrPˆ φ ≤ Trφ for all φ ∈ Tr(H)+ . We call such an operator the projection operator. Then space Tr(H) splits into two linearly independent and closed subspaces Pˆ Tr(H) and (id − Pˆ )Tr(H). Definition. We say that the semigroup Tt induces a weak superselection structure on Tr(H) if a) there exists a projection operator Pˆ such that Tt : imPˆ → imPˆ , Tt |imPˆ = Ut · Ut∗ ,

(1)

where Ut is a strongly continuous group of unitary operators, b) lim |TrATt φ − TrAPˆ (Tt φ)| = 0

t→∞

(2)

holds for all φ ∈ Tr(H) and any A from some ∗ - algebra A, which is strongly dense in B(H). Tt induces a strong superselection structure if a) holds together with b’) lim kTt φ − Pˆ (Tt φ)k1 = 0 ∀φ ∈ Tr(H).

t→∞

(3)

A weak(strong) superselection structure is said to be non-trivial if Pˆ 6= id. The condition described by formula (1) corresponds to the fact that the process of decoherence does not affect statistical states from some preferred set of all density matrices. In particular, it means that pure states from Pˆ Tr(H), if they exist, evolve into pure states. The state reduction corresponding to a weak superselection structure was presented in the C ∗ -algebra framework by Hepp [19]. In the case of the Coleman–Hepp model, where the ∗ -algebra A consists of all local operators and the projection operator Pˆ is given by the von Neumann reduction postulate, Eq. (2) was derived (see also formula (7.57) in [27]). It is worth noting that Eq. (2) was criticized by Bell [3] because the limit is not uniform for all observables A ∈ A : ||Ak∞ ≤ 1, and so it is approached after an arbitrarily long time if one chooses appropriate A. It is belived that the environmentinduced superselection rules emerged uniformly and in a sufficiently short period of time. For example, in the model considered by Kupsch [27], the strong superselection structure was derived together with an algebraic decay to zero in the limit (3).

Environment-Induced Superselection Rules

263

Theorem 24. Suppose Tt satisfies (i) and (ii) from Sect. 3 for all t ≥ 0. Then Tt induces a weak superselection structure. If moreover, Tt is so-relatively compact, then it induces a strong superselection structure. Proof. By Thm. 8 Tr(H) = Tr(H)iso ⊕ Tr(H)s . Let Pˆ be defined by Pˆ (φ) = φ1 , where φ1 = φ − φ2 ∈ Tr(H)iso , φ2 ∈ Tr(H)s . Clearly, it is a projection operator. Let A be the algebra of compact operators. Then |TrATt φ − TrAPˆ (Tt φ)| = |TrATt φ2 | → 0 for any A ∈ K(H) and φ ∈ Tr(H). Moreover, by Thm 19, the restriction of Tt onto Tr(H)iso is a unitary evolution given by Ut · Ut∗ , where Ut Ut∗ = Ut∗ Ut = E. Hence, the extension of Ut to a unitary operator on H in such a way that Ut v = Ut∗ v = v for any v ∈ E ⊥ H, proves the first part of the theorem. If Tt is so-relatively compact, then Eq. (3) follows from Thm. 11. u t Next we describe the notion of non-trivial superselection structure in the algebraic framework. From the proof of Thm. 19 we know that the von Neumann algebra M is dual, as a Banach space, to Tr(H)iso . Hence it consists of the relevant (bounded) observables. Since it may happen that M does not contain the identity operator 1, we take N = M + C · 1 as the effective quantum algebra of observables. Its commutant N 0 is said to be generated by the superselection rules. Proposition 25. If Pˆ 6 = id, then N 0 6= C · 1. Proof. Assume on the contrary that N 0 = C · 1. Then N = B(H). But M is a maximal ideal in N , so M = EB(H)E, where E is the unit in M. Therefore EB(H)E + C · 1 = B(H), which implies that E = 1 and M = B(H). Hence Tr(H)iso = Tr(H), the contradiction. u t Definition. Suppose Pˆ 6 = id. The induced superselection structure is said to satisfy the Hypothesis of Commuting Superselection Rules if Z(N ) = N 0 , where Z(N ) denotes the center of N . This case, by the Jauch theorem [22], is equivalent to the fact that N contains a complete commuting set of observables. Such a structure arises naturally when the generator of a dynamical semigroup is given by the von Neumann projection postulate X Pn ρPn − ρ, L(ρ) = n

where {Pn } is a sequence of pairwise orthogonal, but not necessarily one-dimensional, projectors, which sum up to the identity operator. Then Tt (ρ) = e−t ρ + (1 − e−t )Pˆ (ρ),

P whereP Pˆ (ρ) = n Pn ρPn . Clearly, it satisfiesP conditions (i) and (ii). In this case M = N = n Pn B(H)Pn and so N 0 = Z(N ) = n CPn . The same kind of dynamically induced superselection rules for a finite quantum system interacting with the measuring apparatus was discussed by Araki (Example 1 in [2]). It is worth noting that a completely different situation when N 0 = B(H) and Z(N ) = C·1 can also happen, as the following example shows.

264

R. Olkiewicz

Example 3. Let us consider the pure spin 1/2 system. The algebra A generated by observables consists of 2 × 2 complex matrices. Suppose we want to determine state of an individual quantum system through the interaction with a classical apparatus [21]. The measuring apparatus should, after the contact with the quantum system, present a ray in the Hilbert space C2 , of course with some uncertainty. Therefore, we describe such a device by a two-dimensional complex projective space CP 1 . Using a general scheme of the completely positive coupling between quantum and continuous classical systems [30], the following equation for the reduced density matrix ρ ∈ M2×2 can be derived 1 ρ˙t = −i[H, ρt ] + ((Trρt )I − 2ρt ), 6 where I denotes the 2×2 identity matrix. Clearly, in this case Tr(H)iso = Tr(H)r = CI . Hence M = CI , too, and so M0 = M2×2 . It means that, after a long time, any pure state becomes the completely mixed state. In this case the projection operator Pˆ equals Pˆ (φ) = 21 (Tr(φ))I , φ ∈ M2×2 . The above discussion shows that in order to obtain a particular class of superselection rules some additional assumptions on the semigroup Tt , or, equivalently, on the kind of the interaction with an environment, have to be imposed. Acknowledgements. I would like to thank the A. von Humboldt Foundation for the financial support.

References 1. Alicki, R., Lendi, K.: Quantum dynamical semigroups and applications. Lect. Notes Phys. 286, Berlin– Heidelberg–New York: Springer Verlag, 1987 2. Araki, H.: A remark on Machida-Namiki theory of measurements. Prog. Theor. Phys. 64, 719–730 (1980) 3. Bell, J.S.: On wave packet reduction in the Coleman–Hepp model. Helv. Phys. Acta 48, 93–98 (1975) 4. Blanchard, Ph., Jadczyk, A.: On the interaction between classical and quantum systems. Phys. Lett. A 175, 157–164 (1993) 5. Blanchard, Ph., Jadczyk, A.: Strongly coupled quantum and classical systems and Zeno’s effect. Phys. Lett. A 183, 272–276 (1993) 6. Blanchard, Ph., Olkiewicz, R.: Interacting quantum and classical continuous systems I. The piecewise deterministic dynamics. J. Stat. Phys. 94, 913–931 (1999) 7. Blanchard, Ph., Olkiewicz, R.: Interacting quantum and classical continuous systems II. Asymptotic behavior of the quantum subsystem. J. Stat. Phys. 94, 933–953 (1999) 8. Davies, E.B.: Quantum stochastic processes II. Commun. Math. Phys. 19, 83–105 (1970) 9. Dixmier, J.: Von Neumann algebras. Amsterdam: North-Holland Publishing Company, 1981 10. Erdos, P.: On a family of symmetric Bernoulli convolutions. Am. J. Math. 61, 974–976 (1939) 11. Evans, D.E.: Positive linear maps on operator algebras. Commun. Math. Phys. 48, 15–22 (1976) 12. Evans, D.E.: Irreducible quantum dynamical semigroups. Commun. Math. Phys. 54, 293–297 (1977) 13. Frigerio, A.: Quantum dynamical semigroups and approach to equilibrium. Lett. Math. Phys. 2, 79–87 (1977) 14. Frigerio, A.: Stationary states of quantum dynamical semigroups. Commun. Math. Phys. 63, 269–276 (1978) 15. Frigerio, A., Verri, M.: Long-time asymptotic properties of dynamical semigroups on W∗ -algebras. Math. Z. 180, 275–286 (1982) 16. Groh, U.: Positive semigroups on C∗ - and W∗ -algebras. In: Nagel, R. (ed.) One-parameter semigroups of positive operators. LNM Vol. 1184, Berlin: Springer-Verlag, 1986 17. Giulini, D., Kiefer, C., Zeh, H.D.: Symmetries, superselection rules, and decoherence. Phys. Lett. A 199, 291–298 (1995) 18. Haag, R., Kastler, D.: An algebraic approach to Quantum Field Theory. J. Math. Phys. 5, 848–861 (1964) 19. Hepp, K.: Quantum theory of measurement and macroscopic observables. Helv. Phys. Acta 45, 237–248 (1972) 20. Hu, B.L., Paz, J.P., Zhang,Y.: Quantum Brownian motion in a general environment: Exact master equation with nonlocal dissipation and colored noise. Phys. Rev. D 45, 2843–2861 (1992)

Environment-Induced Superselection Rules

265

21. Jadczyk, A.: Topics in quantum dynamics. In: Coquereaux, R. et al.(eds.) Infinite dimensional geometry, noncommutative geometry, operator algebras and fundamental interactions. Singapore: World Scientific, 1995 22. Jauch, J.: System of observables in Quantum Mechanics. Helv. Phys. Acta 33, 711–726 (1960) 23. Joos, E.: Decoherence through interaction with the environment. In: Giulini, D. et al. (eds.) Decoherence and the appearance of a classical world in quantum theory. Berlin: Springer, 1996 24. Joos, E., Zeh, H.D.: The emergence of classical properties through interaction with the environment. Z. Phys. B 59, 223–243 (1985) 25. Komorowski, T., Tyrcha, J.: Asymptotic properties of some Markov operators. Bull. Acad. Polon. Sci. Math. 37, 221–228 (1989) 26. Krengel, U.: Ergodic theorems. Berlin: Walter de Gruyter, 1985 27. Kupsch, J.: Open quantum systems. In: Giulini, D. et al. (eds.), Decoherence and the appearance of a classical world in quantum theory. Berlin: Springer, 1996 28. Nagel, R.: Spectral and asymptotic properties of strongly continuous semigroups. In: Goldstein, G.R., Goldstein, J.A. (eds.), Semigroups of linear and nonlinear operations and applications. Dordrecht: Kluwer Academic Publishers, 1993 29. Olkiewicz, R.: Some mathematical problems related to classical-quantum interactions. Rev. Math. Phys. 9, 719–747 (1997) 30. Olkiewicz, R.: Dynamical semigroups for interacting quantum and classical systems. J. Math. Phys. 40, 1300–1316 (1999) 31. Paz, J.P., Zurek, W.H.: Environment-induced decoherence, classicality, and consistency of quantum histories. Phys. Rev. D 48, 2728–2738 (1993) 32. Reed, M., Simon, B.: Methods of modern mathematical physics. vol. I. New York: Academic Press, 1981 33. Socała, J.: On the existence of invariant densities for Markov operators. Ann. Polon. Math. 48, 51–56 (1988) 34. Takesaki, M.: Theory of operator algebras. New York: Springer, 1979 35. Topping, D.M.: Lectures on von Neumann algebras. London: Van Nostrand, 1971 36. Twamley, J.: Phase-space decoherence: a comparison between consistent histories and environmentinduced superselection. Phys. Rev. D 48, 5730–5745 (1993) 37. Unruh, W.G., Zurek, W.H.: Reduction of a wave packet in Quantum Brownian motion. Phys. Rev. D 40, 1071–1094 (1989) 38. Watanabe, S.: Ergodic theorems for W∗ -dynamical semigroups. Hokkaido Math. J. 8, 176–190 (1979) 39. Wick, G.C., Wightman, A.S., Wigner, E.P.: The intrinsic parity of elementary particles. Phys. Rev. 88, 101–105 (1952) 40. Wightman, A.S.: Superselection rules; old and new. Il Nuovo Cimento B 110, 751–769 (1995) 41. Zurek, W.H.: Environment-induced superselection rules. Phys. Rev. D 26, 1862–1880 (1982) Communicated by H. Araki

Commun. Math. Phys. 208, 267 – 273 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

A Geometric Criterion for Positive Topological Entropy II: Homoclinic Tangencies Ale Jan Homburg1,? , Howard Weiss2,?? 1 IPST, University of Maryland, College Park, MD 20740, USA 2 The Pennsylvania State University, Department of Mathematics, University Park, PA 16802, USA.

E-mail: [email protected] Received: 2 March 1999 / Accepted: 14 May 1999

Abstract: In a series of important papers [GS1,GS2] Gavrilov and Shilnikov established a topological conjugacy between a surface diffeomorphism having a dissipative hyperbolic periodic point with certain types of quadratic homoclinic tangencies and the full shift on two symbols, thus exhibiting horseshoes near a tangential homoclinic point. In this note, which should be viewed of as an addendum to [BW], we extend this result by showing that such a diffeomorphism with a one-sided isolated homoclinic tangency having any order contact, possible with infinite order contact, possesses a horseshoe near the homoclinic point. 1. Introduction Homoclinic tangencies and their bifurcations play a fundamental role in Dynamical Systems [PT]. For instance, Palis has conjectured [P] that the set of (non-uniformly) hyperbolic surface diffeomorphisms together with the diffeomorphisms exhibiting homoclinic tangencies are dense in the space of all surface diffeomorphisms. Systems exhibiting homoclinic tangencies can exhibit more complicated and more subtle quasi-local behavior than systems possessing transverse homoclinic points or homoclinic points with topological crossings. For instance, Gavrilov and Shilnikov [GS1,GS2] showed that horseshoes (locally maximal hyperbolic sets) may exist near the homoclinic tangency. Newhouse [N1,N2] (see also [GST]) showed that these homoclinic tangencies typically generate secondary tangencies which persist under small perturbations, and that the Gavrilov and Shilnikov horseshoes may co-exist with infinitely many sinks in a neighborhood of the homoclinic orbit. ? Address after December 31, 1999: Department of Mathematics, Utrecht University, Budapestlaan 6, 3584 CD Utrecht, The Netherlands ?? This second author was partially supported by a National Science Foundation grant #DMS-9704913. The manuscript was written while both authors were visiting the IPST, University of Maryland and the authors wish to thank IPST for their gracious hospitality.

268

A. J. Homburg, H. Weiss

In a series of important papers [GS1,GS2] Gavrilov and Shilnikov established a topological conjugacy (on a closed invariant set) between a surface diffeomorphism having a dissipative hyperbolic periodic point with certain types of quadratic homoclinic tangencies (see Figs. 2 (iii) and (iv)) and the full shift on two symbols, thus exhibiting horseshoes near a tangential homoclinic point. In this note, which should be viewed as an addendum to [BW], we extend Gavrilov and Shilnikov’s result by showing that such a diffeomorphism with a one-sided homoclinic tangency having any order contact (possible infinite order contact), possesses a horseshoe near the homoclinic point. Such a map has positive topological entropy and possesses infinitely many hyperbolic periodic points near the homoclinic tangency. Gonchenko claims that a proof of this result for finite order tangencies appeared in his unpublished (Russian) thesis in 1984. The result for finite order tangencies was announced by Gonchenko and Shilnikov [GS] in 1986, but we are unable to find any proof in the literature. In [BW] the authors consider a surface diffeomorphism with a hyperbolic periodic point such that components of the stable and unstable manifolds have a topological crossing, possible with infinite order contact. They prove that some power of the diffeomorphism has the full shift on two symbols as a topological factor. It follows from a remarkable theorem of Katok [K1,K2] that the map possesses a horseshoe. This result extends the well known theorem of Smale [S], where one assumes that the intersection is transversal and one obtains a topological conjugacy on a closed invariant set between some power of the map and the full shift on two symbols. The conjugacy immediately implies that the map possesses a horseshoe. The idea of the construction in [BW] is to find a horseshoe-like picture in the dynamics and to code the dynamics as if coding on a horseshoe. One then proves that this coding map is a factor map and thus the diffeomorphism has the full shift on two symbols as a topological factor. Then Katok’s theorem implies the existence of horseshoes arbitrarily close to a homoclinic tangency. Using this technique, we avoid having to verify difficult uniform contracting/expanding cone estimates to prove directly the existence of a hyperbolic set. 2. Homoclinic Tangencies Let M denote a smooth (C 2 ) surface and f : M → M a smooth (C 2 ) surface diffeomorphism. Let p be a hyperbolic periodic point (which by considering some iterate of the map we will assume is a fixed point) and assume that |λ µ| < 1, where |λ| > 1 and |µ| < 1 are the two eigenvalues of the differential Dfp . We will call such a periodic point dissipative.1 Also suppose that the map f possesses an isolated one-sided tangential homoclinic point q (with order of tangency 2 ≤ 2l ≤ ∞). See Fig. 1. The one-sided tangency hypothesis rules out the coincidence of the stable and unstable manifolds of p. We remark that our proof goes through with only minor modifications for some pathological cases. The ideas apply when f possesses an interval of tangential homoclinic points or f possesses a convergent sequence of one-sided tangential homoclinic points. Let U be a small neighborhood of the orbit of q consisting of finitely many balls (including one ball containing p). In this context small means that the sum of the diameters of the balls is sufficiently small. An important problem is to describe the set of points Uf whose orbits are entirely contained in U. Figure 1 illustrates four different types of homoclinic tangencies. It is not difficult to show that for cases (i) and (ii), the set Uf contains only the orbit of q and the fixed 1 The case where |λ µ| > 1 can be reduced to the dissipative case by considering the inverse f −1 .

Geometric Criterion for Positive Topological Entropy

269

i) c<0, d<0

ii) c>0, d<0

iii) c<0, d>0

iv) c>0, d>0

Fig. 1. Four types of one-sided homoclinic tangencies

point p [AH,GS1,GS2]. However, for cases (iii) and (iv), the dynamics in Uf is much more complicated and we will show that in these cases, Uf contains horseshoes. We note that a precise description of Uf is quite difficult to provide since Uf may also contain non-hyperbolic orbits and infinitely many sinks. We quickly recall the main technical result in [BW, Theorem 2.4] and the related definitions. Let N ⊆ M be homeomorphic to [−1, 1] × [−1, 1]. In the following we shall identify N with [−1, 1] × [−1, 1] and suppress the homeomorphism. Let R = [−1, 1] × [−ρ, ρ], where ρ ∈ (0, 1). A set S ⊂ R will be called a horizontal strip if (1) S is closed and path connected, (2) S contains a curve joining the left edge {−1} × [−ρ, ρ] and the right edge {1} × [−ρ, ρ] of R, (3) ∂S is a Jordan curve which is the union of a finite number of arcs all of whose endpoints lie on the left edge or the right edge of R. It is easily seen that ∂S contains exactly two curves joining the left edge {−1} × [−ρ, ρ] and the right edge {1} × [−ρ, ρ] of R, and S lies in the region of R bounded by these curves (see pictures in [BW]). We shall call the curve on which the second coordinate is larger cupper and the other curve clower . Definition. Let n be a positive integer and S a horizontal strip. We shall say that f n stretches S across R if f n S ⊂ Int N , f n (∂S ∩ Int R) ⊂ N \ R, and f n maps cupper and clower into opposite components of N \ R.

270

A. J. Homburg, H. Weiss

Theorem BW 2.4. Suppose N contains two disjoint closed horizontal strips S0 and S1 that are stretched across R by f n0 for some n0 ≥ 1. Then f n0 has the full two shift as a topological factor. It is convenient to work in C 1+α linearizing coordinates. Let f : M → M be a surface diffeomorphism with a hyperbolic fixed point p and homoclinic tangency q having 2lorder contact. This means that components of the stable and unstable manifolds of p, W s (p) and W u (p), are tangent at q and the tangency has order 2l. By choosing a suitable basis for the tangent space Tp M at p, we may think of df (p) as a linear map L : R2 → R2 which preserves the splitting R2 = R ⊕ R, contracts the first R by a factor of µ and expands the second R by a factor of λ. By the Hartman–Grobman theorem, there is a neighborhood U of p and a homeomorphism h of U into R2 with h(p) = (0, 0) such that if x ∈ U and f (x) ∈ U , then h(f (x)) = L(h(x)). One can choose h arbitrarily close to the identity by choosing the neighborhood U sufficiently small. The type of homoclinic tangency, i.e., case (i), (ii), (iii) or (iv), is unchanged under these orientation preserving homeomorphisms. It also follows from a theorem of Belitski [B] that the homomorphism h may be chosen to be a C 1+α diffeomorphism for some 0 < α < 1, and we will use this C 1+α linearization in our proof. It was previously shown by Hartman [H] that the homomorphism h may be chosen to be a C 1 diffeomorphism. We may assume that U and h have been chosen so that h is C 1+α and h(U ) = D(1) × D(1), where D(r) is the closed disc of radius r about the origin in R. We may also assume that D(1) × {0} and {0} × D(1) lie in W s (p) ∩ U and W u (p) ∩ U respectively. We also assume that the point of homoclinic tangency q lies in U . In order to simplify our notation, we shall henceforth identify U with D(1) × D(1) and suppress the homeomorphism h. Distances in U will be measured with respect to the product of the Euclidean metrics on D(1). We may assume that a point of homoclinic tangency has coordinates (q, 0) and lies on W s (p) ∩ U , and that some preimage has coordinates (0, r), where r = f −n0 (q, 0) lies on W u (p) ∩ U . We choose small neighborhoods V ⊂ U of (r, 0) and W ⊂ U of (q, 0) and we wish to study iterates f n+n0 of the map f restricted to W by decomposing the map f n+n0 : W → U into the linear action of f n = Ln : W → V defined by Ln (x, y) = (µn x, λn y) and a global mapping f n0 = G : V → W defined by G(x, y) = (q + ax + b(y − r) + O(x 1+α ) + O(|y − r|1+α ), cx + g(y − r) + O(x 1+α )), where a, b, c ∈ R. If the homoclinic tangency has order of contact 2 ≤ 2l < ∞ the function g(y − r) = d(y − r)2l + O(x 2l(1+α) ) + O(|y − r|2l(1+α) ), d > 0, and if the tangency has infinite order contact the function g(y − r) is infinitely flat at y = r (derivatives of all orders vanish) and has constant sign on V ∩ W u (p) except at y = r. Clearly G(0, r) = (q, 0). We note that case (iii) corresponds to c < 0 and case (iv) corresponds to c > 0. Consider the family of small rectangles Rn near (q, 0) with vertices (q ± , r/λn ± δ/λn ), where > 0 is sufficiently small, δ a positive number depending on to be chosen later, and n sufficiently large to insure that Ln (Rn ) ⊂ V . Let us study the image f n0 +n (Rn ) = (G ◦ Ln )(Rn ) (see Fig. 2). We make the following three observations which will imply that for cases (iii) and (iv), for n sufficiently large, the image f n0 +n (Rn ) intersects Rn in a horseshoe-like picture (see Fig. 2). Clearly the two shaded regions in Fig. 2 are blowups of the actual (much smaller) regions around q and r.

Geometric Criterion for Positive Topological Entropy

271

r+δ

n L (Rn)

G

r

n n +n GL =f 0

V r- δ

n G L (S0 )

n L

Rn

n G L (S1 )

S0

-n 2δλ

S1 -n r λ-n - δ λ

W q−ε

q

q+ε

Fig. 2. Creation of the horseshoe

(1) The abscissas of points in f n0 +n (Rn ) are contained in q ± aµn (q ± ) + b(±δ) + O(|µn , δ|1+α ). Taking δ small enough, it follows that for n sufficiently large the abscissas of points in f n0 +n (Rn ) are contained in q ± . (2) The ordinates of points in f n0 +n (Rn ∩ {y = r/λn }) are contained in c(q ± )µn + O(µ(1+α)n ). Our assumptions imply that c(q ±)µn +O(µ(1+α)n ) << r/λn −δ/λn for n sufficiently large. It follows that the points in f n0 +n (Rn ∩ {y = r/λn }) lie far below Rn . (3) The ordinate of points in f n0 +n (Rn ∩ {y = r/λn ± δ/λn }) is at least c(q ± )µn + g(±δ) + O(µ(1+α)n ). We can choose n sufficiently large to insure that r/λn + δ/λn < c(q ± )µn + g(±δ) + O(µ(1+α)n ). Given η > 0, define the rectangle N to have vertices (q ± , rλ−n ± δλ−n ± η). Using these three observations, it is easy to find two disjoint closed horizontal strips S0 = S0 (n) and S1 = S1 (n) contained in Rn with S0 (n) lying in the top half of Rn and S1 (n) lying in the bottom half of Rn , such that the images under f n0 +n of these two strips are stretched across Rn (see Fig. 2). We have thus proven the following proposition. Proposition 1. Let f : M → M be a surface diffeomorphism, p a hyperbolic periodic point and q an isolated point of one-sided homoclinic tangency of the type illustrated in Fig. 1 (iii) or (iv). For > 0 small there exists δ > 0 so that for n sufficiently large, Rn contains two disjoint horizontal strips S0 and S1 that are stretched across Rn by f n0 +n . Applying Theorem BW 2.4, we obtain the following theorem.

272

A. J. Homburg, H. Weiss

Theorem 1. Let f : M → M be a surface diffeomorphism, p a hyperbolic periodic point and q an isolated point of one-sided homoclinic tangency of the type illustrated in Fig. 1 (iii) or (iv). For > 0 small there exists δ > 0 so that for n sufficiently large, there is a closed invariant set contained in Rn on which f n0 +n has the full two shift as a topological factor. Since the topological entropy of a topological factor of a map is no less than the topological entropy of the map, Theorem 1 immediately implies that the topological entropy of f n0 +n restricted to Rn is positive (more precisely, at least log 2). Applying Katok’s theorem [K1,K2] on the existence of horseshoes which carry most of the entropy for a surface diffeomorphism, we obtain the existence of horseshoes near the homoclinic tangency. Corollary 1. Let f : M → M be a surface diffeomorphism, p a hyperbolic periodic point and q an isolated point of one-sided homoclinic tangency of the type illustrated in Fig. 1 (iii) or (iv). For > 0 small there exists δ > 0, so that for n sufficiently large, there are horseshoes contained in Rn . Furthermore, for any η > 0, the map f n0 +n restricted to Rn possesses a horseshoe which carries topological entropy at least log 2 − η. An essential component in our proof, which works only in dimension two, is the existence of a local C 1+α linearization around a hyperbolic fixed point. In arbitrary dimensions one only knows the existence of a local C α linearization and there are examples which illustrate that a local C 1 linearization need not exist. Another essential component in our proof is Katok’s theorem, which again only holds in dimension two. We end by posing a challenge. Open Problem. Can Theorem 1 and Corollary 1 be extended to dimensions greater than two? References [AH] [B]

Afraimovich, V. and Hsu, S.: Lectures on Chaotic Dynamical Systems. Manuscript, 1998 Belitskii, G.R.: Functional Equations and the Conjugacy of Local Diffeomorphisms of a Finite Smoothness Class. Soviet Math. Dokl. 13, 56–59 (1972) [BW] Burns, K. and Weiss, H.: A Geometric Criterion for Positive Topological Entropy. Commun. Math. Phys. 192, 95–118 (1995) [GS1] Gavrilov, N. and Silnikov, L.: On 3-Dimensional Dynamical Systems Close to Systems With a Structurally Unstable Homoclinic Curve, I. Math. USSR Sb. 88, 4, 467–485 (1972) [GS2] Gavrilov, N. and Silnikov, L.: On 3-Dimensional Dynamical Systems Close to Systems With a Structurally Unstable Homoclinic Curve, II. Math. USSR Sb. 90, 1, 139–156 (1973 [GS] Gavrilov, N. and Silnikov, L.: Dynamical systems with Structurally Unstable Homoclinic Curves. Soviet Math. Dokl. 33, 234–238 (1986) [GST] Gonchenko, S., Shilnikov, L., Turaev, D.: Dynamical Phenomena in Systems with Structurally Unstable Poincaré Homoclinic Orbit. Chaos 6, 1, 15–31 (1996) [H] Hartman, P.: On Local Homeomorphisms of Euclidean Spaces. Bol. Soc. Mat. Mexicana 5, 2, 220–241 (1960) [K1] Katok, A.: Lyapunov Exponents, Entropy and Periodic Orbits for Diffeomorphisms. Publ. Math. Inst. Hautes Études Sci 51, 137–173 (1980) [K2] Katok, A.: Nonuniform Hyperbolicity and Structure of Smooth Dynamical Systems. Proc. International Congress of Mathematicians Warszawa 1983 2, pp. 1245–1254 [N1] Newhouse, S.: Diffeomorphisms With Infinitely Many Sinks. Topology 13, 9–18 (1974) [N2] Newhouse, S.: The Abundance Of Wild Hyperbolic Sets And Nonsmooth Stable Sets For Diffeomorphisms. Publ. Math. Inst. Hautes Études Sci. 50, 101–151 (1979) [P] Palis, J.: Homoclinic Bifurcations, Sensitive-Chaotic DynamicsAnd StrangeAttractors. In:Dynamical Systems and Related Topics (Nagoya, 1990), Adv. Ser. Dyn. Syst. 9, River Edge, NJ: World Sci. Publishing, 1991, pp. 466–472

Geometric Criterion for Positive Topological Entropy

[PT] [S]

273

Palis, J. and Takens, F.: Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations. Cambridge: CUP Cambridge Studies in Advanced Mathematics, 35, 1993 Smale, S.: Diffeomorphisms with many periodic points. In: Differential and Combinatorical Topology, (edited by S.S. Cairnes), Princeton, NJ: Princeton University Press, 1965, pp. 63–80

Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 275 – 281 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On the Virial Theorem in Quantum Mechanics V. Georgescu1 , C. Gérard2 1 CNRS, Département de Mathématiques, Université de Cergy-Pontoise, 2 avenue Adolphe Chauvin,

95302 Cergy-Pontoise Cedex, France

2 Centre de Mathématiques, UMR 7640 CNRS, Ecole Polytechnique, 91128 Palaiseau Cedex, France

Received: 7 January 1999 / Accepted: 2 June 1999

Abstract: We review the various assumptions under which abstract versions of the quantum mechanical virial theorem have been proved. We point out a relationship between the virial theorem for a pair of operators H, A and the regularity properties of the map R 3 s 7 → eisA (z − H )−1 eisA . We give an example showing that the statement of the virial theorem in [CFKS] is incorrect. The Virial Theorem in Quantum Mechanics The virial relation is the statement that if H, A are two selfadjoint operators on a Hilbert space H, the expectation value of the commutator [H, iA] vanishes on eigenvectors of H: 1{λ} (H )[H, iA]1{λ} (H ) = 0.

(1)

The virial relation is a very important part of Mourre’s positive commutator method. In fact, combined with a positive commutator estimate, one can use the virial relation to obtain the local finiteness of point spectrum (or even the absence of point spectrum). Moreover, for Hamiltonians having a multiparticle structure, it is an essential tool to prove the positive commutator estimate itself (see eg [Mo,PSS,FH]). If H, A are both unbounded operators, some care has to be taken with the definition of the commutator [H, iA] which a priori is only defined as a quadratic form on D(H ) ∩ D(A). A rather weak assumption under which (1) can be formulated without ambiguity is the following one: There exists a subspace S ⊂ D(H ) ∩ D(A) dense in D(H n ) for some n ∈ N, such that |(H u, Au) − (Au, H u)| ≤ C(kH n uk2 + kuk2 ), u ∈ S.

(2)

The quadratic form [H, iA] extends then uniquely from S to D(H n ) which means that the left-hand side of (1) has an unambiguous meaning.

276

V. Georgescu, C. Gérard

The obstacle to a direct proof of (1) is of course that an eigenvector of H needs not be in D(A). Actually the counterexample that we will construct below shows that the virial relation does not hold under assumption (2). To overcome this, additional assumptions on H and A are needed. To our knowledge, three different types of assumptions have been used in the literature to prove the virial theorem in an abstract setting. • In [Mo, Prop. II.4], (1) is proved under the following assumptions: i) D(H ) ∩ D(A) is dense in D(H ), ii) eisA preserves D(H ) and for each u ∈ D(H ) sup|s|≤1 kH eisA uk < ∞, iii) the quadratic form [H, iA] on D(H ) ∩ D(A) is bounded below, closeable, and it extends as a bounded operator from D(H ) to H.

(M)

In fact the condition “ eisA preserves D(H )” implies i) and the second part of ii), see [ABG, Prop. 3.2.5]. Moreover, it was noticed in [PSS] that Mourre’s proof works without change under a condition weaker than iii). So the assumptions which are really needed for the validity of Mourre’s proof are: i) eisA preserves D(H ), ii) |(H u, Au) − (Au, H u)| ≤ C(kH uk2 + kuk2 ), u ∈ D(H ) ∩ D(A).

(M0 )

• In [ABG, Prop. 7.2.10], (1) is proved if H is of class C 1 (A) i.e., if (AGB)

∃z ∈ C\σ (H ) such that R 3 s 7 → eisA Rz e−isA is C 1 for the strong topology of B(H).

We have used the notation Rz = (z − H )−1 . Two equivalent characterizations of the C 1 (A) property in terms of commutators are: ∗ 2 (AGB0 ) ∃z ∈ C\σ (H ) such that |(Au, Rz u) − (Rz u, Au)| ≤ Ckuk , u ∈ D(A),

and: (AGB00 )

i) ∃z ∈ C\σ (H ) such that Rz D(A) ⊂ D(A), Rz∗ D(A) ⊂ D(A), ii) |(H u, Au) − (Au, H u)| ≤ C(kH uk2 + kuk2 ), u ∈ D(H ) ∩ D(A).

• Finally in [CFKS, Theorem 4.6], (1) is proved under the following assumptions: i) D(H ) ∩ D(A) is dense in D(H ), ii) |(H u, Au) − (Au, H u)| ≤ C(kH uk2 + kuk2 ), u ∈ D(H ) ∩ D(A), (CKFS) iii) ∃ H0 , selfadjoint such that D(H ) = D(H0 ), [H0 , iA] extends as a bounded operator from D(H0 ) to H, and D(A) ∩ D(H0 A) is a core for H0 . Since D(H0 A) = {u ∈ D(A)|Au ∈ D(H0 )} ⊂ D(A) one can suspect that there is a misprint in the last condition and that it should be replaced by the stronger version: D(H0 ) ∩ D(H0 A) is a core for H0 . Anyway, such a change does not invalidate the discussion below. It is easy to verify that (M) implies that eisA Rz e−isA is in B(H, D(H )) and that R 3 s 7 → eisA Rz e−isA is C 1 for the strong topology of B(H, D(H )),

Virial Theorem in Quantum Mechanics

277

and hence (M) implies (ABG). The relation between (M0 ) and (ABG) is even more straightforward: if eisA preserves D(H ) then (M0 ) is equivalent to (ABG) (see Theorem 6.3.4 in [ABG]). If H ∈ C 1 (A) then (Au, Rz u) − (Rz∗ u, Au) is the quadratic form of a bounded operator [A, Rz ]0 ∈ B(H) (cf. (ABG0 )). From (ABG00 ) it follows then that D(H )∩D(A) is a core of H and that the quadratic form (H u, Au) − (Au, H u) is continuous for the topology of D(H ), hence it extends uniquely to a continuous quadratic form [H, A]0 on D(H ). Identifying D(H ) ⊂ H ⊂ D(H )∗ in the usual way [H, A]0 becomes a continuous operator D(H ) −→ D(H )∗ and then one has (see [ABG, Theorem 6.2.10]) [A, Rz ]0 = Rz [H, A]0 Rz .

(3)

We shall prove in an appendix that D(H ) is preserved by eisA if [H, A]0 D(H ) ⊂ H. In other terms, if (ABG) holds and [H, A]0 D(H ) ⊂ H, then (M) is satisfied. That (ABG) is more general than (M0 ) can be seen from the following example: consider in L2 (R) the operator H of multiplication by a real rational function (which may have poles, e.g. take H (x) = 1/x) and let A = −id/dx; then clearly H ∈ C 1 (A) but eisA and (A + iλ)−1 do not leave the domain of H invariant. In conditions (M) and (ABG) assumptions either on the action of eisA on D(H ) or on the action of (z − H )−1 on D(A) are made. No comparable assumptions are made in condition (CFKS). However reading the proof (in particular the proof of [CFKS, Lemma 4.5]) one can see that the assumption that (z−H0 )−1 preserves D(A) is implicitly used to justify the identity (3) (with H replaced by H0 ). We give below an example showing that the virial relation does not hold if one only assumes (CFKS) (or a slightly stronger version of it). In particular, we show that the relation (A+iλ)−1 D(H ) ⊂ D(H ), which plays a crucial role in the argument from [CFKS], is not true under their conditions. Finally let us mention that in concrete situations (e.g. H is an L2 space and H, A are differential operators), the use of cutoff and regularization arguments can be an alternative to the abstract approach relying on (M) or (ABG) (see for example [W,K]). Results Let us introduce the following definition concerning multicommutators: we set ad0A H = H . For k ≥ 0, if adkA H is a bounded operator from D(H ) to H and the quadratic form [adkA H, A] on D(H ) ∩ D(A) extends as a bounded operator from D(H ) into H we denote it by adk+1 A H. Theorem 1. There exists a pair H, A of selfadjoint operators on a Hilbert space H such that: i) H, A satisfy (CFKS), ii) the multicommutators adkA H extend as bounded operators from D(H ) to H for all k ∈ N, iii) the pair H, A satisfies a Mourre estimate away from 0: For each compact interval I in R\{0} there exist c > 0, K compact such that 1I (H )[H, iA]1I (H ) ≥ c1I (H ) + K,

278

V. Georgescu, C. Gérard

iv) the virial relation does not hold for H, A: there exists λ ∈ σpp (H ) such that 1{λ} (H )[H, iA]1{λ} (H ) 6 = 0. Theorem 1 is a consequence of Theorem 2 below, which establishes a link between the virial relation and the C 1 (A) property. Let H0 be a positive selfadjoint operator on a Hilbert space H. For φ ∈ H we consider the rank one perturbation of H0 , Hφ := H0 − |φ >< φ|, which is selfadjoint with D(Hφ ) = D(H0 ). Note that λ < 0 is an eigenvalue of Hφ if and only if (φ, (H0 − λ)−1 φ) = 1 and Ker(Hφ − λ) is generated by (H0 − λ)−1 φ. Let A be another selfadjoint operator on H such that D(H0 ) ∩ D(A) is dense in D(H0 ), the quadratic form [H0 , A] on D(H0 ) ∩ D(A) is bounded for the topology of D(H0 ). (4) Theorem 2. Assume that H0 is positive and H0 , A satisfy (4). Assume that the virial relation holds for Hφ , A for each φ in a core S of A. Then H0 is of class C 1 (A). Proof. Let φ ∈ S, λ < 0, u = (H0 − λ)−1 φ, α 2 = (φ, u)−1 , so that λ is an eigenvalue of Hαφ . Since αφ ∈ S and by hypothesis the virial relation holds for Hαφ , A, we have: 0 = (u, [H0 , A]0 u) + α 2 (u, Aφ)(φ, u) − α 2 (u, φ)(Aφ, u) = ((H0 − λ)−1 φ, [H0 , A]0 (H0 − λ)−1 φ) + ((H0 − λ)−1 φ, Aφ) − (Aφ, (H0 − λ)−1 φ). Using (4), this implies that |((H0 − λ)−1 φ, Aφ) − (Aφ, (H0 − λ)−1 φ)| ≤ Ckφk2 , ∀φ ∈ S. t Since S is dense in D(A), this implies (ABG’) and hence that H0 is of class C 1 (A). u If we assume the following condition which is stronger than (4): D(H0 ) ∩ D(A) is dense in D(H0 ), [H0 , A] extends to a bounded operator [H0 , A]0 : D(H0 ) −→ H, D(H0 ) ∩ D(H0 A) is dense in D(H0 ),

(5)

then for φ ∈ D(A) we have: [Hφ , A] = [H0 , A] − [|φ >< φ|, A] = [H0 , A]0 + |Aφ >< φ| − |φ >< Aφ|, and hence the pair Hφ , A satisfies then (CFKS). Note that if in addition to (5) we assume that the multicommutators adkA H0 are bounded operators on D(H0 ), then for φ ∈ D(A∞ ) = ∩p∈N D(Ap ) the multicommutators adkA Hφ have the same property.

Virial Theorem in Quantum Mechanics

279

By Theorem 2 to construct the pair H, A in Theorem 1, it suffices to find a pair H0 , A satisfying (5) such that H0 is not of class C 1 (A). Let H = L2 (R, dx), q the operator of multiplication by x in H and p the self-adjoint operator in H associated to −id/dx. We will consider the operators H0 = eωq , A = eωp − p,

(6)

which are selfadjoint operators on their natural domains given by the functional calculus. We note that D(A) = D(p) ∩ D(eωp ). Noting also that D(eαp ) ⊂ D(eωp ) if 0 < α < ω and using Fatou’s lemma we see that the domain of eωp can be described as follows: a function f ∈ L2 (R) belongs to D(eωp ) if and only if f has an analytic extension to the strip {x + iy| − ω < y < 0} and kf (· + iy)kL2 ≤ const. Then limy→ω f (x + iy) ≡ f (x + iω) exists in L2 and one has (eωp f )(x) = f (x − iω). The operators eωp , eωq were considered by Fuglede in [Fu] in order to show that the Heisenberg form of the canonical commutation relations is not equivalent to the Weyl form. From the Weyl form of the canonical commutation relations eiαp eiβq = eiαβ eiβq eiαp it follows, by formally taking α = β = −iω with ω = (2π )1/2 , that eωp eωq = eωq eωp . This commutation property will certainly hold on a large domain (we give below the details of the proof) although the operators eωp and eωq do not commute, which is the reason why H0 is not of class C 1 (A). 1

Lemma 1. Let H0 , A be the pair defined in (6) for ω = (2π ) 2 . Then i) H0 , A satisfy (5), ii) the multicommutators adkA H0 are bounded operators from D(H0 ) into H for all k ∈ N, iii) on D(H0 ) ∩ D(A) we have [H0 , iA] = ωH0 , iv) H0 is not of class C 1 (A). Proof of Theorem 1. Applying Lemma 1 and Theorem 2 for S = D(A∞ ), we see that there exists φ ∈ D(A∞ ) such that for H = Hφ properties i), ii) and iv) of Theorem 1 are satisfied. Property iii) follows from Lemma 1 iii) and the fact that H − H0 , [H, A] − t [H0 , A] are compact operators. u Proof of Lemma 1. Let us consider the sequence of operators e−q /n . Clearly e−q /n tends strongly to 1 in the spaces H and D(eωq ). Let us verify that the same is true in D(eωp ). 2 2 In fact using the Fourier transformation, we see that eωp e−q /n = e−(q−iω) /n eωp , in 2 2 particular e−q /n preserves D(eωp ). This easily implies that e−q /n tends strongly to 1 2 2 2 in D(eωp ). Similarly we have pe−q /n = e−q /n p − 2ie−q /n q/n, which shows that 2 e−q /n tends strongly to 1 in D(p) and hence in D(eωp − p). After conjugation by Fourier transformation, we see that the same results hold for 2 the operator e−p /n . Let now 2

Tn = e−q

2 /n

e−p

2 /n

2

.

We deduce from the above observations that slimn→+∞ Tn = 1, in the spaces D(H0 ), D(A), D(H0 ) ∩ D(A),

(7)

280

V. Georgescu, C. Gérard

where D(H0 ) ∩ D(A) is equipped with the intersection topology. Since Tn maps H into D(H0 ) ∩ D(H0 A), we see that the first and third conditions of (5) are satisfied. Let us now check the second condition of (5). We claim that [H0 , iA] = ωH0 , on D(H0 ) ∩ D(A).

(8)

In fact let u ∈ D(H0 ) ∩ D(A), and un = Tn u. By (7) it suffices to check that (Aun , H0 un ) − (H0 un , Aun ) = iω(un , H0 un ) for each n. Since Aun ∈ D(H0 ) and H0 un ∈ D(A), we have (Aun , H0 un ) − (H0 un , Aun ) = (un , AH0 un − H0 Aun ). But un is an entire function, decreasing faster than any exponential on each line I mz = Cst. Hence we have d (eωx un (x)) AH0 un (x) = eω(x−iω) un (x − iω) + i dx

d un (x)) + iωeωx un (x) = eωx (un (x − iω) + i dx = H0 Aun (x) + iωH0 un (x),

since ω2 = 2π. This proves (8) and hence the second condition of (5). Moreover it follows from (8) that the multicommutators adkA H0 are bounded on D(H0 ). Let us now prove that H0 is not of class C 1 (A). Assume the contrary. Then (H0 +1)−1 2 would send D(A) into itself. The function u(x) = e−x belongs to D(A) and (H0 +1)−1 u 2 equals (eωx +1)−1 e−x . This function has a pole at z = −iω/2 and hence is not in D(A). t This gives a contradiction and hence H0 is not of class C 1 (A). u

Appendix The following result is of some independent interest. Lemma 2. Let A, H be selfadjoint operators in a Hilbert space H such that H ∈ C 1 (A) and [A, H ]0 D(H ) ⊂ H. Then eisA D(H ) ⊂ D(H ) for all real s. Proof. For any bounded operator S of class C 1 (A) the commutator [S, A] extends to a bounded operator in H denoted [S, A]0 , and one has Z t itA itA ei(t−s)A [S, iA]0 eisA ds. Se = e S + 0

So if t > 0, u ∈ H:

Z kSeitA uk ≤ kSuk +

t

k[S, A]0 eisA ukds.

0

We shall take

S = Hε = H (1 + iεH )−1 = −i/ε + (i/ε)R ε ,

where R ε = (1 + iεH )−1 . We set T = [A, H ]0 (H + i)−1 ∈ B(H) and we use [ABG, Theorem 6.2.10]; then [A, Hε ]0 = R ε T (H + i)R ε = R ε T Hε + iR ε T R ε .

Virial Theorem in Quantum Mechanics

Since kR ε k ≤ 1 we obtain kHε eitA uk ≤ kHε uk + tkT kkuk + kT k

281

Z 0

t

kHε eisA ukds.

From the Gronwall lemma it follows that for each t0 > 0 there is a constant C such that kHε eitA uk ≤ C(kHε uk + kuk) for all ε > 0, 0 ≤ t ≤ t0 , u ∈ H. Now it suffices to apply Fatou’s lemma. u t As a final remark we shall prove a version of the virial theorem. Let A, H be selfadjoint operators on a Hilbert space H such that eisA D(|H |σ ) ⊂ D(|H |σ ) for some real number σ ≥ 1/2 and all s (then the domain of |H |τ will also be invariant if 0 ≤ τ ≤ σ ). Set K = D(|H |σ ) and identify K ⊂ H ⊂ K∗ . Then the group induced by eisA in K is strongly continuous, hence the space D(A; K) = {u ∈ K ∩ D(A)|Au ∈ K} is dense in K. So the sesquilinear form (Au, H u) − (H u, Au) is well defined on the dense linear subspace D(A; K) of K (one needs this restricted subspace only if σ < 1; e.g. if σ = 1/2 then one does not have anything better than H K ⊂ K∗ ). Assume, moreover, that the preceding sesquilinear form is continuous for the topology of K and denote by [A, H ]0 the operator in B(K, K∗ ) associated to it. If we set Aε = (eiεA − 1)(iε)−1 , then it is easily seen that Z 1 ε i(ε−s)A e [H, iA]0 eisA ds [H, Aε ] = ε 0 holds in the strong operator topology of B(K, K∗ ). In particular we see that [H, Aε ] converges strongly in B(K, K∗ ) to [H, iA]0 . This clearly implies the virial theorem, because the eigenvectors of H belong to K. References Amrein, W., Boutet de Monvel, A., Georgescu, V.: C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians. Basel–Boston–Berlin: Birkhäuser, 1996 [CFKS] Cycon, H.L., Froese, R., Kirsch, W., Simon, B.: Schrödinger Operators with applications to Quantum Mechanics and Global Geometry. Berlin–Heidelberg–New York: Springer, 1987 [FH] Froese, R., Herbst, I.: A new proof of the Mourre estimate. Duke Math. J. 49, 1075–1085 (1982) [Fu] Fuglede, B.: On the relation P Q − QP = −i1. Math. Scand. 20, 79–88 (1967) [K] Kalf, H.: The quantum mechanical virial theorem and the absence of positive energy bound states of Schrödinger operators. Israel J. Math. 20, 57–69 (1975) [Mo] Mourre, E.: Absence of singular continuous spectrum for certain selfadjoint operators. Commun. Math. Phys. 78, 519–567 (1981) [PSS] Perry, P., Sigal, I.M., Simon, B.: Spectral analysis of N -body Schrödinger operators. Ann. of Math. 114, 519–567 (1981) [W] Weidmann, J.: The virial theorem and its application to the spectral theory of Schrödinger operators. Bull. Am. Math. Soc. 77, 452–456 (1967)

[ABG]

Communicated by B. Simon

Commun. Math. Phys. 208, 283 – 308 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Proof of the Symmetry of the Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients in General C ∞ Riemannian Manifolds Valter Moretti Department of Mathematics, Trento University, and Istituto Nazionale di Fisica Nucleare, Gruppo Collegato di Trento, 38050 Povo (TN), Italy. E-mail: [email protected] Received: 16 February 1998 / Accepted: 2 June 1999

Abstract: We consider the problem of the symmetry of the off-diagonal heat-kernel coefficients as well as the coefficients corresponding to the short-distance-divergent part of the Hadamard expansion in general smooth (analytic or not) manifolds. The requirement of such a symmetry played a central rôle in the theory of the point-splitting one-loop renormalization of the stress tensor in either Riemannian or Lorentzian manifolds. Actually, the symmetry of these coefficients has been assumed as a hypothesis in several papers concerning these issues without an explicit proof. The difficulty of a direct proof is related to the fact that the considered off-diagonal heat-kernel expansion, also in the Riemannian case, in principle, may be not a proper asymptotic expansion. On the other hand, direct computations of the off-diagonal heat-kernel coefficients are impossibly difficult in nontrivial cases and thus no case is known in the literature where the symmetry does not hold. By approximating C ∞ metrics with analytic metrics in common (totally normal) geodesically convex neighborhoods, it is rigorously proven that in general C ∞ Riemannian manifolds, any point admits a geodesically convex neighborhood where the off-diagonal heat-kernel coefficients, as well as the relevant Hadamard expansion coefficients, are symmetric functions of the two arguments. Introduction After earlier works (e.g. see [Wa78]), the symmetry of the coefficients which appear in the short-distance-divergent part of the Hadamard expansion of the two-point functions of a quantum state in curved spacetime, has been tacitly assumed to hold in mathematicalphysics literature. This symmetry plays a central rôle in the renormalization procedure of the one-loop stress tensor in curved spacetime, either in Lorentzian and Euclidean Quantum Field Theory. In fact, it is directly related to the conservation of the stress tensor and the appearance of the conformal anomaly [Wa78]. On the other hand, the symmetry of Hadamard coefficients is related to that of the heat-kernel coefficients [BD82,Fu91, Mo98a,Mo98b]. Despite the relevance of this assumption, to the author’s knowledge, up

284

V. Moretti

to now no rigorous proof of these symmetries exists in the literature1 . In this paper, we shall see that the problem of the symmetry of the heat-kernel/Hadamard coefficients is not so trivial as it seems at first sight. That is related to the fact that, in principle, the heatkernel expansion could be not asymptotic in the rigorous sense, or in the Riemannian case, when it is performed off-diagonal. We shall prove that the heat-kernel coefficients, in the Riemannian case, are actually symmetric in a geodesically convex neighborhood of any point of a C ∞ manifold. As a result we shall also see that the requirement of analyticity of the manifold assumed in earlier work [Wa78,BO86] can be completely dropped (as argued in [FSW78]). 1. Generalities, the Problem of the Symmetry of the Heat-Kernel and Hadamard’s Coefficients 1.1. Notations, general hypotheses and preliminaries. Within this work, M denotes a (Hausdorff, paracompact, connected, orientable) D-dimensional C ∞ manifold endowed with a non-singular metric g. M can be a manifold with smooth boundary ∂M and we shall consider g ≡ gab either Lorentzian or Riemannian. We shall deal with differential operators of the form A0 = −1 + V : C0∞ (M) → L2 (M, dµg ),

(1)

if M is Riemannian, and A0 = −1 + V : D(M) → C ∞ (M),

(2)

if M is Lorentzian. D(M) is any domain of smooth functions like C0∞ (M) or C ∞ (M). 1 := ∇a ∇ a denotes the Laplace-Beltrami operator and ∇ means the covariant derivative associated to the metric connection. dµg denotes the Borel measure induced by the metric, and V is a real function of C ∞ (M). The requirements above are the general hypotheses which we shall refer to throughout this paper. When M is Riemannian, we sometimes suppose also that A0 is positive, namely, that it is bounded below by some constant C ≥ 0 (for sufficient conditions for this requirement see [Da89]). Let us give some definitions used throughout this work and recall some known and useful relevant results. In this paper, a manifold with boundary (M, ∂M) is defined by giving a pair (NM , fM ), where NM is a manifold and fM : N → R denotes a differentiable function. The set M is defined by M := {p ∈ NM | fM (p) ≥ 0} and the boundary ∂M is defined by ∂M := {p ∈ NM | fM (p) = 0}. Throughout the paper an analytic function is a real-valued function which admits a (multivariable) Taylor expansion in a neighborhood of any point of its domain. Moreover, “smooth” means C ∞ whenever we do not specify further. In the Riemannian case, A0 is symmetric and admits self-adjoint extensions [Mo98a]. In particular, following the spirit of [Wa78] and [Mo98a,Mo98b], at least in the case ∂M = ∅ and A0 ≥ 0 we shall deal with Friedrichs extension [RS80] which will be denoted by A throughout this paper. We recall that A has the same lower bound as A0 . 1 For instance, in [Wa78], such a symmetry was (indirectly) argued to hold for the analytic case. In [FSW78], the symmetry was argued to hold for the C ∞ case. Nevertheless, these papers did not report the corresponding proof. The literature concerning the point-splitting procedure successive to [Wa78], as [BO86], assumes that symmetry implicitly.

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

285

Moreover A0 is essentially self-adjoint and thus A is its unique self-adjoint extension whenever either M is compact [Mo98a] or V ≡ 0 [Da89]. Concerning derivative operators, we shall employ the following notations, in a fixed local coordinate system, Dxα :=

∂ |α| |x , ∂x 1α1 · · · ∂x DαD

(3)

where the multi-index α is defined by α := (α1 , · · · , αD ), any αi ≥ 0 being a natural number (i = 1, · · · , D) and |α| := α1 + · · · + αD . Whenever I ⊂ N is a closed subset of the manifold N , f ∈ C k (I ; Rn ) indicates a n R -valued function defined on I which admits a C k extension on some open set I 0 ⊂ N such that I ⊂ I 0 . Finally, in a fixed coordinate system, ∇f indicates the function which maps any point q to the Jacobian matrix evaluated at q of the function f : p 7 → f (p). In any manifold M endowed with a (not necessarily metric) affine connection the notion of normal neighborhood centered on a point p, Np , indicates any open neighborhood of the point p ∈ M of the form Np = expp (B), B ⊂ Tp (M) being an open starshaped neighborhood of the origin such that expp defines a diffeomorphism therein. Then, the components of the vectors v ∈ Tp (M) contained in B define normal coordinates on M centered in p via the function v 7 → expp v. Notice that any q ∈ Np can be connected with p by only one geodesic segment completely contained in Np . It minimizes the length of the class of curves connecting these two points when the connection is metric, the metric is Riemannian and B is a geodesic ball. A totally normal neighborhood of a point p ∈ M is a neighborhood2 of p, Vp ⊂ M, such that, for any q ∈ Vp , there is a normal neighborhood centered on q containing Vp . Therefore, if q and q 0 belong to the same totally normal neighborhood, there is only one geodesic segment connecting these two points completely contained in any normal neighborhood sufficiently large centered on each of the points (but this segment is not necessarily contained in Vp ). Notice that a coordinate system which covers any totally normal neighborhood does exist in any case: It is that defined in a sufficiently large normal neighborhood of one of its points. Finally, a geodesically convex neighborhood of a point p ∈ M is a totally normal neighborhood of p, Up , such that, for any pair q, q 0 ∈ Up , there is only one geodesic segment which is completely contained in Up and connects q with q 0 . Statements and proofs of existence of normal, totally normal and convex neighborhoods of any point of any geodesically complete manifold can be found in [KN63] for affine connections and [dC92,BEE96] for the Riemannian and Lorentzian case respectively. If a (complete) Riemannian manifold has an injectivity radius r > 0 [dC92], then each pair of points p, q with d(p, q) < r is contained in a totally normal neighborhood. If M admits a boundary, all the definitions above and results concerning normal, totally normal and convex neighborhoods of points away from the boundary hold true.

1.2. Heat-kernel coefficients, Hadamard parametrix and the problem of the symmetry in the arguments. In this part we discuss informally some features of heat-kernel coefficients in both the Lorentzian and the Riemannian case. In our general hypotheses on the manifold M endowed with the metric g, fixing any open totally normal (or geodesically convex) neighborhood N , the so-called world 2 In this work, a neighborhood of a point is any set which includes an open set containing the point.

286

V. Moretti

function is defined, for (x, y) ∈ N × N , as the real-valued map (x, y) 7 → σ (x, y) :=

1 g(x)(expx−1 (y), expx−1 (y)) 2 1 (= g(y)(expy−1 (x), expy−1 (x))). 2

(4)

σ (x, y) does not depend on the chosen particular open totally normal neighborhood which contains x and y. As is well-known [KN63,BEE96,dC92], σ is always smooth in (x, y) and furthermore analytic in x and y (separately in general) whenever the metric is analytic. This is because, in open totally normal neighborhoods, the function (x, y) 7→ expx−1 (y) is always (x, y)-C ∞ or analytic in x and y [KN63] if the metric is so. Moreover, whenever the metric is Riemannian and the manifold has an injectivity radius r > 0 (this holds for compact manifolds in particular), σ can be defined on its natural domain Dr := {p, q ∈ M | d(p, q) < r}, d being the Riemannian distance on the manifold. Indeed, in the considered situation, σ belongs to C ∞ ({p, q ∈ M | d(p, q) < r}). This is because of Sobolev’s Lemma [Ru97], since, in the Riemannian case, the function (x, y) 7 → σ (x, y) = d 2 (x, y)/2 is continuous everywhere on M × M and is smooth in each variable separately on Dr . We pause to summarize the main features of the “small t expansion” of the heat kernel K(t; x, y) of the positive operator A0 in compact Riemannian manifolds supposing that our general hypotheses are fulfilled [Ch84,Gi84,Sh87,Da89,Ca90,Fu91,Ta96,Mo98a]. The heat kernel is the integral kernel of the semigroup of positive self-adjoint operators e−tA , t ∈]0, +∞[ which is a solution in C ∞ ((0, +∞)×M×M) of the “heat equation” with respect to the operator A0 , ∂ K(t, x, y) + A0x K(s, x, y) = 0, ∂t

(5)

with the initial condition in C ∞ (M), lim K(t, x, y) = δ(x, y).

t→0+

(6)

Fixing a sufficiently small open geodesically convex neighborhood of the manifold3 N , the “heat-kernel expansion at t → 0+ ” is the decomposition of the heat kernel K(t; x, y) =

N e−ησ (x,y)/2t N e−σ (x,y)/2t X j a (x, y)t + t Oη,N (t; x, y), j (4πt)D/2 (4π t)D/2

(7)

j =0

which holds for x, y ∈ N . In (7), η is a real which is arbitrarily fixed in ]0, 1[ and the function Oη,N satisfies |Oη,N (t; x, y)| < Cη,N |t|,

(8)

uniformly in (x, y). Above, Cη,N ≥ 0 does not depend on t. Finally, the coefficients aj (x, y) are smooth functions defined in N × N by recurrence relations we shall examine shortly (see [Mo98a] and the appendix of [Mo98b] for details). Similar expansions have been studied extensively in physics and mathematics and have been generalized 3 Actually, with small changes, a very similar decomposition holds in the set D given above [Mo98a]. r

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

287

considering Laplace-like operators acting on smooth sections of vector bundles on Riemannian/Lorentzian manifolds (see [Av98] and references therein). For x 6= y, the expansion above is not a proper asymptotic expansion because the remaining RN (t; x, y) :=

e−ησ (x,y)/2t N t Oη,N (t; x, y), (4π t)D/2

(9)

in principle, may be less infinitesimal than previous terms in the expansion in spite of vanishing faster than any positive power of t as t → 0+ . Taking the limit as η → 1− on the right-hand side of (9), one gets RN (t; x, y) :=

e−σ (x,y)/2t N t ON (t; x, y). (4π t)D/2

(10)

However, there is no guarantee that ON (t; x, y) vanishes or is bounded as t → 0+ . Therefore, as said above, the remaining part of the formally “asymptotic” expansion of K(t; x, y) could be less infinitesimal than the previous terms in expansion (7). The lack of general information of the precise behaviour of the remaining part of the considered expansion around t = 0 does not allow one to get important theorems such as the uniqueness of the coefficients aj (x, y). It is worthwhile stressing that, by the symmetry of K(t; x, y) in the Riemannian case and the general symmetry of σ (x, y), the symmetry of the coefficients aj (x, y) would follow from the uniqueness theorem trivially. Actually, at least to the author’s knowledge, there is no proof of the general offdiagonal asymptoticness of the heat-kernel expansion in the mathematical literature4 . Conversely, there appear formulae concerning upper bounds of the heat kernel which contain some arbitrary parameter like η above [Da89]. On the other hand, in practice, computations concerning off-diagonal heat-kernel coefficients in nontrivial cases are impossibly complicated and therefore, no counterexample is known concerning their symmetry. It is worthwhile remarking that symmetrized expansions for K(t; x, y) can be obtained following different approaches as the “Weyl calculus” [Ta96]. However, the coefficients obtained by this route satisfy different equations from the heat-kernel recurrence relations and, in general, cannot be identified with the standard heat-kernel coefficients used in physics. Obviously, in the case x = y, when both exponentials disappear, the heat-kernel expansion (7) is a proper asymptotic expansion. Whenever M is Riemannian complete and noncompact, the heat kernel exists as a smooth function (see [Da89,Wa79] for the pure Laplacian case) and expansion (7) remains true, in general, provided the injectivity radius of the manifold is strictly positive (this can be assured by imposing bounds on the sectional curvature of the manifold) and supposing that some bound conditions on the Ricci curvature tensor are satisfied [Ch84]. In the presence of boundaries of the Riemannian manifold M, A being some selfadjoint extension of A0 determined by fixing some boundary conditions on ∂M, the expansion above has to be changed just by adding a further (dependent on the boundary conditions) term h(t; x, y) in the sum above. However, for x 6 = y, the literature on this case is not very extensive, except for the analysis of the pure Laplacian case with Dirichelet boundary conditions. In this case [Ch84] h(t; x, y) can be bounded by a constant times t D/2 e−σ (y,∂ M)/4t (or x in place of y) and thus vanishes exponentially as t → 0+ whenever at least one of the arguments does not belong to the boundary. In the case x = y, h(t; x, x) can also be expanded in an asymptotic series of terms. 4 Unfortunately, the important textbook [Ch84] reports a result concerning this point which does not seem to follow from the corresponding proof (see Appendix of [Mo98b]).

288

V. Moretti

These terms carry powers of the form t j +1/2−D/2 instead of t j −D/2 (j natural) [El95, EORBZ94] and maintain the exponential factor cited above. Hence, in the case x = y away from the boundary, these added terms vanish faster than any power t M (M ∈ N) as t → 0+ (see [Ch84] for the pure Laplacian case). In the Lorentzian case, the picture changes dramatically. Generally, A0 is not bounded below and this drawback in general remains in self-adjoint extensions5 . This introduces several pathologies dealing with the heat equation and the associated semigroup of exponentials which, as a consequence, contains unbounded operators. However, an analogous expansion should arise considering an “heat kernel” H (s, x, y) solution of a “Schrödinger” equation [BD82,Fu91] formally related to the group of the imaginary exponential operator (which are bounded) of the operator A0 , −i

∂ H (s, x, y) + A0x H (s, x, y) = 0 ; ∂s

(11)

with initial condition (holding on locally-integrable smooth test functions) lim H (s, x, y) = δ(x, y).

s→0

(12)

(See [Fu91,Ca90,Av98] for details.) In the Lorentzian case, one expects that some local “asymptotic” expansion of the form6 ∞ eiσ (x,y)/2s X H (t, x, y) ∼ aj (x, y)(is)j (4π is)D/2

(13)

j =0

should hold. If the manifold has a boundary, further terms appear and depend on the boundary. Actually, the situation is much more complicated [Fu91,EF97] and we shall not address this issue here. We only notice that, if the Lorentzian manifold is locally static and V is invariant under the associated group of isometries, it should be possible to get information on the Lorentzian heat-kernel coefficients by a Wick-rotation into a Riemannian manifold. In this case, the analytical dependence of the heat-kernel coefficients on the time associated to the time-like Killing vector should be a consequence of the staticity of the metric. This should allow us to perform analytical continuation to Euclidean time.

1.3. Determination and smoothness of heat-kernel and Hadamard coefficients. Let us consider a manifold M satisfying our general hypotheses. In a local coordinate system x 1 , x 2 , · · · , x D , defined in any open convex neighborhood or, more generally, in any open totally normal neighborhood T , one can define the van Vleck-Morette determinant 1V V M . This is a bi-scalar which is given in the coordinates above by [Ca90] (g := detgab ) 2 ∂ σ (x, y) g(x) −1/2 det > 0, 1V V M (x, y) := (−1) [g(x)g(y)] |g(x)| ∂x a ∂y b D

(14)

5 They exist because A , thought of as an operator on L2 (M, dµ ), is symmetric ( e.g., on C ∞ (M)) and g 0 0 it commutes with the antiunitary operator given by the complex conjugation [RS80]. 6 We stress that, if the manifold is Lorentzian, σ (x, y) can also be negative.

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

289

x, y ∈ T . On T , it satisfies (all derivatives are computed in the variable x in the considered coordinate system) ∇ a σ (x, y)∇a ln 1V V M (x, y) = −D + ∇c ∇ c σ (x, y).

(15)

Notice that 1V V M (x, y) is strictly positive on T (since it is a bi-scalar and 1V V M (x, y) = |g(x)|−1/2 > 0 in normal coordinates around y) and it is a C ∞ function of (x, y) which is also analytic in x and y (separately in general) whenever the manifold and the metric are C ∞ and analytic respectively. Obviously, this follows from the fact that (x, y) 7 → σ (x, y) is (x, y)-C ∞ or, respectively, analytic in x and y in the considered domain. If the manifold is Riemannian and the injectivity radius is strictly positive, the van Vleck-Morette determinant can be defined on the set Dr as a smooth function. In either the Riemannian or Lorentzian case, the coefficients aj are bi-scalars defined in any fixed open geodesically convex neighborhood N containing x and y, away from the boundary of the manifold if it exists, or equivalently, in the set Dr , provided the manifold is Riemannian with strictly positive injectivity radius. In the considered domain, the functions (x, y) 7 → aj (x, y) can be heuristically determined by well-known equations with opportune regularity conditions. These equations are obtained perturbatively by inserting the considered Lorentzian or Riemannian expansions formally computed up to N = ∞ (omitting the remaining) into (11) and (5) and imposing that each coefficient of any power of t vanishes separately and taking into account (15). Following this route, in any normal coordinate system defined as a normal neighborhood of y, one finds d −1/2 a0 (x(λ), y)1V V M (x(λ), y) = 0, dλ −1/2 −λj 1V V M (x(λ), y) A0x(λ) aj (x(λ), y) d j +1 −1/2 λ aj +1 (x(λ), y) 1V V M (x(λ), y) , = dλ

(16)

(17)

where λ 7 → x(λ) is the unique segment geodesic from y ≡ x(0) to x ≡ x(1) completely contained in the normal neighborhood. The regularity conditions follow. The solutions have to be (x, y)-smooth everywhere in the considered domain, in particular they have to be bounded for x → y. Moreover, a0 (x, y) → 1

(18)

must be hold for x → y, which assures the validity of (12) and (6) since also −1/2 1V V M (x, y) → 1. The reason for using open geodesically convex neighborhoods should be clear. Indeed, in order to perform the derivatives contained in the differential operator A0 on the left-hand side of (17), with x and y fixed in N , there must exist an open neighborhood Oz of any point z which belongs to the geodesic which connects y with x, such that a geodesic which connects y with any point in Oz lies completely in N . Moreover the dependence of the considered geodesics on the extreme points has to be smooth. This is true provided N is open and geodesically convex, the smoothness being a consequence of the total normality of the neighborhood. (Working in Dr whenever possible, similar properties hold true and the definitions are well-posed.) The following definition gives the unique solutions in N of the recurrence equations (16) and (17) (j ≥ 1) satisfying the requirements given above, either for Riemannian or Lorentzian manifolds.

290

V. Moretti

Definition 1.1. In our general hypotheses on M and A0 , where the former can admit a boundary and can be either Riemannian or Lorentzian, in any fixed open geodesically convex neighborhood N not intersecting ∂M, the heat-kernel coefficients are the realvalued functions defined on N × N , labeled by j ∈ N, 1/2

a0 (x, y) = 1V V M (x, y), 1/2

a(j +1) (x, y) = −1V V M (x, y)

(19)

Z

1 0

h

−1/2

i

λj 1V V M A0x(λ) aj (x(λ), y)dλ,

(20)

λ 7 → x(λ) being the unique geodesic segment from y ≡ x(0) to x ≡ x(1) contained completely in N . (It is possible to give an analogous and equivalent definition on the set Dr in a Riemannian manifold with strictly positive injectivity radius. In any case, it is obvious that fixing x, y, aj (x, y) defined above does not depend on the chosen open geodesically convex neighborhood containing x and y). In the case of a Riemannian compact manifold, the heat kernel coefficients defined above are just those which appear in (7) [Ch84,Mo98a,Mo98b]. Moreover, these coefficients do not depend on the particular self-adjoint extension of A0 . (Conversely, in the case of the presence of a boundary, the further coefficients cited previously do depend on the self-adjoint extension). a0 (x, y) enjoys the same properties of positivity, smoothness/analyticity of 1V V M (x, y). Moreover, assuming the smoothness/analyticity of the function V which appears in the operator A0 and working in local coordinates defined in a geodesically convex neighborhood containing x and y, one can generalize this result to all the coefficients aj . Indeed, taking account of the smooth/analytic dependence on the parameter and the initial and final conditions of the geodesics (and their derivatives) [dC92,KN63] and finally considering (20), one can check that the coefficients aj (x, y) are (x, y)-smooth or x and y analytic functions of (x, y) in the considered domain (away from the boundary). Concerning the proof of analyticity, a shortcut is to continue the functions on the righthand side of (20) to complex values of the arguments x and y. For example, since the integrand functions are analytic in x for y fixed, one can continue these to complex values in the variable x for any fixed real y. Therefore one can prove Cauchy–Riemann conditions for the complex components of x on the left-hand side of (20) passing the derivatives under the sign of integration. Also in the presence of boundaries, one can formally use the coefficients aj above (those which do not depend on the boundary conditions) to build up a part of a formal series for Green’s functions G(x, y) A0x G(x, y) = δ(x, y).

(21)

Indeed one has that the Green functions above can be locally approximated by a formal series, which defines, whenever they converge to proper solutions, “Hadamard local fundamental solutions”. Following the procedures outlined in [Mo98a,Mo98b], it can be simply shown that both in the Riemannian and in the Lorentzian case, these series can be represented, up to the indicated order, by (the summation appears for D ≥ 4 only)

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

H (x, y) =

D/2−2 X j =0

291

D/2−j −1 aj (x, y|A) 2 (D/2 − j − 2)! σ (4π )D/2

2aD/2−1 (x, y|A) − aD/2 (x, y|A)σ σ ln 2(4π )D/2 2 ) (N σ X ck aD/2+k (x, y|A)σ k+1 ln + 2 −

(22)

k=1

if D is even, and (the summation appears for D ≥ 5 only) H (x, y) =

(D−5)/2 X j =0

√ (D − 2j − 4)!! π 2 D/2−j −1 aj (x, y|A) σ (4π )D/2 2(D−3)/2−j

a(D−3)/2 (x, y|A) + (4π )D/2

r

a(D−1)/2 (x, y|A) √ 2π 2π σ − σ (4π )D/2

(23)

if D is odd. The coefficients ck are real constants (determined by the recurrence procedure which defines the Hadamard local solutions), N is a natural fixed arbitrarily and the sum in the last line of (22) can be dropped (i.e. take N = 0) as far as the stress tensor renormalization is concerned. Equations (17) and (19) assure that the coefficients of the formal series for the considered Green’s function, truncated at the indicated orders, satisfy, up to larger orders in powers of σ , the corresponding recurrence differential equations given in Chapter 5 of [Ga64] both in the Riemannian and in the Lorentzian case and the corresponding regularity/initial conditions. A further smooth part of the Hadamard expansion, usually indicated by w(x, y) and expanded in positive powers of σ and σ 1/2 respectively, has been omitted in (22) and (23). For D even, this part depends on the arbitrary choice of the first (i.e., σ 0 ) coefficient (w0 ) of its expansion. Anyway, in practice, the symmetry requested in the point-splitting technique concerns only the coefficients which appear above. In the Lorentzian case one has to specify the prescription to compute the logarithms and the fractional powers of σ in the case σ < 0; as is well known in Quantum Field Theory, this produces different types of (Hadamard expansions of) two-point functions with the same coefficients (Wightman functions and Feynman propagator). The expansions above define “parametrices” of the Green’s functions at the considered order of approximation. However, nothing assures that the corresponding not truncated series converge and, most important, define smooth functions. This convergence anyway holds locally uniformly in the analytic case [Ga64], in this case the series define proper functions, which are true local solutions of the corresponding differential equation: these are the Hadamard fundamental local solutions. In general non-analytic manifolds, one has a convergence in the sense of Borel only [Fu91,Fr75]. However, the convergence issue should not be so important in practice, since, within the practical point-splitting procedure, one has to take into account only a finite number of terms of these expansions and thus one can use the parametrices instead of the sum of the series. Nevertheless the requirement of the convergence of the series as well as the smoothness of the sum have been used within the proof of the conservation of the stress tensor [Wa78,Br84,BO86].

292

V. Moretti

Successively, these requirements have been partially dropped in [FSW78] where a “distributional” convergence of the Hadamard series has been used, but no explicit improved proof of the results given in [Wa78] (and related papers) have been supplied. As we said previously, another strongly important point, used to prove the conservation of the renormalized stress tensor in the cited literature, is the symmetry of the Hadamard local fundamental solutions concerning their divergent part as σ → 0 [Wa78] up to the order of expansion considered in (22) and (23) (actually most of the known literature treats the case D = 4 only, but the same procedures can by generalized to different dimensions in a direct way). In [FSW78], it was argued that a proof of this property holds true also for the case of a C ∞ (not analytic) manifold, unfortunately such a proof was not reported there and, at least to the author’s knowledge, such a general proof (as well as a proof of the symmetry of the heat-kernel coefficients) does not exist in the literature. Notice that the symmetry of the heat-kernel coefficients assures the symmetry of the parametrices (22) and (23). For this reason the symmetry of the heat-kernel coefficients is important in the point-splitting technique.

2. A Proof of the Symmetry of Heat Kernel Coefficients in the Riemannian Case 2.1. Two theorems. Our proof is quite technical and involves several steps. The way is dealt with as follows. First, one shows that the thesis holds true in the case of a real analytic manifold by using known local properties of the expansion of the heat-kernel. This is the content of the first theorem we shall prove. Afterwards, one proves that, in some sense, any C ∞ manifold can be approximated by analytic manifolds. This point is quite complicated because this approximation has to hold in a common geodesically convex neighborhood. This is necessary in order to make sensible a common definition of heat-kernel coefficients. Finally, one proves that the heat-kernel coefficients, defined in the common geodesically convex neighborhood are “sequentially continuous” in the class of metrics used. Then, and this is the content of the second theorem we shall present, the symmetry for the case of a C ∞ manifold follows by the “continuity” of the heat-kernel coefficients with respect to the metrics and from the symmetry in the analytic case. It is worthwhile stressing that (local and global) approximation theorems in real analytic manifolds are well-known in the literature (see [TO98] for a recent review). However, these theorems concern functions rather than metrics and the problem of the existence of common geodesically convex neighborhoods is not treated explicitly. For this reason we prefer giving independent proofs (see the Appendix). Lemma 2.1. Let us assume our general hypotheses on A0 ≥ 0 and M which is explicitly supposed to be Riemannian and compact. In a coordinate system defined in an open sufficiently small (geodesically convex) neighborhood Nz of any point z ∈ M, for any pair of points x, y ∈ Nz , and any natural N such that N > D/2 + 2|α 0 | + 2|β 0 |, α 0 , β 0 being arbitrarily fixed multi-indices, one has α0

β0

α0

β0

Dx Dy K(t; x, y) = Dx Dy +

 N  e−σ (x,y)/2t X  (4π t)D/2

j =0

aj (x, y)t j

  

e−ησ (x,y)/2t N −|α 0 |−|β 0 | (α 0 ,β 0 ) t Oη,N (t; x, y). (4π t)D/2

(24)

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

293 (α 0 ,β 0 )

Above, η ∈]0, 1[ can be fixed arbitrarily and the corresponding function Oη,N is continuous in (t, x, y) ∈ [0, +∞[×Nz × Nz and (x, y)-uniformly bounded by Bη,N |t| in a positive neighborhood of t = 0, Bη,N > 0 being a constant. Proof. See Lemma 3.1 of [Mo98b]. u t By the lemma above we are able to prove our first result. Theorem 2.1. In our general hypotheses on M, which is supposed to be a Riemannian manifold (also with boundary in general), and A0 , the following properties hold true for the heat-kernel coefficients in (19) and (20): (a) For any fixed point z ∈ M (away from ∂M), there is a sufficiently small open geodesically convex neighborhood of z, Nz (which does not intersect ∂M), such that, in any local coordinate system defined therein, for any j ∈ N, and any pair of β derivative operators Dxα , Dx and for any point y ∈ Nz , Dxα Dyβ aj (x, y)|x=y = Dxα Dyβ aj (y, x)|x=y .

(25)

(b) For any choice of the multi-indices α, β and j, N ∈ N, the functions j (x, y) := aj (x, y) − aj (y, x)

(26)

computed in any local coordinate system in the set Nz defined in (a), satisfy h i σ −N (x, y) Dxα Dyβ j (x, y) → 0

(27)

as x → y. (c) If gab and V are (real) analytic functions of some local coordinate frame defined in an open connected set O (away from the boundary) where, for all x, y ∈ O, the coefficient aj (j fixed in N) is defined, then aj (x, y) = aj (y, x)

(28)

for any pair (x, y) ∈ O × O. 1/2

Proof. Notice that the thesis is trivially proven for a0 (x, y) = 1V V M (x, y), since the right-hand side is symmetric in x and y. So we can pass directly to the case j > 0 proving (a). Let us first consider the case of a compact Riemannian manifold and A0 ≥ 0. Then, since A0 is positive, we can employ standard theorems on the heat kernel, in particular we can use Lemma 2.1 above. Therefore, let us fix a coordinate system where Lemma 2.1 holds true in an open geodesically convex neighborhood of the point z. For any pair of multi-indices α, β, Dxα Dyβ

N X j =0

j (x, y)t j = (4π t)D/2 Dxα Dyβ  

 N  −σ (x,y)/2t X e j · e+σ (x,y)/2t  (x, y)t . j   (4π t)D/2 

j =0

294

V. Moretti

Taking into account that (x, y) 7 → K(t; x, y) − K(t; y, x) ≡ 0 in these hypotheses [Ch84] (see also Theorem 1.1 in [Mo98a]), employing Leibnitz’ rule in evaluating the derivatives above and making use of (24), we get Dxα Dyβ

N X j =0

(α,β)

j (x, y)t j = t N −2|α|−2|β| eδσ (x,y)/2t Uδ,N (t; x, y),

(29)

(α,β)

where δ = 1 − η ∈]0, 1[ (η is the same parameter which appears in (24)), Uδ,N (t; x, y) is built up using linear combinations of antisymmetrized remainders which appear in (α 0 ,β 0 ) (α 0 ,β 0 ) (24), Oη,N (t; x, y) − Oη,N (t; y, x) with coefficients given by positive powers of t (α 0 ,β 0 )

and derivatives of the function σ . Due to the similar property of the functions Oη,N

,

(α,β) Uδ,N

is (x, y)-uniformly bounded by some constant Cδ,N > 0 in a right(t; x, y) 7 → neighborhood of t = 0, provided N has been chosen sufficiently large. Then, taking the limit for x → y we have N X j =0

(α,β)

Dxα Dyβ j (x, y)|x=y t j = t N −2|α|−2|β| Uδ,N (t; y, y),

(30)

and thus, with a trivial redefinition of U obtained by decomposing N X j =0

=

N−2|α|−2|β|−1 X j =0

+

N X

,

N −2|α|−2|β|

one gets N−2|α|−2|β|−1 X j =0 (α,β)

(α,β)

Dxα Dyβ j (x, y)|x=y t j = t N −2|α|−2|β| Vδ,N (t; y, y),

(31)

where Vδ,N (t; y, y) is bounded in a positive neighborhood of t = 0. In the limit t → 0+ , this is possible only when all the coefficients of the polynomial on the lefthand side vanish separately. This also implies that the covariant derivatives of any order of the functions j , evaluated on the diagonal, vanish. Obviously this does not depend on the particular coordinate frame used around y. Therefore, changing coordinates and passing from covariant derivatives to ordinary derivatives in a different coordinate frame, we get that, once again, the derivatives of any order of the functions j , evaluated on the diagonal vanish. (a) has been proven in the hypotheses of a compact manifold (without boundary) and A0 ≥ 0. Given a general manifold M and any inner point y, dropping the requirement A0 ≥ 0, we can consider a neighborhood O of y and build up a new manifold M0 which contains a neighborhood O0 isometric to O. M0 can be chosen compact (without boundary) provided M is complete. On M0 , we can define an operator A00 (depending on a smooth potential V 0 ) which coincides with A0 in the neighborhood O0 ≡ O. In general, also if A0 is positive, A00 may be non-positive. However, since V 0 is bounded below by some real v, the operator A00 +|v|I is positive on M0 . We can consider the heat-kernel coefficients bj (x, y) of the expansion (7) for the operator A00 + |v|I . For these coefficients the item (a) of the thesis holds true. An algebraic computation based on the fact that, formally, if S(t; x, y) satisfies the heat equation with respect to A0 then

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

295

S(t; x, y) exp (−ct) satisfies the same equation with respect to A0 + cI (c ∈ R), proves that the coefficients aj (x, y) of (19) and (20) corresponding to A00 ≡ A0 are related to those above by the relations aj (x, y) =

j X (−1)k |v|k bj −k (x, y)

k!

k=0

bj (x, y) =

j X |v|k aj −k (x, y) k=0

k!

.

,

(32)

(33)

The coefficients on the left-hand side of (32) satisfy (19) and (20) with respect to A0 ≡ A00 in O ≡ O0 once the coefficients on the right-hand side do so with respect to A00 + |v|I . Thus, item (a) is trivially proven for the coefficients aj (x, y) in the general case. Notice that, in the same way, items (b) and (c) also hold true in the general case provided they are valid in the particular case of a compact manifold without boundary. Item (b) is trivially proven by expanding, in the variable x, any j (x, y) and all of its derivatives, via the Taylor algorithm, around the point y in a normal Riemannian coordinate system centered in y ≡ 0. For instance, considering j , one has, for any N ∈ N, j (x, y) X = 0≤|α|≤2N+1

(x 1 )α1 · · · (x D )αD ∂ |α| (x, 0) |x=0 + |x|2N+1 O2N+1 (x), (34) α1 ! · · · αD ! ∂ α1 x 1 · · · ∂ αD x D

where |x|2 /2 = σ (x, y), and O2N+1 (x) is a smooth function which vanishes as x → 0 ≡ y and thus is bounded around y ≡ 0. Using the result of item (a) (changing coordinates in general), one gets thesis (b). The same procedure can be employed for derivatives of j (x, y). Let us consider item (c). In this case, the Taylor expansion above can be carried out, in the considered coordinates, up to N = ∞. Thus, taking into account (a), for x belonging to a neighborhood of any fixed point y ∈ O, one has j (x, y) = 0.

(35)

Since, for y fixed in O, j (x, y) is analytic in x ∈ O, which is an open and connected set, and vanishes in an open neighborhood contained in O (dependent on y), it has to t vanish everywhere on O. Therefore (x, y) 7 → j (x, y) vanishes in O × O. u The results obtained above concerning the heat-kernel coefficients, can be generalized directly to the relevant coefficients of the expansions of the Hadamard local solution of the operator A0 by taking into account (22) and (23) above. Actually, the result contained in item (b) should be sufficient for all applications of (Euclidean) point-splitting procedures known in the literature despite the complete symmetry of the Hadamard coefficients originally required7 . However, we aim to get a more general result. Proposition 2.1. Let M be a real C ∞ manifold with a non-singular metric g, satisfying our general hypotheses. 7 I am grateful to R. M. Wald for this remark.

296

V. Moretti

(a) Let 0 be any open set in M (such that 0 ∩ ∂M = ∅) endowed with a coordinate ¯ ⊂ frame x 1 , · · · , x D . For any connected relatively-compact open set , such that 0 , there is a sequence of real metrics {gn } with the same signature of g defined in a ¯ such that each gnab is an analytic function of the given coordinates neighborhood of ¯ to the metric g. Similarly, for any fixed and the sequence {gn } converges uniformly in multi-index α, the sequence of derivatives with respect to the coordinates x 1 , · · · , x D ¯ to {D α g}. {D α gn }, converges uniformly in (b) For any choice of the set 0 , the coordinates x 1 , · · · , x D , the set and the sequence {gn } given above and for any z ∈ , there is a natural N0 and a family of open neighborhoods of z, {Nzi }, i ∈ R, such that {Nzi } is a local base of the topology of M, Nzi ⊂ N¯ zi 0 ⊂ , for any pair i, i 0 such that i 0 > i and, moreover, for any i ∈ R, both Nzi and N¯ zi are common geodesically convex neighborhoods of z for all the metrics g and gn when n > N0 . (c) For any choice of the set 0 , the coordinates x 1 , · · · , x D , the set , the sequence {gn }, z ∈ and the class {Nzi }, i ∈ R arbitrary, the functions (x, y) 7 → σn (x, y) are well-defined and smooth in any neighborhood of N¯ zi × N¯ zi and the sequence of these functions as well as the sequences of their derivatives of any order converge uniformly in N¯ zi × N¯ zi to σ (x, y) and corresponding derivatives. (d) For any choice of the set 0 , the coordinates x 1 , · · · , x D , the set , the sequence {gn }, z ∈ and the class {Nzi }, for any i ∈ R, if (λ, x, y) 7 → γn (λ, x, y), λ ∈ [0, 1], indicates the only geodesic segments starting from the point y ∈ N¯ zi and terminating in the point x ∈ N¯ zi corresponding to the nth metric and contained in N¯ zi , then {γn (λ, x, y)} and the sequences of their λ, x, y-derivatives of any order converge uniformly in [0, 1] × N¯ zi × N¯ zi to γ (λ, x, y) and corresponding derivatives, γ (λ, x, y) being the geodesic of the initial metric g. Proof. See the Appendix. u t We need another technical lemma to get the final theorem. Lemma 2.2. Let {gk,n } be a class of continuous functions, k = 1, 2, · · · , l and n ∈ N ∪ {∞}, gk,n : Kk → Mk ,

(36)

where Mk and Kk ⊂ Mk are, respectively, metric spaces and compact sets. Let {fn } be a class of continuous functions, n ∈ N ∪ {∞}, fn : 1 × 2 × · · · × l → N,

(37)

where N is a metric space, the sets k ⊂ Mk , k = 1, 2, · · · , l, are open and gk,∞ (Kk ) ⊂ k . Suppose that, for any fixed k and for n → +∞, gk,n → gk,∞ uniformly in Kk and fn → f∞ uniformly in 1 × · · · × l . Then, there is a natural N0 such that, for n > N0 , the left-hand side below is well-defined and, for n → +∞, fn (g1,n (x1 ), g2,n (x2 ), · · · , gl,n (xl )) → f∞ (g1,∞ (x1 ), g2,∞ (x2 ), · · · , gl,∞ (xl )) (38) uniformly in K1 × K2 × · · · × Kl . Proof. It is quite straightforward. Take into account that a continuous function h defined on a compact set H of a metric space with values in a metric space is uniformly continuous in H and h(H ) is also a compact set. N0 is defined by determining the compact sets t C1 , · · · , Cl such that gk (Kk,n ) ⊂ Ck ⊂ k , for n = N0 + 1, N0 + 2, · · · , ∞. u

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

297

We are now able to state and prove the most important theorem concerning the symmetry of heat-kernel coefficients in Riemannian manifold. Theorem 2.2. Let M be a C ∞ Riemannian manifold (with boundary in general) and A0 an operator, both satisfying our general hypotheses. For any point z ∈ M (away from ∂M) there is a geodesically convex neighborhood of z, Nz (which does not intersect ∂M) such that, for any pair (x, y) ∈ Nz , aj (x, y) = aj (y, x)

(39)

for j = 0, 1, 2 · · · , where the heat-kernel coefficients are those given in (19) and (20) Proof. Fix any z ∈ M away from the boundary, let 0 be an open set endowed with a coordinate frame x 1 , · · · , x D (0 ∩ ∂M = ∅) and let be a connected relatively-compact ¯ ⊂ 0 . We can use the thesis of Proposition 2.1 open neighborhood of z such that with the same notations employed there. In particular, fix a common open geodesically convex neighborhood Nz := Nzi0 of z and its closure given in Proposition 2.1 and a ¯ also sequence of analytic metrics {gn } defined in a neighborhood of the compact set ¯ given in Proposition 2.1. Since Nz and Nz are geodesically convex and, by Proposition 2.1, there is another similar open neighborhood Nzi , such that N¯ z ⊂ Nzi , both coefficients aj (x, y) and aj (y, x) are well-defined and smooth in N¯ z × N¯ z . For the moment, let us suppose also that V is an analytic function of the considered coordinates. Let us fix x, y ∈ Nz , and consider the functionals of the metrics defined in N¯ z , aj xy [gn ] := (−1)j 1−1/2 (x, y|gn )aj (x, y|gn ), j

−1/2

ajyx [gn ] := (−1) 1

(y, x|gn )aj (y, x|gn ),

(40) (41)

n = 0, 1, · · · ∞, with g∞ := g and 1 := 1V V M . If gn is fixed, the coefficients above are smooth functions on N¯ z ×N¯ z which are also analytic in x and y. Obviously, the symmetry of these functionals in x, y would involve that of the heat-kernel coefficients since the VVM determinant is symmetric. Since N¯ z is totally normal, it is possible to make explicit each aj xy and aj xy in terms of a sequence of integrals computed along the unique geodesic between x and y which belongs completely to a normal neighborhood centered on x (as well as y) including the whole set N¯ z . Moreover, since N¯ z is geodesically convex with respect to all the metrics, we can do it for all the metrics gn , the corresponding geodesics depending on the particular metric one is considering. Let us indicate the considered geodesic starting from y and reaching y 0 ∈ N¯ z and computed with respect to the metric gn by λ 7 → γ (λ, y 0 , y|gn ) (with λ ∈ [0, 1]). Employing (19) and (20), one finds, with A0x [g] = −∇ga ∇ga + V , a0xy [g] ≡ 1, Z 1 dλ[1−1/2 A0 ]γ (λ,x,y) [g]11/2 (γ (λ, x, y|g), y|g), a1xy [g] = 0

Z a2xy [g] =

0

1

Z dλ 0

1

dλ0 λ0 [1−1/2 A0 ]γ (λ0 ,x,y) [g]11/2 (γ (λ0 , x, y|g), y|g)

×[1−1/2 A0 ]γ (λ,γ (λ0 ,x,y),y) [g]11/2 (γ (λ, γ (λ0 , x, y|g), y|g), y|g), ··· Z 1 Z 1 Z 1 j −1 dλ dλ1 λ11 · · · dλj −1 λj −1 Aj xy (λ, λ1 , · · · λj −1 |g), (42) aj xy [g] = 0

0

0

298

V. Moretti

where (omitting the explicit dependence on the chosen metric for the sake of simplicity) Aj xy (λ, λ1 , · · · λj −1 ) := [1−1/2 A0 ]γ (λj −1 ,x,y) 11/2 (γ (λj −1 , x, y), y) × [1−1/2 A0 ]γ (λj −2 ,γ (λj −1 ,x,y),y) 11/2 (γ (λj −2 , γ (λj −1 , x, y), y), y) × ··· [1−1/2 A0 ]γ (λ,γ (λ1 ,···γ (λj −1 ,x,y),···y),y) · · · 11/2 (γ (λ, γ (λ1 , · · · γ (λj −1 , x, y), · · · y), y), y).

(43)

Notice that, fixing any metric, 1(x, y) and their derivatives are smooth functions of the derivatives of the function σ (x, y) in the set N¯ z × N¯ z ; therefore by item (c) of Proposition 2.1 and Lemma 2.2, on the compact set N¯ z × N¯ z , one gets the uniform convergence with all of the derivatives of the sequence of functions 1±1/2 (x, y|gn ) to the function 1±1/2 (x, y|g). Moreover, from item (d) of Proposition 2.1, taking into account that all the functions appearing in the integration above are computed on the geodesics connecting y with x which, not depending on n, belong completely to the compact N¯ z , and using recurrently Lemma 2.2 one gets (1) for j = 0, 1, 2, · · · , Aj xy (λ, λ1 , · · · λj −1 |gn ) → Aj xy (λ, λ1 , · · · λj −1 |g),

(44)

as n → +∞. This holds uniformly in λ, λ1 , · · · , λj −1 ∈ [0, 1], and therefore, (2) for any j ∈ N, there is a constant Cj such that |Aj xy (λ, λ1 , · · · λj −1 |g)| < Cj

for n = 1, 2, · · · ,

(45)

uniformly in (λ, λ1 , · · · , λj −1 ) ∈ [0, 1]j . Lebesgue’s dominated convergence theorem assures that, for n → +∞, aj xy [gn ] → aj xy [g].

(46)

The same result can be obtained considering the coefficients ajyx [gn ] and ajyx [g]. This allows one to conclude the proof noticing that, (47) aj xy [gn ] − ajyx [gn ] → aj xy [g] − ajyx [g] for n → +∞. The left-hand side above vanishes because the metrics gn are analytic in the open connected set Nz × Nz and thus item (c) of Theorem 2.2 holds true. If V = V (x) is not analytic, one can find a sequence of positive analytic functions of the considered coordinates in N¯ y , {Vn }, such that this sequence converges uniformly to V with all of its derivatives. This sequence can be obtained considering the convolutions of V and the flat-space heat kernel similarly to what we have done in building up the sequence of the metrics gn for proving Proposition 2.1 (see the Appendix). Defining A0x [gn ] := −∇gan ∇gan + Vn , and using the same arguments above, one can prove (44) and (45) once again and therefore gets the thesis. u t We have a straightforward corollary based on the formulae (22) and (23).

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

299

Corollary of Theorem 2.2. Let M be a D-dimensional C ∞ Riemannian manifold and A0 an operator both satisfying our general hypotheses. For any point z ∈ M (away from ∂M if ∂M is not empty) there is a geodesically convex neighborhood Nz of z, (which does not intersect ∂M if ∂M exists) such that, for any pair (x, y) ∈ Nz , up to the orders indicated in the summations below, the coefficients uj , vj of the Hadamard parametrix

H (x, y) =

D/2−2 X j =0

+

N X

2 σ (x, y)

D/2−j −1

uj (x, y)

σ j (x, y)vj (x, y) ln(σ (x, y)/2)

(48)

j =0

for D even (the summation appears for D ≥ 4 only), and s D/2−j −1 2 2π uj (x, y) + v0 (x, y) H (x, y) = σ (x, y) σ (x, y) j =0 p +v1 (x, y) 2π σ (x, y) (D−5)/2 X

(49)

for D odd (the summation appears for D ≥ 5 only), satisfy uj (x, y) = uj (y, x), vj (x, y) = vj (y, x).

(50) (51)

2.2. Conclusions and outlooks. We have proven the symmetry of the heat-kernel / Hadamard coefficients in the general Riemannian case. The Lorentzian case remains an open issue. However, we expect that one can pass to the Lorentzian case from the Riemannian one by some analytic continuation, if the manifold and the coefficient V are analytic. This should assure the symmetry of the considered coefficients in the analytic Lorentzian case. From that, the symmetry in the C ∞ Lorentzian case is straightforward, since the proof of Theorem 2.2 needs the validity of the symmetry in the analytic case only, not depending on the signature of the metric. Indeed, Proposition 2.1, which is the kernel of the proof above, holds true for any signature of the metric of the manifold (and some parts of it can be generalized for non-metric affine connections). Acknowledgement. I am particularly indebted to A. Cassa for his constant assistance in solving mathematical problems related to this work and for his numerous and always illuminating technical suggestions. It is a pleasure to thank I. G. Avramidi, E. Ballico, S. Delladio, F. Serra Cassano, A. Tognoli and R. M. Wald for helpful discussions and G. Esposito, B. S. Kay and D. Klemm for valuable suggestions. This work has been financially supported by a Postdoctoral Research Fellowship of the Department of Mathematics of the University of Trento.

300

V. Moretti

Appendix: Proof of Proposition 2.1 Several simple lemmata are necessary. We do not report the proofs of those lemmata for the sake of brevity. These are based essentially on the Banach fixed-point theorem, the theorem of the inverse function and further simple considerations of elementary real analysis8 . Lemma A.1. Let f be a function of C k ([t0 − 1, t0 + 1] × B¯ R (y0 ); Rm ), where k ∈ {∞, ω}, t0 , 1 > 0 and R > 0 are real numbers and BR (y0 ) indicates the open ball of Rm centered in y0 with radius R. Let us consider the differential equation dY = f (t, Y ) Y ∈ C 1 ([t0 − δ, t0 + δ]; Rm ) for some δ > 0, δ ≤ 1 dt

(52)

with initial condition Y (t0 ) = y¯0 y¯0 ∈ B¯ r (y0 ), r fixed arbitrarily such that 0 < r < R.

(53)

(a) A solution of Eq. (52) with initial condition (53) exists and is unique in any set [t0 − δ, t0 + δ], provided that 0 < δ < Min 1, 10r , 100 , (54) where 10rn= [(R − r)/2]/ Sup{||f (t, y)|| t ∈ [t0 − 1, t0 + 1] y ∈ B¯ R (y0 )}o and p 100 = 1/ Sup{ m T r∇f (t, y)T ∇f (t, y) t ∈ [t0 − 1, t0 + 1] y ∈ B¯ R (y0 )} . (b) It satisfies Y (t, y¯0 ) ∈ B¯ R (y0 ) for any t ∈ [t0 − δ, t0 + δ] and y¯0 ∈ B¯ r (y0 ). (c) Moreover, the solution (t, y¯0 ) 7 → Y (t, y¯0 ) belongs to C ∞ ([t0−δ, t0+δ]× B¯ r (y0 ); Rm ) and, in the case k = ω, it is also analytic in the variable t ∈ [t0 − δ, t0 + δ] and in the variable y¯0 ∈ B¯ r (y0 ) (separately in general). Lemma A.2. Let {fn } be a sequence of functions of C ∞ ([t0 −1, t0 +1]× B¯ R (y0 ); Rm ), where the used notations are the same as those used in the Lemma A.1. Let us suppose also that, for any p = 0, 1, 2 · · · and for any multi-index α, Dyα

p ∂ p fn α ∂ f∞ → D y ∂t p ∂t p

uniformly on [t0 − 1, t0 + 1] × B¯ R (y0 ),

(55)

f∞ being another function of C ∞ ([t0 − 1, t0 + 1] × B¯ R (y0 ); Rm ). Let us indicate the solutions of Eq. (52), with fn in place of f and initial condition (53), by (t, y¯0 ) 7 → Yn (t, y¯0 ) (n = 0, 1, 2, · · · , ∞). Then, for any δ > 0 satisfying (54) above with f∞ in place of f , and any r > 0 with r < R: (a) There is a natural N such that for n > N , (t, y¯0 ) 7→ Yn (t, y¯0 ) is defined in [t0 − δ, t0 + δ] × B¯ r (y0 ); (b) for any p = 0, 1, 2, · · · , ∂ p Y∞ ∂ p Yn → uniformly in [t0 − δ, t0 + δ] × B¯ r (y0 ). ∂t p ∂t p

(56)

8 A complete proof of the lemmata contained in this appendix can be found within the first version of the preprint gr-qc/9902034 (http://xxx.lanl.gov/abs/gr-qc/9902034v1).

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

301

Lemma A.3. With the same hypotheses of Lemma A.2 one gets also that, for any p = 0, 1, 2, · · · and for any multi-index α, Dyα¯0

∂ p Yn ∂ p Y∞ → Dyα¯0 p ∂t ∂t p

uniformly in [t0 − δ, t0 + δ] × B¯ r (y0 ).

(57)

Lemma A.4. Let {fn } be a sequence of Rm -valued functions defined in an open set N ⊂ Rm such that: (i) fn ∈ C k (N ; Rm ) for n = 0, 1, · · · , where k is fixed in {∞, ω} . (ii) fn → f∞ ∈ C k (N ; Rm ) uniformly in the set N with all of their derivatives of any order. (iii) There is a point x0 ∈ N such that det (∇f∞ |x=x0 ) 6 = 0. Then, there exist two open neighborhoods of x0 and f∞ (x0 ) respectively, Ux0 and Vf (x0 ) , a real K > 0 and a natural N0 such that, for n > N0 including n = ∞. (a) All functions fn |Ux0 : Ux0 → fn (Ux0 ) define diffeomorphisms, in particular, any fn (Ux0 ) is an open set. Moreover, |det (∇fn |Ux0 )| > K. (b) \ fn (Ux0 ), (58) Vf∞ (x0 ) ⊂ n>N0

where the intersection includes n = ∞. −1 uniformly with all of their derivatives (c) In the set Vf∞ (x0 ) and for n → ∞, fn−1 → f∞ −1 of any order. Moreover, 0 < |det∇(fn |Vf∞ (x0 ) )| < K1 . Lemma A.5. Let K be a connected compact set of Rm and G, Gn : K → M(D, R) continuous functions such that G(x) = G(x)T and Gn (x) = Gn (x)T for any x ∈ K, n = 0, 1, 2, · · · ; M(D, R) denoting the algebra of real D × D matrices. Let us suppose that G(x) is not singular for any x ∈ K and, component by component, Gn (x) → G(x) uniformly in x, for n → ∞. Then, there is a natural N0 such that, for n > N0 and not depending on x ∈ K, all the matrices Gn (x) are non-singular and sign(Gn (x)) = sign(G(x)), sign(A) denoting the signature of the real symmetric matrix A. Proof of Proposition 2.1. Let us proceed with the proof of item (a). Let 0 be an open neighborhood of the point z in the manifold M, such that 0 ∩ ∂M = ∅. Suppose also that xE ≡ (x 1 , · · · , x D ) are coordinates defined in 0 . Then, let be a connected ¯ ⊂ 0 . relatively-compact open neighborhood of z such that Let g be the metric on M which can be either Lorentzian or Riemannian. Finally, let us define the pure Euclidean-Laplacian heat kernel in RD , e−||Ex −Ey || /4t , (4π t)D/2 2

E(t, xE, yE) :=

(59)

where xE, yE ∈ RD and t ∈]0, +∞[. From now on, we shall identify the various subsets of 0 with the corresponding subsets of RD through the given coordinate system. Since the topology on 0 is that of RD , one can find another connected relatively-compact ¯ ⊂ 00 and ¯ 00 ⊂ 0 . Let us consider the class of covariant open set 00 such that second-order tensorial fields defined in the given coordinate system on 0 , Z x ) := d D yE E(1/n, xE, yE)gab (E y )η(E y ), (60) gnab (E RD

302

V. Moretti

where d D yE is the natural Lebesgue measure on RD and xE 7→ η(E x ) is a nonnegative C ∞ 00 ¯ and vanishes outside of ¯ . From the well-known function which takes the value 1 in y )η(E y) properties of the Euclidean heat-kernel [Ch84] we have that, since y 7→ gab (E in (60) is uniformly continuous in its domain (as it is continuous in a compact set), ¯ x ) → gab (E x )η(E x ) uniformly in RD , as n → ∞. In particular, this holds in , gnab (E ¯ we have a sequence of symmetric where η(E x ) = 1. Therefore, for any point xE ∈ , x ) ≡ [gnab (E x )] which converges to the nonsingular symmetric matrix matrices Gn (E ¯ By Lemma A5, for n > N0 , the matrices Gn (E x )] uniformly in xE ∈ . x) G(E x ) ≡ [gab (E define metrics in the tangent space at xE with the same signature of G(E x ). ¯ given above holds also for the derivatives of any The uniform convergence in x ) and gab (E x ). Indeed, from (60), one has, passing the order of the components gnab (E derivatives under the sign of integration (see the extended discussion below) and then, using integration by parts, Z x) = d D yE DxαE E(1/n, xE, yE)gab (E y )η(E y) DxαE gnab (E D R Z d D yE (−1)|α| (DyαE E(1/n, xE, yE))gab (E y )η(E y) = RD Z d D yE E(1/n, xE, yE)DyαE (gab (E y ))η(E y ) + G(α) (E x ). (61) = RD

The function G(α) above is a sum of terms containing derivatives of order > 0 of the function η; omitting overall constants, these terms are of the form (|γ | > 0) Z γ β d D yE E(1/n, xE, yE)DyE (gab (E y ))DyE η(E y ). (62) RD

β

γ

¯ and n → ∞ these terms converge to the functions D (gab (E x ))DxE η(E x) Taking xE ∈ xE ¯ ¯ which vanish in since η is constant in and |γ | > 0. Therefore, dropping the term ¯ and n → ∞, G(α) , one has from (61), for xE ∈ x ) → DxαE gab (E x ), DxαE gnab (E

(63)

uniformly in xE. To conclude the proof of item (a), let us prove that, fixing the indices a, b, the functions x ) are analytic functions of the coordinates xE on the whole space RD . From xE 7 → gab (E (60) and the definition of the function η, we have Z x ) := d D yE E(1/n, xE, yE)gab (E y )η(E y ). (64) gnab (E ¯ 00

¯ 00 , it is possible to continue the variable xE of the heat-kernel Fixing the natural n and yE ∈ to complex values. It is obvious that, fixing n = 1, 2, · · · , the obtained function (ζE , yE) 7→ E(1/n, ζE , yE) belongs to C ∞ (CD × RD ) and, for any fixed yE, it is holomorphic in the variable ζE ∈ C. Obviously, the derivatives of any order in ReζE and I mζE of the integrand ¯ 00 , O E being a relatively of (60) are bounded in any compact set of the form O¯ ζE0 × ζ0 D compact open neighborhood of ζE0 ∈ C . Therefore, Lebesgue’s dominated convergence theorem implies that the left hand side of (64) continued to complex values of xE = ζE is smooth and one can pass the derivatives in (any component of) ReζE and I mζE under

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

303

the sign of integration. In this way one can check the validity of the Cauchy–Riemann conditions for the yE-integrated function of ζE by the validity of the same conditions for the integrand function of (ζE , yE). Therefore ζE 7 → gab (ζE ) is a complex-analytic function. Taking I mζE = 0, one gets the real-analyticity of the left-hand side of (64). Let us go on to prove item (b). Our strategy is the following: We shall define the exponential maps of each metric gn around the point z ≡ 0E and thus, by using these exponential maps and by shrinking the found neighborhoods, we shall define normal neighborhoods, totally normal neighborhoods and geodesically convex neighborhoods of the metrics gn . Finally, we shall extract a class of (totally normal) geodesically convex neighborhoods which are common to all metrics. Let us fix z ∈ , define a normal coordinate system yE with respect to the metric g∞ := E This coordinate system is defined in a normal neighborhood g and centered in z ≡ 0. E η > 0, the open ball above being defined in the normal centered in z, Nz = Bη (0), coordinates with respect to the standard RD metric (obviously, this defines open sets with respect to the topology of the manifold). Employing Lemma 2.2, one sees that, in y ) → g∞ab (E y ) uniformly in yE, with all the this system of coordinates one still has gnab (E derivatives. However, in general, the components of the various metrics are not analytic functions of the coordinates, but this is not important at this step. In the considered coordinates, the first order geodesic equations read, for any metric gn (including n = ∞), dyna (t, yE0 , vE0 ) = vna (t, yE0 , vE0 ), dt dvna (t, yE0 , vE0 ) a (E yn )v b (t, yE0 , vE0 )v c (t, yE0 , vE0 ). = −0nbc dt

(65) (66)

(The sum over the repeated indices is understood). Above, yE0 and vE0 are, respectively, the initial position and the initial velocity evaluated at t = 0. The latter, in components, defines a vector in TyE (M). From Lemma A1, we know that the solution is unique E 0)) E for some δ > 0 and R > 0. We can provided that (t, (E y0 , vE0 )) ∈ [−δ, δ] × B¯ R ((0, E × B¯ r (0) E ⊂ B¯ R ((0, E 0)) E (in any case, B¯ r (0) E ⊂ must find a real r > 0 such that B¯ r (0) hold). Obviously, the existence and uniqueness of the solution holds true also replacing E 0)) E by B¯ r (0)× E B¯ r (0). E Let us indicate the geodesics given above in coordinates by B¯ R ((0, γn . From Lemma A2 and Lemma A3, we know that, in the considered common domain, employing coordinates yE and for n larger than some N0 , γn (t, yE0 , vE0 ) → γ∞ (t, yE0 , vE0 ) with all the t, yE0 , vE0 derivatives, uniformly in all these variables, where (t, yE0 , vE0 ) 7→ γ∞ (t, yE0 , vE0 ) is the geodesic associated to the target metric g∞ = g. For any fixed real α > 0, (65) and (66) entail the identity γn (αt, yE0 , vE0 /α) = γn (t, yE0 , vE0 ),

(67)

for n = 0, 1, · · · , ∞. This means that, if 2 > δ > 0, maintaining all properties concerning the uniform convergence and passing to the new variable λ = (2/δ)t, we can work with geodesics defined in the interval λ ∈ [−2, 2] provided r is replaced by r 0 = (δ/2)r < r. Since there is no ambiguity we can use the name r instead of r 0 . Therefore, from now on, we suppose λ ∈ [−2, 2]. This allows one to define the well-known E × B¯ r (0), E exponential maps for (E y , vE) ∈ B¯ r (0) v ) := γn (1, yE, vE). (E y , vE) 7 → expny (E

(68)

304

V. Moretti

Once the exponential maps are defined in the common neighborhood above, we can proceed to study the totally normal neighborhoods. To this end, let us consider the functions, defined in our coordinate system and in the induced base in the tangent space, E × B¯ r (0) E → M × M : (E y , vE) 7→ (E y , expny (E v )). Fn : B¯ r (0)

(69)

y , vE), like the Notice that, in the considered domain, for n → ∞, the sequence of Fn (E y , vE) and corsequences of their derivatives of any order in (E x , vE), converge to F∞ (E responding derivatives of it, uniformly in these variables. Passing from the geodesic equations (65),(66) to the corresponding equations for the yE, vE-derivatives of the solutions, one can straightforwardly prove that det (∇F∞ |(Ey ,Ev )=(0, E 0) E ) = 1. (Obviously, this property holds true for any n and any point yE in the considered domain.) Therefore, using E × Br (0), E one gets that there is a common neighborhood of Lemma A4 in the set Br (0) 0 E E E E (0, 0), U(0, E 0) E ⊂ Br (0) × Br (0), where all functions Fn , for n > N0 , define diffeomorE 0)) E = (0, E 0), E V E E such phisms. Moreover there is an open neighborhood of F∞ ((0, (0,0) that \ Fn (U(0, (70) V(0, E 0) E ⊂ E 0) E ). n>N00

E Without loss of generality, we can take U(0, E 0) E of the form U0E × Bρ (0), 0 < ρ < r, U0E E Similarly, we can take V E E of the form VE × VE . being an open neighborhood of 0. 0 0 (0,0) E 0), E V E E satisfying (70), the inverse of the functions In any open neighborhood of (0, (0,0)

−1 Fn converges uniformly with all the derivatives to the inverse of F∞ , and Fn−1 and F∞ −1 0 −1 0 are diffeomorphisms. Therefore as n → ∞, expnEy (E y ) → exp∞Ey (E y ) uniformly in (E y , yE0 ) ∈ V0E × V0E . This enable us to prove that V0E is a totally normal neighborhood of z ≡ 0E for all the considered metrics. Take yE ∈ V0E . From the definition of Fn and (70), one has

E y } × Bρ (0)), {E y } × V0E ⊂ Fn ({E E and for any n > N0 including v ) is a diffeomorphism in Bρ (0) and therefore, vE 7 → expnEy (E n = ∞, E for any yE ∈ V0E . V0E ⊂ expnEy (Bρ (0))

(71)

This means that V0E is a totally normal neighborhood of z ≡ 0E which is common for all metrics provided n > N00 . In this last step, we prove that it is possible to choose V0E such that V0E and V¯ 0E are common geodesically convex neighborhoods of z ≡ 0E for all the metrics whenever n > N0 ≥ N00 . Actually, we shall find a class of neighborhoods V0E defining a local base of the topology. Essentially we shall use the theory developed in part 8 of Chapter III of E [KN63]. The set V0E can be chosen as a ball Bδ (0). Our thesis can be proven using the following two results:

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

305

1) There is an integer N000 > N00 and a real ρ 0 > 0, such that for n > N000 (including n = ∞), the D × D-matrix-valued functions, given in components by ! X c (72) y ) := δab − 0nab (E y )y c Anab (E c

E Above, the connection coefficients 0 c are those are positive definite for yE ∈ Bρ 0 (0). nab th corresponding to the n metric represented in coordinates yE. E with ρ 0 > 0 sufficiently small, it is possible to choose 2) For a fixed open ball Bρ 0 (0) 00 a natural N0 > N0 , two reals ρ > 0 and δ 0 > 0 such that, for n > N0 (including n = ∞), using coordinates yE in the domain as well as in the co-domain, E ⊂ Bρ 0 (0) E for any yE ∈ Bδ 0 (0). E expnEy (Bρ (0))

(73)

Before we prove 1) and 2), we show that 1) and 2) entail that for any real δ such that E and its closure is geodesically convex 0 < δ < δ 0 , one has Nz := V0E := Bδ (0) with respect to all metrics. Item (b) is therefore completely proved by posing, in the E with δi = δ 0 (1 + tanh i)/2. Notice that the open considered coordinates, Nzi := Bδi (0) neighborhoods of z given above define a local base of the topology because one can E are metric define a Riemannian metric in a neighborhood of z such that the balls Bδ (0) balls. Moreover, it is well known [KN63] that the metric topology induced from any metric on a manifold coincides with the topology of the manifold. To this end, in our coordinate frame, let us indicateP a geodesic of the nth metric by λ 7 → yEn (λ) and consider the function λ 7 → Tn (λ) = a (y a (λ))2 . Suppose that such E ( > 0) for λ = 0. From the a geodesic is tangent to the boundary of a ball ∂B (0) geodesic equations, one gets ! " # X X d 2 yna d 2 ynb d 2 Tn c c |λ=0 = 2 0nab (E yn (0))yn (0) |λ=0 |λ=0 . δab − dλ2 dλ2 dλ2 c a,b

Therefore 1) assures that, if < ρ 0 , there is a neighborhood of the tangent point where E for n > N 00 (including n = ∞). the geodesics lie outside B (0) 0 Now we use 2) to conclude the proof. Let us choose ρ in (71) and δ 0 > 0, such that E and V¯ E (73) is satisfied, ρ 0 being that considered in 1). We want to show that V0E = Bδ (0) 0 are geodesically convex for any metrics gn , n > N0 (including n = ∞) if 0 < δ < δ 0 . E (or B¯ δ (0)). E Consider the n-geodesic Let yE1 and yE2 be any couple of points of Bδ (0) E t 7 → yEn (λ), λ ∈ [0, 1], joining these points. We want to show that it lies in Bδ (0) E respectively). Suppose that this is not true. Then, there is at least one point which (B¯ δ (0) E (B¯ δ (0) E respectively). In our hypotheses, the maximum of the does not belong toBδ (0) function Tn is attained for a value of the parameter λne ∈]0, 1[ since the extreme points E (B¯ δ (0) E respectively) and thus Tn (0), Tn (1) < Tn (λne ). of the geodesics belong to Bδ (0) Therefore, posing yEne := yEn (λne ) and ρne := Tn (λne ), since λne is an internal point E at of [0, 1], dTn /dλ|λ=λne = 0 must hold and thus the geodesic is tangent to ∂Bρne (0) yEne , where Tn reaches its maximum. Notice that, because of 2), all the geodesics lie E Therefore, due to 1), there is a neighborhood of yEne = yEn (λne ) where the in Bρ 0 (0). E This is not possible. This means that, not depending on geodesics lie outside Bρne (0). E and B¯ δ (0) E are geodesically convex. (Actually, the maximum of Tn is n > N0 , Bδ (0)

306

V. Moretti

attained at the extreme points where the geodesic is not tangent to the corresponding sphere, and thus there is no absurdum.) To conclude the proof of item (b), let us prove 1) and 2) above. The proof of 1) is E the quite simple. As we know, one has that, for a sufficiently small ball centered in 0, metrics gn , represented in coordinates yE, converge uniformly to the metric g∞ with all the E 0 c (E derivatives. Therefore, in a sufficiently small ball centered in 0, nab y ) must converge c y ) for n → ∞, and thus, from (72), ||An − A∞ ||∞ → 0. uniformly in yE to 0∞ab (E c E = 0, because the coordinates yE are normal coordinates of (0) Then notice that 0∞ab E Therefore, there is a sufficiently small ball centered in 0, E the metric g∞ centered on 0. y ) is positive definite uniformly in yE, i.e., there exists an a > 0 such that where A∞ (E y )u) > a uniformly in yE and u with ||u|| = 1. In fact, since ||u|| = 1, (u, A∞ (E E E y ) − A∞ (0))u)| ≤ ||A∞ (E y ) − A∞ (0)||, |(u, (A∞ (E

(74)

E → 0 as yE → 0. E Since the bound above is uniform in u, one has y ) − A∞ (0)|| and ||A∞ (E that, for any > 0, there is a neighborhood of yE = 0E where, E y )u) > (u, A∞ (0)u) − = 1 − , (u, A∞ (E

(75)

uniformly in u. Taking > 0 such that 1 − = a > 0, a is the requested positive lower y )u). Now, we can repeat the same procedure considering the norm bound of (u, A∞ (E || ||∞ and different values of n in the found neighborhood. One has y ) − A∞ (E y ))u)| ≤ ||A∞ − An ||∞ . |(u, (An (E

(76)

Since ||An − A∞ ||∞ → 0 as n → 0, we get that, for any > 0, there is a N000 such that, for n > N000 , y )u) > (u, A∞ (E y )u) − > a − , (u, An (E

(77)

uniformly in u and yE. Taking 0 < < a, one has proven the thesis. v ) is Let us prove 2). The case n = ∞ is trivial since the function (E y , vE) 7 → exp∞Ey (E E = 0. E We also know that, for sufficiently small ρ, δ > 0, in continuous and exp∞0E (0) E × Bρ (0), E the sequence of functions (E y , vE) 7 → expnEy (E v ) converges to the function Bδ 0 (0) E × Bρ (0), E one has v ) uniformly in (E y , vE) as n → ∞. In Bδ 0 (0) (E y , vE) 7 → exp∞Ey (E E ≤ ||expnEy (E E v ) − exp∞0E (0)|| v ) − exp∞Ey (E v )|| + ||exp∞Ey (E v ) − exp∞0E (0)||. ||expnEy (E Moreover, fixing ρ 0 > 0, one can take > 0 such that 0 < ρ 0 − < ρ 0 /2, and find a E × Bρ (0), E y , vE) ∈ Bδ 0 (0) pair ρ, δ 0 > 0 such that for (E E − exp∞Ey (E v )|| < ρ 0 − , ||exp∞0E (0) and a natural N0 such that, on the same ball and for n > N0 , v ) − exp∞Ey (E v )|| < ρ 0 − . ||expnEy (E E × Bρ (0), E one has Therefore, in Bδ 0 (0) E ≤ (ρ 0 − ) + (ρ 0 − ) < ρ 0 . v ) − exp∞0E (0)|| ||expnEy (E This is the thesis.

Off-Diagonal Heat-Kernel and Hadamard’s Expansion Coefficients

307

Items (c) and (d) are trivially proven by noticing that, as a consequence of the analogous property of the diffeomorphisms Fn defined above, in the normal coordinates yE and thus in any other coordinate system around z ≡ 0E which covers any set Nzi , the −1 E0 −1 diffeomorphisms (E y , yE0 ) 7 → expnE y , yE0 ) 7→ exp∞E y 0 )) y (y ) converge to the function (E x (E uniformly with all the derivatives. Similarly, the geodesics (λ, yE, vE) 7 → γn (λ, yE, vE) converge uniformly in all arguments jointly to the geodesic (λ, yE, vE) 7 → γ∞ (λ, yE, vE) with E × Br (0). E Employing our procedure to define the all the derivatives for (E y , vE) ∈ Br (0) neighborhoods Nzi given above, it is possible to shrink them, maintaining all the relevant E i ∈ R. By consequence, as n → ∞, the properties, in such a way that N¯ zi ⊂ Br (0), sequence of functions −1 0 y )) (λ, yE, yE0 ) 7 → γn (λ, y, y 0 ) = yEn (λ, y, expnE y (E

(78)

defined on any set [0, 1] × N¯ zi × N¯ zi , and the sequence of functions −1 0 −1 0 y )(expnE y ), expnE y )), σn (y, y 0 ) = gn (E y (E y (E

(79)

defined on any set N¯ zi × N¯ zi , converge uniformly in all the variables jointly, to the corresponding functions computed with respect to the metric g∞ = g. Making a recurrent use of Lemma 2.2, this result can be proven also concerning the derivatives of any order in all variables and in any coordinate system. u t

References [Av98]

Avramidi, I.G.: Covariant techniques for computation of the heat kernel. hep-th/9704166, Rev. Math. Phys., in press [BO86] Brown, M.R. and Ottewill, A.C.: Phys. Rev. D. 34, 1776 (1986) [BEE96] Beem, J.K., Ehrlich, P.E., Easley, K.L.: Global Lorentzian Geometry. New York: Marcel Dekker, Inc., 1996 [Br84] Brown, M.R.: J. Math. Phys. 25, 136 (1984) [BD82] Birrel, N.D. and Davies, P.C.W.: Quantum Fields in Curved Space. Cambridge: Cambridge University Press, 1982 [Ca90] Camporesi, R.: Phys. Rep. 196, 1 (1990) [Ch84] Chavel,I.: Eigenvalues in Riemannian Geometry. Orlando, FL: Academic Press, Inc., 1984 [Da89] Davies, E.B.: Heat Kernel and Spectral Theory Cambridge: Cambridge University Press, 1989 [dC92] do Carmo, M.P.: Riemannian Geometry. Boston: Birkhäuser, 1992 [El95] Elizalde, E.: Ten Physical Applications of Spectral Zeta Functions. Berlin: Springer-Verlag, 1995 [EORBZ94] Elizalde, E., Odintsov, S.D., Romeo, A., Bytsenko, A.A. and Zerbini, S.: Zeta Regularization Techniques with Applications. Singapore: World Scientific, 1994 [EF97] Estrada, R. and Fulling, S.A.: Distributional Asymptotic Expansions of Spectral Functions and the associated Green Kernels. funct-an/9710003 [Fr75] Friedlander, F.G.: The wave equation on a curved space-time. Cambridge: Cambridge University Press, 1975 [Fu91] Fulling, S.A.: Aspects of Quantum Field Theory in Curved Space-Time. Cambridge: Cambridge University Press, 1991 [FR87] Fulling, S.A. and Ruijsenaars, S.N.M.: Phys. Rep. 152, 135 (1987) [FSW78] Fulling, S.A., Sweeny, M., Wald, R.M.: Commun. Math. Phys. 63, 257 (1978) [Ga64] Garabedian, P.R.: Partial Differential Equations. New York: John Wiley and Sons, Inc., 1964 [Gi84] Gilkey, P.G.: Invariance theory, the heat equation and the Atiyah–Singer index theorem. Math. lecture series 11 Boston, MaA: Publish or Perish Inc., 1984 [KN63] Kobayashi, S. and Nomizu, K.: Foundations of Differential Geometry.Vol. 1 NewYork: Interscience Publishers, 1963 [Mo98a] Moretti, V.: Commun. Math. Phys. 201, 327 (1999)

308

V. Moretti

[Mo98b] Moretti, V.: One-loop stress-tensor renormalization in curved background: The relation between ζ -function and point-splitting approaches, and an improved point-splitting procedure. UTM 540, gr-qc/9809006, J. Math. Phys. to appear [Ru97] Rudin, W.: Functional Analysis. New Delhi: TATA McGraw-Hill, 1997 [RS80] Reed, M. and Simon, B.: Functional Analysis. London: Academic Press, 1980 [Sh87] Shubin, M.A.: Pseudodifferential Operators and Spectral Theory. Berlin: Springer-Verlag, 1987 [Ta96] Taylor, M.E.: Partial Differential Equations. Vol II, New York: Springer, 1996 [TO98] Tognoli, A.: Approximation Theorems in real analytic and algebraic geometry. In: Lectures in Real Geometry. Ed. F. Broglia, Berlin–New York: Walter de Gruyter & Co., 1998 [Wa78] Wald, R.M.: Phys. Rev. D 17, 1477 (1978) [Wa79] Wald, R.M.: Commun. Math. Phys. 70, 226 (1979) [Wa94] Wald, R.M.: Quantum Field theory and Black Hole Thermodynamics in Curved Spacetime. Chicago: The University of Chicago Press, 1994 Communicated by H. Araki

Commun. Math. Phys. 208, 309 – 330 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Boundary Layers for the Navier–Stokes Equations of Compressible Fluids Hermano Frid1,? , Vladimir Shelukhin1,2,?? 1 Instituto de Matemática, Universidade Federal do Rio de Janeiro, Caixa Postal 68530, CEP 21945-970, RJ,

Brazil. E-mail: [email protected]; [email protected]

2 Lavrentyev Institute of Hydrodynamics, Novosibirsk, 630090, Russia. E-mail: [email protected]

Received: 11 December 1998 / Accepted: 9 June 1999

Abstract: The global unique solvability is proved for the Navier–Stokes equations of compressible fluids for the one-dimensional spiral flows between two circular cylinders. The zero shear viscosity limit µ → 0 is justified. The value O(µα ), 0 < α < 1/2, is established for the boundary layer thickness. 1. Introduction The calculation of laminar damping at a rigid boundary goes back to Stokes (1851), who calculated the flow over an oscillatory plane. Based on the Stokes’ solution, Blasius (1908) found the laminar boundary layer thickness δ, called the Stokes length. Particularly, for the flow around a rotating circular cylinder the Stokes–Blasius theory states that δ ∼ O(µ/(ρω))1/2 , provided a(ρω/µ)1/2 1, where a is the characteristic length, ω is the angular velocity, ρ is the density, and µ is the shear viscosity. For more details we refer to [5] and [10]. In the present paper we discuss a mathematical basis for the laminar boundary layer theory. Particularly, we show that there exists a boundary layer thickness function δ(µ) such that δ(µ) → ∞ as µ ↓ 0. (1.1) δ(µ) → 0 and µ1/2 To this end, we first prove the global unique solvability for the Navier–Stokes equations describing shear one-dimensional flows of a compressible, isentropic fluid between two coaxial circular cylinders. Next, we prove a convergence of solutions as the shear viscosity µ goes to zero and the dilatational viscosity λ is kept fixed and positive. Thus, our results are valid for the fluids with the constitutive law P = −pI+ λdivv+ 2µD, where the viscosities obey the Duhem inequalities µ ≥ 0 and 3λ + 2µ ≥ 0 [18]. Here, ? Research was partially supported by CNPq, proc. 352871/96-2.

?? Research was supported by CNPq, proc. 301783/96-9.

310

H. Frid, V. Shelukhin

P is the stress tensor, D is the rate of strain tensor, p is the pressure, and v is the velocity vector. Then, we obtain an estimate for the boundary layer thickness matching property (1.1) as µ ↓ 0 and λ = const. To clarify the estimation technique, we first apply it for incompressible fluids obeying the constitutive law P = −pI+ 2µD. The Navier–Stokes equations for a compressible isentropic fluid express the conservation of mass and the balance of momentum for flows with the cylindric symmetry as follows [17]: ρu = 0, (1.2) ρt + (ρu)x + x ρ(ut + uux −

v2 u ) + px − (λ + 2µ)(ux + )x = 0, x x ρ(vt + uvx +

p = Rρ γ ,

uv v ) − µ(vx + )x = 0, x x

ρ(wt + uwx ) − µ(wxx +

(1.3) (1.4)

wx ) = 0. x

(1.5)

Here, u is the component of the velocity vector v along the radial variable x, x ∈ = {0 < r1 < x < r2 }, v is the angular component of v, w is the axial component of v, ρ is the density. The constants λ, µ, R, and γ are assumed positive, with γ ≥ 1. In the domain Q =]0, T [×, we consider the initial boundary value problem given by (1.2)–(1.5) and u = 0, v = vi (t), w = wi (t) for x = ri , i ∈ {1, 2},

(1.6)

(v, ρ)|t=0 = (v0 (x), ρ0 (x)) for x ∈ .

(1.7)

The boundary conditions (1.6) imply that the fluid sticks at the bounding cylinders which move in such a way that the axis of symmetry is fixed. Our first result is about the global existence and uniqueness of solutions to problem (1.2)–(1.7). To formulate it, we require that the initial and boundary data satisfy the smoothness conditions kv0 , ρ0 k2W 1,2 () < ∞, inf ρ0 > 0, kvi , wi k2C 1 ([0,T ]) < ∞,

(1.8)

and the compatibility conditions u0 = v0 − vi (0) = w0 − wi (0) = 0 at x = ri , i ∈ {1, 2}.

(1.9)

From here on, we use the notations kf, g, · · · k2 = kf k2 + kgk2 + · · · for functions f, g, · · · belonging to the same functional space equipped with a norm k · k. Theorem 1.1. Under the assumptions (1.8) and (1.9), there exists a unique solution of problem (1.2)–(1.7) such that v ∈ L∞ (0, T ; W 1,2 ()) ∩ L2 (0, T ; W 2,2 ()), vt ∈ L2 (Q), ρ ∈ L∞ (0, T ; W 1,2 ()), ρt ∈ L∞ (0, T ; L2 ()), inf ρ > 0. Q

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

311

Earlier, existence theorems for the solutions with the axial symmetry were obtained only for radial flows with v = w = 0 [21]. The foundations for the theory of the one-dimensional solutions with the plane symmetry can be found in the book of S. N. Antontsev, A. V. Kazhikhov, and V. N. Monakhov [1]. A generalization of this theory is given in [25] for one-dimensional equations with the three- dimensional velocity vector. As for the existence theorems for the full three-dimensional Navier–Stokes equations, we refer to the results of P.-L. Lions [19], D. Hoff [11], and A. V. Kazhikhov and V. A. Weigant [13]. Since the proof of Theorem 1.1 follows the same lines as in the cited works, we give here only its sketch. To formulate our second result about the zero shear viscosity limit, we rewrite system (1.2)–(1.5) as follows: (1.10) ρt + (ρu)r = 0, ρv 2 + px − νurx = 0, ν = λ + 2µ, x ρuv − µvrx = 0, (ρv)t + (ρuv)r + x (ρw)t + (ρuw)r − µwxr = 0.

(ρu)t + (ρu2 )r −

(1.11) (1.12) (1.13)

f x.

Observe that the operator f → fr has the Here, we employ the notation fr = fx + properties: g (gh)r = gr h + ghx , grx = gxr − 2 , x 0

g (β)((ρβ)t + (ρuβ)r ) = (ρg(β))t + (ρug(β))r = 0, Z Z Z ϕψr x dx = − ψϕx x dx, xξr dx = r2 ξ(r2 ) − r1 ξ(r1 ),

for any functions β ∈ C 1 (Q), g ∈ C 1 (R), ϕ ∈ D(), ψ ∈ D(), and ξ ∈ D(R). Theorem 1.2. Assume that v0 and w0 are uniformly bounded in L∞ () with respect to µ, and v0 , w0 → v¯0 , w¯ 0 in L1 () as µ → 0, with v¯0 , w¯ 0 ∈ L∞ (). Then there exist functions ρ and v = (u, v, w) such that u ∈ L∞ (0, T ; W01,2 ()) ∩ L2 (0, T ; W 2,2 ()), ut ∈ L2 (Q), 0 < ρ ∈ L∞ (Q); ρ x , ρ t ∈ L∞ (0, T ; L2 ()); u, v, w ∈ L∞ (Q), and the solutions (ρ, v) of problem (1.2)–(1.7) converge to (ρ, v) as follows: C α (Q)

Lq (Q)

ρ, u −→ ρ, u; v, w −→ v, w, µ ↓ 0,

(1.14)

strongly for some α ∈]0, 1[ and for any q ∈ [1, ∞[. In addition, ut , ux , uxx * ut , ux , uxx

(1.15)

ρt , ρx * ρ t , ρ x

(1.16)

weakly in L2 (Q) and ∗-weakly in

L∞ (0, T ; L2 ()).

312

H. Frid, V. Shelukhin

The limit functions solve the limit equations (1.10)–(1.13) in the following sense: ρ t + (ρ¯ u) ¯ r = 0, ρ¯ v¯ 2 + p(ρ) ¯ x − λu¯ rx = 0, x Z Z u ρ · v(ϕt + uϕx − ϕ)x dt dx + ρ0 v0 ϕ(0, x)x dx = 0, x (ρ¯ u) ¯ t + (ρ¯ u¯ 2 )r −

Q

Z

Z ρ · w(ϕt + uϕx )xdtdx +

(1.17) (1.18) (1.19)

ρ0 w0 ϕ(0, x)x dx = 0,

(1.20)

Q

for any ϕ ∈ D(] − ∞, T [×). The proof of this theorem is based upon estimates uniform in µ and a framework suitable for transport equations which allows one to improve weak convergence to strong by analyzing the equations deduced for 8(z) and 8(¯z), where z is any of the two velocity components v or w and 8 : R → R is a convex function. Here and in what follows we use the bar symbol to denote a weak limit as µ ↓ 0. This idea of improvement of weak convergence, taking advantage of the form of transport equations, first appeared in the work of Kazhikhov (see [1] Chapter 3), then DiPerna and Lions [6] formulated clearly this idea involving the notion of renormalization. Further generalizations and applications were given by Lions [19,20], Hoff [11], Kazhikhov and Weigant [13], and Kazhikhov and Shelukhin [12,26]. Our third and main result is about the boundary layer effect. We call a function δ(µ) the boundary layer thickness (BL-thickness) for problem (1.2)–(1.7) with vanishing µ if δ(µ) ↓ 0, as µ ↓ 0, and ¯ vµ − vkL∞ (Q) > 0, lim inf kρµ − ρ,

(1.21)

¯ vµ − vkL∞ (]0,T [×δ(µ) ) = 0, lim kρµ − ρ,

(1.22)

µ→0

µ→0

where δ = {x : r1 + δ < x < r2 − δ} and ρµ , vµ is the corresponding solution of problem (1.2)–(1.7). Clearly, this definition does not determine the BL-thickness uniquely, since any function δ∗ (µ) satisfying the inequality δ∗ (µ) ≥ δ(µ) for small µ is also a BL-thickness. To make the proof of the existence of a BL-thickness simpler, we restrict ourselves to the following initial data: ρ|t=0 = ρ0 = const > 0, v|t=0 = 0.

(1.23)

Theorem 1.3. Let the assumptions of Theorem 1.1 and assumption (1.23) be satisfied. Then ρ¯ = ρ0 , v¯ = 0, and any function δ(µ), satisfying the conditions δ(µ) → 0 and δ(µ)/µ1/2 → ∞, as µ → 0, is a BL- thickness, i.e. lim inf kρµ − ρ0 , vµ kL∞ (0,T ;C()) ¯ > 0, µ→0

lim kρµ − ρ0 , vµ kL∞ (0,T ;C([r1 +δ(µ),r2 −δ(µ)])) = 0.

µ→0

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

313

The analogous result for incompressible flows is also valid. It will be formulated below. The question of boundary layer for the Navier–Stokes equations was addressed in a number of works for the case of incompressible fluids. Serrin [24] studied the zero viscosity limit problem for the radial flow between two straight converging walls. Fife [7] obtained a boundary layer estimate for the stationary equations in a half space. Temam and Wang [27] derived explicit formulas for the boundary layers occurring in the linearized channel flows. Caflisch and Sammartino [3,4] justified the Prandtl boundary layer theory in the class of analytic solutions. The proper Prandtl equations, which according to Prandtl govern flows in the boundary layer, were studied by Oleinik [22] and Khusnutdinova [14]. The boundary layer problem also arises in the theory of hyperbolic systems when parabolic equations with small viscosity are applied as perturbations. Serre and Gisclon [8,23] developed a method to detect the boundary layer effect for a viscous perturbation of some class of quasi-linear hyperbolic systems in one space dimension. Grenier and Gués [9] generalized their approach for the multi-dimensional case. 2. Estimates Independent of µ For later use we denote by (f, g) , kf k, kf kq, , kf kq,Q , and by kf kq,r the scalar product in L2 () and the norms in L2 (), Lq (), Lq (Q), and Lr (0, T ; Lq ()) respectively. We denote by c different positive constants depending on T and the norms in (1.8) and independent of µ. R It follows from (1.10) that the integral xρdx does not depend on time. This is our first a priori estimate since we look for solutions such that ρ > 0. By the maximum principle applied to Eq. (1.5), kwk∞,Q ≤ c. This is our second estimate. Lemma 2.1. There is a constant c such that kρ|v|2 , G(ρ)k1,∞ + kνu2x , µvx2 , µwx2 , νu2 , µv 2 k1,Q ≤ c, where G(ρ) is the nonnegative function defined as follows: G = Rρ γ /(γ − 1) if γ > 1 and G = R(1 − ρ + ρ ln ρ) if γ = 1. Proof. System (1.10)–(1.13) is endowed with the equation for the energy function e = v2 /2, u2 v2 (xρe)t + (xρue)x + xupx + x(νu2x + µvx2 + µwx2 + ν 2 + µ 2 ) x x u2 v2 (2.1) + µvvx + µ + µwwx ) + νu2 + µv 2 )x = 0. x x On the other hand, it follows from Eqs. (1.10), (1.12), and (1.13) respectively that −(x(νuux + ν

(px , xu) =

d (x, G) , dt

||µvvx |ri = µ(v2 − v1 )vi −

µvi2 || 1 + µvi ( , v) ri x

314

H. Frid, V. Shelukhin

d +vi (1, dt

Zri

Zri ρvdy) − vi (ρ, uv) + (vi ,

x

x

||µwwx |ri = µ(w2 − w1 )wi − +wi

d (1, dt

µwi2 || ri

ρuv dy) , y

1 1 + µwi ( , w) − µ||wi ( 2 , w) x x

Zri ρwdy) − wi (ρ, uw) + ||wi (ρ, x

uw ) . x

Thus, to prove the lemma, one should integrate Eq. (2.1). u t Lemma 2.2. There is a constant c such that kρ, ρ −1 k2∞,Q ≤ c. Proof. We write Eq. (1.11) in the form Zx

ρ 2 (u − v 2 )dy. y

(ρu)t + (ρu + p − νur + σ )x = 0, σ = 2

r1

Hence, the function Zt

Zx (νur − ρu − p − σ )dτ +

ϕ(t, x) =

ρ0 u0 dy

2

r1

0

satisfies the equalities ϕx = ρu, ϕt = νur − ρu2 − p − σ.

(2.2)

Observe that kxϕx k21, ≤ kxρk1, kxρu2 k1, and Zt (x, ϕ) = −

Zx (x, ρu + p + σ ) dτ + (x,

ρ0 u0 dy) .

2

r1

0

Thus, kϕk∞,Q ≤ c. Given a function F1 (ϕ), we compute the material derivative Dt (ρF1 ) ≡ ( ∂t∂ + ∂ u ∂x )ρF1 . Using (2.2), we have 0

Dt (ρF1 ) = −ρF1 ur + ρF1 (νur − p − σ ). The choice F1 (ϕ) = exp(ϕ/ν) results in Dt (ρF1 ) ≤ ρF1 |σ/ν| ≤ cρF1 . Thus, the first estimate of the lemma is proved. Next, we compute the material derivative of the function F2 (ϕ)ρ −1 , where F2 = exp(−ϕ/ν). We have F2 F2 p+σ A, A = . Dt ( ) = ρ ρ ν Rt t Since 0 maxx Adτ ≤ c, the second estimate of the lemma follows. u As a consequence, we have the estimate kuk2,∞ ≤ c.

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

315

Lemma 2.3. There is a constant c such that kvk∞,Q ≤ c. Proof. Let us denote u(t, x) |, z = v exp(− v∗ = max{sup |v0 |, sup |vi (t)|}, a(t) = sup | x i,[0,T ] Rr

Given a function F (r), we set g(r) = function F (z(t, x)) satisfies the equation

−v∗

Let us choose F (r) =

p

0

sF (s)ds and derive from (1.4) that the 0

00

adτ ).

00

(xρF (z))t + (xρuF (z))x + zF (z)(xaρ + ρu + +µxF (z)zx2 + µ

Zt

µ ) x

∂ z 0 (F (z) + g(z) − x(zx + )F (z)) = 0. ∂x x

(2.3)

(dist(r, I ))2 + δ 2 − δ, I = {r : |r| ≤ v∗ }.

0

00

With this choice, rF (r) ≥ 0, F (r) ≥ 0 for all r ∈ R, and the functions F (z), g(z), 0 and F (z) vanish at x = ri and at t = 0. Now, we integrate Eq. (2.3) and send δ to zero RT to conclude that kzk∞,Q ≤ v∗ . Since 0 a(t)dt ≤ c, the lemma is proved. u t Lemma 2.4. There is a constant c such that kρx k2,∞ ≤ c. Proof. We set β = u − ν(1/ρ)x and find, by (1.2) and (1.3), that (ρβ 2 )t + (ρuβ 2 )x =

ρu 2 ρ(u2 − v 2 ) β − 2β(px + ). x x

Hence, 2 ρu 2 u2 − v 2 d 0 0 (ρ, β 2 ) + (ρ 2 p (ρ), β 2 ) = ( , β 2 ) + (ρ 2 p (ρ), βu) −2(βρ, ) . dt ν x ν x Since kuk2∞,2 ≤ ckux k22,Q ≤ c, kuk44,Q ≤ kuk22,∞ kuk2∞,2 ≤ c, the assertion of the lemma follows from the last equality. u t Lemma 2.5. There is a constant c such that kuxx , ut k22,Q + kux k22,∞ ≤ c and kuk∞,Q ≤ c. Proof. The equality ρu2t +

px ν u ν2 2 v2 uxx − νut uxx = ρ(−uux + − + ( )x )2 , ρ x ρ ρ x

which holds due to (1.3), gives ν

ν d √ kux k2 + k ρut , √ uxx k2 ≤ ckuux , v 2 , ρx , ux , uk2 . dt ρ

Since kuux k2 ≤ kuk2∞, kux k2 and kuk∞,2 ≤ c, the first estimate of the lemma is proved. Now, the second one is a corollary. u t As another corollary, we derive from Eq. (1.2 ) that kρt k2,∞ ≤ c.

316

H. Frid, V. Shelukhin

3. Zero Shear Viscosity Limit Here, we prove Theorem 1.2, assuming that Theorem 1.1 holds. For simplicity, in what follows we agree that sµ → s means that there is a sequence µn ↓ 0 such that sµn → s. It is implicit in this section that the functions ρ and v depend on µ and we use the notation sµ = (ρ, v) for the vector solution. It follows from the above estimates that sµ converges to some s¯ ≡ (ρ, ¯ v¯ ) in L2 (Q) weakly. Let us show that this convergence can be improved. The fact that ρ → ρ¯ in Lq (Q), 1 ≤ q < ∞, follows from the uniform boundedness of ρ in W 1,2 (Q)∩ L∞ (Q) and the Sobolev imbedding theorem. The uniform estimates ¯ Further, for x ≤ kρt , ρx k2,∞ ≤ c immediately give that ρ ∈ L∞ (0, T ; C 1/2 ()). (r1 + r2 )/2 and ε > 0 small enough, we have 1 |ρ(t1 , x) − ρ(t2 , x)| ≤ ε

Zx+ε √ √ c |ρ(t1 , y) − ρ(t2 , y)| dy + c ε ≤ √ |t1 − t2 | + c ε, ε x

and so, choosing ε = O(|t1 − t2 |), we get |ρ(t1 , x) − ρ(t2 , x)| ≤ c|t1 − t2 |1/2 for some c > 0, independent of µ. Similarly, we obtain the same inequality for x > ¯ Thus, ρ → ρ¯ in C α (Q) ¯ for (r1 + r2 )/2. Hence, ρ is uniformly bounded in C 1/2 (Q). 1/2 ∞ 1,2 ∞ 2 ¯ any α < 1/2, and ρ¯ ∈ C (Q)∩ L (0, T ; W ()) and ρ¯t ∈ L (0, T ; L ()). Let us consider the sequence ρu, µ ↓ 0. By the above estimates, k(ρu)x k2,Q ≤ c and one may derive from (1.11) that the sequence (ρu)t , µ ↓ 0, is bounded in L2 (0, T ; W −1,2 ()). Thus, by the Aubin–Lions theorem, the sequence ρu, µ ↓ 0, converges in L2 (Q). Now, the inequality |uµ − uν | ≤

|ρ µ uµ − ρ ν uν | 1 1 + | µ − ν ||ρ ν uν | µ ρ ρ ρ

implies that u → u¯ in L2 (Q). Since kuk∞,Q ≤ c, we have, by interpolation argument, that u → u¯ in Lq (Q) for any q ∈ [1, ∞[. Clearly, u¯ ∈ L∞ (Q)∩ L2 (0, T ; W 2,2 ())∩ L∞ (0, T ; W01,2 ()) and u¯ t ∈ L2 (Q). By the arguments above, there exists α ∈]0, 1[ ¯ and u¯ ∈ C α (Q). ¯ such that u → u¯ in C α (Q) Let us consider the sequence w, µ ↓ 0. We start from the observation that the functions ρ, ¯ u, ¯ and w¯ satisfy Eqs. (1.17) and (1.20). Due to the regularity derived above for the solution vector s¯ , one can, by the continuity argument, substitute the set of the test functions D(]−∞, T [×) in equality (1.20) by W 1,2 (Q)5 , where the latter denotes the closure of D(]−∞, T [×) in W 1,2 (Q). Further, the test set W 1,2 (Q)5 can be extended to W 1,2 (Q)T , the closure of D(] − ∞, T [×R) in W 1,2 (Q). Indeed, given a function ψ ∈W 1,2 (Q)T , we see that ψξδ /δ ∈W 1,2 (Q)5 , where ξδ (x) = min {δ, dist(x, ∂)}. Now, to justify the extension, one needs only to prove that Z 1 xρ · u · wψ(ξδ )x dxdt = 0. lim δ→0 δ Q

But this equality holds since u ∈ L∞ (0, T ; W01,2 ()).

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

317

Next, we pass to the Lagrangian coordinates (t, y) by the formulas Zx y(t, x) = r1 +

ρ(t, s)s ds, yx = xρ, yt = −xρ · u.

(3.1)

r1

R Without loss of generality, we may assume that xρ0 dx = r2 − r1 . With this change of variables, the functions s(t, ·) and s(t, y) are related again to the domains and Q respectively. Now, in the new coordinates, Eq. (1.20) for w(t, ¯ x) with the test set W 1,2 (Q)T reads Z Z w(t, y)9t (t, y)dtdy + w0 (y)9(0, y)dy = 0. (3.2)

Q

¯ This integral law holds for all 9 ∈ C 1 (Q)T , the closure of D(] − ∞, T [×R) in C 1 (Q). Indeed, given 9 ∈ C 1 (Q)T , the function ψ(t, x) = 9(t, y(t, x)) belongs to W 1,2 (Q)T in view of (3.1). Equation (3.2) implies that w(t, y) = w0 (y), a.e. on ]0, T [. Hence w2 (t, y) = w02 (y), a.e. on ]0, T [, and Z Z 2 w (t, y)9t (t, y)dtdy + w02 (y)9(0, y)dy = 0, (3.3)

Q

for all 9(t, y) ∈ C 1 (Q)T . By continuity argument, (3.3) holds for all 9(t, y) such that 9, 9t ∈ L1 (Q),

9|t=T = 0, 9 ≥ 0.

(3.4)

Let us introduce the functions Zr2

Zx ξρ(t, ξ )dξ/

b= r1

ξρ(t, ξ )dξ, α = w − (bw2 + (1 − b)w1 ). r1

One can verify that α vanishes at x = ri and solves the equation ρ(αt + uα) − µαxx − µ

αx µ(w2 − w1 ) (2ρ + xρx ). = −ρ(1 − b)w1t − ρbw2t + x r2 − r1

Given a function ψ(t, x) ∈ C 1 (Q)T , we multiply this equation by 2xαψ, integrate, and send µ to zero. As a result we obtain Z Z (3.5) J1 (ψ) ≡ xρ · α 2 (ψt + uψx )dtdx + xρ0 α02 ψ(0, x)dx Q

Z −2

¯ 1t + bw ¯ 2t ]dtdx = 2hµxαx2 , ψi. x αψ ¯ ρ[(1 ¯ − b)w

Q

Here, µxαx2 is a nonnegative Radon measure on Q, a weak limit of µxαx2 , µ ↓ 0, in the space of signed Radon measures on Q. It is a simple consequence of (3.5) that

318

H. Frid, V. Shelukhin

J1 (ψ) ≥ 0, ∀ψ(t, x) ∈ W 1,2 (Q)T+ ,

(3.6)

the subindex “+” denoting nonnegativity. By switching to the Lagrangian coordinates, (3.6) reads Z Z J2 (9) ≡

α 2 (t, y)9t (t, y)dtdy +

(3.7)

Q

2 − r2 − r1

α02 (y)9(0, y)dy

Z

α(t, y)9(t, y)(yw2t + (1 − y)w1t )dtdy ≥ 0 Q

C 1 (Q)T+ .

This is true because any function 9(t, y) from C 1 (Q)T+ is for all 9(t, y) ∈ transformed by (3.1) into the function ψ(t, x) ≡ 9(t, y(t, x)) belonging to W 1,2 (Q)T+ . By continuity, the set (3.4) fits (3.7) as a test set as well. Using the equalities ¯ 2 + (1 − b)w ¯ 1 ), ¯ y) = y, w(t, b(t, ¯ y) = w0 (y), α¯ = w¯ − (bw α 2 (t, y) = w 2 (t, y) − 2α(t, y)w(t, y) + α 2 (t, y), one can compute that

Z

0 ≤ J2 (9) =

Z w2 (t, y)9t (t, y)dtdy +

w02 (y)9(0, y)dy

(3.8)

Q

for all 9(t, y) satisfying (3.4). Comparing (3.3) and (3.8) on the test set (3.4), we find that w2 (t, y) ≤ w2 (t, y) a.e. in Q. On the other hand, by convexity argument, w2 (t, y) ≥ w 2 (t, y) a.e. in Q. Hence, w 2 (t, x) = w2 (t, x) a.e. in Q. This implies that w → w in L2 (Q). Since the sequence w, µ ↓ 0, is bounded in L∞ (Q) the last convergence holds also in Lq (Q) for any q ∈ [1, ∞[. As a consequence, we derive from (3.5) that 0 = hµxαx2 , ψi = hµxwx2 , ψi, ∀ψ(t, x) ∈ C 1 (Q)T . We treat the sequence v, µ ↓ 0, in the same manner. As above, Eq. (1.19) holds with the test set W 1,2 (Q)T . The switching to the Lagrangian coordinates transforms (1.19) into Z Z v(t, y) · u(t, y) 9(t, y) dt dy + v0 (y)9(0, y) dy = 0 (3.9) v(t, y)9t (t, y) − x(t, y)

Q

for all 9 ∈ C 1 (Q)T . Using the estimates of Sect. 2, one can prove that the set (3.4) may be taken as a test set in (3.9). Rt Given η(t, y) ∈ C 1 (Q)T , we choose 9 = ηeU , where U = 0 u(s, y)/x(s, y) ds. This choice is possible since u ∈ L2 (0, T ; W01,2 ()). Denoting V = v(t, y)eU , we see that Z Z V ηt dt dy + v0 (y)η(0, y) dy = 0 Q

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

319

for all η ∈ C 1 (Q)T . Hence, V (t, y) = v0 (y) a.e. on ]0, T [ and we arrive at the representation formula v(t, y) = v0 (y)e−U (t,y) . Clearly, V 2 (t, y) = v02 (y) a.e. on ]0, T [, i.e. Z Z 2v 2 (t, y) · u(t, y) v 2 (t, y)9t (t, y) − 9(t, y) dt dy + v02 (y)9(0, y) dy = 0 x(t, y)

Q

(3.10) for all 9 ∈ C 1 (Q)T . Again, we can extend the test set C 1 (Q)T to (3.4). Now, we introduce the function β = v − (bv2 + (1 − b)v1 ), multiply Eq. (1.4) by 2xβψ, where ψ(t, x) ∈C 1 (Q)T , and send µ to zero. As a result, we have Z Z Z xρ · β 2 (ψt + uψx )dtdx + xρ0 β02 ψ(0, x)dx − 2 ρ¯ uβ ¯ 2 ψ dtdx (3.11)

Q

Z −2 Q

Q

¯ 1 + bv ¯ 1t + bv ¯ 2t + u¯ ((1 − b)v ¯ 2 )]dtdx = 2hµxβx2 , ψi ≥ 0 ¯ ρ[(1 x βψ ¯ − b)v x

for all ψ ∈C 1 (Q)T+ . Here, µxβx2 is a nonnegative Radon measure. Switching to the Lagrangian coordinates gives Z Z Z u¯ 2 2 2 β 9t dtdy + β0 (y)9(0, y)dy − β 9dtdy x

Q

Z −2

Q

¯ 2t + (1 − b)v ¯ 1t + β9{bv

Q

u¯ ¯ 1 + bv ¯ 2 )}dtdy ≥ 0. ((1 − b)v x

By the arguments above, the set (3.4) can be chosen as a test set for this inequality. Due to the formulas b¯ =

y , v¯ = v0 (y) exp (− r2 − r1

Zt 0

u(s, ¯ y) ¯ 1 + bv ¯ 2 ), ds), β¯ = v¯ − ((1 − b)v x(s, y)

¯ 1 + bv ¯ 2 ) + ((1 − b)v ¯ 1 + bv ¯ 2 )2 , β 2 = v 2 − 2v((1 − b)v the last inequality reduces to Z Z Z u(t, ¯ y) 2 2 2 v (t, y)9t (t, y)dtdy + v0 (y)9(0, y)dy − 2 v 9dtdy ≥ 0 x(t, y) Q

(3.12)

Q

for all 9(t, y) satisfying (3.4). The comparison of (3.10) and (3.12) gives v 2 (t, y) = v 2 (t, y) a.e. in Q. Hence, v(t, x) → v(t, x) in L2 (Q) as µ ↓ 0. Clearly, this convergence is also valid in Lq (Q) for any q ∈ [1, ∞[. It follows from (3.11) that hµxvx2 , ψi = 0 for all ψ ∈C 1 (Q)T . As a consequence of the above strong convergence we have that equation (1.18) is satisfied. The weak convergences (1.15) and (1.16) of the derivatives follow from the estimates in Sect. 2.

320

H. Frid, V. Shelukhin

Thus, Theorem 1.2 is proved for some sequence µn ↓ 0. Let us show that this theorem holds for any sequence µn ↓ 0. To this end, it suffices to prove that the limit problem (1.17)–(1.20) has a unique solution. Given a solution (ρ, v) of problem (1.17)–(1.20), we introduce the Lagrangian variables (t, y) by formulas (3.1). In the new coordinates, the functions ρ(t, y), v(t, y), and x(t, y) solve the following initial boundary value problem in the domain Q: u u2 − v 2 u = λ(xρuy + )y − p(ρ)y , (xρ)t + x 2 ρ 2 uy = 0, ( )t + x x2 x Z Z vu v9t − 9 dt dy + v0 (y)9(0, y) dy = 0, x Q

w9t dt dy + Q

(3.14)

Z

Z

(3.13)

w0 (y)9(0, y) dy = 0,

(3.15)

xt = u, ρxxy = 1, u|∂ = 0, u|t=0 = u0 (y), ρ|t=0 = ρ0 (y),

(3.16)

where 9(t, y) is any function from C 1 (Q)T . It follows from (3.14) and (3.15) that Zt w(t, y) = w0 (y), v(t, y) = v0 (y) exp (− 0

u(s, y) ds). x(s, y)

Hence, we need to show that the functions ρ, u, v, and x are uniquely defined by problem (3.13)–(3.16). Setting U = u/x, V = v/x, R = xy , one can rewrite Eqs. (3.13) and (3.14) as follows: x Ut + U 2 − V 2 = λ( Uy + 2U )y − p(ρ)y , R Vt + 2V U = 0, Rt = (xU )y , xt = xU, xρR = 1. Given two solutions si ≡ (Ui , Vi , Ri , xi ), i ∈ {1, 2}, we introduce the differences s =s1 − s2 , p = p(ρ1 )−p(ρ2 ) and find that Ut + U (U1 + U2 ) − V (V1 + V2 ) = λ(

x1 x1 R Uy )y + 2λUy − py − λ( U2y )y R1 R1 R2

x U2y )y , Vt + 2V U1 + V2 U = 0, Rt = (x1 U + xU2 )y , xt = xU1 + x2 U. R2 Multiplying these equations by U, V , R, and x respectively, we arrive at the inequalities +λ(

Zt kV , R, x, pk ≤ c

kUy k2 dt,

2

0

c1 1 d kU k2 + c1 kUy k2 ≤ kUy k2 + c2 kU, V , pk2 + c2 kR, xk2 kU2yy k2 . 2 dt 2 Rt 2 Hence, the function z(t) = kU k + 0 kUy kdt vanishes at t = 0 and solves the inequality 0 z ≤ c(t)z, with c(t) ∈ L1 (0, T ). Thus, z = 0 and Theorem 1.2 is proved.

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

321

4. Boundary Layer in Incompressible Flows The Navier–Stokes equations for incompressible flows with cylindrical symmetry reduce to u = 0, ρ = const(= 1) and v wx ) = 0. (4.1) Lv ≡ vt − µ(vx + )x = 0, wt − µ(wxx + x x We give a boundary layer analysis only for the first equation in (4.1), since the second one can be studied similarly. We use the method of doubling of variables which goes back to Kru˘zkov [15] and which was developed further in the recent paper of Bouchut and Perthame [2]. The initial boundary value conditions for the function v defined in the domain Q are v(0, x) = v0 (x) ∈ W 2,1 (), v(t, ri ) = vi (t) ∈ C 1 ([0, T ]), vi (0) = v0 (ri ). (4.2) These conditions provide the unique solvability [16] of Eq. (4.1)1 in the class v ∈ L2 (0, T ; W 2,2 ()) ∩ L∞ (0, T ; W 1,2 ()), vt ∈ L2 (Q).

(4.3)

Lemma 4.1. There are positive constants c1 and c2 independent of µ such that solutions v of problem (4.1)1 , (4.2) satisfy the following estimates: ZT |vx (t, ri )|dt ≤ c1 ,

µ1/2 0

sup kvx (t)k1, ≤ kv0x (t)k1, + c2

0≤t≤T

2 X

kvi kW 1,1 (0,T ) ,

1

sup kv − v0 k1, ≤ c1 µ1/2 , 0≤t≤T

where c1 depends on the norms kv0 kW 1,2 () and kvi kC 1 ([0,T ]) , and c2 depends on r2 −r1 . Proof. We start from the energy identity µ

v d kvx k2 + kµvxx , vt k2 = kµ( )x k2 + 2µv1t vx |rr21 ≡ J1 + J2 . dt x

(4.4)

Due to the imbedding-type inequality |vx |2 ≤ |

v(r2 ) − v(r1 ) 2 | + 2kvx kkvxx k r2 − r1

and due to the maximum principle estimate kvk∞,Q ≤ max{sup |vi (t)|, sup |v0 (x)|}, i,t

x

we have, by the Young inequality, 1 J2 ≤ µc + µkvx k1/2 kvxx k1/2 ≤ µc + kµvxx k2 + µkvx k2 + µ1/2 c. 2 By the Grönwall inequality, we conclude from (4.4) that µ sup kvx k2 + kµvxx k22,Q ≤ µ1/2 c. 0≤t≤T

(4.5)

322

H. Frid, V. Shelukhin

Now, the first estimate of the lemma follows from (4.5). Next, we pass to the proof of the second estimate. Let us introduce the notations α = v2 (t)

x − r1 r2 − x + v1 (t) , u = v − α, z = ux , g = −Lα. r2 − r1 r2 − r1

Given a smooth function ϕ : R → R, one can derive from (4.1)1 the following identity Z

ϕ(z)|t0 dx

Zt Z +µ

00

ϕ (z)((zx + 0

Zt Z

Zt

0

ϕ (z)gx dxdτ −

= 0

z 2 z2 ) − 2 )dxdτ 2x 4x

0

gϕ (z)|rr21 dτ.

0

√ Now, to obtain the second estimate of the lemma, one should choose ϕε (z) = z2 + ε2 00 and send ε to zero using the property 0 ≤ ϕε (z)z2 ≤ ε. To obtain the third estimate, we start from the observation that the function z = v −v0 satisfies the identity Z

ϕε (z)|t0 dx

Zt Z +µ

00

Zt

ϕε (z)zx2 dxdτ 0

Zt Z 0

0

zx ϕε (z)|rr21 dτ

0 0

ϕε (z)(

+µ

=µ

zx z − 2 )dxdτ. x x

Now, it suffices to send ε to zero and apply the first two estimates. The lemma is proved. t u Theorem 4.1. Let a function v be a solution of problem (4.1)1 , (4.2) in the class (4.3). Then lim kvµ − v0 kC([0,T ]×[r1 +δ(µ),r2 −δ(µ)]) = 0, µ→0

for any function δ(µ) such that µ1/2 /δ(µ) → 0 as µ ↓ 0. Proof. Let us introduce the function ξδ (x) such that ξδ (x) = x − r1 if r1 ≤ x ≤ r1 + δ, ξδ (x) = δ if r1 + δ ≤ x ≤ r2 − δ, and ξδ (x) = r2 − x if r2 − δ ≤ x ≤ r2 . Given a smooth convex function η : R → R, one can easily verify that the function z = vx solves the inequality η(z)t −

µ z v 0 η(z)x + 2µη (z)a − µη(z)xx ≤ 0 in ]0, ∞[×, a = 2 − 3 . x x x

Passing to the entropies η(z) = |z − k|, k ∈ R, one arrives at the inequality Z∞ Z

Z∞ Z |z − k|ϕt ξδ dxdt + µ

− 0

|z − k|{(ξδ ϕ)x /x + ϕxx ξδ − 2ϕx ξδx } dxdt (4.6) 0

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

Z∞ Z −2µ

323

Z∞ Z sgn(z − k)aϕξδ dxdt + µ

0

(|z − k|ϕ)x ξδx dxdt ≤ 0 0

for any nonnegative ϕ(t, x, y) ∈ D(]0, ∞[×R × R) with any fixed y ∈ R. Due to the choice of ξδ , the last integral in (4.6) is greater than 2 Z X

∞

−µ

|z − k|ϕ dt|x=ri .

1 0

Now, we take k = z0 (y) in (4.6) and integrate with respect to y ∈ R, agreeing that / . By choosing ϕ(t, x, y) = 8(t)ζ (x −y), with 0 ≤ 8 ∈ D(]0, ∞[), z0 (y) = 0 for y ∈ 0 ≤ ζ ∈ D(R), we obtain ZZZ − |z(t, x) − z0 (y)|ξδ (x)8t (t)ζ (x − y) ZZZ ∂ ∂2 ζ (x − y) − 2 ζ (x − y) +µ |z(t, x) − z0 (y)|ξδ (x) ∂x ∂x ZZZ ∂ +µ |z(t, x) − z0 (y)|ξδx (x){ζ (x − y)/x − 2 ζ (x − y)} (4.7) ∂x ZZZ − 2µ sgn(z(t, x) − z0 (y))a(t, x)8(t)ξδ (x)ζ (x − y) −µ

2 ZZ X

|z(t, x) − z0 (y)|8(t)ζ (x − y)|x=ri ≡

1

5 X

jk (ε) ≤ 0.

1

us put 8(t) as a regularization of the function 10R0, we set Yε (t) = RLet t 0 0 0 0 0 0 −∞ Yε (s)ds, with Yε (t) = Y (t/ε)/ε, Y ∈ D(]0, 1[), Y ≥ 0, and Y = 1. We take 8(t) = Yε (t) − Yε (t − τ ). R Next, given 1 > 0, we choose ζ (x) = ψ1 (x) = ψ(x/1)/1, 0 ≤ ψ(x) ∈ D(]0, 1[), ψ = 1. By sending ε to zero, inequality (4.7) turns into (4.7)ε=0 ; to obtain the latter, one should substitute the first integral in (4.7) by Z Z |z(t, x) − z0 (y)|ξδ (x)ψ1 (x − y)|t=τ j1 (0) = t=0 dydx R

and substitute 8(t) by 10
Z Z R

|z(t, x) − z0 (y)| ≥ |z(t, x) − z0 (x)| − |z0 (x) − z0 (y)|,

(4.8)

k ∂ |z0 (x) − z0 (y)| k ψ1 (x − y) dydx ≤ 21−k+1 kψ (k) kL1 kz0,x k1, , ∂x

(4.9)

Z k ∂ −k ∂x k ψ1 (x − y) dy ≤ c1 ,

(4.10)

324

H. Frid, V. Shelukhin

Z

ZT |z|dx ≤ c, µ

|z(t, ri )|dt ≤ cµ1/2 .

(4.11)

0

Due to (4.8) and (4.9), we have Zτ Z

0

j1 (0) ≥ α (τ ) − cδ1, α(τ ) =

|z(t, x) − z0 (x)|ξδ (x)dx. 0

By (4.8), (4.9), and by (4.10), we get j2 (0) ≥ −µ(1−1 + 1−2 )cα(τ ) − µδ(1 + 1−1 )c. Again by the observations (4.8)–(4.11), j3 (0) ≥ −µ(1 +

1 )c − µ(1 + 1)c, j4 (0) ≥ −µδc, j5 (0) ≥ −µ1/2 c − µc. 1

Combining the above estimates for ji (0) and choosing 1 = µ1/2 , we derive from (4.7)ε=0 , by the Grönwall inequality, the following estimate rZ2 −δ

δ

|vx (τ, x) − v0x (x)|dx ≤ cµ1/2 .

(4.12)

r1 +δ

Taking into account the third estimate of Lemma 4.1, we conclude the proof of the theorem. u t Remark 4.1. We observe that lim inf kv − v0 kC(Q) ¯ >0 µ→0

if vi (t) 6 = 0 for t > 0. Besides, the function v¯ = v0 is the unique solution of the limit problem vt = 0, v(0, x) = v0 (x). Hence, Theorem 4.1 justifies the existence of a laminar boundary layer of thickness O(µα ) for the Navier–Stokes equations of incompressible fluids for any 0 < α < 1/2 in agreemant with the boundary layer theory [5]. 5. Boundary Layer in Compressible Flows By uniqueness, the limit problem (1.17)–(1.20) has only trivial solution ρ¯ = ρ0 = const, v = 0, provided ρ0 = ρ0 and v¯0 = 0. We are to prove Theorem 1.3 by obtaining the estimates rZ2 −δ rZ2 −δ 1/2 |vx |dx ≤ cµ , δ |wx |dx ≤ cµ1/2 (5.1) δ r1 +δ

r1 +δ

for the solution of problem (1.2)–(1.7) satisfying condition (1.23). The derivation of these inequalities is based upon the next crucial claim.

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

325

Lemma 5.1. There is a positive constant c independent of µ such that c(x − r1 ) − U ≡ νu − x

Zr2

Zx pdy ≤ 0, V ≡ νu + c(r2 − x) + r1

pdy ≥ 0.

(5.2)

x

Proof. Clearly, inequalities (5.2) hold at x = ri and at t = 0. It follows from (1.3) and (1.4) that U and V satisfy the equations ρ(Ut + uUx ) − νUxx − ν ρ(Vt + uVx ) − νVxx − ν

Ux ν2 νcr1 cρr1 + u(ρp + 2 + 2 ) + 3 + A = 0, x x x x

ν2 ν(c + p) Vx νρv 2 + u(ρp + cρ + 2 ) − − + B = 0, x x x x

where νp νρv 2 − +ρ A=− x x Due to (1.2), the integral

Rx ri

Zx

Zx pt dy, B = ρ

r1

pt dy. r2

pt dy admits the representation

Zx

0

Zx

pt dy = −uρp (ρ) + ri

0

00

ρu(p (ρ)ρx − ri

p (ρ) )dy. x

Hence, A and B are bounded uniformly in µ and there is a positive constant c such that νcr1 x −3 + A > 0 and −νcx −1 + B < 0 uniformly in µ. Now, the assertion of the lemma follows by the maximum principle. u t Lemma 5.2. The estimate ZT µ

1/2

kvx , wx k22,∞

+µ

3/2

kvxx , wxx k22,Q

≤ c, µ

kvx , wx k∞, dt ≤ c,

1/2 0

and kv, wk1,∞ ≤ cµ1/2 are valid uniformly in µ. Proof. We discuss the function v only, since w can be treated similarly. Let us divide Eq. (1.4) by ρ, multiply by vxx , and integrate. The result reads r µvxx µuv µ d k vx k2 + k √ k2 = µ(uvx , vxx ) + ( , vxx ) dt 2 ρ x (5.3) µ2 v vxx r2 )x , ) + µvt vx |r1 . − (( x ρ Denoting the right-hand side of this equality by

4 P 1

Jj , we observe that

µ µ 1 µ J1 = − (ux , vx2 ) ≤ kuxx kkvx k2 , J2 ≤ k √ vxx k2 + ckux k2 kvk2 , ρ 2 2 5

326

H. Frid, V. Shelukhin

J3 ≤

1 µ 1 µ k √ vxx k2 + µ2 ckv, vx k2 , J4 ≤ k √ vxx k2 + µkvx k2 + µ1/2 c. 5 ρ 5 ρ

To get the last estimate, we used Lemma 2.2 and inequality (4.5). Let us write one more energy equality, Z Z v2 xρv 2 d dx + µ (xvx2 + )dx = −(ρu, v 2 ) + µxvvx |rr21 ≡ J5 . dt 2 x

(5.4)

Due to Lemmas 2.2 and 2.5, we have J5 ≤

1 µ k √ vxx k2 + kvk2 + µkvx k2 + µ1/2 c. 5 ρ

By summing up equalities (5.3), (5.4), and taking into account Lemma 2.5 and the estimates for Ji , we arrive, by the Grönwall inequality, at the first estimate of the lemma. Now, the second one is a consequence of inequality (4.5). The derivation of the third estimate is based upon the inequality Zt Z

Z xρ|v|dx ≤ −µ

α(t) ≡

0

1 |v|( + ρu)dxdt + µ x

Zt

xvx sgnv|rr21 dτ.

(5.5)

0 0

To get (5.5), it suffices to multiply Eq. (1.4) by the function ϕε (v), defined in Sect. 4, integrate, and send ε to zero. Rt Clearly, the first integral in (5.5) is less than c 0 α, and the second one is less than Rt cµ1/2 , by the second estimate of Lemma 5.2. Hence, α ≤ c 0 α+ cµ1/2 , and the last estimate of the lemma is proved as well. u t First, we prove inequality (5.1) for w. To this end we denote z = wx , divide Eq. (1.5) 0 by ρ, differentiate with respect to x, multiply by ξδ (x)ϕε (z), and integrate. (The functions ξδ and ϕε are defined in Sect. 4.) As a result we obtain Zt (ξδ , ϕε ) = − 0

Zt µ 0

µ µ (ξδ ϕε , zx2 + ( − u)zzx ) dτ + ρ ρx 00

0

ξ 0 ( δ , zx ϕε ) dτ − µ ρ

Zt 0

Zt

0

0

(uξδ , zϕε ) dτ −

(5.6)

0

0

X ξ 0 ( δ , zϕε ) dτ ≡ Jj . xρ 11 8

The integral J8 admits the representation J8 = −J12 + J13 , Zt Z J12 = 0

z(µ−xρu) 2 00 µ (zx + ) ξδ ϕε dτ dx, J13 = ρ 2xµ

Zt Z 0

00

z2 ξδ ϕε µ − xρu 2 ( ) dτ dx. 4µρ x

Since 0 ≤ z2 ϕε 00 (z) ≤ ε, we have that limε→0 J8 ≤ 0. The integral J9 reads Z t rZ1 +δ Z t Zr2 0 uzϕε (z)dτ dx − uzϕε 0 (z)dτ dx. J9 = 0

r1

0 r2 −δ

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

Hence, by Lemma 5.1,

Zt Z

327

ξδ zϕ 0 ε (z)dτ dx.

J9 ≤ c 0

As for the integral J10 , we have Zt J10 = µ 0

Zt 0 ϕε r1 ξδ ϕε r2 − |r1 +δ dτ − µ ( 2 , ϕε ρx ) dτ, | ρ r2 +δ ρ ρ 0

Rt

while, clearly, J11 ≤ µc 0 kzk1, dτ . By sending ε to zero in (5.6), we arrive at the inequality for the function Y (t) = (ξδ , |z|) , Zt

Zt

|z|(τ, r1 ) + |z|(τ, r2 ) + (|z|, |ρx |) + kzk1, dτ.

Y (τ )dτ + µc

Y ≤c 0

0

Since ρx is bounded in L∞ (0, T ; L2 ()) uniformly in µ, we conclude, by Lemma 5.2, that Zt Y ≤ c Y dτ + cµ1/2 . 0

Thus, inequality (5.1) for w follows by the Grönwall inequality. Let us derive inequality (5.1) for v. Again, we denote z = vx , divide Eq. (1.4) by ρ, 0 differentiate with respect to x, multiply by ξδ (x)ϕε (z), and integrate. The result is Zt Z (ξδ , ϕε ) + 0

Zt (

− 0

Zt (

+ 0

µ µ ξδ ϕε ( zx2 + ( − u)zzx )dτ dx = ρ ρx 00

µzx 0 0 , ξδ ϕε ) dτ − ρ

uv 0 0 , ξ ϕ ) dτ + x δ ε

Zt ( 0

Zt ( 0

µz 0 0 , ξ ϕ ) dτ + xρ δ ε

µvzx 00 , ξδ ϕε ) dτ + 2 x ρ

Zt

Zt ( 0

0

0

(uz, ξδ ϕε ) dτ 0

uvzx 00 , ξδ ϕε ) dτ (5.7) x

X µv 0 0 ϕ ) dτ ≡ Jj . , ξ x2ρ δ ε 20

( 0

Zt

14

The integrals J14 , J15 , and J16 coincide with J9 , J10 and J11 respectively. The sum J17 + J18 reads Zt 0

∂ uv 0 ( (ξδ ϕε ), ) dτ = − ∂x x

Zt 0

zu 0 ( , ξδ ϕε ) dτ + x

Zt ( 0

u ux 0 − , vξδ ϕε ) dτ. 2 x x

Hence, by Lemma 5.2, lim (J17 + J18 ) ≤ cY (t) + ckux , uxx k1,Q kvk1, ≤ cY (t) + cµ1/2 ,

ε→0

328

H. Frid, V. Shelukhin

Rt where Y (t) stands for 0 (ξδ , |z|) . Integrating by parts, we find that Zt Z J19 = 0

0

ξδ ρx µvϕε 2ξδ 0 ( + − ξδ )dτ dx − x2ρ x ρ

Zt ( 0

µz 0 , ξδ ϕε ) dτ ≤ cµ. x2ρ

Finally, it is clear that J20 ≤ cµ. Now, treating equality (5.7) in the same manner as for (5.6) , we arrive at estimate (5.1) for v. Estimates (5.1) together with the estimates of Lemma 5.2, prove Theorem 1.3. 6. Existence and Uniqueness First, we discuss the problem of local solution using the Faedo-Galerkin method. Let Xn = span{sin

j π(x − r1 ) ; j = 1, ..., n} r2 − r1

be an n-dimensional space, with the corresponding orthogonal projection Pn : L2 () → Xn . We look for functions un (t), αn (t), βn (t) ∈ Xn and ρn satisfying (xρn )t + (xρn un )x = 0, ρn (0, x) = ρ0 (x),

(6.1)

Pn (xMj,n (t)) = 0, un (0) = Pn (u0 ), αn (0) = Pn (α0 ), βn (0) = Pn (β0 ),

(6.2)

where Mj,n (t) are the left-hand sides of Eqs. (1.3)–(1.5) respectively, with the functions u, v, w, and ρ substituted by un , vn , wn , and ρn . Here, Z x xρn dx/(r2 −r1 ), vn = αn +bn v2 +(1−bn )v1 , wn = βn +bn w2 +(1−bn )w1 , bn = r1

v0 = α0 +b0 v2 (0)+(1−b0 )v1 (0), w0 = β0 +b0 w2 (0)+(1−b0 )w1 (0), b0 = bn (0, x). By means of the standard fixed point arguments (see [25] for details), problem (6.1), (6.2) is solvable on some time-interval [0, Tn ], Tn ≤ T , with un , αn , βn ∈ C 1 ([0, Tn ], Xn ); ρn , ρnx , ρnt ∈ L∞ (0, Tn ; L2 ()). Let us obtain estimates independent of n. For symplicity, we omit the subindex “n”. Treating Eq. (6.1) as a transport equation for ρ, we arrive at the inequalities ρ

±1

(t, x) ≤

sup ρ0±1 exp (± x

Zt Z (|uxx | + 0

|ux | ) dsdx). r1

Another consequence of (6.1) is Z Z u u d xρx2 dx = − 3xρx2 (ux − ) + xρρx (uxx + ( )x ) dx. dt x x

(6.3)

(6.4)

Boundary Layers for Navier–Stokes Equations of Compressible Fluids

329

It follows from (6.3) and (6.4) that there is a constant c independent of n and such that ρ + 1/ρ + kρx k2 ≤ cey , Z

where y(t) =

(6.5)

x x ( ρ(u2x + αx2 + βx2 ) + (νu2x + µαx2 + µβx2 )) dx 2 2

Zt Z 2 2 (xρ(u2t + αt2 + βt2 ) + x(νu2xx + µαxx + µβxx )) ds dx.

+ 0

y0

with the help of (6.2) and using (6.5), we obtain the inequality y 0 ≤ Computing c exp (cy), where c does not depend on n. Now, in a straightforward manner, one can conclude that all the approximations are defined on the same time- interval [0, T∗ ] and at least one subsequence of the approximations converges on it to a local solution given by Theorem 1.1. To prove the global existence claimed in Theorem 1.1, it suffices to derive global estimates, with µ being fixed and positive. The functions ρ and u are already estimated in Sect. 2. As for the functions v and w, they can be estimated in the same way as u. We may then infer the existence part of Theorem 1.1. We pass to the proof of uniqueness. Assuming that there exist two solutions s1 and s2 , we denote s = s1 − s2 and introduce the function y(t) =

1 √ √ √ √ k xρ, xρ1 u, xρ1 α, xρ1 βk2 + 2

Zt

√ √ √ k νxur , µxvr , µxwx k2 ds.

0

A lengthy but straightforward computation (see [25]) yields the inequality y 0 ≤ εA(t)y + Bε (t)y, which is valid for any ε ∈]0, 1[ and for some positive functions A ∈ L∞ (0, T ), Bε ∈ L1 (0, T ) dependent on the norms of the solutions si . Thus, Theorem 1.1 is proved. References 1. Antontsev, S.N., Kazhikhov, A.V., Monakhov, V.N.: Boundary Value Problems in Mechanics of Nonhomogeneous Fluids. New York: Elsevier Science Publishers B.V., 1990 2. Bouchut, F., Perthame, B.: Kru˘zkov’s estimates for scalar conservation laws revisited. Trans. Am. Math. Soc. 350, 28447–2870 (1998) 3. Caflisch, R.E., Sammartino, M.: Zero viscosity limit for analytic solutions of the Navier–Stokes equation on a half-space I: Existence for Euler and Prandtl equations. Commun. Math. Phys. 192, 433–461 (1998) 4. Caflisch, R.E., Sammartino, M.: Zero viscosity limit for analytic solutions of the Navier–Stokes equation on a half-space II: Construction of the Navier–Stokes solution. Commun. Math. Phys. 192, 463–491 (1998) 5. Shlichting, H.: Boundary Layer Theory. 7-th Edition. London–New York: McGraw-Hill Company, 1979 6. DiPerna, R.J., Lions, P.L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98, 511–547 (1989) 7. Fife, P.C.: Considerations regarding the mathematical basis for Prandtl’s boundary layer theory. Arch. Rat. Mech. Anal. 28, 184–216 (1968) 8. Gisclon, M., Serre, D.: Étude des conditions aux limites pour un système strictement hyperbolique via l’approximation parabolique. C.R. Acad. Sci. Paris Sér. I Math. 319, 377–382 (1994)

330

H. Frid, V. Shelukhin

9. Grenier, E., Guès, O.: Boundary layers for viscous perturbations of noncharacteristic quasilinear problems. J. Diff. Eqs. 143, 110–146 (1998) 10. Henderson, D.M., Miles, J.W.: Surface-wave damping in a circular cylinder with a fixed contact line. J. Fluid Mech. 275, 285–299 (1994) 11. Hoff, D.: Global solutions of the Navier–Stokes equations for multi-dimensional compressible flow with discontinuous initial data. J. Diff. Eqs. 120, 215–254 (1995) 12. Kazhikhov, A.V., Shelukhin, V.V.: The verification compactness method. Novosibirsk: Actual Problems in Modern Math. 2, 51–60 (1996) 13. Kazhikhov, A.V., Weigant, V.A.: On global existence of the two-dimensional Navier–Stokes equations of viscous compressible fluid. Sib. Mat. Zhurn. 36, 1283–1316 (1995) (in Russian) 14. Khusnutdinova, N.V.: Heat boundary layer near a plate. Dokl. Acad. Nauk SSSR. 285, 605–608 (1985) 15. Kru˘zkov, S.N.: First order quasilinear equations in several independent variables. Math. USSR Sb. 10, 217–243 (1970) 16. Ladyzenskaya, O.A., Solonnikov, V.A., and Uraltseva, N.N.: Linear and quasilinear equations of parabolic type. Trans. Math. Monographs, Vol. 23, Providence, RI: Ams. Mat. Soc., 1968 17. Landau, L.D., Lifshitz, E.M.: Fluid Mechanics. 2nd Edition. Oxford: Pergamon Press Ltd., 1987 18. Lions, P.L.: Mathematical Topics in Fluid Mechanics. Vol.1, Incompressible Models. Oxford: Clarendon Press, 1996 19. Lions, P.L.: Existence global de solutions pour les équations de Navier–Stokes compressible isentropiques. C.R.Acad. Sci. Paris. 316, 1335–1340 (1993) 20. Lions, P.L.: Compacité des solutions des équations de Navier–Stokes compressible isentropiques. C. R. Acad. Sci. Paris. Sér I. 317, 115–120 (1993) 21. Nikolaev, V.B.: On solvability of mixed problems for the one-dimensional viscous gas equations of the axisymmetrical motion. Novosibirsk: Din. Sploshnoi Sredy. 44, 83–92 (1980)(in Russian). 22. Oleinik, O.A.: The Prandtl system of equations in boundary layer theory. Dokl. Akad. Nauk SSSR. 150, Soviet Math. 4, 583–586 (1963) 23. Serre, D.: Systemes de lois de conservation I,II. Paris: Diderot Editor. Art et Sciences, 1996 24. Serrin, J.: On the mathematical basis for Prandtl’s boundary layer theory: An example. Arch. Rational Mech. Analysis. 28, 217–225 (1968) 25. Shelukhin, V.V.: A shear flow problem for the compressible Navier–Stokes equations. Int. J. Non-Linear Mech. 33, 247–257 (1998) 26. Shelukhin, V.V.: The limit of zero shear viscosity for compressible fluids. Arch. Rat. Mech. Anal. 143, 357–374 (1998) 27. Temam, R., Wang, Xiaoming:Asymptotic analysis for the linearized Navier–Stokes equations in a channel. Differential and Integral Eqs. 8, 1591–1618 (1995) Communicated by A. Jaffe

Commun. Math. Phys. 208, 331 – 353 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Degeneracy of the b-Boundary in General Relativity Fredrik Ståhl Department of Mathematics, University of Umeå, 90187 Umeå, Sweden. E-mail: [email protected] Received: 11 June 1999 / Accepted: 30 June 1999

Abstract: The b-boundary construction by B. Schmidt is a general way of providing a boundary to a manifold with connection [12]. It has been shown to have undesirable topological properties however. C. J. S. Clarke gave a result showing that for spacetimes, non-Hausdorffness is to be expected in general [3], but the argument contains some errors. We show that under somewhat different conditions on the curvature, the bboundary will be non-Hausdorff, and illustrate the degeneracy by applying the conditions to some well known exact solutions of general relativity.

1. Introduction A serious limitation in our understanding of singularities in general relativity is the fact that singularities by definition are not parts of the space-time manifold. So in order to study the structure of singularities we would like to have some procedure for attaching an abstract boundary set containing the singular points to a space-time. At the very least the extended space-time should have a suitable topology making it possible to make statements like “close to the singularity” mathematically precise. One of the candidates is the b-boundary construction by B. Schmidt which works for any manifold with connection [12], and in the Lorentzian case it can be shown to be well-defined and locally complete [13]. However, for the FLRW and Schwarzschild space-times, the boundary is not Hausdorff separated from interior points [1,7]. This is a serious drawback since all points in space-time are then “close” to a given boundary point, making all statements about neighbourhoods of the singularity useless. The b-boundary structure is closely related to the singular holonomy group [2]. The methods used by Bosshard [1] and Johnson [7] are heavily dependent on the specific geometry of the FLRW and Schwarzschild space-times, based on the study of the boundary of two-dimensional sections. Clarke used a more general approach to find sufficient conditions for the topology to be non-Hausdorff [3]. The condition involves the asymptotic

332

F. Ståhl

behaviour of the Riemann tensor and its inverse and derivative in a parallel propagated frame along a curve ending at the boundary point. The argument in [3] contains some errors however. We show that under somewhat different conditions on the Riemann tensor and its inverse and derivative, the boundary fibres of the frame bundle are degenerate. We also confirm that the conditions hold in the FLRW (with expansion factor t c with c ∈ (0, 1), which is a bit more general than in [3]), Kasner, Schwarzschild, Reissner-Nordström and Tolman-Bondi space-times. Our reasoning will depend a lot on the work by Clarke [3], the most essential difference being that we choose to work with small circles instead of squares in Sect. 3 and that we use a stronger restriction initially on the derivative of the Riemann tensor. The outline of the paper is as follows. In Sect. 2, we introduce some notation and definitions. In Sect. 3 we approximate Lorentz transformations resulting from parallel propagation along small circles in terms of the Riemann tensor, and in Sect. 4 we use these results to find a curve generating a given Lorentz transformation by parallel propagation. Section 5 is concerned with singular holonomy and gives the connection to the b-boundary, and we illustrate the implications for some well known space-times in Sect. 6. We also discuss some other contributions to the singular holonomy group in Sect. 7. 2. Preliminaries Throughout this paper, (M, g) is a space-time, i.e. a smooth 4-dimensional connected orientable and Hausdorff manifold M with a smooth metric g of signature (−+++). The construction of the b-boundary may be carried out in different bundles over M (see Refs. [12,4,6] for some background). Here we choose to work in the bundle of pseudoorthonormal frames OM, consisting of all pseudo-orthonormal frames at all points of M. OM is a principal fibre bundle with the Lorentz group L as its structure group. We write the right action of an element L ∈ L as RL : E 7 → EL for E ∈ OM. We will, a bit sloppily, restrict attention to one of the connected components of OM and the component of identity in L using the same notation. From the fibre bundle structure of OM, we have a canonical 1-form θ which is R4 valued, and from the metric on M we construct the connection form ω, which takes values in the Lie algebra l of L [8]. The Schmidt metric on OM is the Riemannian metric G given by G(X, Y ) := hθ(X), θ(Y )iR4 + hω(X), ω(Y )il ,

(1)

where h·, ·iR4 and h·, ·il are Euclidean inner products with respect to fixed bases in R4 and l, respectively [12,4]. If κ is a curve in the bundle of pseudo-orthonormal frames OM, we denote the blength of κ by l(κ). By a slight abuse of notation we will also write l(γ , E0 ) for the b-length, or generalised affine parameter length, of a curve γ (t) in M, with respect to a given frame E0 at some point on γ . The definition is l(γ , E0 ) :=

Z X n

(Vi )2

1/2 dt,

(2)

i=1

where Vi are the components of the tangent vector of γ with respect to the frame E obtained by parallel propagation of E0 along γ . The notation is motivated by the fact that l(γ , E0 ) is the same as the b-length of the horizontal lift of γ through E0 in OM.

Degeneracy of the b-Boundary in General Relativity

333

We also write d(E, F ) for the b-metric distance between two points E and F in OM, and Br (E) for the open ball in OM with centre at E and radius r. The Schmidt metric was used by Schmidt [12] to construct a boundary, the bboundary, of the base manifold M, providing endpoints for all b-incomplete inextendible curves. Basically the procedure is as follows. 1. Construct the Cauchy completion OM of OM and extend the group action to OM. 2. Let M be the set of orbits of L in OM, and define a projection π : OM → M taking a point in OM to the orbit through the point. 3. M is then a topological space with the topology inherited from OM via π , and we may identify π(OM) with M. 4. Define the b-boundary as ∂M = M \ M. The topological space OM is no longer a fibre bundle since the action of L might be non-free on a boundary “fibre” (orbit). We quantify the boundary fibre degeneracy by defining the singular holonomy group as 8sOM (E) := {L ∈ L; EL = E},

(3)

for E ∈ π −1 (p) with p ∈ ∂M [2]. It follows that the boundary fibre π −1 (p) is homeomorphic to L/8sOM (E). We say that the boundary fibre is degenerate if the singular holonomy group is nontrivial, and totally degenerate if the singular holonomy group is the whole Lorentz group L. The importance of total degeneracy is illustrated by the following result from [3]. Proposition 1. If p ∈ ∂M with π −1 (p) totally degenerate, then every neighbourhood of p in M contains all null geodesics in M ending at p. In particular, M is not Hausdorff. In what follows we will need various norms, given a fixed frame E ∈ OM. We use bold symbols for the array of frame components of a tensor in the frame E. For tangent vectors X, we define the norm |X| to be the Euclidean norm of the frame component array X, and similarly for cotangent vectors. In the Lie group and Lie algebra, we use the Euclidean norm with respect to a fixed basis, and for general tensors T we use the mapping norm, e.g. kTk :=

sup

|Tij X i Y j |

|X|=|Y|=1

(4)

for a covariant 2-tensor T . 3. Parallel Propagation and the Riemann Tensor In this section we calculate a first approximation to the Lorentz transformation generated by parallel propagation around a small circle. First we construct a disc with suitable properties. Let f : Dl → M, where Dl := {(x, y) ∈ R2 ; x 2 + y 2 ≤ l 2 },

(5)

and put ∂ , ∂x f (x,y) ∂ . Y (x, y) := f∗ ∂y f (x,y)

X(x, y) := f∗

(6) (7)

334

F. Ståhl

Let (r, θ ) be polar coordinates on Dl , i.e. x = r cos θ and y = r sin θ , and put ∂ , ∂r f (x,y) ∂ . Z(x, y) := f∗ ∂θ f (x,y)

V (x, y) := f∗

(8) (9)

Then V = cos θ X + sin θ Y, Z = −r sin θ X + r cos θ Y.

(10) (11)

Pick a pseudo-orthonormal frame E(0, 0) at p := f (0, 0), and define E(x, y), where x = r cos θ and y = r sin θ, by parallel propagating E(0, 0) along the radial curves ρθ : s 7 → f (s cos θ, s sin θ )

(12)

for each θ ∈ [0, 2π). Similarly, let F (x, y) be defined by parallel propagating E(r, 0) along the circular curves or : s 7 → f (r cos s, r sin s)

(13)

for each r ∈ [0, l]. Let L(x, y) be the Lorentz transformation taking E(x, y) to F (x, y), i.e. F = EL. From now on, bold symbols denote component arrays with respect to the frame E. Lemma 1. The Lorentz transformation L is given by Z θZ r R(V, Z) dr dθ, L = exp −

(14)

0 0

where exp is the exponential map l → L. Proof. Since F is parallel along each or , ∇Z F = (∇Z E)L + E∇Z L = 0.

(15)

We may view L on each or as a curve in the Lorentz group L parameterised by θ . Then ˙ −1 = −∇Z E, E LL

(16)

where the dot denotes differentiation with respect to θ . Now let λ be the curve in the Lie algebra corresponding to L by right translation, i.e. λ corresponds to the right-invariant vector field equal to L˙ at L by L˙ = λL. (It might seem more natural to choose left translation, but then we would have to solve for L−1 instead.) Thus Eλ = −∇Z E.

(17)

Differentiating with respect to V and using that ∇V E = 0 and ∇V ∇Z E = R(V , Z)E,

(18)

Degeneracy of the b-Boundary in General Relativity

335

we get ∂λ = −R(V, Z) ∂r

(19)

in the frame E. Integrating and solving L˙ = λL gives Z θZ r R(V, Z) dr dθ. L = exp −

t u

(20)

0 0

Corollary 1. The Lorentz transformation 3 generated by parallel propagation in the counterclockwise direction around the boundary of f (Dl ) is given by ZZ Z 2πZ l R(V, Z) dr dθ = exp − R(X, Y) dσ, (21) 3 = exp − 0

Dl

0

where dσ is the area element of Dl with respect to the metric dx 2 + dy 2 . Proof. The first expression follows immediately by letting r → l and θ → 2π in Lemma 1. Using (10) and (11) and the symmetries of the Riemann tensor we get R(V , Z) = rR(X, Y ),

(22)

and hence the second formula. u t Let γ be the loop at p obtained by following the radial curve ρ0 , the boundary ol of the disc in the counterclockwise direction, and back again along ρ0−1 . Then parallel propagation along γ generates 3 since E is parallel along ρ0 . Suppose that f is chosen such that the radial curves ρθ are geodesics and |X| = |Y| = 1 at p, i.e. f is basically the exponential map Tp M → M, restricted to {(xXp + yYp ∈ Tp M; x 2 + y 2 ≤ l 2 }. We can then approximate 3 by an expression involving the value of R only at p. The essential thing here is the length estimate. Note 1. Whilst f is smooth by construction, it need not be an embedding or even 1–1. In such a case, E is not a frame field on Dl , but the construction still works. Lemma 2. Suppose that ∇V V = 0, |X| = |Y| = 1 at p, and l > 0 is sufficiently small for there to be an α < 1 such that l 2 kRkDl ≤ 10−3 α,

(I)

−6 2

(II)

l k∇RkDl ≤ 10 3

α ,

where k·kDl := supf (Dl ) k·k. Then

3 − δ + πl 2 R(X, Y)|p < 10−5 α 2 , where δ is the identity element of L, and the b-length of γ is less than 9l.

(23)

336

F. Ståhl

Proof. Note that ∇V V = 0 implies that |V| = 1 on the whole disc. First we need estimates for |Z| and |∇V Z|. Since [V , Z] = 0, ∇V2 Z = ∇V ∇Z V = R(V , Z)V ,

(24)

∇V Z = ∇V Z|p + R(V, Z)V|r= ξ1 r

(25)

1 Z = Zp + ∇V Z|p r + R(V, Z)V|r= ξ2 r 2 2

(26)

so

and

for some ξ1 , ξ2 ∈ [0, r]. But from (11), Zp = 0 and ∇V Z|p = − sin θ Xp + cos θ Yp ,

(27)

α 1 |Z|r= ξ2 , |Z| ≤ r + kRkDl |Z|r=ξ2 r 2 ≤ r + 2 2000

(28)

so

and since α < 1, 2000 r, 1999

(29)

2001 2 α< . 1999 1999

(30)

|Z| < and |∇V Z| < 1 + Put

2πZ l

Z λ := −

R(V, Z) dr dθ. 0

(31)

0

Then Z kλk ≤ 2πkRkDl

l

|Z| dr <

0

α , 300

(32)

so k3 − δ − λk ≤

∞ X kλkk k=2

k!

< kλk2

∞ X kλkk k=0

2k

<

α2 . 80000

(33)

Next we replace the integral in λ with an expression involving only the value of the Riemann tensor at the origin. The mean value theorem gives (34) R(V, Z) = R(V, Z)|p + ∇V R(V, Z) |r=ξ3 r for some ξ3 ∈ [0, r]. Since Zp = 0 and ∇V V = 0, R(V, Z) = (∇V R)(V, Z)|r=ξ3 r + R(V, ∇V Z)|r= ξ3 r.

(35)

Degeneracy of the b-Boundary in General Relativity

337

Applying the mean value theorem again to the first factor in the last term and using that R(V, ∇V Z)|p = R(X, Y)|p and ∇V2 Z = R(V , Z)V gives R(V, Z) = (∇V R)(V, Z)|r=ξ3 r + R(X, Y)|p r

+ (∇V R)(V, ∇V Z)|r= ξ4 ξ3 r + R V, R(V, Z)V |r= ξ4 ξ3 r

(36)

for some ξ4 ∈ [0, ξ3 ]. Thus from (29) and (30),

R(V, Z) − R(X, Y)|p r < 4001 k∇RkD r 2 + 2000 kRk2 r 3 . l Dl 1999 1999

(37)

Integrating and using condition (I) and (II) along with α < 1 we get

λ + πl 2 R(X, Y)|p < 10−6 α 2 .

(38)

Adding (33) and (38) and applying Corollary 1 gives

3 − δ + πl 2 R(X, Y)|p < 10−5 α 2 ,

(39)

and we have established the first part of the lemma. The b-length of γ is given by l(γ , E) = l(ρ0 , E) + l(ol , E) + l(ρ0 , E3).

(40)

Now ρ0 is a geodesic with |ρ˙0 | = |V| = 1, so the first and third terms are l(ρ0 , E) = l

(41)

l(ρ0 , E3) ≤ lk3k ≤ l exp(kλk) < 1.1 l

(42)

and

by (32). The second term is Z l(ol , E) =

2π

|L

−1

0

Z Z|r=l dθ ≤

0

2π

|Z|kLkr=l dθ,

(43)

since the norm of a Lorentz transformation equals the norm of its inverse. But L is given by Lemma 1, and applying (29), condition (I) and α < 1 we get kLkr=l ≤ exp

1000

θ l 2 kRkDl θ < exp , 1999 1999

(44)

so using (29) again gives 2000 l l(ol , E) < 1999

Z

2π

exp 0

θ dθ < 6.3 l. 1999

Adding (41), (42) and (45) together we get the desired bound on l(γ , E). u t

(45)

338

F. Ståhl

Note 2. In [3], parallel propagation around a small square starting at one of the corners is investigated. The central result is Lemma 2.2.1, where the conditions l 2 kRk < α/28 and lk∇Rk < kRk/20 are used to establish k3 − δ − l 2 R(X, Y)|p k < 6α 2 .

(46)

An explicit calculation in FLRW space-time shows that this is impossible without using a stronger condition on k∇Rk. Apart from typographical errors, the main problem seems to be in the argument at the top of p. 24 of [3]. It is possible to obtain an estimate of order α 2 for parallel propagation around a circle with the starting point at the centre with a bound of order α on k∇Rk, by modifying the argument in our Lemma 2. The idea is to use a second order expansion of R(V, Z) and then a symmetry argument to get rid of the first ∇R term. However, the penalty for the weaker condition on ∇R is that a condition on k∇ 2 Rk of order α 2 has to be imposed. For our purpose, condition (II) is sufficient. 4. Generating Lorentz Transformations Using the approximation from Lemma 2, we can construct a loop generating a given Lorentz transformation exactly, provided that the transformation is sufficiently close to the identity. The idea is to generate a sequence of approximate transformations by parallel propagation along the boundaries of a sequence of appropriately constructed circles, applying Lemma 2 at each stage. First we construct the approximate curves to be used as building blocks for the final curve. Note 3. To ensure the existence of the disks used to generate the curves, we need to avoid the situation where one of the radial curves cannot be continued because it runs into a singularity. If we restrict attention to a subset U of OM with compact closure, this can only happen if U contains a trapped inextendible incomplete curve [13]. This is avoided if we assume that the closure of U in OM is compact and contained in OM. Lemma 3. Let λ ∈ l, E ∈ OM and p = π(E) be given and suppose that there is a bivector W such that Rp (W) = λ, where Rp is the Riemann tensor in the frame E at p. Put U := {F ; d(E, F ) < 22kWk1/2 }

(47)

and k·kU := supU k·k, and let L := exp λ. Also, assume that kWk is sufficiently small for the closure of U in OM to be compact and contained in OM. If kWk < (π/4000) kRk−1 U ,

−2/3

kWk < (π/40000) k∇RkU

(I) ,

(II)

then there is a horizontal curve γ in U starting at E which generates a Lorentz transformation 3 with

where

kL − 3k < 10−3 α 2 ,

(48)

40000 3/4 4000 1/2 k∇RkU , kWkkRkU , kWk α < max π π

(49)

and the b-length of γ is less than 22kWk1/2 .

Degeneracy of the b-Boundary in General Relativity

339

Proof. We start by decomposing W as W = A cos θ + ∗A sin θ,

(50)

where A and ∗A are dual independent simple bivectors. Inverting this relation and using that for any bivector B, √ (51) k∗Bk ≤ 2 3kBk, we get kAk, k∗Ak < 4kWk.

(52)

Define a disc by f : Dl1 → π(U), such that |X| = |Y| = 1, hX, YiR4 = 0 and πl12 X ∧ Y = −A cos θ at E, as in §3. Then (52) gives l12 <

4 kWk, π

(53)

so f (Dl1 ) ⊂ π(U). Put

o n 3/2 1/2 α := max 103 l12 kRkU , 103 l1 k∇RkU .

(54)

Then condition (I) and (II) give α < 1 so Lemma 2 applies. From Lemma 2 we have a loop γ1 at p and a Lorentz transformation 31 generated by parallel propagation around γ1 . Replacing A cos θ and l1 with ∗A sin θ and l2 and repeating the above procedure we get another loop γ2 at p which generates a Lorentz transformation 32 . Put Z1 := 31 − δ − Rp (A cos θ )

(55)

Z2 := 32 − δ − Rp (∗A sin θ ).

(56)

and

From Lemma 2 we know that kZ1 k, kZ2 k < 10−5 α 2 .

(57)

Let 3 = 31 32 . Then 3 is generated by parallel propagation around the concatenation γ of γ1 and γ2 , and we may write 3 − L = Z1 Z2 + δ + Rp (∗A sin θ ) + δ + Rp (A cos θ ) Z2 + Rp (A cos θ)Rp (∗A sin θ ) −

∞ X λk k=2

k!

.

(58)

Using first (57) and (52) and then condition (I) we get k31 k = kZ1 + δ + Rp (A cos θ)k < 10−5 α 2 + 1 + 4kRp kkWk < 1.01,

(59)

and similarly k32 k = kZ2 + δ + Rp (∗A sin θ )k < 1.01

(60)

340

F. Ståhl

and kδ + Rp (A cos θ)k < 1.01.

(61)

Inserting (60) and (61) into (58) and using that kλk ≤ kRp kkWk <

π 4000

(62)

and kWk < π(l12 + l22 )

(63)

and condition (I) gives k3 − Lk < 2.02 · 10−5 α 2 + 16kRp k2 kWk2 +

kλk2 < 10−3 α 2 . 2(1 − kλk)

(64)

The length of γ is the sum of the lengths of γ1 and γ2 . From Lemma 2 and (53) we find that l(γ1 , E) < 9l1 < 11kWk1/2 .

(65)

The same holds for γ2 except that we have to correct for the starting frame being E31 instead of E. From (59), l(γ2 , E31 ) < 9l2 k31 k < 11kWk1/2 ,

(66)

and thus l(γ , E) < 22kWk1/2 .

t u

(67)

The Riemann tensor in a given frame can be viewed as a map from the space of bivectors to the Lie algebra l. We use the norm kWk := 2

sup

|W ij Xi Yj |

|X|=|Y|=1

(68)

for the bivectors, so that the mapping norm kRk := sup kR(W)k kWk=1

(69)

agrees with the previously defined tensor norm (4). We now concentrate on the case when the Riemann tensor in the frame E is invertible (the frame is of course not essential here since invertibility in one frame is equivalent to invertibility in any frame). Note that if the Riemann tensor is invertible at a point F ∈ OM, the image of the space of bivectors is the whole Lorentz group L, so by the standard holonomy theory the infinitesimal holonomy group is the whole of L. Thus the length estimate is the important result here. The idea is to piece the curves from Lemma 3 together to generate a sufficiently small Lorentz transformation exactly.

Degeneracy of the b-Boundary in General Relativity

341

Lemma 4. Let λ ∈ l, E ∈ OM and p := π(E) be given and suppose that Rp , the Riemann tensor in the frame E at p, is invertible. Let Rp−1 be the inverse and put U := {F ; d(E, F ) < 24kRp−1 k1/2 kλk1/2 }

(70)

and k·kU := supU k·k. If the closure of U in OM is compact and contained in OM and kλk < 10−6 kRp−1 k−2 kRk−2 U ,

kλk < 10−12 kRp−1 k−3 k∇Rk−2 U ,

(I) (II)

then there is a horizontal curve in U starting from E and ending at E exp λ, piecewise smooth except possibly at the endpoint, of b-length less than 24kRp−1 k1/2 kλk1/2 .

(71)

Proof. Let L := exp λ. To construct the first square, put W := Rp−1 (λ). Since kRp−1 kkRkU ≥ kRp−1 kkRp k ≥ 1,

(72)

kλk < 10−6 .

(73)

condition (I) gives

Applying condition (I) to the first factor of kλk2 and (73) to the second factor and then taking the square root gives that condition (I) of Lemma 3 is fulfilled. Similarly, applying condition (II) to the first factor of kλk3 , (73) to the other two factors and taking the third root gives that condition (II) of Lemma 3 is fulfilled. Thus Lemma 3 applies and we have a loop γ1 which generates a first approximation L1 to L. Also, 4000 2 kλkkRp−1 k2 kRk2U , α 2 < kλk max π (74) 40000 3/2 1/2 kλkkRp−1 k3 k∇Rk2U , π so from condition (I) and (II), α 2 < 2kλk,

(75)

and Lemma 3 gives kL − L1 k <

1 kλk. 500

(76)

Next we repeat the construction for the Lorentz transformation L1−1 L. We first have to check that the conditions are satisfied. But from (59) and (60), kL1 k ≤ k31 kk32 k < 1.1,

(77)

and from (76) and the fact that the norm of a Lorentz transformation equals the norm of its inverse, kL1−1 L − δk < kL1 kkL − L1 k <

1 kλk. 450

(78)

342

F. Ståhl

It follows that we can write L1−1 L = exp λ2 with 450 −1 1 kL1 L − δk < kλk. (79) 449 449 Thus λ2 satisfies the conditions as long as the generating curve stays in U. Repeating the above process we get a series of loops γk corresponding to a sequence λk of Lie algebra elements, generating Lorentz transformations Lk . The products Lˆ k = L1 L2 . . . Lk are generated by parallel propagation along the concatenation of the curves γ1 , γ2 , . . . , γk , and 1 −1 Lk < 1.1k−1 (80) kλk k kLˆ k − Lk ≤ kLˆ k−1 kkLk − Lˆ k−1 500 from (76) and repeated application of (77). But (79) gives 1 k−1 kλk, (81) kλk k < 449 so Lˆ k → L as k → ∞. It remains to show that the resulting curve is contained in U. From Lemma 3, kλ2 k <

l(γ1 , E) < 22kRp−1 k1/2 kλk1/2 .

(82)

For γk , we have to take into account that the starting point is E Lˆ k−1 instead of E, so l(γk , ELk−1 ) < 22kRp−1 k1/2 kλk k1/2 kLˆ k−1 k 1 (k−1)/2 (83) kλk1/2 1.1k−1 < 22kRp−1 k1/2 449 from (77) and (81). Summing over k we get the desired bound on the length, and it is evident that the generating curve stays in U. u t Note 4. The main difference between our Lemma 4 and Lemma 2.2.2 of [3] is that condition (I) involves the second power of Rp−1 and R instead of the first. This is needed to establish (74) which is essential for the construction of the sequence of circles to work. The corresponding equation at the bottom of p. 26 in [3] is incorrect since there a bound on 0 2 kλk is needed, but the given conditions only provide a bound on 0kλk. It is now a simple matter to generate arbitrary transformations by splitting them in a finite number of factors, sufficiently small for Lemma 4 to apply, and joining together the resulting curves. Note that we do not need to go through the approximation scheme in Lemma 4 more than once as is done in [3], since once we have a curve generating the first factor, we can translate it along the fibres to get curves generating the other factors. Theorem 1. Let E ∈ OM with p := π(E) and put U := {F ∈ OM; d(E, F ) < δ}

(84)

for some δ > 0, small enough for the closure of U in OM to be compact and contained in OM. Let L := exp λ be a Lorentz transformation and suppose that R is invertible on U. Then there is a horizontal curve γ in π −1 ◦ π(U) which generates L with l(γ , E) < 24kLkkRp−1 k1/2 kλk1/2 n1/2 ,

(85)

where

om l n n := kλk max 106 kRp−1 k2 kRk2U , 1012 kRp−1 k3 k∇Rk2U , 242 kRp−1 k/δ 2 .

(86)

Degeneracy of the b-Boundary in General Relativity

343

Proof. We start by generating the Lorentz transformation L1 := exp(λ/n), where n ∈ N is chosen sufficiently large for Lemma 4 to hold on a subset of U, which gives (86). By Lemma 4 there exists a horizontal curve γ1 in U from E to E1 := EL1 . Let Lk := (L1 )k and Ek := ELk for k = 2, 3, . . . , n. Then γk = γ1 Lk−1 is a horizontal curve from Ek−1 to Ek since the action of the Lorentz group preserves horizontal curves. Let γ be the combined curve obtained by joining the curves γk in sequence. Then γ generates Ln = L and since l(γk ) ≤ kLk k l(γ1 ) ≤ kLk l(γ1 )

(87)

l(γ1 ) < 24kRp−1 k1/2 kλ/nk1/2 ,

(88)

and

the result follows. u t

5. The Singular Holonomy Group We can now relate the structure of the singular holonomy group with the asymptotic behaviour of the Riemann tensor. First we need the following characterisation from [3]. Proposition 2. Suppose that γ : (0, 1] → OM is a horizontal curve with γ (0) = E and p = π (E) ∈ ∂M. Then L ∈ 8sOM (E) if and only if there is a sequence ti with ti → 0 and loops κi : [0, 1] → M such that κi (0) = κi (1) = π ◦ γ (ti ), Li → L, l(κi , γ (ti )) → 0,

(I) (II) (III)

where Li are the Lorentz transformations obtained by parallel propagating γ (ti ) around κi for each i. We may use Proposition 2 to give an alternative definition of the singular holonomy group [2]. Let ϕa (F ) be the group of Lorentz transformations generated by parallel transport around loops κ at π(F ) with l(κ, F ) ≤ a. Then if γ : (0, 1] → OM is a horizontal curve starting at γ (0) = E ∈ π −1 (p) with p ∈ ∂M, 8sOM (E) :=

\

[

ϕa (γ (t)).

(89)

a∈R+ t∈(0,1]

A nontrivial 8sOM may have several causes. For example, the bounded part of the curvature may contribute as well as the unbounded part [2], and non-trivial topologies can generate discrete subgroups (see §7 below). In the following section we concentrate on using Lemma 4 to show how divergence of the Riemann tensor can cause total degeneracy.

344

F. Ståhl

6. Total Degeneracy Combining Proposition 2 with Theorem 1 we get the following sufficient conditions for total degeneracy of a boundary fibre. In the rest of this section we will see that the conditions are indeed fulfilled in many interesting cases relevant to general relativity. Theorem 2. Suppose that γ : (0, 1] → OM is a horizontal curve with γ (0) = E and p = π (E) ∈ ∂M, and that there are sequences ti → 0 and ρi → 0 such that R is invertible on the balls Ui := Bρi (γ (ti )). If the closure of each Ui in OM is compact and contained in OM and kRi−1 k3 kRk2Ui , kRi−1 k2 k∇RkUi and kRi−1 k/ρi tend to 0 as ti → 0, then 8sOM (E) = L. Note that invertibility of the Riemann tensor means that it is injective, i.e. there are no 2-planes on which R vanishes, and surjective, i.e. there is no subspace of the Lie algebra unaffected by curvature. If R is invertible, kR−1 k = sup λ

−1 kR−1 (λ)k kWk = sup = inf kR(W)k , kWk=1 kλk W kR(W)k

(90)

so kR−1 k → 0 if and only if kR(W)k diverges for all bivectors W. In other words, kR−1 k → 0 if and only if, for all index pairs k and l, there are two indices i and j such that the frame component Rij kl diverges. This could happen if all sectional curvatures diverge, for example. We are now able to show that the boundary fibres are totally degenerate in many cases. We will employ the following procedure. Let γ : I → M be a curve with an endpoint p ∈ ∂M, and let E be a pseudo-orthonormal frame field on (a subset of) M. Using Cartan’s equations we find the rotation coefficients and the Riemann tensor components in the frame E. We may then write down and solve the parallel propagation equations for a frame F along γ . The tricky part is finding a sequence of parameter values ti along with suitable ρi -balls Ui and bounds on kRkUi and k∇RkUi . To this end, we need to explore the connection between the b-distance and Lorentz transformations. Lemma 5. Let p ∈ M and V ⊆ Bρ (p, Ep ) ⊂ OM, and suppose that Ep can be extended to a frame field E on V. Put k0kπ(V ) := supπ(V ) k0k, where 0 is the array of the rotation coefficients in the frame E, and K := max{k0kπ(V ) , 1}. If ρ ≤ 1/4K then all frames in V can be expressed as EL with kLk < 2. Proof. Let κ : [0, ρ] → V be a curve in V with κ(0) = (p, E), parameterised by b-length s. Let κ˙ be the tangent vector of κ, and let V be the tangent vector of π ◦ κ with components V in the fixed frame E. Also, let the frame F of κ be given by F = EL. We want to show that kLk < 2. From [8], the fundamental 1-form θ at κ(s) is given by F −1 ◦π∗ , where F is regarded as a map R4 → Tπ◦κ M, so θ(κ) ˙ = L−1 V.

(91)

Next, the connection form ω is given by ϕ ω(κ) ˙ = ver κ, ˙

(92)

Degeneracy of the b-Boundary in General Relativity

345

where ϕ is the canonical isomorphism from l to the vertical subspace of Tκ(s) OM, and ver κ˙ denotes the vertical component of κ˙ [8]. By definition, if a ∈ l and A(t) is any curve in L with A(0) = δ and dtd t=0 A = a, then d ϕ(a) := RA(t) F = F a dt t=0

(93)

at F . The vertical component of κ˙ is given by ˙ ∇V F = (∇V E)L + E L˙ = F L−1 (0VL + L),

(94)

where 0VL is the matrix with components 0 ikl Vk Llj and 0 ikl are the rotation coefficients of the frame E. Combining (92), (93) and (94) gives ˙ ω(κ) ˙ = L−1 (0VL + L).

(95)

Since κ is parameterised by b-length, ˙ 2 = 1, |θ(κ)| ˙ 2 + kω(κ)k

(96)

|V| ≤ kLk|θ(κ)| ˙ ≤ kLk

(97)

so from (91),

and from (95), d kLk ≤ kLkkω(κ)k ˙ + k0k|V|kLk ≤ KkLk2 + kLk. ds

(98)

Put u := KkLk. Then u2

u˙ ≤ 1, +u

(99)

and integration gives u≤

K , (K + 1)e−s − K

(100)

since kLk = 1 at s = 0. Thus kLk ≤ (K + 1)e−s − K

−1

,

(101)

and the result follows from s ≤ 1/4K and K ≥ 1. u t Note 5. Equation (98) corresponds to the differential equation on p. 42 of [3], except that there the last term is incorrectly given as 1 instead of kLk.

346

F. Ståhl

6.1. FLRW space-times. Let (M, g) be a Robertson-Walker space-time, with M = (0, τ ) × 6 and g given by the line element ds 2 = −dt 2 + a(t)2 dσ 2

(102)

such that (6, dσ 2 ) is a homogeneous space (see eg. [6,10,3]). The scale function a(t) is determined from the chosen matter model via the field equations. For a Friedman big bang model, a(t) → 0 as t → 0, corresponding to a curvature singularity at t = 0. Let γ be a curve in M with constant projection x ∈ 6, parameterised by t. Then γ starts at the singularity at t = 0. Choose the pseudo-orthonormal frame field E on (a subset of) M as E0 :=

∂ ∂t

and

Eα := a(t)−1 E˜ α ,

(103)

where E˜ is an orthonormal frame field on the Riemannian manifold (6, dσ 2 ). Note that E˜ may be defined only on a neighbourhood of x if (6, dσ 2 ) does not admit a global parallelisation. Here greek indices α, β, . . . refer to spatial components and have values in {1, 2, 3}. Write θ for the cotangent frame field dual to E, i.e. θ is the fundamental 1-form restricted to the section of OM defined by E. From Cartan’s equations, the nonvanishing connection and curvature form components are ˙ −1 θ α , ω0α = ωα0 = aa ωαβ = −ωβα = a −1 0˜ αµβ θ µ ,

(104)

¨ −1 θ 0 ∧ θ α , 0α = α0 = aa ˜ αβµν + a˙ 2 a −2 δ αµ δβν ) θ µ ∧ θ ν , αβ = −βα = (a −2 R

(105)

and

˜ αβµν are the rotation where a dot denotes differentiation with respect to t and 0˜ αδβ and R coefficients and the Riemann tensor components, respectively, of (6, dσ 2 ) in the frame ˜ E. Solving the parallel propagation equations we find that E is parallel along γ . To study the asymptotic behaviour we consider the case a(t) = t c for a constant c ∈ (0, 1). Then there are positive constants N1 and N2 such that ˜ kRk < N1 max{t −2 , t −2c kRk}

(106)

˜ t −3c k∇ Rk} ˜ k∇Rk < N2 max{t −3 , t −2c−1 kRk,

(107)

and

in the frame E. Moreover, R is invertible on γ and kR−1 k < N3 t 2

(108)

−1 for some positive constant N3 , so kR k → 0 as t → 0. Pick a sequence ti → 0 and let ˜ ˜ kRk ρi := ti /8 and Ui := Bρi γ (ti ) . Let S be a neighbourhood of x in 6 such that k0k,

Degeneracy of the b-Boundary in General Relativity

347

˜ are bounded on S. Put Vi := Ui ∩ Ki , where Ki := π −1 ([ti /2, 3ti /2] × S). and k∇ Rk Then for small enough ti , 1 < k0kVi ≤ 2cti−1 = Ki ,

(109)

and since c < 1, ρi < 1/4Ki . Thus Lemma 5 gives kLk < 2 on Vi . If κ is a curve in Vi with κ(0) = γ (ti ) and l(κ) ≤ ρi , the t-coordinate satisfies Z |t − ti | =

s

0

Lθ(κ) ˙

0

ti ds ≤ kLkl(κ) < 4

(110)

on κ. Let κ˜ be the projection of π ◦ κ to 6. Since E˜ is an orthonormal frame, the metric length of κ˜ in (6, dσ 2 ) can be estimated by Z ˜ ≤ lσ (κ)

0

s

a −1 kLk|θ(κ)| ˙ ds < 2c−2 ti1−c ,

(111)

which tends to 0 as ti → 0. But then Ui must be contained in Ki for small enough ti , so the t-coordinate must be greater than ti /2 on the whole of Ui . Thus kRi−1 k3 kRk2Ui , kRi−1 k2 k∇RkUi and kRi−1 k/ρi all tend to 0 as ti → 0, so by Theorem 2 the fibre over γ (0) is totally degenerate. Note that in [3], a similar result is given for 2/3 < c < 1. The reason for the restriction on c is that Clarke uses a bound on kR−1 k of order t 2c , while kR−1 k is actually of order t 2 for small enough t.

6.2. Kasner space-times. To illustrate that the fibre degeneracy is not an artefact of isotropy we repeat the calculations for the anisotropic Kasner space-times (see e.g. [10]). Let M := I × 6 with metric g given by ds 2 = −dt 2 + t 2px dx 2 + t 2py dy 2 + t 2pz dz2 ,

(112)

where (x, y, z) are coordinates on 6 and the constants px , py and pz satisfy px + py + pz = 1

and

px2 + py2 + pz2 = 1.

(113)

We exclude the special case when px = py = 0, pz = 1 (including permutations of x, y and z) which corresponds to one half of Minkowski space. For all other parameter values, there is a curvature singularity at t = 0. Let γ be a curve with constant x, y and z, starting at the singularity and parameterised by t. Choosing a pseudo-orthonormal frame field E as E0 :=

∂ ∂ ∂ ∂ , E1 := t −px , E2 := t −py and E3 := t −pz ∂t ∂x ∂y ∂z

(114)

we find again that E is parallel propagated along γ , that R is invertible, and that kRk < N1 t −2 , k∇Rk < N2 t −3 and kR−1 k < N3 t 2 for some constants N1 , N2 and N3 . Put p := max{|px |, |py |, |pz |}. Then k0k = pt −1 , and an argument similar to that in §6.1 gives that the boundary fibre is totally degenerate.

348

F. Ståhl

6.3. Schwarzschild space-time. Let (M, g) be given by ds 2 = b(r)−2 dt 2 − b(r)2 dr 2 + r 2 (dϑ 2 + sin2 ϑ dφ 2 )

(115)

with t ∈ R, r ∈ (0, 2m), ϑ ∈ [0, π], φ ∈ [0, 2π ), and −1/2 2m −1 b(r) := r (see e.g. [6,10]). Choose E as E0 := b−1

(116)

∂ ∂ ∂ ∂ , E1 := b , E2 := r −1 and E3 := (r sin ϑ)−1 ∂r ∂t ∂ϑ ∂φ

(117)

and let the corresponding cotangent frame be θ . The connection form is ω01 = ω10 = −mbr −2 θ 1 ,

ω02 = ω20 = b−1 r −1 θ 2 ,

ω03 = ω30 = b−1 r −1 θ 3 ,

ω23 = −ω32 = −r −1 cot ϑ θ 3 ,

(118)

and the curvature form is 01 = 10 = 2mr −3 θ 0 ∧ θ 1 ,

02 = 20 = −mr −3 θ 0 ∧ θ 2 ,

03 = 30 = −mr −3 θ 0 ∧ θ 3 ,

12 = −21 = −mr −3 θ 1 ∧ θ 2 ,

13

=

−31

= −mr

−3 1

θ ∧θ , 3

23

=

−32

= 2mr

−3 2

(119)

θ ∧θ . 3

Thus there are positive constants N1 and N2 such that kRk < N1 r −3 and k∇Rk < N2 r −9/2 in the frame E. Let γ be a radial curve parameterised by r with ϑ = π/2, φ = 0 and t = t0 . Then E is parallel on γ , R is invertible, and kR−1 k < r 3 /m along γ . If ϑ is bounded away from 0 and π, √ (120) k0k ≤ 2m r −3/2 for small r. Choosing a sequence ri → 0 and 3/2

ρi :=

ri √ , 16 m

(121)

an argument similar to that in §6.1 gives that kLk < 2 and r > ri /2 on each Ui := Bρi γ (ri ) for small enough ri . Thus the conditions of Theorem 2 are fulfilled, so the boundary fibre is totally degenerate. 6.4. Reissner-Nordström space-time. Let (M, g) be given by ds 2 = −b(r)−2 dt 2 − b(r)2 dr 2 + r 2 (dϑ 2 + sin2 ϑ dφ 2 ) with t ∈ R, r ∈ (0, r− ), ϑ ∈ [0, π] and φ ∈ [0, 2π ), and 2m e2 −1/2 + 2 b(r) := 1 − r r

(122)

(123)

(see e.g. [6,10]). Degeneracy of the boundary fibre follows directly by generalising the argument in §6.3, with ρi := ri2 /32|e|, kRk < N1 r −4 , k∇Rk < N2 r −6 and kR−1 k < N3 r 4 . Note that the timelike nature of the singularity does not affect the argument.

Degeneracy of the b-Boundary in General Relativity

349

6.5. Tolman-Bondi space-time. The metric for the spherically symmetric Tolman-Bondi space-time (M, g) is given by ds 2 = −dt 2 + e2ω dr 2 + R 2 (dϑ 2 + sin2 ϑ dφ 2 ),

(124)

where ω := ω(t, r) and R := R(t, r) > 0 [11]. If the energy momentum tensor is taken to be of dust form, T := (t, r)

∂ ∂ ⊗ , ∂t ∂t

(125)

the equations for ω and R are 1 1 ˙2 m R − = (W 2 − 1), 2 R 2 R 0 = W eω ,

(126) (127)

r 2ρ

, (128) R2 R0 where W := W (r), ρ(r) := (0, r), dots and primes denote partial derivatives with respect to t and r respectively, and Z r (129) m(r) := 4π ρr 2 dr. =

0

Here r is rescaled such that r := R(0, r) and ω, R and are assumed to be smooth functions of t and r. We require that (t, r) ≥ 0 and (t, 0) > 0 for physical reasons. Put 1 (130) E(r) := W 2 (r) − 1 2 and let 3m(r) (131) a(r) := 4π r 3 and E(r)R(0, r) . (132) p(r) := − m(r) It can be shown that both a and p extend to smooth even functions of r on R, with a(r) > 0 and p(r) ≤ 1. Choose a pseudo-orthonormal frame E with cotangent frame θ according to E0 :=

∂ W ∂ ∂ ∂ , E1 := 0 , E2 := R −1 , and E3 := (R sin ϑ)−1 . ∂t R ∂r ∂ϑ ∂φ

(133)

Then the connection form is R˙ 0 1 R˙ θ , ω02 = ω20 = θ 2 , 0 R R ˙ R W ω12 = −ω21 = − θ 2 , ω03 = ω30 = θ 3 , R R W 3 1 1 3 2 3 ω 3 = −ω 1 = − θ , ω 3 = −ω 2 = − cot ϑ θ 3 , R R

ω01 = ω10 =

(134)

350

F. Ståhl

and the curvature form is 01 = 10 = 2mR −3 θ 0 ∧ θ 1 , 02 = 20 = −mR −3 θ 0 ∧ θ 2 , 03 = 30 = −mR −3 θ 0 ∧ θ 3 , m0 m 1 − θ ∧ θ 2, 12 = −21 = R0R2 R3 m0 m − 3 θ 1 ∧ θ 3, 13 = −31 = 0 2 RR R 23 = −32 = 2mR −3 θ 2 ∧ θ 3 .

(135)

Integrating (126), we get the following implicit expression for R: R 3/2 r

F (pR/r) = F (p) −

t a 1/2 F (p0 ), t0 a0

(136)

where a0 := a(0) = ρ(0) > 0, p0 := p(0) ≤ 1, t0 := (3/8π a0 )1/2 F (p0 ), and F : (−∞, 1) → (0, π/2) is a positive, bounded, smooth, strictly increasing and strictly convex function. If E(r) < 0, (136) is singular on a hypersurface {t = tb (r)}, where pR = r, with tb (r) ≤ 0. For t < tb (r) an equation similar to (136) holds, and we will concentrate on the region where t > 0. We refer to [11] for the details. There are several types of singularities in the Tolman-Bondi space-time. There is a coordinate singularity at r = 0, a central singularity at (t, r) = (t0 , 0), and a final singularity at r > 0, R = 0. For some parameter values, there are also shell crossing singularities where R 0 = 0 (see §7.2 below). First we study the final singularity. Let γ be a curve with constant r, ϑ and φ, and parameterise γ by τ := ts − t, where ts :=

a 1/2 F (p) t0 . a0 F (p0 )

(137)

Then γ starts at the final singularity at τ = 0 and E is parallel along γ . All functions not depending on t are bounded, so from (136) and (126), there are constants N1 , N2 and N3 such that kRk < N1 τ −2 , k∇Rk < N2 τ kR

−1

−3

,

k < N3 τ . 2

(138) (139) (140)

By an argument as in §6.1, fibres over the final singularity are degenerate. Next we turn our attention to the central singularity at (t0 , 0). Let γ be a radial curve with t = t0 and constant ϑ and φ, starting at (t, r) = (t0 , 0) and parameterised by r. Also, let F = EL be parallel along γ . Solving the parallel propagation equation we find that L is a Lorentz boost in the (E0 , E1 )-plane with hyperbolic angle Z ˙0 R dr. ϕ := − W

(141)

Degeneracy of the b-Boundary in General Relativity

351

Let C0 :=

1 00 1 00 a (0)F (p0 ). p (0)F 0 (p0 ) − 2 4a0

(142)

We assume that C0 6 = 0, the case of interest being C0 > 0 since then the singularity is naked [11]. If we restrict attention to the neighbourhood of γ where |t − t0 | <

C0 t0 2 r , 3F (p0 )

(143)

then it is possible to use (136), (126) and the fact that a and p extends to R to estimate all components of 0, R and ∇R. We find that there are positive constants N1 , N2 and N3 such that kRk < N1 r −4 , k∇Rk < N2 r kR

−1

−19/3

(144) ,

k < N3 r . 4

(145) (146)

Also, ϕ is bounded as r → 0. Again, an argument similar to the one in §6.1, with 7/3 ρi proportional to ri , gives that the fibre is totally degenerate also for this naked singularity. Note that k∇Rk has a stronger divergence than kRk3/2 . 7. Partial Degeneracy In general it can be very hard to show that a boundary fibre is degenerate, since different subgroups of the singular holonomy group may be generated by various things, e.g. unbounded curvature, regular curvature, quasi-regular singularities, and contributions from other boundary points due to non-Hausdorff behaviour of the b-boundary [2]. Note that even if the Riemann tensor is non-invertible and /or if only some components diverge, in some cases Lemma 3 may be used to establish partial degeneracy at least.

7.1. Quasi-regular singularities. To illustrate how degeneracy can be caused by topological anomalies we consider quasi-regular singularities obtained by suitable identifications in (the universal covering space of) Minkowski space-time (M, g) with ds 2 = −dt 2 + dx 2 + dy 2 + dz2 .

(147)

Given an isometry ϕ, we may identify points ϕ(p) with p in (the universal covering space of) a subset of (M, g) [5]. ˆ g) As a first example, let (M, ˆ be the universal covering space of Minkowski space with the timelike 2-plane {x = y = 0} removed and let ϕ be the rotation in the (x, y)plane by an angle φ 6 = 2π. Then the space-time obtained by identifying points with ˆ g) their images under ϕ has a conelike singularity at {x = y = 0}. Since (M, ˆ is flat, the infinitesimal, local and restricted holonomy groups are all trivial, so the only contribution to the singular holonomy group comes from curves not homotopic to 0. It clearly suffices to study curves with x 2 + y 2 = r as r → 0, and a simple argument then gives that 8sOM is a discrete group generated by φ modulo 2π.

352

F. Ståhl

Secondly, let ϕ be a boost in the (t, x)-plane with hyperbolic angle φ and consider the subset {z > −t} of (M, g). Identifying points under ϕ we get the Misner space-time with quasi-regular singularities similar to the ones in the Taub-NUT space-time [6,9]. As for the conelike example above, it is straightforward to show that 8sOM is generated by ϕ. More complicated singularities can be constructed by variations of this procedure [5]. 7.2. Shell crossing singularities. We return to the Tolman-Bondi space-time from §6.5 to study the shell crossing singularities where R 0 = 0. Only some components of the curvature diverge, so all we can hope for is to establish partial degeneracy in some directions. Unfortunately, it turns out that while kRk is of order (R 0 )−1 , k∇Rk is of order (R 0 )−3 , which prohibits us from using Lemma 3 in this case. Also, higher order derivatives of the Riemann tensor have even stronger divergence. Since the infinitesimal holonomy group is generated by the Riemann tensor and its derivatives, whose norms all diverge, it seems probable that the singular holonomy group is nontrivial. Proving that is impossible with our technique however, since we have no way to control the contributions from higher order terms. 8. Discussion We have shown that in many cases, the b-boundary has totally degenerate fibres, leading to undesired topological effects. The argument is based on that the divergence of the derivative of the Riemann tensor is sufficiently weak, so that the essential contribution to the singular holonomy group comes from R(X, Y). As we saw in Sect. 7.2, this fails in some cases. Since the infinitesimal holonomy group is generated by expressions of the form ∇V1 ...Vn (X, Y), it might be possible to use higher order derivatives of the Riemann tensor to generate elements in the singular holonomy group. One would then have to go further in the expansion in the proof of Lemma 2, and the conditions would get much more complicated. In Sect. 7.1, we gave a simple example of how a quasi-regular singularity can give rise to degenerate fibres. It is very easy to construct examples of quasi-regular singularities with discrete singular holonomy groups, but it is unknown if nondiscrete groups can arise in this way. The most apparent unsolved problem involving the b-boundary is the structure of the boundary itself. In the FLRW case the boundary has been shown to be a single point [3]. But for the Schwarzschild space-time, the results are not as conclusive. Both Bosshard [1] and Johnson [7] have established partial degeneracy of boundary fibres, but it is unknown if the boundary is just a point or something else (Johnson conjectures that it is a line). References 1. Bosshard, B.: On the b-Boundary of the Closed Friedman-Model. Commun. Math. Phys. 46, 263–268 (1976) 2. Clarke, C. J. S.: The Singular Holonomy Group. Commun. Math. Phys. 58, 291–297 (1978) 3. Clarke, C. J. S.: The Analysis of Space-Time Singularities. Cambridge: Cambridge University Press, 1993 4. Dodson, C. T. J.: Space-Time Edge Geometry. Int. J. Theor. Phys. 17, no. 6, 389–504 (1978) 5. Ellis, G. F. R., Schmidt, B. G.: Singular Space-Times. Gen. Relativ. Gravitation 8, no. 11, 915–953 (1977)

Degeneracy of the b-Boundary in General Relativity

353

6. Hawking, S. W., Ellis, G. F. R.: The Large Scale Structure of Space-time. Cambridge: Cambridge Univ. Press, 1973 7. Johnson, R.: The Bundle Boundary in Some Special Cases. J. Math. Phys. 18, 898–902 (1977) 8. Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry, vol. I. New York: John Wiley & Sons, 1963 9. Misner, C. W.: Taub-NUT Space as a Counterexample to Almost Anything. In: Ehlers, J. (ed.), Relativity Theory and Astrophysics I: Relativity and Cosmology. Lectures in Applied Mathematics 8. Providence, RI: Am. Math. Soc., 1967, pp. 160–169 10. Misner, C. W., Thorne, K. S., Wheeler, J. A.: Gravitation. New York: W. H. Freeman and Company, 1973 11. Newman, R. P. A. C.: Strengths of Naked Singularities in Tolman-Bondi Spacetimes. Classical Quantum Gravity 3, 527–539 (1986) 12. Schmidt, B. G.: A New Definition of Singular Points in General Relativity. Gen. Relativ. Gravitation 1, no. 3, 269–280 (1971) 13. Schmidt, B. G.: The Local b-Completeness of Space-Times. Commun. Math. Phys. 29, 49–54 (1973) Communicated by H. Nicolai

Commun. Math. Phys. 208, 355 – 379 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On Hypergeometric Functions Connected with Quantum Cohomology of Flag Spaces Vadim Schechtman? Max-Planck-Institut für Mathematik, Gottfried-Claren-Strasse 26, 53225 Bonn, Germany Received: 20 December 1997 / Accepted: 2 July 1999

Abstract: The Givental recursion relations for hypergeometric series associated with equivariant quantum cohomology are computed for flags varieties G/B. A simple formula for these functions is obtained in the case G = SL(3). Introduction In Givental’s work on the Gromov–Witten invariants for projective complete intersections, [G1], the principal role is played by certain formal power series connected with the quantum cohomology of a manifold. One has a manifold X with a natural torus action, with the finite number of fixed points, and one has a power series Zw associated with each fixed point xw . The coefficients of these series are certain integrals over the spaces of stable maps of genus 0 curves with two marked points to X. These series form a fundamental system of solutions of a certain lisse D-module on a power of the punctured disk. The small quantum cohomology of X coincides with the algebra of functions on its characteristic variety. The series Zw are uniquely determined by certain recursion relations relating Zw with all Zw0 if xw is connected with xw0 by a fixed line. The present work consists of three parts. The first part contains nothing new. Here we present Givental’s computations from [G1], in the simplest case of a projective space, in more detail than in [G1]. In the second part, we write down the above recursion relations for the flag spaces X = G/B (G being a simple algebraic group), see Theorem II.3.8, which is the main result of this paper. Here X is equipped with the natural action of the maximal torus of G. In the work [G2], Givental gave another beautiful set of relations which also determines the above mentioned series completely. Namely, these are Toda lattice differential equations (more precisely, we need the equivariant version of the results of [G2]). This set of relations has a completely different nature, and it is highly non-trivial fact that both ? Present address: I.H.E.S., 35, route de Chartres, 91440 Bures-Sur-Yvette, France.

356

V. Schechtman

sets of relations determine the same series. In the third part, we check this by a direct computation for G = SL(3). It turns out that in this case the series Zw admit a nice explicit expression, see III.1.2, III.2.2. (A posteriori it is not surprising, since in this case X admits the Plücker embedding into P1 × P1 , and one can use another computation by Givental, dealing with the toric complete intersections.)

Part I. Projective Spaces 1. Equivariant Cohomology of Pn 1.1. Let X denote the n-dimensional projective space Pn = {(z0 : . . . : zn )}, the space of lines L ⊂ V = Cn+1 . The torus T = (C∗ )n+1 acts on X by the rule (α0 , . . . , αn )·(z0 : . . . : zn ) = (α0 z0 : . . . : αn zn ); this action has n + 1 fixed points xi = (0 : . . . 0 : 1 : 0 . . . 0) (one on the i th place), i = 0, . . . , n. Let L denote the line bundle over X whose fiber over L ⊂ V is L; L has an obvious T -equivariant structure. Let t denote the Lie algebra of T , t∗Z := H om(T , C∗ ) ⊂ t∗ the lattice of characters. For λ ∈ t∗Z , let Lλ be the T -equivariant line bundle over the point, equal to C, with T acting by means of the character λ. Assigning to λ the first Chern class c1 (Lλ ) ∈ HT2 (pt), we identify the graded ring A := HT∗ (pt) with C[tZ∗ ] = C[λ0 , . . . , λn ], λi being the projection on the i th factor. Q The graded A-algebra R := HT∗ (X) is identified with C[p, λ0 , . . . , λn ]/( i (p − λi )), where p := c1 (L) ∈ HT2 (X). It is computed using Bott’s fixed point theorem, [AB]. Since our cohomologies will be even anyway, it is convenient to use the grading of the rings A, R, etc. by assigning to p and λi degree 1. 1.2. The Euler classes of the tangent spaces at the fixed points xi are equal to Y (λi − λb ) ∈ HT2n (pt) = An . ei := e(TX;xi ) =

(1.1)

b6 =i

Let us consider the restriction map ib∗ : HT∗ (X) −→ HT∗ (xb ). We have ib∗ (p) = ib∗ (c1 (L)) = c1 (Lxb ). The fiber Lxb is the line C · (0, . . . , 1, . . . 0) ⊂ Cn+1 (1 on the bth place); the Lie algebra t acts on this line by means of the character λb . Therefore, ib∗ (p) = λb , whence

ib∗ (f (p)) = f (λb ).

(1.2)

Let A0 be the ring obtained from A by inverting all elements ei , i.e. by inverting all the differences λa − λb (a 6 = b). Bott’s theorem says that the restriction map

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

357

i ∗ : HT∗ (X) −→ HT∗ (X T ) = ⊕nb=1 A · 1xb becomes an isomorphism after the base change A −→ A0 . We denote R 0 = R ⊗A A0 . Let us introduce the elements φi (p) :=

Y

(p − λb ) ∈ R n .

(1.3)

b6 =i

Obviously, φi (λj ) = ei δij .

(1.4)

It follows from Bott’s theorem that the set 1/2

{φi (p)/ei ; i = 0, . . . , b}

(1.5)

is the basis of orthonormal idempotents of the algebra R 0 (to be precise, we should adjoin the square roots of ei to R 0 ). One can express this in a slightly different way. We have the integration map Z X

: R −→ A

of degree −2n, given by Z X

f (p) =

X

resp=λi

i

X f (λi ) f (p) Q . dp = ei b (p − λb )

(1.6)

i

We have the Poincaré pairing h·, ·i : R × R −→ A, Z hf, gi =

X

f g.

We have hf, φi i = f (λi ),

(1.7)

hφi , φj i = ei δij .

(1.8)

hence

For each f ∈ R, f =

X f (λi ) φi . ei

(1.9)

i

Note that the rhs a proiori lies in R 0 but in fact it belongs to R ⊂ R 0 , since the lhs does.

358

V. Schechtman

2. Partition Function 2.1. Let Xd (d ≥ 0) denote the stack of stable maps {f : (C; y1 , y2 ) −→ X} of genus 0 curves with two marked points, such that f∗ ([C]) = d · β, where β ∈ H2 (X) is the generator dual to p. It is a Deligne–Mumford stack for d ≥ 1. Let L1 be the line bundle over Xd whose fiber at a point (f, . . . ) is the tangent space TC;y1 ; denote c(d) = c1 (L1 ) ∈ HT2 (Xd ). We want to calculate the following formal power series: X 1 qd . e1∗ (2.1) Z(q, p) = Z(q, p, λ, h) = 1 + h + c(d) d≥1

Here

e1 : Xd −→ X

is the evaluation map sending (f, . . . ) to f (y1 ). This is the same as to compute the series X Z X Z e1∗ φi 1 d q = qd φi e1∗ Zi (q) = hZ(q), φi i = h + c(d) h + c(d) X Xd d d (2.2) (i = 0, . . . , n). First, let us formulate the answer. 2.2. Define the series S(q, p) =

X d≥0

and Si (q) := S(q, λi ) =

Qn

b=0

X d≥0

d!

Qd

1

m=1 (p − λb + mh)

Q

b6 =i

· qd

(2.3)

qd . hd m=1 (λi − λb + mh)

Qd

1

·

(2.4)

Theorem 2.1. (a) Z(q, p) = S(q, p). (b) For all i = 0, . . . , n, Zi (q) = Si (q). Of course, (a) and (b) are equivalent, due to the remarks of the previous section. The theorem will be proven in Sect. 5, after preliminaries in Sects. 3, 4. 2.3. Dimension count. The dimension of Xd is equal to dim(Xd ) = nd + n + d − 1.

(2.5)

Indeed, one sees easily that the dimension of the space of maps P1 −→ Pn of degree d is equal to (d + 1)(n + 1) − 1. To get the dimension of Xd , we have to subtract from this number 3 (reparametrizations of P1 ) and add 2 (marked points). The theorem says that for each d ≥ 0, 1 1 = Qn Qd . (2.6) e1∗ h + c(d) b=0 m=1 (p − λb + mh)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

359

Let us assign to the variable h the degree 1. The map e1∗ decreases the degree by dim(Xd ) − dim(X) = nd + d − 1. Therefore, the degree of the lhs of (2.6) is equal to −1 − (nd + d − 1) = −(n + 1)d, which equals the degree of the rhs. Note that this is true for d = 0 as well (the map e1∗ has a positive degree 1 in this case!). 3. Recursion Relation In this section we will study the series S(q, p) and Si (q). 3.1. As a first remark, note that the normalized series hS, φi i hφi , φi i

Sinorm :=

(3.1)

may be written as Sinorm (q) = =

X

Qd

Qn

Qd

b6 =i

d≥0

X d≥0

qd d!hd m=0 (λi − λb + mh)

Q

b=0

1

·

1

m=0;(b,m)6 =(i,0) (λi

3.2. Define the series si (q) := Si (qh) =

X

− λb + mh)

· qd .

bi (d)q d ,

(3.2)

(3.3)

d≥0

where bi (d) = bi (d, λ, h) =

1

.

(3.4)

ci (k) · q k λj − λi · sj (q; ), λi − λj + kh k

(3.5)

d!

Q

j 6 =i

Qd

m=1 (λi

− λj + mh)

The next theorem is the main result about our series. Theorem 3.1. For each i = 0, . . . , n, we have si (q, h) = 1 +

XX k>0 j 6 =i

where j

ci (k) =

k!

Q

b6=i

Qk

j

1

m=1; (b,m)6 =(j,k)

λi − λb +

m(λj −λi ) k

.

(3.6)

The relations (3.5), (3.6) uniquely determine the series si . The theorem is a variant of a simple fractions decomposition. We use the following elementary fact.

360

V. Schechtman

Lemma 3.2. Let f (h) be a non-constant polynomial with distinct roots α1 , . . . , αN . Then N X 1 1 = . (3.7) f (h) (h − αk )f 0 (αk ) k=1

We have 0

f (αk ) =

f (h) h − αk

h=αk

.

(3.8)

Indeed, the difference of the rhs and lhs of (3.7) does not have singularities on h ∈ P1 , hence it is a constant; but the value of both sides at ∞ is equal to 0, hence they are equal. The formula (3.8) is evident. u t Now let us apply this to the coefficients bi (d), (3.4); we get d

1 XX 1 · d! λi − λj + kh Q

bi (d) =

j 6=i k=1

b6 =i

1

Qd

m=1; (b,m)6 =(j,k) (λi

− λb + m ·

λj −λi k )

.

(3.9)

Let us split the product in the denominator into two parts: d Y

=

m=1

k Y

·

d Y

.

m=1 m=k+1

The second product is equal to d Y

:=

m=k+1

d Y Y

(λi − λb + m

b6=i m=k+1

=

Y Y d−k

λj − λi ) k

(λi − λb + λj − λi + m0 ·

b6=i m0 =1

=

Y Y d−k

(λj − λb + m0 ·

b6=i m0 =1

Hence Qd

1

m=k+1

=

λj − λi ) k

λj − λ i ). k

λj − λi d! bj (d − k, ). k! k

(3.10)

Therefore, bi (d, h) =

d XX j 6=i k=1

=

d XX j 6=i k=1

λ −λ

bj (d − k, j k i ) 1 · Q Qk λi − λj + kh k! b6 =i m=1; (b,m)6 =(j,k) (λi − λb + m ·

λj −λi k )

λj − λi 1 j · c (k) · bj (d − k, ). λi − λj + kh i k

(3.11) Obviously, (3.11) is equivalent to (3.5). This proves (3.5). The uniqueness is obvious since si (0) = 1, and the recursion relations determine si (q) modulo q k+1 once we know sj (q) modulo q k for all j 6= i. The theorem is proved. u t

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

361

3.3. In order to get a better feeling what is going on, let us consider some examples. First, the case n = 0 is almost trivial, but gives a nice answer. We have in this case A = C[λ] (λ := λ0 ); R = C[p, λ]/(p − λ) = A; S(q) = S0 (q) = eq/ h .

(3.12)

There are no recursion relations. 3.4. The case n = 1. We have R = C[p, λ0 , λ1 ]/((p − λ0 )(p − λ1 )); φ0 = p − λ1 , φ1 = p − λ0 ; e0 = λ0 − λ1 , e1 = λ1 − λ0 . It is convenient to introduce the “root” α := λ0 − λ1 . We have s0 (q) = s0 (q; α) =

X d≥0

s1 (q) = s1 (q; α) =

X d≥0

d!

Qd

d!

qd

Qd

m=1 (α

qd

m=1 (−α + mh)

+ mh)

;

= s0 (q; −α).

(3.13)

(3.13)0

The recursion relations look as follows: s0 (q; α, h) = 1 +

X c(k; α)q k k>0

and s1 (q; α, h) = 1 +

α + kh

α · s1 (q; α; − ) k

X c(k; −α)q k k>0

where c(k; α) =

−α + kh

· s0 (q; α;

α ), k

1 kk · k−1 . 2 (k!) α

(3.14)

(3.14)0

(3.15)

The first few values of c(k; α): c(1; α) = 1; c(2; α) =

3 1 . ; c(3; α) = α 4α 2

(3.16)

The relation (3.14)0 is obtained from (3.14) by switching α to −α. Now let us make a little computation: start building up the series s0 , s1 using (3.14), (3.14)0 . We have q q2 α 1 s1 (−α) + · s1 (− ) + . . . , α+h α + 2h α 2 q s0 (α) + . . . . s1 (h) = 1 + −α + h

s0 (h) = 1 +

Thus, s0 = 1 +

q q + . . . ; s1 = 1 + + ... , α+h −α + h

362

V. Schechtman

hence s0 = 1 +

q + α+h

1 1 1 1 · + · α + 2h α α + h −2α

q2

q2 q + + ... , α + h 2(α + h)(α + 2h) which is the correct answer, up to this order.

(3.17)

=1+

3.5. As the last example, assume that n is arbitrary and let us check (3.3), (3.4) up to the first order(sic!) using (3.5), (3.6). We have 1 j . (3.18) ci (1) = Q (λ b6 =i,j j − λb ) Therefore, si (h) = 1 +

X j 6 =i

1 q ·Q + ... λi − λ j + h b6 =i,j (λj − λb )

q + ... , =1+ Q j 6=i (λi − λj + h)

(3.19)

where we have used the formula X 1 1 1 Q = · . (λ − λ + h) (λ − λ ) λ − λ j b i j +h j 6 =i i b6 =i,j j

Q

(3.20)

j 6 =i

4. First Reduction 4.1. We start by proving Theorem 2.3. Let us define the series zi by zi (q) = Zi (qh)

(4.1)

(cf. (3.3)). According to Theorem 3.1, in order to prove Theorem 2.1, it suffices to show that zi (q) satisfy the relations (3.5). 4.2. Let us define the coefficients Bi by Si (q) =

X

Bi (d)q d

(4.2)

d

(cf. (2.4)). Thus, we have Bi (d) =

bi (d) 1 = Qn Qd . hd b=0 m=1 (λi − λb + mh)

(4.3)

As we have already noted, (3.5) is equivalent to the identities bi (d, h) =

d XX j 6=i k=1

j ci (k) λj − λi · bj d − k, λi − λj + kh k

(4.4)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

363

or bi (d, h) =

d XX j 6=i k=1

j

ci (k) · λi − λj + kh

λj − λi k

d−k

λj − λi d − k, k

· Bj

.

(4.5)

If we assign to λi and h the degree 1 then we have j

deg bi (d) = −dn; deg Bi (d) = −dn − n; deg ci (k) = −kn + 1,

(4.6)

and the identities (4.4), (4.5) are homogeneous, cf. 2.4.

4.3. Let us denote

Z Ui (d) :=

Thus, Zi (q) =

Xd

X

e1∗ φi . h + c(d)

(4.7)

Ui (d)q d .

(4.8)

d≥0

We have

1 X (−1)a Ui (d) = h ha a≥0

Z Xd

e1∗ φi · c(d)a .

We Ui (0) = 1. Assume now that d ≥ 1. The degree of the integral R have ∗ a Xd e1 φi · c(d) is equal to n + a − (nd + n + d − 1) = −nd + a − d + 1 which is less than zero for a < d, hence this integral is zero for these a. Therefore, Ui (d) =

1 X (−1)a h ha

Z

a≥d

Xd

e1∗ φi · c(d)a =

Let us denote

Z ui (d) :=

Xd

1 hd

Z Xd

e1∗ φi · (−c(d))d . h + c(d)

e1∗ φi · (−c(d))d h + c(d)

(for d = 0 we set ui (0) := 1). Thus, X ui (d)q d ; ui (d) = Ui (d)hd . zi (q) =

(4.9)

(4.10)

d≥0

We have to prove that zi (d) = bi (d). Therefore, Theorem 2.3 is equivalent to Theorem 4.1. The integrals ui (d) satisfy the relations ui (d, h) =

d XX j 6=i k=1

j ci (k) λj − λi · uj d − k, . λi − λj + kh k

(4.10)

This is what we are going to prove in the next section, using Bott’s localization theorem.

364

V. Schechtman

5. Fixed Point Formula 5.1. We will compute the integrals ui (d) (see (4.9)) by means of Bott’s fixed point formula. It says that X ui (P ), (5.1) ui (d) = P

the summation running over all connected components P ⊂ XdT . Here ui (P ) denotes the integral ! Z e1∗ φi · (−c(d))d 1 . (5.2) · ui (P ) := h + c(d) e(N ) P

P

P /Xd

Here N denotes the normal bundle, e the Euler (top Chern) class. What do the connected components look like (cf. [K])? Let lij ⊂ X denote the straight line connecting the points xi and xj . These are the curves in X stable under the action of T . A point in XdT is a stable map f : (C; y1 , y2 ) −→ X

(5.3)

such that f (yi ) ∈ XT = {x0 , . . . , xn } and each irreducible component C1 ⊂ C is mapped either to one of the points xi – in this case we call C1 vertical, or to one of the lines lij – in this case we call C1 horizontal. The map (5.4) f C : C1 = P1 −→ lij 1

is a finite covering ramified at points xi and xj . The sum of the degrees of these coverings over all horizontal C1 should be equal to d. The connected component P to which the point (5.3) belongs, is specified by the combinatorial data: which irreducible components of C are vertical or horizontal, and the degrees of the coverings (5.4) for horizontal components. 5.2. Let us consider the integral ui (P ) (5.2). Let f as in (5.3) be a point in P . We have (5.5) e∗ φi = φi (λj ) 1

P

if f (y1 ) = xj . Therefore, ui (P ) may be nonzero only if f (y1 ) = xi . We will suppose this is the case from now on. Let C1 ⊂ C be the irreducible component containing y1 . Claim. If ui (P ) is nonzero then C1 is horizontal. Indeed, suppose that C1 is vertical. Let us call a special point a marked point or a point of intersection of two irreducible components. The connected component P has the form Ms+1 ×?, where Ms+1 is the Deligne– Mumford moduli stack of genus 0 curves with s +1 marked points, this whole component mapping to xi . A generic curve in Ms+1 contains the marked point y1 , maybe the marked point y2 , and s or s−1 points of intersection with horizontal curves, depending on whether it contains y2 or not; s special points altogether. We have dim(Ms+1 ) = s − 2. Since the total degree of f is d, the number of horizontal components does not exceed d; therefore, s − 1 ≤ d, hence dim(Ms+1 ) < d. t Consequently, c(d)d P = 0, i.e. ui (P ) = 0. The claim is proven. u

(5.6)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

365

5.3. Now let us consider the component P containing a stable curve (f : C = C1 ∪ C2 −→ X) ∈ P ⊂ XdT ,

(5.7)

where C1 is the irreducible component containing the marked point y1 , which is mapped with multiplicity k onto the line lij , and C2 is all the rest. The map (5.8) f1 := f C : C1 = P1 −→ lij 1

is the k-fold covering ramified only over two points xi and xj , where 1 ≤ k ≤ d. The map (5.9) f2 := f C : C2 −→ X 2

T . belongs to a connected component P2 of the fixed point space Xd−k We want to compute the integral (5.2). Let us compute the terms under the integral. As we have already noted, Y (λi − λb ). (5.10) e1∗ φi P = φi (λi ) = ei = b6 =i

We have

λi − λj . c(d) P = k

(5.11)

5.4. Normal bundle. The most labourous job is to compute the Euler class of the normal bundle over P . We use the Kontsevich formula, (cf. [K], 3.3.1): the class of NP /Xd in the Grothendieck group of T -equivariant bundles over P is equal to [NP /Xd ] = [H 0 (C1 ; f1∗ TX )0 at y ] − [H 0 (C1 ; TC1 )0 at y ] + [Ty;C1 ⊗ Ty;C2 ] + [Ty1 ;C1 ] + [NP2 /Xd−k ]

(5.12)

(we use the notations describing a bundle by the fiber at a point f ). We have

and Therefore,

[H 0 (C1 ; f1∗ TX )0 at y ] = [H 0 (C1 ; f1∗ TX )] − [(f1∗ TX )y ]

(5.13)

[H 0 (C1 ; TC1 )0 at y ] = [H 0 (C1 ; TC1 )] − [Ty;C1 ].

(5.14)

[NP /Xd ] = ([H 0 (C1 ; f1∗ TX )] − [(f1∗ TX )y )]) + (−[H 0 (C1 ; TC1 )] + [Ty1 ;C1 ] + [Ty;C1 ]) + [Ty;C1 ⊗ Ty;C2 ] + [NP2 /Xd−k ].

We have e([Ty1 ;C1 ]) =

λi − λj λj − λi ; e([Ty;C1 ]) = k k

and

(5.15)

(5.16)

λj − λi λi − λj · [0] · , (5.17) k k so that the second bracket in (5.15) gives simply −[0]. All zeros in this game will cancel out in the final expression for e(NP /Xd ) (see (5.22) below)! e([H 0 (C1 ; TC1 )]) =

366

V. Schechtman

Lemma 5.1. We have e([H 0 (C1 ; f1∗ TX )]) =

k n Y Y k−m m ( λi + λj − λb )/[0]. k k

(5.18)

b=0 m=0

Note that in the product there two factors equal to [0]: they correspond to the values (m, b) = (0, i) or (k, j ). One of these zeros is cancelled. Proof. We have the exact sequence of vector bundles over X = Pn : 0 −→ OX −→ V ∗ ⊗ OX (1) −→ TX −→ 0.

(5.19)

We have H 0 (C1 ; f1∗ (V ∗ ⊗ OX (1))) = H 0 (C1 ; f1∗ OX (1)) ⊗ V ∗ . The Lie algebra t acts on H 0 (C1 ; f1∗ OX (1)) by the characters m k−m λi + λj ; m = 0, . . . , k, k k t and on V ∗ by the characters −λb , b = 0, . . . , n. This implies the lemma. u The remaining zero in (5.18) will cancel out with the zero from (5.17).

5.5. We have e([(f1∗ TX )y ]) = e([Txj ;X ]) = ej =

Y

(λj − λb ).

(5.20)

b6 =j

Finally, we have e([Ty;C1 ] ⊗ [Ty;C2 ]) =

λj − λi + c(d − k). k

(5.21)

Combining everything together, we have proven Lemma 5.2. We have Qn e(NP /Xd ) =

b=0

Qk

k−m m=0; (b,m)6 =(i,0),(j,k) ( k λi

ej

+

λ j − λi + c(d − k)) · e(NP2 ;Xd−k ). ·( k

m k λj

− λb )

·

(5.22)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

367

5.6. Now we can plug our computations in (5.2). We get λj −λi d e · k 1 i ui (P ) = · λ −λj i k h+ k

· Qn Z ·

b=0

P2

1

Qk

k−m m=0; (b,m)6 =(i,0),(j,k) ( k λi

ej

c(d − k) +

λi −λj k

·

+

1 e(NP2 /Xd−k )

m k λj

(5.23)

− λb )

.

The overall factor 1/k is due to the group of automorphisms of the covering f1 , having order k. In the big product, the terms with m = 0 give the contribution ei , and the terms with b = i give the contribution k Y λj − λi k m m (− λi + λj ) = k! . (5.24) k k k m=1

The integral is by definition Uj (P2 ) (remember that P2 is the connected component of the smaller space Xd−k !). Thus, we get 1 1 · Qk kh + λi − λj k! Q b6 =i m=1; (b,m)6 =(j,k) λi − λb + d−k λj − λi · Uj (P2 ) · k j ci (k) λj − λi d−k · · Uj (P2 ). = kh + λi − λj k

ui (P ) =

m(λj −λi ) k

(5.25)

This implies formula (4.10) (cf. 4.5). Theorem 4.1, and hence Theorem 2.1 are proven. t u

Part II. Flag Spaces 1. Equivariant Cohomology of Flag Spaces 1.1. Let V = Cn+1 . Let X be the variety of complete flags of linear subspaces 0 ⊂ V1 ⊂ . . . ⊂ Vn ⊂ V , dim(Vi ) = i. We set D := dim(X) = n + (n − 1) + . . . + 1 =

n(n + 1) . 2

(1.1)

The torus T = (C∗ )n+1 acts on V . Namely, if v0 , . . . , vn is the standard basis of V , we put X X αi zi vi , α = (αi ) ∈ T . α◦ z i vi = By functoriality, T acts on X. We denote by t the Lie algebra of T , and denote by λi ∈ t∗ the character which is the projection on the i th component.

368

V. Schechtman

1.2. Fixed points. Let F = (V1 ⊂ . . . ⊂ Vn ) be a flag fixed under the action of T . Since T V1 = V1 , there exists i0 such that V = C · ei0 . Taking the quotient, we get the flag F 0 = (V2 /V1 ⊂ . . . ⊂ Vn /V1 ) in the space V 0 = V /V1 , fixed under the action of the torus T 0 = T /(C∗ )i0 . By induction on n, we conclude that there exists the unique permutation (i0 , . . . , in ) of the set {0, . . . , n} such that Vk is spanned by ei0 , . . . , eik−1 (k = 1, . . . , n). ∼ We shall denote the group of all bijections w : {0, . . . , n} −→ {0, . . . , n} by W , and we shall identify such a bijection with the permutation (w(0), . . . , w(n)). The previous discussion identifies the set of fixed points X T with W : to an element w ∈ W correponds the flag xw = (C · ew(0) ⊂ . . . ⊂ ⊕ki=0 C · ew(i) ⊂ . . . ⊂ V ) ∈ XT .

(1.2)

Tangent spaces. As a t-module, the tangent space Tw := TX;xw at the point xw , has the set of characters {λip − λiq , 0 ≤ p < q ≤ n}. Hence, its Euler class is given by ew := e(Tw ) =

Y p
(λip − λiq ).

(1.3)

Fixed lines. Given a permutation w = (i0 , . . . , in ) and an integer p, (1 ≤ p ≤ n), let sp w be the permutation with ip and ip−1 transposed, the other entries remaining in place. If xw = (V1 ⊂ . . . ⊂ Vn ), let `w,sp w be the projective line inside X consisting of all flags of the form V1 ⊂ . . . ⊂ Vp−1 ⊂ V 0 ⊂ Vp+1 ⊂ . . . ⊂ V , where all the subspaces Vi are fixed, and V 0 is varying. This line is fixed under the action of T , and contains exactly two fixed points: xw and xsp w . However, these are not all fixed lines. In fact, the fixed lines passing through one fixed point correspond to all positive roots (and we have just described the lines corresponding to the simple roots), cf. [BGG]. Consider the case w = id (identity permutation); so xid is the standard flag, with Vi spanned by {e0 , . . . , ei−1 }. Given p < q, let spq ∈ W be the permutation of p and q. For a matrix A = (aij ) ∈ GL(2), set ep;A = a11 ep + a21 eq ; eq;A = a12 ep + a22 eq .

(1.4)

Let xA be the flag with the spaces Vi;A spanned by e0;A , . . . , ei−1;A , where ei;A = ei for i 6 = p, q. When A runs through GL(2), the flags xA form the projective line `e;spq = GL(2)/B, stable under the action of T , and passing through the fixed points xe and xspq . In a similar manner, one defines the T -stable line `w;spq w passing through xw and xspq w . 1.3. Cohomology. Let Li be the line bundle over X whose fiber over a flag V1 ⊂ . . . ⊂ Vn is equal to Vi+1 /Vi ; let ui ∈ H 2 (X) be its first Chern class. The cohomology algebra of X is equal to (1.5) H ∗ (X) = C[u0 , . . . , un ]/(σ0 (u), . . . , σ (u)).

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

369

Here σi (u) are the elementary symmetric functions defined by the rule n Y (Z + ui ) = Z n+1 + σ0 (u)Z n + . . . + σn (u).

(1.6)

i=0

The T -equivariant cohomology R = HT∗ (X) is an A = HT∗ (pt) = C[λ0 , . . . , λn ]algebra isomorphic to HT∗ (X) = C[u0 , . . . , un ; λ0 , . . . , λn ]/(σ0 (u) − σ0 (λ), . . . , σn (u) − σn (λ)). (1.7) For each w = (i0 , . . . , in ) ∈ W , define an element of HT2D (X) by Y (uip − λiq ). φw =

(1.8)

p
Let iw denote the embedding {xw } ,→ X. Lemma 1.1. We have iw∗ φw0 = ew · δww0 . Proof. It follows from definitions, that iw∗ up = λw(p) . On the other hand,

Y

(λw(p) − λq ) = 0

(1.9) (1.10)

p
for w 6 = e. This implies the lemma. u t Set

−1 ]w∈W = A[(λp − λq )−1 ]p
(1.11)

Corollary 1.2. The elements {φw }w∈W form an A0 -basis of the algebra R 0 . The multiplication in R 0 is recovered from the rule φw φw0 = ew δww0 .

(1.12)

Proof. This is a corollary of Bott’s fixed point theorem, cf. I.1.2. u t 2. Partition Function 2.1. It is convenient to switch to an arbitrary flag space X = G/B, G being the simple simply connected algebraic group G associated to a finite root system R. The manifold X is acted upon by T , the maximal torus of G. We identify H2 (X; Z) with the coroot lattice (2.1) H2 (X; Z) = Q = ⊕i∈I Z · αi∨ , where αi∨ are the simple coroots; it contains the submonoid of positive coroots Q+ = ⊕I N · αi∨ ⊂ Q.

(2.2)

370

V. Schechtman

The cohomology group H 2 (X; Z) is the dual weight lattice H 2 (X; Z) = P = Char(T ) = ⊕I Z · ωi ,

(2.3)

where ωi are the fundamental weights; it contains the submonoid of dominant weights P + = ⊕I N · ωi ⊂ P .

(2.4)

R+

⊂ R the subset of positive roots The root system R lies inside P ; we denote by and by {αi }i∈I the set of simple roots. To each λ ∈ P = Char(T ) corresponds the line bundle Lλ over X, with c1 (Lλ ) = λ. In the Grothendieck group of T -equivariant bundles, the class of the tangent bundle of X is equal to the sum (2.5) [TX ] = ⊕α∈R + [Lα ]. In particular, (2.6) dim(X) = card(R + ). Let W be the Weyl group of R. The torus T acts on X with the finite set of fixed points {xw }w∈W , each xw lying inside the corresponding Schubert cell (cf. [BGG]). The fixed lines pass through the pairs xw , xsα w , where sα is the reflection corresponding to a positive root α. We identify the cohomology ring A = HT∗ (pt) with C[P ] = C[αi ]I . The Euler classes of the tangent spaces are given by Y α. (2.7) ew = e(Txw ;X ) = w(eid ); eid = α∈R +

Set A0 = A[α −1 ]α∈R . Bott’s Theorem gives the A0 -base {φw }w∈W in HT∗ (X)A0 , such that 0 = δww0 ew . (2.8) iw∗ φw 2.2. To each β ∈ Q+ corresponds the space Xβ of stable maps (2.9) f : (C; y1 , y2 ) −→ X; f∗ ([C]) = β; g(C) = 0 P P νi α ∨ . of curves of genus 0 with two marked points, to X. Set |β| = νi if β = Lemma 2.1. dim(Xβ ) = 2|β| + dim(X) − 1. Proof. Let us choose a point (2.9) with C = P1 . We have dim(Xβ ) = dim(Tf ;Xβ ) = dim(H 0 (C; f ∗ TX )) − dim(H 0 (C; TC )) + dim(Ty1 ;C ) + dim(Ty2 ;C ). Note that H 1 (C; f ∗ TX ) = 0. Therefore, dim(H 0 (C; f ∗ TX )) = χ(C; f ∗ TX ) = =

X α∈R +

X

(2.10)

χ(C; f ∗ Lα )

α∈R +

(hβ, αi + 1) = hβ, 2ρi + card(R + ) = 2|β| + dim(X),

(2.11) since hαi∨ , ρi = 1 for each i. Here ρ denotes the half-sum of the positive roots. Obviously, t dim(H 0 (C; TC )) = 3. Plugging this into (2.10), we get the lemma. u

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

371

In (2.11), we have used the equalities χ(C; f ∗ Lα ) = hβ, αi + 1.

(2.12)

This number may be negative if the root system is not simply laced. 2.3. Denote

(2.13) c(β) = c1 (T1 ) ∈ HT2 (Xβ ), where T1 is the line bundle over Xβ whose fiber over a point (2.9) is equal to Ty1 ;C . Let e1 : Xβ −→ X be the "evaluation at y1 " map. Our aim is to calculate the formal power series X β 1 ·q e1∗ (2.14) Z(q) = h + c(β) β∈Q+ Q νi qi , for in variables q = (qi )i∈I , with coefficients in the ring HT∗ (X)(h). Here q β = P β= νi αi∨ . This amounts to calculating |W | power series ! X Z e1∗ φw (2.15) · qβ . Zw (q) = hφw , Zi = h + c(β) Xβ + β∈Q

Example 2.2. For the root system A1 , the series Z(q) is given by the expression I, (2.3), with n = 1. The Weyl group has order 2; there are two series Zw (q): Z± (q) =

∞ X d=0

d!h

qd

Qd d

m=1 (±α

+ mh)

=

∞ X

qd

d=0

! d! d0;h ±α;h

.

(2.16)

Here we have used the notation ! = dα;h

d Y

(α + mh).

(2.17)

m=1

The series Z+ (resp. Z− ) corresponds to the trivial (resp. non-trivial) element of the Weyl group, cf. I, (2.4) and I.3.6. 3. Fixed Point Computation 3.1. Let us return to the series (2.15). Denote Z e1∗ φw . Iw (β) = Xβ h + c(β) Set

Z Jw (β) =

Xβ

e1∗ φw · (−c(β))|β| . h + c(β)

The next lemma is proved in the same manner as I, (4.10). Lemma 3.1. We have Iw (β) =

1 Jw (β). h|β|

(3.1)

(3.2)

372

V. Schechtman

3.2. Now, we want to compute the integral Jw (β) using the fixed point formula. We have X Jw (P ), (3.3) Jw (β) = P

where

Z Jw (P ) =

P

e1∗ φw · (−c(β))|β| 1 · . h + c(β) P e(NP /Xβ )

(3.4)

Here P denotes a component of the fixed point space XβT . Let us compute Jw (P ). The picture of a connected component is the same as in I, §5. Thus, let f : (C; y1 , y2 ) −→ X be a point in P ⊂ Xβ . The integral Jw (P ) may be non-zero only if f (y1 ) = xw , so we will assume this. Let C = C1 ∪ C2 , where C1 is the connected component containing y1 and C2 is all the rest. As in I.5.3, we prove that C1 is horizontal, i.e. it covers with some multiplicity k > 0 a fixed line `w;sα w , for some α ∈ R+. To simplify the notations, assume that w = id. We have Y γ (3.5) e1∗ φid P = eid = e(Txid ;X ) = γ ∈R +

and

α c(β) P = . k

Lemma 3.2. We have 2k−1 α · e(NP /Xβ ) = (−1) (k!) · k k

Y

2

γ ∈R + ; γ 6 =α

(3.6)

Q (γ − mk α) Q m≥1 m m≥1 (sα γ − k α)

α eid · (− + c(β − kα ∨ )). · esα k Proof. We have (cf. I.5.5) [NP /Xβ ] = [H 0 (C1 ; f1∗ TX )] − [(f1∗ TX )y ] − −[H 0 (C1 ; TC1 )] + [Ty1 ;C1 ] + [Ty;C1 ] + [Ty;C1 ⊗ Ty;C2 ] + [NP2 /Xβ−kα∨ ].

(3.7)

Note that P2 (containing the map f |C2 ) is the connected component of Xβ−kα ∨ . We have e([(f1∗ TX )y ]) = e([Txsα ;X ]) = esα ;

(3.8)

α α ; e([Ty;C1 ] = − , k k

(3.9)

α α · [0] · − . k k

(3.10)

α + c(β − kα ∨ ). k

(3.11)

next, e([Ty1 ;C1 ]) = and

e([H 0 (C1 ; TC1 )]) = We have e([Ty;C1 ⊗ Ty;C2 ]) = −

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

Lemma 3.3. We have e([H

0

(C1 ; f1∗ TX )])

373

Q m Y m≥0 (γ − k α) Q = . m m≥1 (sα γ − k α) +

(3.12)

γ ∈R

P Proof. As we have already noted, [TX ] = γ ∈R + [Lγ ] in the Grothendieck group. Also, H 1 (C1 ; f1∗ TX ) = 0 (convexity of X), and our lemma is the corollary of the following one. Lemma 3.4. We have

Q e([χ(C1 ; f1∗ Lγ )])

=Q

m≥0

m≥1

m k α) . − mk α)

(γ −

(sα γ

(3.13)

All but a finite number of terms in the fraction cancel out, and we are left with a finite t product in the numerator (resp. denominator), if hγ , α ∨ i is positive (resp. negative). u Now, Lemma 3.2 follows from Lemma 3.3. Note that the product (3.12) contains one factor equal to zero: in the numerator, coresponding to γ = α, m = k. This zero cancels out with the zero in (3.10), and we are left with the well defined non-zero product. This completes the proof of Lemma 3.2. u t Substituting this result into (3.4), we get Lemma 3.5. We have Jw (P ; h) = where and

Cw (α; k) w(α) · Jsα w (P2 ; − ), kh + w(α) k

(3.14)

Cw (α; k) = w(Cid (α; k))

(3.15)

k k(2−|α|) Cid (α; k) = (−1)k(|α|+1) α k|α|−2k+1 · (k!)2 Q Y m≥1 (sα γ − m α) k Q . · m m≥1 (γ − k α) +

(3.16)

γ ∈R ; γ 6 =α

Here we use the notation |α| = Let us introduce the series zw (q) =

P

ai for α =

X

P

ai αi .

Jw (β)q β = Zw (hq).

(3.17)

β

Theorem 3.6. We have zw (q; h) = 1 +

X α∈R + ; k>0

∨

w(α) Cw (α; k)q kα · zsα w (q; − ), kh + α k

where Cw (a; k) are given by the formulas (3.15), (3.16). This is an immediate corollary of Lemma 3.7. u t

(3.18)

374

V. Schechtman

Let us look more attentively at the expression (3.16). Recall Lemma 3.7 ([B], Ch. VI, §1, 1.6, Prop. 17, Cor. 2). Let w = s1 · . . . · sq

(3.19)

be a reduced decomposition of an element of the Weyl group, where si is the reflection correposnding to a simple root αi . Then the roots γi = sq sq−1 · . . . · si+1 (αi ) (i = 1, . . . , q) are all positive, distinct, and R + ∩ w−1 (−R + ) = {θ1 , . . . , θq }.

(3.20)

Obviously, if in the product in (3.16) both γ and sα γ are positive, then the two factors corresponding to them cancel out. The previous lemma says that we must keep only l(sα ) − 1 terms in the product (we have already taken care of the term γ = α). Corollary 3.8. If the root α is simple then Cid (α; k) =

k k −k+1 α . (k!)2

(3.21)

In fact, for a simple α, the big product disappears altogether. The expression (3.21) coincides with I, (3.15).

Part III. Computations for SL(3) 1. Formula 1.1. In this part, X will denote the flag manifold G/B, with G = SL(3). According to [G2], the quantum cohomology of X is given by the Fourier transform of the following D-module on the three-torus. Let A = (aij ) be the 3 × 3-matrix with aii = ui−1 (i = 1, 2, 3); ai,i−1 = vi (i = 2, 3); ai,i+1 = −1 (i = 1, 2); a13 = a31 = 0. Consider the characteristic polynomial det(λ + A) = λ3 + P1 λ2 + P2 λ + P3 = λ3 + (u0 + u1 + u2 )λ2 + (u0 u1 + u0 u2 + u1 u2 + v1 + v2 )λ + u0 u1 u2 + u0 v2 + u2 v1 . (1.1) Thus, the polynomials Pi = Pi (u; v) are the deformed symmetric functions. Consider the three-dimensional torus T , with multiplicative coordinates q0 , q1 , q2 . We define the differential operators Di on T , where Di is obtained from Pi by the substitution uj = qj ∂qj , vj = qj /qj −1 . We are interested in the solutions of the system D1 φ = D2 φ = D3 φ = 0,

(1.2)

φ = φ(q0 , q1 , q2 ). First of all, since D1 φ = 0, the function φ depends in fact only on the quotients v1 = q1 /q0 , v2 = q2 /q1 . It is useful to write up the expressions of the operators qi ∂qi acting on such functions, in coordinates v1 , v2 :

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

375

q0 ∂q0 = −v1 ∂1 ; q1 ∂q1 = v1 ∂1 − v2 ∂2 ; q2 ∂q2 = v2 ∂2

(1.3)

(for brevity we set ∂i = ∂vi ). In these coordinates, the remaining operators look as follows: D2 = −(v1 ∂1 )2 + (v1 ∂1 )(v2 ∂2 ) − (v2 ∂2 )2 + v1 + v2 ,

(1.4)

D3 = −(v1 ∂1 ) v2 ∂2 + v1 ∂1 (v2 ∂2 ) − v2 (v1 ∂1 ) + v1 (v2 ∂2 ).

(1.5)

2

2

Theorem 1.1. There exists a unique, up to a multiplicative constant, solution φ(v1 , v2 ) of the system (1.6) D2 φ = D3 φ = 0 in the ring of formal power series Q[[v1 , v2 ]]. If we normalize φ by the condition φ(0) = 1, it will be given by the formula φ(v1 , v2 ) =

X i,j ≥0

(i + j )! i j v v . (i!)3 (j ! )3 1 2

(1.7)

Here D2 , D3 are given by (1.4), (1.5). Proof. Let us denote the power series (1.7) by D2 φ = 0 is equivalent to the recursion

P

i,j ≥0

j

aij v1i v2 , a00 = 1. Equation

(i 2 − ij + j 2 )aij = ai−1,j + ai,j −1

(1.8)

(we imply that aij = 0 if either i or j is negative). Equation D3 φ = 0 is equivalent to ij (i − j )aij = −iai,j −1 + j ai−1,j .

(1.9)

Both formulas are checked immediately for aij given by (1.6). Already (1.8) defines all t aij uniquely from a00 . u Remarks 1.2. (a) We have aij = aj i . (b) It follows from (1.8) that ai0 = The function φ(v, 0) =

1 . (i!)2 X

vi (i!)2

(1.10)

(1.11)

conicides with the hypergeometric function associated with G/B for G = SL(2). (c) The formulas (1.8) and (1.9) imply the identity i 3 ai,j −1 − j 3 ai−1,j = 0. This, together with (1.10), gives immediately (1.7).

(1.12)

376

V. Schechtman

1.2. Another formula. V. Batyrev communicated to me another formula for the solution P j of (1.6): φ˜ = i,j bij v1i v2 , where bij =

X 1 Cir Cjr , (i!)2 (j !)2 r

(1.13)

b! a!(b − a)!

(1.14)

where Cba =

for integers a, b such that 0 ≤ a ≤ b, and 0 if a < 0 or a > b. Claim. The function φ˜ is equal to φ. Indeed, our claim is equivalent to the identity X i Cir Cjr = Ci+j .

(1.15)

r

To prove this, one remarks that to choose i elements from a set which is a disjoint union of two sets of cardinalities j and i, is the same as to choose some r elements from the first set, and i − r from the second one. 2. Equivariant Version Below, we will deform Eqs. (1.6) and the solution (1.7). In terms of quantum cohomology, this deformation corresponds to passing to the equivariant cohomology, with respect to a natural action of a three-torus on X. 2.1. Introduce differential operators Di;h;λ = Pi (hq0 ∂q0 , hq1 ∂q1 , hq2 ∂q2 ; q1 /q0 , q2 /q1 ) − σi (λ0 , λ1 , λ2 ).

(2.1)

Here Pi (u0 , u1 , u2 ; v1 , v2 ) are the polynomials (1.1); σi (λ) = Pi (λ; 0) are the elementary symmetric functions. We are interested in the solutions of the system D1;h;λ ψ = D2;h;λ ψ = D3;h;λ φ = 0 of the form λ /h λ /h λ /h

ψ(q0 , q1 , q2 ) = q0 0 q1 1 q2 2 φ(q0 , q1 , q2 ); φ(q) =

X

(2.2) j

bij k q0i q1 q2k .

(2.3)

i,j,k

The equation D1;h;λ φ = 0 implies i + j + k = 0, thus, for a solution ψ, the factor φ(q) would depend only on v1 = q1 /q0 and v2 = q2 /q1 . To formulate the answer, we introduce the notations αi = λi − λi−1 ; pα! = (h + α)(2h + α) · . . . · (ph + α).

(2.4)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

377

Theorem 2.1. There exists a unique, up to a multiplicative constant, solution of the system (2.2) having the form (2.3), with φ ∈ Q[[v1 , v2 ]], v1 = q1 /q0 , v2 = q2 /q1 . If we normalize it by the condition φ(0) = 1, it will have the form φ(v1 , v2 ) =

X

(i + j )!α1 +α2

i,j ≥0

j

i!j !iα! 1 jα! 2 iα! 1 +α1 jα! 1 +α2

v1i v2 . hi+j

(2.5)

j

Proof. Let us denote by aij the coefficient at v1i v2 of the unknown φ. The equation D2;h;λ φ = 0 is equivalent to (i 2 h2 − ij h2 + j 2 h2 + ihα1 + j hα2 )aij = ai−1,j + ai,j −1

(2.6)

(cf. 1.8)). The equation D3;h;λ φ = 0 is equivalent to [(ih − λ0 )(ih − j h + λ1 )(j h + λ2 ) + λ0 λ1 λ2 ]aij = (−ih + λ0 )ai,j −1 + (j h + λ2 )ai−1,j

(2.7)

(cf. 1.9)). These identities are checked directly. The uniqueness follows from (2.6). u t Remarks 2.2. We have aij (α1 , α2 ) = aj i (α2 , α1 ). Equation (2.6) implies ai0 =

1 hi i!iα! 1

(2.7)

(cf. (1.10)). On the other hand, (2.6) and (2.7) together imply the identity ih(ih + α1 )(ih + α1 + α2 )ai,j −1 − j h(j h + α2 )(j h + α1 + α2 )ai−1,j = 0

(2.8)

(cf. (1.12)). This and (2.7) give the expression (2.5).

3. Comparison with the Fixed Point Method 3.1. Let us see what Theorem II.3.6 gives for G = SL(3), i.e., for the root system A2 . The simple roots are α1 , α2 , the positive ones are α1 , α2 , α := α1 + α2 . The Weyl group W = 63 has two Coxeter generators s1 , s2 corresponding to the simple roots, and sα = s1 s2 s1 . For example, s1 takes α1 to −α1 and α2 to α, etc. We have six series zw (w ∈ W ): X

aij ;w q i q j ;

(3.1)

aij ;w = w(aij ); aij = aij (α1 , α2 ; h) := aij ;id .

(3.2)

zw = zw (q1 , q2 ; α1 , α2 ; h) =

i≥0,j ≥0

where The element w acts only on the arguments α1 , α2 . It is easy to see that the recursion relation of Theorem II.3.6 takes the following form.

378

V. Schechtman

Theorem 3.1. We have zid (q; h) = 1 +

X k>0

q1k kk 1 · · k−1 · zs1 (q; −α1 /k) 2 kh + α1 (k!) α1

X

q2k kk 1 · · · zs2 (q; −α2 /k) kh + α2 (k!)2 α2k−1 k>0 X q1k q2k k 2(k−1) · − kh + α1 + α2 (k!)2 +

k>0

α1 + α 2 1 · · Qk−1 · zsα (q; −(α1 + α2 )/k) . 2 α1 α2 m=1 (mα1 − (k − m)α2 ) (3.3)

Now, the main result of this section is Theorem 3.2. The recursion relation (3.3) is satisfied, with aij =

(i + j )!α1 +α2

i!j !iα! 1 jα! 2 iα! 1 +α2 jα! 1 +α2

.

(3.4)

Proof. Assume that i ≤ j . Let us write aij in the form aij (α1 , α2 ; h) =

[(i + j )h + α1 + α2 ] · . . . · [(1 + j )h + α1 + α2 ] (j h + α1 + α2 ) · . . . · (h + α1 + α2 ) 1 . · i!(h + α1 ) · . . . · (ih + α1 ) · j !(h + α2 ) · . . . · (j h + α2 )

(3.5)

The denominator (as a function of h) has the distinct roots: h=−

α1 α1 + α2 (k = 1, . . . , i); h = − (k = 1, . . . , i); k k

α2 (k = 1, . . . , j ). k Accordingly, we have the simple fraction decomposition

(3.6)

h=−

aij (α1 , α2 ; h) =

i 1 bk (α , α ) X ij 1 2 k=1

kh + α1

+

j 2 k X bij (α1 , α2 ) k=1

kh + α2

+

i X k=1

3 bk (α , α ) ij 1 2

kh + α1 + α2

. (3.7)

The theorem is equivalent to Lemma 3.3. We have (a) For 1 ≤ k ≤ i, 1 k bij (α1 , α2 )

=

kk 1 · k−1 · ai−k,j (−α1 , α1 + α2 ; −α1 /k); 2 (k!) α1

(3.8)

Hypergeometric Functions for Quantum Cohomology of Flag Spaces

379

(b) For 1 ≤ k ≤ j , 2 k bij (α1 , α2 )

=

kk 1 · k−1 · ai,j −k (α1 + α2 , −α2 ; −α2 /k); 2 (k!) α2

(3.9)

(c) For 1 ≤ k ≤ i, 3 k bij

=−

k 2(k−1) α1 + α2 1 · · Qk−1 2 2 (k!) α1 α2 m=1 [mα1 − (k − m)α2 ]

(3.10)

· ai−k,j −k (−α2 , −α1 ; −(α1 + α2 )/k).

This lemma is established by a direct computation. The theorem is proved. u t Corollary 3.4. The series Zid (q1 , q2 ) = zid (q1 / h, q2 / h) coincides with the series φ(q1 , q2 ) from (2.5). Acknowledgements. This paper arose from the author’s attempt to make an exercise on the beautiful fixed point technique of Givental and Kontsevich. I am grateful to Yu. I. Manin for many stimulating discussions, to A. Goncharov for the stimulating interest on the first stage of this work, and to V. Batyrev who communicated to me Formula III.1.4. This work was done in the highly stimulating atmosphere of Max-Planck-Institut für Mathematik, and I am very grateful to MPI for the hospitality.

References [AB] Atiyah, M.F. and Bott, R.: The moment map and equivariant cohomology. Topology 23, 1–28(1984) [BGG] Bernstein, I.N., Gelfand, I.M., Gelfand, S.I.: Schubert cells and cohomology of the spaces G/P . Usp. Mat. Nauk 28 (3), 3–26 (1973) (Russian) [Russ. Math. Surv. 28 (3), 1–26 (1973); =I.M. Gelfand, Coll. papers, Vol. II, 570–595] [B] Bourbaki, N.: Groupes et algèbres de Lie. Chapitres 4, 5 et 6, Paris: Hermann, 1968 [G1] Givental, A.: Equivariant Gromov–Witten invariants. IMRN 13, 613–663 (1996), alg-geom/9603021 [G2] Givental, A.: Stationary phase integrals, quantum Toda lattices, flag manifolds and the Mirror conjecture. In: Topics in Singularity Theory, V.I. Arnold’s 60th Anniversary Collection, A. Khovanski˘i, A. Varchenko, V. Vassiliev (eds.), AMS Translations Ser. 2, 180, 103–117 (1997); alg-geom/9612001 [K] Kontsevich, M.: Enumeration of rational curves via torus actions. In: The moduli space of curves, R. Dijkgraaf, C. Faber, van der Geer (eds.), Progress in Math. 129, Basel–Boston: Birkhäuser, 1995, pp. 335–368, hep-th/9405035 Communicated by G. Felder

Commun. Math. Phys. 208, 381 – 411 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Bounds on Scattering Poles in One Dimension Michael Hitrik? Department of Mathematics, University of Lund, Box 118, 221 Lund, Sweden Received: 27 January 1999 / Accepted: 7 July 1999

Abstract: For the class of super-exponentially decaying potentials on the real line sharp upper bounds on the counting function of the poles in discs are derived and the density of the poles in strips is estimated. In the case of nonnegative potentials, explicit estimates for the width of a pole-free strip are obtained. 1. Introduction The purpose of this paper is to establish some estimates on the scattering poles for the class of exponentially and super-exponentially decaying potentials on the real line. In particular, we derive sharp upper bounds on the counting function for the poles of superexponentially decaying potentials and estimate the density of the poles in strips. In the case of nonnegative potentials we give explicit estimates for the size of a pole-free strip. These results are obtained using the representation of the scattering matrix, given by Melin [14]. Notice that this representation has also been used by Zworski [17] in his study of the distribution of scattering poles in the case of compactly supported potentials. We also present an alternative approach to the study of the location of the poles, obtained by rewriting the Schrödinger equation as a system of Riccati equations. The existence of pole-free regions and estimates of their size is one of the main problems in the theory of resonances. This problem has been investigated extensively in the semi-classical setting and in the framework of the Lax-Phillips theory. Also, the problem of estimating the number of scattering poles in various subdomains of the complex plane has been much studied in recent years. The survey paper [18] contains an overview of this work as well as an extensive bibliography. The results of this paper are concerned with the class of (super-)exponentially decaying potentials, and when deriving bounds on the scattering poles, we shall always indicate explicitly their dependence on the potential. ? Present address: Centre de Mathématiques, CMAT Ecole Polytechnique, 91128 Palaiseau Cedex, France. E-mail: [email protected]

382

M. Hitrik

The organization of the paper is as follows. In the beginning of Sect. 2 we recall the intertwining operators and the representation of the scattering matrix, as given in [14]. We also discuss some continuity properties of the scattering matrix as a function of the potential. The basic estimates for the study of scattering poles are then derived in Theorem 2.4 and Theorem 2.5, and the density of the poles for super-exponentially decaying potentials in discs is estimated. In the beginning of Sect. 3 we estimate the size of a pole-free strip for compactly supported nonnegative potentials, and afterwards this estimate is generalized to super-exponentially decaying potentials. Next we address the problem of estimating the density of poles in arbitrary strips. The obtained higher value of the density in this case compared with the case of discs reflects bounds on the location of the poles. Such bounds are finally derived using the Riccati equation approach. 2. The Scattering Matrix and Global Upper Bounds 2.1. The scattering data. We begin by recalling some important results of scattering theory on the line. Our basic reference here is the paper [14]. Consider the Schrödinger equation Hp u = −u00 + pu = k 2 u, k ∈ R, where p is a real-valued measurable potential such that Z ∞ (1 + |x|) |p(x)| dx < +∞. −∞

(2.1)

(2.2)

There exist two functions f (x, k) and g(x, k), such that f and g solve (2.1), and f (x, k) = eixk + o(1), x → +∞, g(x, k) = e−ixk + o(1), x → −∞. We shall say that f and g are the Jost functions. For k ∈ R \ {0}, f (x, k) and f (x, k) = f (x, −k) are solutions of the same equation (2.1), but with different boundary conditions at +∞, so they are linearly independent. Therefore, we can write ikg(x, k) = a(k)f (x, −k) + b(k)f (x, k),

(2.3)

where a(k) and b(k) are uniquely determined. One finds that a(k) = a(−k), b(k) = b(−k), and that k 2 + |b(k)|2 = |a(k)|2 .

(2.4)

A combination of (2.3) with its complex conjugate then shows that ikf (x, k) = a(k)g(x, −k) + b(−k)g(x, k).

(2.5)

We shall now introduce the elements of the scattering matrix of p when 0 6 = k ∈ R. When doing it we notice that, since a(k) 6= 0 by (2.4), it follows that the functions f and g form a basis of solutions of (2.1). Moreover, they extend to analytic functions of k in the upper half-plane, continuous in the closure of that set. Their complex conjugates

Bounds on Scattering Poles

383

f (x, k) = f (x, −k) and g(x, k) = g(x, −k) have natural analytic extensions to the lower half-plane instead. Equations (2.3) and (2.5) may now be rewritten in the form b(k) ik g(x, k) − f (x, k), a(k) a(k) ik b(−k) g(x, k) + f (x, k), g(x, −k) = − a(k) a(k)

f (x, −k) =

(2.6)

where we have expressed the solutions of (2.1) with analytic extensions to C− as linear combinations of those with analytic extensions to C+ . Since f (x, k) was normalized by boundary conditions at +∞, we shall call r(k) = b(k)/a(k) the right reflection coefficient. For similar reasons, b(−k)/a(k) is called the left reflection coefficient, and the function t (k) = ik/a(k) is the transmission coefficient. We notice that the matrix appearing in (2.6) is unitary. This is the scattering matrix, after the off-diagonal elements have been multiplied by −1. The assertions about analyticity in k ∈ C+ of the functions f and g are consequences of their integral representations in terms of the intertwining operators between Hp and H0 , which we proceed to discuss following [14]. Associated to p, there are two operators A+ = I + R+ and A− = I + R− , with Hp A± = A± H0 , such that ±(y − x) ≥ 0 in the support of A± . (Here and in what follows we identify operators with their distribution kernels.) The functions R± are continuous up to the boundary in the sets ±(y − x) > 0, and Z (2.7) || R± (x, ·) ||L1 = |R± (x, y)| dy < ∞, for any x. Moreover, || R± (x, ·) ||L1 → 0 as x → ±∞. It follows then that f (x, −k) (resp. g(x, k)) is the Fourier transform of A+ (x, y) (resp. A− (x, y)) with respect to the second variable, so that Z ∞ R+ (x, y)eiyk dy, f (x, k) = eixk + x

and g(x, k) = e−ixk +

Z

x

−∞

R− (x, y)e−iyk dy,

(2.8)

where k ∈ R. Apart from functions that are continuous on the whole of R2 , we have R± (x, y) ≡ R±,0 (x, y), where

and

Z ∞ 1 θ+ (y − x) R+,0 (x, y) = p(t) dt, 2 (x+y)/2 Z (x+y)/2 1 p(t) dt, θ+ (x − y) R−,0 (x, y) = 2 −∞ θ+ (t) = 1 when t ≥ 0 and 0 otherwise.

(2.9)

384

M. Hitrik

These are the leading terms in R± , and one has that the R± satisfy the equations R± = R±,0 + Lp,± R± ,

(2.10)

where Lp = Lp,− is given by ZZ E(x − x 0 , y − y 0 )p(x 0 )T (x 0 , y 0 ) dx 0 dy 0 , Lp T (x, y) = 1 when x > 0, |y| < |x| and 0 otherwise. 2 There is a similar expression for Lp,+ (x, y). In what follows we shall write R−,0 (x, y) = R0 (x, y), R− (x, y) = R(x, y), when no confusion seems possible. In order to describe the growth properties of R(x, y), and, in particular, to sharpen (2.7), we introduce the nondecreasing functions Z x Z x |p(t)| dt, v(x) = u(t) dt. u(x) = E(x, y) =

−∞

−∞

The solution R of (2.10) is obtained by inverting the operator I − Lp ; R=

∞ X k=0

Lkp R0 .

(2.11)

The following estimate: 1 x + y (v(x) − v ((x + y)/2))k k R (x, y) , x ≥ y, ≤ u Lp 0 2 2 k! is true, see [13]. Therefore, x+y x+y 1 |R(x, y)| ≤ u exp v(x) − v , x ≥ y, 2 2 2 and we have

Z || R(x, ·) ||L1 =

x

−∞

(2.12)

(2.13)

|R(x, y)| dy ≤ ev(x) − 1.

Another important result ([14, Lemma 4.2]) is that p(x)R(x, y) ∈ L1 (R2 ).

(2.14)

Notice also that if x ≥ a in the support of p, then it is immediate from (2.13) that 2a − x ≤ y ≤ x in the support of R(x, y).

(2.15)

We introduce now the following representations for the functions a and b, given by Melin [14]: There exist temperate real-valued distributions X and Y such that ˆ a(k) = X(k) and b(k) = Yˆ (k),

(2.16)

Bounds on Scattering Poles

385

where X and Y are given by the following explicit formulas: Z +∞ Z +∞ 1 1 p(z)dz δ(y) − p(z)R(z, z + y) dz, X(y) = δ 0 (y) − 2 2 −∞ −∞ (2.17) Z +∞ y 1 1 p + p(z)R(z, y − z) dz. (2.18) Y (y) = 4 2 2 −∞ Here the Fourier transform is normalized as in [8]. We remark that the expressions for X and Y in [14] were given in terms of the kernel R+ (x, y), but using the identities ˇ = R(−x, −y, p), Xpˇ = Xp and Ypˇ = Yˇp , which appear in [14], formula R+ (x, y, p) (5.14), it is easy to see that the representations (2.17) and (2.18) are valid. For future reference we rewrite now (2.3) in the form ˆ ikg(x, k) = X(k)f (x, −k) + Yˆ (k)f (x, k).

(2.19)

A combination of (2.15) with (2.17) and (2.18) shows that chsupp(Y ) ⊂ chsupp(p(·/2))

(2.20)

chsupp(X) ⊂ [−2d, 0],

(2.21)

and

if d is the diameter of the support of p. Furthermore, Z +∞ 1 0 p(z)dz δ(y) ∈ L1 ∩ L∞ , X(y) − δ (y) + 2 −∞ and

y 1 p ∈ L1 ∩ L∞ . Y (y) − 4 2

(2.22)

(2.23)

It follows that Xˆ extends to an analytic function in Im k > 0, continuous up to the ˆ boundary. We also know that X(k) has finitely many zeros in Im k > 0, all of them simple and situated on the imaginary axis. Furthermore, iβ is a zero precisely when −β 2 is an eigenvalue of Hp . 2.2. Dependence on the potential. We shall now give some symmetry properties of the distributions X and Y with respect to the one-parameter groups δλ p(x) = λ2 p(λx), λ > 0 and τh p(x) = p(x +h), h ∈ R. It follows from (2.9) and (2.11) that Rδλ p (x, y) = λRp (λx, λy). Therefore, the mappings p → Xp and p → Yp commute with the action of the dilatation group. In other words, Xδλ p = δλ Xp , Yδλ p = δλ Yp .

(2.24)

386

M. Hitrik

For the translation group we have instead, Xτh p = Xp , Yτh p = τ2h Yp .

(2.25)

Later we shall also need some continuity properties of the mappings p → Xp and p → Yp , which we proceed to discuss. When doing this, we let q ≥ 0 be a measurable function such that Z ∞ −∞

and set

(1 + |x|) |q(x)| dx < ∞,

Bq = {p ∈ L1 (R); |p| ≤ q}.

Definition 2.1. We say that a mapping T : Bq → W , where W is a Banach space, is weakly sequentially continuous, if for any sequence pj in Bq converging to p ∈ Bq weakly in the space of measures on R it is true that T (pj ) converges to T (p) in W . Theorem 2.2. The mapping p → Xp − δ 0 is weakly sequentially continuous from Bq to the Banach space of bounded measures on R. Proof. Assume that pj ∈ Bq converges to p ∈ Bq weakly in the space of measures. It follows then that pj is also convergent in the weak topology of L1 (R), i.e. Z Z hu, pj i = u(x)pj (x) dx → hu, pi = u(x)p(x) dx, for every u ∈ L∞ (R). In fact, we may find a sequence uν ∈ C0 tending to u almost everywhere and boundedly. Since sup hpj − p, uν − ui ≤ 2hq, |uν − u|i → 0 as ν → ∞, j

the assertion follows from the fact that hpj , uν i → hp, uν i for every ν. It follows in particular that Z x pj (y) dy Pj (x) = Rx

−∞

converges pointwise to P (x) = −∞ p(y) dy. Since it is equicontinuous and since pj ≤ q, the convergence must be uniform on R. Let us consider now R∞ = Rp and Rj = Rpj . Let Rj,ν be the contribution to Rj which is homogeneous of degree ν ≥ 1 in pj , see (2.11). Allowing j to be ∞ and writing p = p∞ , we have then ZZ E(x − x 0 , y − y 0 )pj (x 0 )Rj,ν (x 0 , y 0 ) dx 0 dy 0 . (2.26) Rj,ν+1 (x, y) = Let Rq,ν be the corresponding expression with q. Then Rj,ν ≤ Rq,ν .

Bounds on Scattering Poles

387

We shall prove now that 2 (2.27) Rj → Rp in L∞ loc (R ) P∞ 2 as j → ∞. Since it follows by (2.12) that ν=1 Rq,ν is convergent in L∞ loc (R ), it suffices ∞ to prove that Rj,ν is convergent in Lloc for every ν. In view of (2.9) we have already seen that this is true when ν = 1, so let us assume that ν > 1 and that the statement has already been proved for lower values of ν. We notice that the functions in (2.26) are equicontinuous, and therefore it is sufficient to show the pointwise convergence. Since x 0 ≤ x in the support of the integrand in (2.26), and since ZZ ZZ pj (x 0 )Rj,ν (x 0 , y 0 ) dx 0 dy ≤ q(x 0 )Rq (x 0 , y 0 ) dx 0 dy, x 0 ≤N

x 0 ≤N

where the right-hand side tends to zero as N → −∞, we may replace the integration in (2.26) by an integration over a compact set in x 0 when proving our assertion. The support conditions on E give us then also a bound on y 0 when (x, y) is kept fixed. It is sufficient therefore to prove the pointwise convergence of (2.26) when the integration is already that Rj,ν converges in L∞ loc and Rperformed over a compact set. Since we know ∞ pj (x)u(x) dx is convergent when u ∈ L , our assertion follows. Consider now the expression (2.17), which we write as Z +∞ 1 1 0 pj (z)dz δ(y) − fpj (y). Xpj (y) = δ (y) − 2 2 −∞ The coefficient in front of δ(y) converges to the corresponding one for p, and we want to show that fpj → fp in L1 . Since Z Z pj (z)Rj (z, z + y) dz ≤ q(z)Rq (z, z + y) dz = W (y), where W ∈ L1 , it suffices to show that fj (y) = fpj (y) converges to fp in L1loc . Set Z n q(z)Rq (z, z + y) dz. Wn (y) = −n

Then |Wn | ≤ W and Wn → W pointwise. It follows by Lebesgue’s theorem that Wn → W in L1 and that the norm in L1 (R) of Z pj (z)Rj (z, z + y) dz y→ |z|>n

converges to zero uniformly in j as n → ∞. It suffices therefore to prove the convergence in L1loc (R) of Z y→

n

−n

pj (z)Rj (z, z + y) dz

to the corresponding integral when Rj has been replaced by R, and pj by p. The proof is now complete, since we know that Rj → R in L∞ loc and hpj , ui → hp, ui when t u ∈ L∞ . u In the same way one can prove the following result. Theorem 2.3. The mapping p → Yp (y) − p(y/2)/4 is weakly sequentially continuous from Bq to L1 (R).

388

M. Hitrik

2.3. Proof of the main estimate. We have already observed the important result (2.14), valid for all potentials, such that (1 + |x|)p(x) ∈ L1 . The main estimate for the study of scattering poles is contained in the following theorem. Before stating it, we introduce the notation Z ∞ p(z)Rp (z, z + y) dz, (2.28) fp (y) = −∞

so that Xp (y) = δ 0 (y) −

1 2

1 p(z) dz δ(y) − fp (y). 2 −∞

Z

∞

Theorem 2.4. Let (1 + |x|)e2a|x| p(x) ∈ L1 for some a > 0, and set q(x) = e2a|x| p(x). Then

Z

∞ −∞

ea|y| fp (y) dy ≤ 2e2|| p || || q || || q ||L1 + || q || || p ||L1 , Z

where || p || =

∞ −∞

(2.29)

|x| |p(x)| dx.

Proof. Since Rp ≤ R|p| , we may assume that p ≥ 0. First we shall prove that Lkp R0,p (x, y) ≤ ea(x+y) Lkp R0,q (x, y),

(2.30)

for all k ≥ 0. We have R0,p (x, y) =

1 2

Z

(x+y)/2

−∞

e−2a|t| q(t) dt ≤ ea(x+y) R0,q (x, y), y ≤ x,

since |t| ≥ −(x + y)/2. Therefore we assume that k ≥ 1, and (2.30) has been proved for lower values of k. We have ZZ 0 0 0 0 E(x − x 0 , y − y 0 )p(x 0 )Lk−1 Lkp R0,p (x, y) = p R0,p (x , y ) dx dy ZZ 0 0 0 0 0 0 ≤ E(x − x 0 , y − y 0 )ea(x +y ) p(x 0 )Lk−1 p R0,q (x , y ) dx dy , and we only have to notice that

x0 + y0 ≤ x + y

when (x 0 , y 0 ) is in the support of the integrand. This gives (2.30), and after a summation over k, we get Rp (x, y) ≤ ea(x+y) Rp,q (x, y), where Rp,q =

∞ X k=0

Lkp R0,q .

Bounds on Scattering Poles

389

We have that y ≤ 0 in the support of Rp (x, x + y) and it follows that Z ∞ Z ∞ p(z)Rp (z, z + y) dz ≤ q(z)Rp,q (z, z + y) dz, ea|y| −∞

−∞

and then

Z

|| ea|·| fp ||L1 ≤

∞

−∞

q(x)|| Rp,q (x, ·) ||L1 dx.

(2.31)

Set r(x) = || Rp,q (x, ·) ||. This function is bounded on any interval bounded to the right. We now come to estimate r(x). Since Rp,q = Lp Rp,q + R0,q , it follows that Z x Z x (x − x 0 )p(x 0 )r(x 0 ) dx 0 + (x − x 0 )q(x 0 ) dx 0 . r(x) = −∞

−∞

We shall estimate r(x) first for negative x. Set q − (x) = q(x) when x < 0 and q − (x) = 0 for x ≥ 0. Set also s(x) = |x| p(x). If x ≤ 0 we have Z x s(x 0 )r(x 0 ) dx 0 + || q − ||. r(x) ≤ −∞

This inequality may be written ϕ 0 (x) ≤ s(x)ϕ(x) + s(x)|| q − ||, Rx Rx where ϕ(x) = −∞ s(y)r(y) dy. If S(x) = −∞ s(y) dy it follows that

e−S ϕ

0

≤ −|| q − ||

d −S e . dx

Hence e−S(x) ϕ(x) ≤ (1 − e−S(x) )|| q − ||, and it follows that r(x) ≤ ϕ(x) + || q − || ≤ e|| p

− ||

|| q − ||, x ≤ 0.

Next we assume that x ≥ 0. Let us set A = e|| p and

− ||

B = e|| p

|| q − || || p− ||L1 + || q ||L1 ,

− ||

|| p− || || q − || + || q − ||.

Then it is easily seen that Z

x

r(x) ≤ x It follows if ψ(x) =

If h(x) =

Rx 0

Rx 0

p(x 0 )r(x 0 ) dx 0 + Ax + B.

0

p(y)r(y) dy that ψ 0 (x) ≤ s(x)ψ(x) + As(x) + Bp(x).

s(y) dy this may be written

e−h ψ

0

≤ −A

d −h e + Be−h p(x). dx

(2.32)

390

M. Hitrik

Hence

e−h ψ ≤ A(1 − e−h ) + B|| p+ ||L1 ψ(x) ≤ A eh(x) − 1 + eh(x) B|| p+ ||L1 .

and

Here p + (x) = p(x) for x > 0 and 0 otherwise. It follows when x ≥ 0 that r(x) ≤ xψ(x) + Ax + B ≤ Axeh + Bxeh || p+ ||L1 + B + ≤ xe|| p || A + B|| p+ ||L1 + B. We have A + B|| p + ||L1 ≤ e|| p + e|| p =e

− || − ||

|| p− ||

|| q − || || p ||L1 + || q ||L1 || p− || || q − || || p ||L1 + || q − || || p+ ||L1 || p ||L1 || q − || 1 + || p− || + || q ||L1 + || q − || || p + ||L1

≤ || p ||L1 e2|| p ≤ 2e2|| p and

− ||

B ≤ e|| p

− ||

|| q − || + || q ||L1 + || q − || || p ||L1

|| p ||L1 || q − || + || q ||L1 ,

− ||

− || q − || 1 + || p− || ≤ e2|| p || || q − ||.

Hence it follows that for x ≥ 0,

r(x) ≤ xe2|| p || 2|| p ||L1 || q − || + || q ||L1 + e2|| p || || q − ||.

Combining it with (2.32) we obtain Z ∞ − q(x)r(x) dx ≤ e|| p || || q − || || q − ||L1 + e2|| p || || q − || || q + ||L1 −∞

+ e2|| p || 2|| p ||L1 || q − || || q + || + || q ||L1 || q + || ≤ 2e2|| p || || q || || q ||L1 + || p ||L1 || q || .

In view of (2.31) this completes the proof. u t Remark. We notice that the estimate (2.29) is invariant under scaling. Indeed, when δλ p(x) = λ2 p(λx), λ > 0, it follows from (2.24) that fδλ p (x) = λ2 fp (λx) and then Z Z ∞ 1 ∞ aλ|y| a|y| fp (y) dy = fδλ p (y) dy. e e λ −∞ −∞ The assertion follows since when p is replaced by δλ p and a by λa, then the right hand side of (2.29) is multiplied by λ. Remark. It follows from the proof of Theorem 2.4, or, alternatively, directly from (2.13) that when x ≤ 0 in the support of p, the estimate (2.29) improves to the following scaling invariant bound: Z 0 ea|y| fp (y) dy ≤ e|| p || || p ||L1 || q ||. (2.33) −∞

Bounds on Scattering Poles

391

The estimate (2.29) will be particularly useful when estimating the Fourier transform of fp at low frequencies. On the other hand, for high frequencies the following bound on fp is available. This bound will be needed later. Theorem 2.5. Assume that (1 + |x|)e2a|x| p(x) ∈ L1 for some a > 0. Then ZZ Z ∞ 1 ea|y| fp (y) dy ≤ exp(|| p ||L1 /a) e2a|x−y| |p(x)p(y)| dx dy. a −∞ Proof. Computing the Fourier transform in (2.17), we find that Z 1 ∞ ˆ p(x)eixk g(x, k) dx, X(k) = ik − 2 −∞

(2.34)

(2.35)

where g(x, k) is the left Jost function, introduced in (2.8). Put m(x, k) = eixk g(x, k). Then m(x, k) satisfies Z x Dk (x − t)p(t)m(t, k) dt, (2.36) m(x, k) = 1 + −∞

with

1 2iky e −1 , 2ik see [3]. The integral representation (2.35) can also be found in [3]. Equation (2.36) is solved by iteration, ∞ X gn (x, k), m(x, k) = 1 + Dk (y) =

n=1

where

Z

gn (x, k) =

xn ≤xn−1 ≤...≤x

Dk (x − x1 ) . . . Dk (xn−1 − xn )p(x1 ) . . . p(xn ) dx1 . . . dxn .

When Im k ≥ −a, we estimate Dk (y) by e2ay /|k|, y ≥ 0, and this gives Z e2ax |p(x1 )| . . . |p(xn−1 )| e−2axn |p(xn )| dx1 . . . dxn |gn (x, k)| ≤ |k|n xn ≤xn−1 ≤...≤x Z x e2ax (u(x))n−1 −2at |p(t)| dt , ≤ e |k|n (n − 1)! −∞ Z

where u(x) =

x −∞

|p(t)| dt.

We get |m(x, k) − 1| ≤

e2ax |k|

Z

x

−∞

Now 1 ˆ X(k) = ik − 2

e−2at |p(t)| dt exp(|| p ||L1 /|k|). 1 p(x) dx − fˆp (k), 2 −∞

Z

(2.37)

∞

(2.38)

392

M. Hitrik

and using (2.35) and (2.37) we get the following bound on fˆp , Z Z 1 ˆ 2a|x−y| |p(x)| |p(y)| dx dy , Im k ≥ −a. exp || p ||L1 /|k| e fp (k) ≤ |k| (2.39) In particular when p ≥ 0 and k = −ia is purely imaginary, then fp ≥ 0 and it follows that Z 0 ea|y| fp (y) dy = fˆp (−ia) −∞ Z Z 1 e2a|x−y| |p(x)| |p(y)| dx dy . ≤ exp || p ||L1 /a a t Since fp ≤ f|p| , the proposition follows. u Remark. Using (2.24) and (2.25) we see that the estimate (2.34) is both scaling and translation invariant. 2.4. Density of the poles. We shall now introduce the relevant class of potentials. We say that a potential p is super-exponentially decaying if e2a|x| p(x) ∈ L1 (R) for any a > 0. It follows then from Theorem 2.4 that when p is super-exponentially decaying, Xˆ p extends to an entire analytic function, and it is easily seen that Yˆp enjoys the same property. The relation (2.4) extends to C as ˆ X(−k), ˆ k 2 + Yˆ (k)Yˆ (−k) = X(k)

(2.40)

ˆ ˆ ˆ = X(−k), Yˆ (k) = Yˆ (−k), k ∈ R. The zeros of X(k) in Im k < 0 will be since X(k) called scattering poles or resonances. These are the poles of the transmission coefficient ˆ t (k) = ik/X(k) in C− . It follows from (2.40) that the scattering poles coincide with the poles of the reflection coefficient r(k) =

Yˆ (k) ˆ X(k)

in C− . When p decays at some fixed exponential rate, the continuation of Xˆ can be made to a strip S around the real axis. In this case, the scattering poles are the zeros of Xˆ in S− = {k ∈ S; Im k < 0}. Theorems 2.4 and 2.5 have a direct application to the problem of estimating the density of resonances. We introduce the counting function N (r) as the number of scattering poles in the disc |k| ≤ r, counted with their multiplicities. In the case of compactly supported potentials it was proved by Zworski [17] that N(r) =

2d r + o(r), r → ∞, π

(2.41)

where d is the diameter of the support of the potential. For some special class of superexponentially decaying potentials and using different methods, Froese [5] established that N(r) = Cr ρ + o(r ρ ),

Bounds on Scattering Poles

393

where ρ is the order of growth of the Fourier transform of the potential. Here we shall give an upper bound on N(r) for a general super-exponentially decaying potential. Similar bounds in any odd dimension have been obtained by Froese [6]. We shall nevertheless give a proof, as it is short and serves as a preparation for the more general results to follow. The bound on N(r) will be given in terms of the function Z Z e2r|x−y| |p(x)p(y)| dx dy , r ≥ 0. (2.42) ϕp (r) = log We notice that this is a strictly increasing convex function, with linear growth at infinity if and only if p is compactly supported. Theorem 2.6. Let p be a super-exponentially decaying potential. Then Z r N (t) dt ≤ C + ϕp (r), r ≥ 1, r/2 t

(2.43)

for some C depending on p. Proof. First, we shall estimate the growth of Xˆ p in the lower half-plane. By (2.39) we have ZZ 1 ˆ exp || p ||L1 / |k| e2|β(x−y)| |p(x)p(y)| dx dy, fp (k) ≤ |k| when k = α + iβ. Then

ˆ fp (k) ≤ C(p)eϕp (|β|) , |k| = r ≥ 1.

Here C(p) denotes different constants, depending on the potential. Since Xˆ p (k) = R ik − (1/2)( p) − (1/2)fˆp (k), we have that a similar bound holds for Xˆ p , ˆ (2.44) Xp (k) ≤ C(p) exp ϕp (r) , |k| ≤ r. Assume now that Xˆ p (0) 6 = 0. Then Jensen’s formula, see [16], Z π Z r N(t) 1 log Xˆ p (reiθ ) dθ − log Xˆ p (0) dt = t 2π −π 0 together with (2.44) implies (2.43) at once. When Xˆ p (0) = 0, we use the fact that the 2 zero is of order one, since Xˆ p (k) ≥ k 2 , k ∈ R. Then we can apply the preceding t argument to Xˆ p (k)/k, the conclusion being the same. u Remark. Since when a > 1 we have N(r) = (log a)−1 N (r) we also get a bound on N(r).

Z r

ar

dt ≤ (log a)−1 t

Z r

ar

N (t) dt, t

394

M. Hitrik

3. Scattering Poles Near the Real Axis 3.1. Pole-free regions for compactly supported potentials. As a preparation for later considerations, and also, since some of the results are only valid in this case, we shall work here with compactly supported integrable potentials. Let [a, b] be the smallest interval, containing the support of the potential p. Then it follows from (2.20) and (2.21) that suppX ⊆ [−2(b − a), 0], suppY ⊆ [2a, 2b], and therefore Xˆ and Yˆ are entire functions. Moreover, it was proved by Zworski [17] that [−2(b − a), 0] is the smallest interval containing the support of X. The functions f (x, k) and g(x, k) are also entire analytic functions of k. We have f (x, k) = eixk for x > b, and

g(x, k) = e−ixk for x < a.

Recall that the scattering poles, or resonances of p, are defined as the points k ∈ C− , ˆ for which X(k) = 0. From (2.19) it follows that the poles can be characterized in the following way: k ∈ C− is a scattering pole if and only if there exists a function ϕ(x) such that −ϕ 00 (x) + p(x)ϕ(x) = k 2 ϕ(x),

and ϕ(x) =

(3.1)

Aeixk when x > b x
for some number A. The solution ϕ(x) = g(x, k) grows exponentially at ±∞. Let us also notice that the poles are symmetric with respect to reflection in the imaginary axis. Indeed, we have that ˆ ˆ k ∈ C, X(k) = X(−k), since

(3.2)

ˆ ˆ X(k) = X(−k), k ∈ R.

Some lower bounds on the imaginary part of the poles ouside the imaginary axis, depending on the real part, were obtained in the works [7] and [4]. Here we shall be concerned with the existence of a pole-free strip below the real axis. Theorem 3.1. Let p ∈ L1 (R) be supported by an interval of length d > 0. Define the function 1 e−2d|| p ||L1 . h(p) = 4d Then we have 1. The set S = {k; −h(p) < Im k < 0, Re poles of p. k 6= 0} contains no scattering 2. If ed|| p ||L1 < 1 then the interval −i 0, −(1/2d) log(d|| p ||L1 ) contains at most one pole of p.

Bounds on Scattering Poles

395

3. Assume that p ≥ 0. Then the strip S = {k; −h(p) < Im k < 0} contains at most one pole of p. Moreover, set || p ||L1 . g(p) = min h(p), 2 Then the strip 6 = {k ∈ C, −g(p) < Im k < 0} is a pole-free region. Remark. It follows from Theorem 3.1 that, as a sequence of nonnegative potentials tends to infinity, the scattering poles can approach the real axis at most exponentially fast. The following proposition gives the first part of the theorem. Proposition 3.2. Let p ∈ L1 (R) be supported by an interval of length d > 0. Assume that k is a scattering pole of p with Re k 6= 0. Then, d |Im k| ≥

1 −2d|| p || 1 L . e 4

(3.3)

Proof. By (2.24) and (2.25) we know that the scattering poles are invariant under translations of the potential, and k is a pole of p if and only if λk is a pole of δλ p(x) = λ2 p(λx). Therefore we may assume that the support of p is contained in the interval [0, 1]. The function ϕ(x, k) satisfies −ϕ 00 (x, k) + p(x)ϕ(x, k) = k 2 ϕ(x, k),

(3.4)

ϕ 0 (0) = −ikϕ(0), ϕ 0 (1) = ikϕ(1).

(3.5)

and

Multiplying (3.4) by ϕ(x, k) and integrating by parts, we get Z k

2

1

Z |ϕ(x, k)| dx =

0

2

1

−ϕ 00 (x, k)ϕ(x, k) + p(x) |ϕ(x, k)|2 dx

0

= −ik |ϕ(0)|2 + |ϕ(1)|2 +

Z

1

0 ϕ (x, k) 2 + p(x) |ϕ(x, k)|2 dx.

0

Therefore,

−Re k |ϕ(0)|2 + |ϕ(1)|2 = 2Re kIm k

Z

1

|ϕ(x, k)|2 dx,

0

and, as Re k 6 = 0, we get |Im k| =

|ϕ(0)|2 + |ϕ(1)|2 R1 2 0 |ϕ(x, k)|2 dx.

(3.6)

396

M. Hitrik

We shall now estimate ϕ(x, k) = g(x, k) when x ∈ [0, 1]. We may write Z x R(x, y)e−iyk dy. g(x, k) = e−ixk + −∞

(3.7)

It follows from (2.15) that R(x, y) 6 = 0 if and only if −x ≤ y ≤ x. Therefore, writing β = Im k, we get Z x |R(x, y)| e−y|β| dy |g(x, k)| ≤ e−x|β| + −x Z x Z x x|β| |R(x, y)| dy ≤ e|β| 1 + |R(x, y)| dy . ≤ 1+e −x

−x

Using the estimate (2.13), we get Z Z x v(x) |R(x, y)| dy ≤ e −x

x

u(t)e−v(t) dt = ev(x)−v(0) − 1 = ev(x) − 1,

0

since v(0) = 0. Therefore, |g(x, k)| ≤ e|β|+v(x) , 0 < x < 1. Since v is increasing and v(1) = Z

1

R1 0

(1 − t) |p(t)| dt ≤ || p ||L1 , we get

|g(x, k)|2 dx ≤ e2|β|+2|| p ||L1 .

(3.8)

0

As |g(0, k)|2 = 1, it follows from (3.6) and (3.8) that |β| ≥

1 2e

2|β|+2|| p ||L1

.

Hence if s = 2 |β| and t = 2|| p ||L1 , we have ses ≥ e−t . If σ = e−t /2 we have σ eσ =

1 −t e1/2 −t e exp(e−t /2) ≤ e ≤ e−t ≤ ses . 2 2

Hence σ ≤ s, i.e. |Im k| ≥

1 −2|| p || 1 L . e 4

This completes the proof. u t The second part of the theorem is given in the following proposition. by an interval of length Proposition 3.3. Let p ∈ L1 be supported d and assume that ed|| p ||L1 < 1. Then the interval −i 0, −(1/2d) log(d|| p ||L1 ) contains at most one pole of p.

Bounds on Scattering Poles

397

Proof. When proving the proposition we may assume that supp(p) ⊂ [−1, 0] in view of (2.24) and (2.25). Consider Z Z 1 0 1 p(y) dy − fp (y)eλ|y| dy, λ ≥ 0, ϕ(λ) = Xˆ p (−iλ) = λ − 2 2 −∞ Z

where fp (y) =

∞

−∞

p(z)Rp (z, z + y) dz.

Since y ∈ [−2, 0] in supp(fp ), using (2.33) we obtain Z 0 Z 0 fp (y) eλ|y| dy |y| fp (y) eλ|y| dy ≤ 2 −∞ −∞ Z ∞ || p || 2λ|x| |x| |p(x)| dx ≤ 2e|| p ||L1 +2λ || p ||2L1 . || p ||L1 e ≤ 2e −∞

Therefore if e|| p ||L1 < 1 we have Z 0 |y| fp (y) eλ|y| dy ≤ 2e2λ || p ||L1 , −∞

and then 1 ϕ (λ) = 1 − 2 0

Z

0

−∞

|y| e

λ|y|

1 fp (y) dy ≥ 1 − 2

Z

0

−∞

|y| eλ|y| fp (y) dy

≥ 1 − e2λ || p ||L1 > 0, if λ ∈ 0, − (1/2) log(|| p ||L1 ) . Therefore ϕ(λ) has at most one zero in this interval, and the proof is complete. u t It remains to prove the third assertion in the theorem. In the case of nonnegative potentials additional information on the purely imaginary poles is available. Proposition 3.4. Let p ∈ L1 be super-exponentially decaying and nonnegative. Then Hp can have at most two poles on the imaginary axis. If k is such a pole, then Z ∞ |k| > η0 (p) ≡ (1/2)|| p ||L1 + (1/2) p(x)|| R(x, ·) ||L1 dx. (3.9) −∞

Define the functions

Z

η1 (p) = sup{λ > 0;

∞

−∞

|y| fp (y)eλ|y| dy < 2} ∈ [−∞, ∞)

and

η(p) = max (η0 (p), η1 (p)) , where fp (y) is defined in (2.28) . Then there can be at most one pole of the form −iλ, where λ ∈ (0, η(p)). Finally, if p is such that Z ∞ |y| fp (y) dy ≥ 2, (3.10) −∞

then Hp has no purely imaginary poles.

398

M. Hitrik

Proof. We may assume that p is not identically zero. Since Z ∞ p(z)Rp (z, z + y) dz ≥ 0, fp (y) = −∞

it follows that the function 1 ϕ(λ) = Xˆ p (−iλ) = λ − 2

Z

p(y) dy −

1 2

Z

0

−∞

fp (y)eλ|y| dy, λ ≥ 0,

(3.11)

is concave for λ ≥ 0, and ϕ(0) < 0. Therefore, there can be at most two zeros on (0, ∞), and if ϕ(λ) = 0, then Z ∞ Z ∞ fp (y) dy = || p ||L1 + p(x)|| R(x, ·) ||L1 dx, 2λ > || p ||L1 + −∞

−∞

which is (3.9). Also, if ϕ 0 (0) ≤ 0, then ϕ(λ) < 0 for λ > 0, and this gives (3.10). Finally, we shall estimate the length of an interval, containing at most one purely imaginary pole of Hp . Since ϕ 0 (λ) > 0 on (0, η1 (p)), it follows that ϕ(λ) has at most one zero in this interval. Combining this with (3.9) completes the proof. u t Remark. Let Hαp = −D 2 + αp, where p ≥ 0 is some fixed super-exponentially decaying potential, and α > 0 is the coupling constant. Using the arguments of Proposition 3.4, we may draw the following conclusions concerning the behaviour of the purely ˆ imaginary poles as functions of α. When α = 0, then X(0) = 0. When α > 0 is sufficiently small, the function ϕ(λ) = ϕα (λ) has exactly two zeros λ1 (α) and λ2 (α), with λ1 (α) close to 0, and λ2 (α) close to +∞. As α grows, the distance between the two poles decreases. For a certain value of the coupling constant α0 , the poles meet, the function ϕα0 having a zero of multiplicity two. Increasing the coupling constant further results in splitting the double root, and the poles leave the imaginary axis, so that (3.2) is respected. We shall now complete the proof of Theorem 3.1. It follows from Propositions 3.2 and 3.4 that when p ≥ 0, then the strip 6 = {k ∈ C, −g(p) < Im k < 0} is a pole-free region. We have to consider therefore S = {k ∈ C, −h(p) < Im k < 0}. From Proposition 3.2 we know that S contains no poles off the imaginary axis, and we only have to prove that the interval (0, h(p)) contains at most one λ such that −iλ is a resonance. Let η0 (p) and η1 (p) be defined as in Proposition 3.4, and consider η(p) = max (η0 (p), η1 (p)) . We know that the interval −i(0, η(p)) contains at most one pole, and it suffices therefore to prove that η(p) ≥ h(p). When doing this, we may again assume that supp(p) ⊂ [−1, 0], see (2.24) and (2.25). It follows then as in Proposition 3.3 that Z 0 |y| fp (y)eλ|y| dy ≤ 2e2λ+|| p ||L1 || p ||2L1 . −∞

Bounds on Scattering Poles

399

We write t = || p ||L1 , so that η0 (p) ≥ (1/2)t and η1 (p) ≥ λ if e2λ+t t 2 = 1. Then the inequality η(p) ≥ h(p) holds if −t e 1 ≤ max t, log . 2e2t t2 It is enough to prove this when 1/(2 exp(2t)) > t. But then we have −t 1 e −t 4t e > log 4e > 1 > 2t . log 2 t 2e The proof of Theorem 3.1 is now complete. 3.2. A pole-free strip for exponentially decaying potentials. In the beginning of this section it will be assumed that the potential p is such that (1 + |x|)e2a|x| p(x) ∈ L1 for some a > 0. It follows from Theorem 2.4 that Xˆ p is analytic in Im k > −a and continuous up to the boundary of this set. We shall study the location of the scattering poles near the real axis. In particular, we shall be interested in estimates that are uniform in p. When q ≥ 0 is such that (1 + |x|)e2a|x| q(x) ∈ L1 , set Bq = {p ∈ L1 (R); |p| ≤ q}. If µ is a measure, such that |µ| ≤ q, where q ∈ L1loc , then µ is absolutely continuous. The following result is then immediate from the first part of the proof of Theorem 2.2. Proposition 3.5. Let pj be a sequence in Bq . Then there is a subsequence pjk and some p ∈ Bq such that pjk → p weakly in L1 (R). We have the following Theorem 3.6. Let K be a compact set in Im k > −a and F ⊂ L1 (R) be sequentially closed in the weak topology of measures. Assume that there is an integer n such that Xˆ p has at most n zeros in K when p ∈ Bq ∩ F . Then there is an open neighbourhood of K such that Xˆ p has at most n zeros in when p ∈ Bq ∩ F . (All zeros are counted with multiplicities.) Proof. We write K = ∩∞ j =1 j , where the j form a decreasing sequence of small open neighbourhoods of K. Assume that the statement is false. Then we may for every j find pj ∈ Bq ∩ F such that Xˆ j = Xˆ pj has at least n + 1 zeros in j . Passing to a subsequence, we may assume that pj → p weakly in L1 , where p ∈ Bq ∩ F . Then we know by Theorem 2.2 that Xˆ j converges uniformly to Xˆ p in the upper half-plane, and since the sequence Xˆ j is bounded in the space of functions, analytic in the set Im k > −a, it follows by a normal families argument that Xˆ j → Xˆ p locally uniformly in Im k > −a. Take now a relatively compact open neighbourhood V of K such that Xˆ p 6 = 0 on ∂V , and all zeros of Xˆ p in V are contained in K. An application of the argument principle to V gives that Xˆ p must have as many zeros as Xˆ j in V when j is large and we get a contradiction. u t

400

M. Hitrik

As an application of Theorem 3.6 we get Proposition 3.7. There exists an open complex neighbourhood of the origin V , such that V contains at most one pole of any p ∈ Bq . Proof. Since Bq is sequentially closed in the weak topology of measures, the result follows from the fact that when p ∈ Bq , then Xˆ p vanishes at most to the first order at the origin, since 2 ˆ (3.12) Xp (k) ≥ k 2 , k ∈ R, in view of (2.4) and (2.16). u t Using (3.12) together with Proposition 3.7 it is not difficult to see that there exists a strip of the form −λ(q) < Im k < 0 which contains at most one pole of any p ∈ Bq . Due to the symmetry of the poles, such a pole is then situated on the imaginary axis. Now if p ∈ Bq is nonnegative and k is a purely imaginary pole of p, we have that |k| ≥ || p ||L1 /2 in view of Proposition 3.4. It follows therefore that if 0 ≤ p ∈ Bq is such that || p ||L1 is sufficiently small, then the strip {k; −|| p ||L1 /2 < Im k < 0} contains no poles of p. We shall now prove a more precise result. Theorem 3.8. Assume that 0 ≤ p is super-exponentially decaying and that Z ∞ 1 . e|| p ||L1 |x−y|/2 |x − y| p(y) dy ≤ inf x −∞ 10

(3.13)

Then Hp has no resonances in the strip S = {k; −|| p ||L1 /4 ≤ Im k < 0}. Proof. Since the position of the resonances is not changed when p is replaced by a translate of p we may assume that Z ∞ 1 . e|| p ||L1 |y|/2 |y| p(y) dy ≤ 10 −∞ Also, since the conditions and conclusions of the theorem are the same for p(x) and δλ p(x) = λ2 p(λx), we may assume that || p ||L1 = 1. We notice that Z 1 1 0 ˆ fp (y)eλ|y| cos (β |y|) dy Re Xp (β − iλ) = λ − − 2 2 −∞ Z Z 1 1 0 1 1 0 fp (y)eλ|y| dy ≤ − + fp (y)eλ|y| dy, ≤ λ− + 2 2 −∞ 4 2 −∞ when β ∈ R and λ ≤ 1/4. It suffices to prove therefore that Z 0 1 fp (y)e|y|/4 dy < . 2 −∞ Set q(y) = e|y|/2 p(y). An application of Theorem 2.4 gives Z 0 fp (y)e|y|/4 dy ≤ 2e2|| p || || q || || q ||L1 + || q || || p ||L1 . −∞

(3.14)

Bounds on Scattering Poles

401

By our assumptions we have || p ||L1 = 1, and || p || ≤ || q || ≤ 1/10. Hence Z 0 1 |y|/4 2/10 1 || q ||L1 + . e fp (y) dy ≤ 2e 10 10 −∞ We have || q ||L1 ≤ || q || +

Z |y|≤1

e|y|/2 p(y) dy ≤

1 1 + e1/2 || p ||L1 = + e1/2 . 10 10

We have thus proved that Z 0 2 2/10 2 e|y|/4 fp (y) dy ≤ e + e1/2 10 10 −∞ 2 ! 2 2 18 2 2 1+ + + ≤ 10 10 10 10 10 =

1 4 (1, 24) = 0.496 < . 10 2

This completes the proof. u t Remarks. 1. It follows from Theorem 3.1 that a similar result holds for compactly supported nonnegative potentials, the condition similar to (3.13) in this case being d|| p ||L1 ≤ t0 /2, where d is the length of the support of the potential and t0 et0 = 1. 2. Using the arguments, similar to those used in the proof of Theorem 3.8 together with the estimate (3.12) it is straightforward to estimate the width of a pole-free strip in the case when p is not small. This leads, however, to more complicated expressions, and therefore we shall avoid stating them explicitly. 3.3. Density of resonances in strips. We shall now turn to the problem of estimating the number of scattering poles in arbitrary strips. We notice that any super-exponentially decaying potential p has only finitely many poles in Im k ≥ −a for any a > 0. In fact, ˆ ˆ the function Xp is entire analytic and Xp (k)/k → 1 when |k| → ∞, Im k + a ≥ 0. We let Np (a) denote the number of scattering poles in the set Im k ≥ −a. Our goal is to find upper bounds on Np (a). Our starting point is the following general result. The proof that we shall give has been communicated to the author by Professor Lars Hörmander. The original proof of the author was different and gave a slightly weaker result. The author is grateful to Professor Hörmander for this contribution and for referring to [11], where more general results have been given. Proposition 3.9. Let h be a function, analytic in a neighbourhood of the set Im k ≥ 0. Assume that |h(k)| ≤ 1 along R and 1 γ +O , |k| → ∞, Im k ≥ 0, (3.15) h(k) = 1 + ik |k|1+δ for some δ > 0. Then γ ≥ 0 and if kj are the zeros of h in the upper half-plane, repeated according to their multiplicity, we have Z X X 1 ∞ log |h(t)| dt ≥ 2 Im kj . (3.16) γ =2 Im kj − π −∞

402

M. Hitrik

Proof. It follows from the assumptions and the maximum principle that |h(k)| ≤ 1 when Im k ≥ 0, and since γ |h(k)| = 1 + Re + o |k|−1 , ik we must have Re (γ /ik) ≤ 0 when Im k > 0, thus γ ≥ 0. The Riesz representation formula for functions, subharmonic and ≤ 0 in the upper half-plane (see [10]) gives Z k − k X Im k ∞ log |h(t)| j dt + log log |h(k)| = aIm k + , Im k > 0, k − kj π −∞ |t − k|2 (3.17) where kj are the finitely many zeros of h in Im k > 0. The left-hand side is Re (γ / ik) + o(|k|−1 ) at infinity and we have k − k 2iIm kj 2Im kj j + O |k|−2 . log = log 1 − = Re k − kj ik k − kj Since log |h(t)| is locally integrable and log |h(t)| = O(|t|−1−δ ) at infinity, we have that log |h(t)| ∈ L1 (R). When α ≤ argk ≤ π − α, α > 0, we have that |k|2 ≤ Cα |t − k|2 for t ∈ R, and therefore by Lebesgue’s theorem we obtain Z |k|2

∞

−∞

log |h(t)| dt → |t − k|2

Z

∞

−∞

log |h(t)| dt, k → ∞, α ≤ argk ≤ π − α.

We have that Im k/|k|2 = Im (1/k) = −Re (1/ik) and then it follows from (3.17) that the constant a = 0 and (3.16) is true. u t Corollary 3.10. If h is analytic with |h| ≤ 1 in the open upper half-plane and (3.15) holds there, then X 2 Im kj ≤ γ . Thus the number of zeros with Im kj ≥ a/2 does not exceed γ /a. Proof. It suffices to apply the proposition to h(k + iε) for ε > 0 and let ε → 0. u t We are now ready to state Theorem 3.11. For any super-exponentially decaying p we have Np (a/2) ≤ C(p)(1 + ηa (p)), a ≥ 1, where

ZZ ηa (p) =

e2a|x−y| |p(x)p(y)| dx dy,

and C(p) is some constant depending on || p ||L1 , but not on a.

Bounds on Scattering Poles

403

Proof. We shall pass to a new function F (k), having the same zeros as Xˆ p and with the property that |F (k)| ≤ 1 on Im k = −a. This will make it possible to apply Proposition 3.9 to the function F (k − ia). When constructing the function F we write Z ∞ 1 1 ˆ p(x) dx − fˆp (k). (3.18) Xp (k) = ik − 2 −∞ 2 By (2.39) we have 1 ˆ exp || p ||L1 /|k| ηa (p), Im k ≥ −a. fp (k) ≤ |k| Using this, we shall now estimate Xˆ p (k) on the line Im k = −a. Writing k = ξ − ia and using (3.18), we obtain 2 |c| ηa2 (p) 2|| p || 1 /|k| ˆ 2 2 || p ||L1 /|k| L , (k) ≤ ξ + c + e + e η (p) 1 + X p a |k| |k|2 R where c = a − (1/2) p(x) dx. Therefore, since a ≥ 1, 2 ˆ Xp (k) ≤ ξ 2 + a 2 + C(p)a + C(p)ηa (p) (1 + ηa (p)) ≤ |k|2 + C(p)a (1 + ηa (p))2 , where here and in what follows we let C(p) denote different constants ≥ 1, depending only on || p ||L1 . It follows that if we take µ equal to µ = C(p)a (1 + ηa (p)) then

(3.19)

Xˆ (k) p ≤ 1, Im k = −a, ik − µ

since we may assume that µ − 2a ≥ 1 + ηa (p). Therefore the function Xˆ p (k) F (k) = ik − µ has the same zeros as Xˆ p in the set Im k + a ≥ 0, and satisfies |F (k)| ≤ 1 on Im k = −a. Moreover, since by (2.39), Z ∞ Xˆ p (k) 1 1 p(x) dx + O =1− , 2 ik 2ik k −∞ it follows that γ +O F (k) = 1 + ik

1 k2

, |k| → ∞, Im k ≥ −a,

where

Z γ = µ − (1/2)

∞

−∞

p(x) dx,

(3.20)

(3.21)

and we may assume that γ > 0. An application of Proposition 3.9 to the function t F (k − ia) shows that Np (a/2) ≤ γ /a, and this completes the proof. u

404

M. Hitrik

Remark. It is known that the scattering poles of an integrable compactly supported potential lie below a logarithmic curve, i.e. if k is a pole, then |Im k| ≥ a + b log |k|,

(3.22)

with a ∈ R and b > 0 –see Theorem 3.14 and also [12], where this result is proved in a more general setting in the three-dimensional case. Comparing Theorem 2.6 and Theorem 3.11 we see, in particular, that the latter reflects the logarithmic bound (3.22). Remark. We notice that after obvious modifications, the results above are also valid when the potential p decays at some fixed exponential rate. We shall finish this section by making some remarks concerning the question of existence of resonances of exponentially decaying potentials. It is well known that in the onedimensional case, any compactly supported potential has infinitely many resonances– see [17] for the precise results, and in [15] this is established for smooth potentials in any odd dimension. However, the situation is completely different for potentials, decaying at some fixed exponential rate. This is already seen from the existence of the reflectionless potentials, all the resonances in this case being square roots of the eigenvalues. We shall now give an example of an exponentially decaying potential without bound states, which has only finitely many resonances in the set where these are naturally defined. We start with the right reflection coefficient r(k) =

2 , (k + i)(k + 2i)

(3.23)

and try to find the corresponding potential p, which has no bound states. Then we must have that x ≤ 0 in the support of the inverse Fourier transform of r, and it follows from the Gelfand-Levitan equation for the right scattering data that x ≤ 0 in supp(p), see [14]. In order to find the potential on the negative half-axis, we first compute the left reflection coefficient ρ(k). We have the well-known formulas, see [13],

1 t (k) = exp 2πi

Z

+∞

−∞

log (1 − |r(λ)|2 ) dλ , Im k > 0, λ−k

(3.24)

and ρ(k) = −

r(−k)t (k) . t (−k)

(3.25)

Using (3.23) and (3.24) we can then calculate the transmission coefficient. We only state the result and refer to [2] for a detailed discussion of the inverse scattering problem for rational reflection coefficients. We have t (k) =

√ k(k + αi) , α = 5, (k + i)(k + 2i)

and using (3.25), ρ(k) =

−2(k + αi) . (k + i)(k + 2i)(k − αi)

(3.26)

Bounds on Scattering Poles

405

To determine the potential for x < 0, we use the Gelfand-Levitan equation for the left scattering data, Z x R(x, z)Q(z + y)dz = 0 when x > y, (3.27) R(x, y) + Q(x + y) + −∞

where 1 Q(x) = 2π

Z

+∞

−∞

ρ(k)e−ixk dk.

The residue calculus then gives that Q(x) =

−4α eαx , x < 0, (α + 1)(α + 2)

and therefore solving (3.27), we find that the potential is given by p(x) =

16α 2 µe2αx , x < 0, (µ − 2e2αx )2

(3.28)

where µ = (α +1)(α +2). The transmission coefficient t (k) admits a meromorphic continuation to the set Im k > −α, and the poles there are the resonances of p. From (3.26) we see that there are only two resonances, both situated on the imaginary axis. The example above admits a direct generalization which we shall finally describe. When doing this we start with a function R(x, y) in the form R(x, y) = u(x)eα(y−x) θ− (x) + (f (x − y) + g(x + y)) θ+ (x) θ+ (x − y). (3.29) Here θ− (x) = θ+ (−x) and α is a positive number. The functions f , g and u are to be chosen so that I + R will be the intertwining operator A− corresponding to a potential p, supported by R− . We then must have that (3.30) ∂x2 − ∂y2 R(x, y) = p(x)R(x, y) + p(x)δ(x − y). If we require that f (−y) + g(y) = u(0)eαy , y ≤ 0, then

(3.31)

∂x + ∂y R(x, y) = u0 (x)eα(y−x) θ− (x) + 2g 0 (x + y)θ+ (x) θ+ (x − y),

and hence if 2g 0 (y) = u0 (0)eαy , y ≤ 0, we obtain

∂x2 − ∂y2 R(x, y) = 2 u0 (x)θ− (x) + 2g 0 (2x)θ+ (x) δ(x − y) + u00 (x) − 2αu0 (x) eα(y−x) θ− (x)θ+ (x − y).

(3.32)

406

M. Hitrik

If we choose u such that u00 (x) − 2αu0 (x) = 2u(x)u0 (x), x < 0,

(3.33)

g 0 (x) = 0, x > 0,

(3.34)

and if finally

then it follows that R satisfies (3.30) with p(x) = 2u0 (x)θ− (x). Assuming that u is not identically zero, solving (3.33) we find that u(x) =

2αe2αx , x < 0, 2αC − e2αx

for some C with 2αC > 1, and we now have to choose f and g so that the conditions (3.31), (3.32) and (3.34) are satisfied. Now (3.32) together with (3.34) gives that g(y) = (u0 (0)/2α)eαy θ− (y) + (u0 (0)/2α)θ+ (y), since g must be continuous. Then by (3.31) we get u0 (0) −αy e , y > 0, f (y) = u(0)e−αy − 2α and we have constructed R such that I + R is the intertwining operator A− . Using (3.29) it is now easy to compute Xp . Since for x < 0 we have that R(x, y) = u(x)eα(y−x) θ+ (x − y), and x ≤ 0 in the support of p, we get Z 0 Z p(z)u(z) dz eαy θ− (y) = u2 (0)eαy θ− (y), p(z)R(z, z + y) dz = −∞

since p(x) =

2u0 (x).

Then

1 Xp (y) = δ 0 (y) − u(0)δ(y) − θ− (y)u2 (0)eαy . 2

(3.35)

The conclusion that Xˆ p has only finitely many zeros is now immediate. In particular, in the special case when R comes from the potential given by (3.28), computing the Fourier transform in (3.35) we recover the expression (3.26).

3.4. A coupled system of Riccati equations. The purpose of this section is to present an alternative approach to the study of the location of resonances. It is more direct than before and does not depend on the study of the scattering matrix. Instead we shall work with a system of Riccati equations. Notice that a related approach has been used in [1] when studying stability of the shape resonances. Working with Riccati equations will allow us to recover the results concerning the purely imaginary poles, and also, we shall derive bounds giving improved estimates on the imaginary part of the poles with sufficiently large real part. A further study of the system of the Riccati equations (3.37) below could perhaps lead to more precise estimates.

Bounds on Scattering Poles

407

We assume first that p ∈ L1 is compactly supported and supp(p) ⊂ [a, b]. For k ∈ C− we let u(x, k) be the solution to the problem Hp u = k 2 u, u(x, k) = e−ixk , x < a. We want to investigate when u(x, k) = ceikx for x > b. Consider the function ϕ(x, k) =

u0x (x, k) , u(x, k)

(3.36)

which solves the Riccati equation ϕ0 = p − k2 − ϕ2 in the set where u 6 = 0. In what follows we write k = α − iβ, where β > 0, and α will be kept fixed. If instead of ϕ we consider ψ = ϕ + ik, then

ψ 0 = p + 2ikψ − ψ 2 , ψ(x) = 0, x < a.

For reasons of symmetry we may assume that α ≥ 0, and we write ψ = f + ig. This gives us a coupled system of ODE: ( f 0 = p + 2βf − 2αg − f 2 + g 2 , g 0 = 2αf + 2βg − 2f g,

(3.37)

and f (x) = g(x) = 0 for x < a. We know that α − iβ is a resonance precisely when f (b, β) = 2β and g(b, β) = 2α. First, we shall examine the situation when α = 0 and p ≥ 0. In this case the description of the resonances is given by Proposition 3.4, but it is instructive to recover these results by studying (3.37). Then g = 0 and we have the equation f 0 = p + 2βf − f 2 .

(3.38)

Since f (x) = 0 for x ≤ a and p(x) ≥ 0, it is true that f (x) ≥ 0 where it is defined. Then Z x f (x) ≤

a

e2β(x−y) p(y) dy,

and it follows that f exists on the whole interval [a, b]. Notice also that Proposition 3.4 gives || p ||L1 > 2β ⇒ −iβ is not a scattering pole. Next we shall study the derivative fβ0 of f with respect to β. We have 00 = 2βfβ0 + 2f − 2ffβ0 fxβ

(3.39)

408

M. Hitrik

and fβ0 (a, β) = 0. Since f ≥ 0, it follows from this equation that fβ0 (x, β) ≥ 0. Then 00 ≤ 2βfβ0 + 2f, fxβ

and it implies that fβ0 (x, β)

Z ≤ 2e

2xβ a

x

e

−2yβ

Z f (y, β) dy ≤ 2e

b

2bβ a

e−2yβ f (y, β) dy, x ≤ b.

Now, in view of (3.38), Z b Z b e−2yβ f (y, β) dy = (b − y)∂y (e−2yβ f (y, β)) dy Z =

a a

b

a

(b−y)p(y)e−2βy dy −

Z a

b

(b−y)f 2 (y)e−2yβ dy ≤

(3.40)

Z a

b

(b−y)p(y)e−2βy dy.

Define the function Z ϕ(p) = sup{β ≥ 0, e2bβ

a

b

(b − y)p(y)e−2βy dy < 1}.

Then ϕ(p) ≥ β0 , where e2β0 (b−a) (b − a)|| p ||L1 = 1. Since fβ0 (b, β) < 2 on (0, β0 ), it follows that the equation f (b, β) = 2β has at most one solution on this interval. Taking into account (3.39), we summarize the discussion above in the following result, which is just a restatement of Proposition 3.3, combined with (3.39). Proposition 3.12. Let 0 ≤ p ∈ L1 be supported by an interval of length d > 0. Put 1 1 1 || p ||L1 , log . η(p) = max 2 2d d|| p ||L1 Then there can be at most one resonance of the form −iβ, when β ∈ (0, η(p)). In the case of poles off the imaginary axis and in the case when the potential has variable sign, the situation becomes more subtle, and it is no longer clear that the coupled system of equations (3.37) has a global solution. To circumvent this difficulty, we view the function ϕ, defined in (3.36) as taking values in the complex projective line CP 1 . If we use u and u0 as a system of homogeneous coordinates, then the (nonautonomuous) vector field X, generating the global flow, is given by 0 2 u X(u : u0 ) = p − k 2 − u on an open set where u 6 = 0, and X(u : u0 ) = 1 − (p − k 2 )

u 2 u0

,

where u0 6 = 0. It is now convenient to introduce new homogeneous coordinates ku − iu0 and ku + iu0 , so that the solution curve starts at the point (0 : 1). We may then formulate the condition that k is a pole by saying that the solution curve passes through the point (1 : 0) at time x = b. A straightforward computation, using, for example, transition

Bounds on Scattering Poles

409

functions for the tangent bundle of CP 1 , gives the expression for the vector field X on an open set, where ku + iu0 6 = 0. We get X=

p 2ik

ku − iu0 2 ku − iu0 1+ + 2ik . 0 ku + iu ku + iu0

Therefore, the function ku − iu0 ku + iu0

f (x, k) =

vanishes for x < a, and solves the differential equation fx0 =

p (1 + f )2 + 2ikf. 2ik

Our aim now is to estimate the lifespan of the solution. In particular, if f exists on the entire interval (−∞, b], then k is not a resonance. To that end, we shall derive a differential inequality for |f |. A computation shows that 2pβ |f |2 d |f |2 = 4β |f |2 + + Re dx |k|2

pf ik

!

+ p |f |2 Re

f ik

,

and therefore, for |f (x)| 6 = 0, we have |p| |f | |p| |f |2 |p| d |f | ≤ 2β |f | + + . + |k| dx 2 |k| 2 |k| If g solves g 0 (x) = 2βg(x) +

|p(x)| g(x) |p(x)| |p(x)| g 2 (x) + + |k| 2 |k| 2 |k|

and g vanishes for x = a, then, by comparison, we have that 0 ≤ |f | ≤ g, where g is defined. To estimate the lifespan of g, we apply the following lemma. Lemma 3.13. Consider a nonlinear differential equation (

h0 (t) = a(t)h2 (t) + b(t), h(0) = 0,

where a and b are nonnegative locally integrable functions. If Z

T

Z a(t) dt

0

T

b(t) dt

0

then the solution h(t) exists on [0, T ]. Proof. This follows from Lemma 1.3.3 in [9]. u t

< 1,

410

M. Hitrik

To apply the lemma, we just write g(x) = h(x) exp 2βx + (1/ |k|) P (x) , Z

where P (x) =

x

−∞

|p(y)| dy,

so that h solves h0 (x) =

|p(x)| exp 2βx + (1/ |k|) P (x) h2 (x) 2 |k| |p(x)| exp −2βx − (1/ |k|) P (x) . + 2 |k|

(3.41)

Using that the product of the integrals of the coefficients in (3.41) is less than or equal to ZZ 1 exp(|| p ||L1 / |k|) e2β|x−y| |p(x) p(y)| dx dy, 4 |k|2 we arrive at the following theorem. Theorem 3.14. Let p be super-exponentially decaying. Then if k = α − iβ, β > 0, is a scattering pole of p, we have ZZ 1 exp(|| p ||L1 / |k|) e2β|x−y| |p(x) p(y)| dx dy ≥ 1. (3.42) 4 |k|2 Proof. We have already observed that if p is compactly supported, the assertion is a direct 1 application of Lemma 3.13 to (3.41). In the general case, we choose a sequence pj ∈ L of compactly supported functions such that pj ≤ |p| and pj → p almost everywhere. Then it follows as in Theorem 3.6 that Xˆ pj → Xˆ p locally uniformly. An application of Hurwitz’s theorem shows then that k is a pole of p if and only if k = limj →∞ kj , where t kj is a pole of pj . Applying (3.42) to each pj and letting j → ∞ gives the theorem. u We can remark here that when p is compactly supported, then Theorem 3.14 gives a direct proof of the logarithmic bound for the imaginary part of the poles in this case. Remark. We finally notice that a result, similar to Theorem 3.14 can be obtained if one uses the characterization of the resonances as poles of the meromorphic continuation of the weighted resolvent of the Schrödinger operator. In fact, it is essentially well known (see, for example, [5]) and follows from the resolvent equation combined with the analytic Fredholm theory, that, for a super-exponentially decaying potential, the weighted resolvent Rp (k) = p1/2 R(k)|p|1/2 admits a meromorphic continuation to the lower half-plane. Here p1/2 = sign(p) |p|1/2 and / σ (Hp ). R(k) = (Hp − k 2 )−1 , Im k > 0, k 2 ∈ Moreover, the poles of the continuation are precisely the points k such that the weighted free resolvent

Bounds on Scattering Poles

411

R0,p (k) = p1/2 R0 (k)|p|1/2

(3.43)

has −1 as an eigenvalue. Since R0 (k)(x, y) = i

eik|x−y| , 2k

it follows that (3.43) is an analytic family of Hilbert–Schmidt operators for k 6 = 0. Therefore if k is such that the Hilbert–Schmidt norm of R0,p (k) is less than one, then k is not a pole. This leads to an estimate, similar to (3.42). Acknowledgements. I am deeply grateful to Professor Anders Melin for his invaluable advice and encouragement during the preparation of this paper. I am also grateful to Professor Lars Hörmander for communicating the new and improved proof of Proposition 3.9.

References 1. Ashbaugh, M., Sundberg, C.: An improved stability result for resonances. Trans. Am. Math. Soc. 281, 347–360 (1984) 2. Calogero, F., Degasperis, F.: Spectral transform and solitons. Amsterdam: North-Holland, 1982 3. Deift, P., Trubowitz, E.: Inverse scattering on the line. Comm. Pure Appl. Math. 32, 121–251 (1979) 4. Fernandez, C., Lavine, R.: Lower bounds for resonance widths in potential and obstacle scattering. Commun. Math. Phys. 128, 263–284 (1990) 5. Froese, R.: Asymptotic distribution of resonances in one dimension. J. Diff. Eq. 137, 251–272 (1997) 6. Froese, R.: Upper bounds for the resonance counting function in odd dimensions. Can. J. Math. 50 (3), 538–546 (1998) 7. Harrell II, E.M.: General lower bounds for resonances in one dimension. Commun. Math. Phys. 86, 221–225 (1982) 8. Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Berlin-NewYork: Springer-Verlag, 1983 9. Hörmander, L.: Lectures on nonlinear hyperbolic differential equations. Berlin: Springer-Verlag, 1997 10. Hörmander, L.: Notions of Convexity. Boston: Birkhäuser, 1994 11. Hörmander, L., Sigurdsson, R.: Growth properties of plurisubharmonic functions related to Fourier– Laplace transforms. Preprint, Department of Mathematics, Lund University, 1993 12. Lax, P.D., Phillips, R.S.: A logarithmic bound on the location of the poles of the scattering matrix. Arch. Rat. Mech. 40, 268–280 (1971) 13. Marchenko, V. A.: Sturm–Liouville operators and applications. Basel: Birkhäuser Verlag, 1986 14. Melin, A.: Operator methods for inverse scattering on the real line. Comm. P.D.E. 10, 677–786 (1985) 15. Sá Barreto, A., Zworski, M.: Existence of resonances in potential scattering. Comm. Pure Appl. Math. 173, 1271–1280 (1996) 16. Titchmarsh, E. C.: The theory of functions. Oxford: Oxford Univerisity Press, 1968. 17. Zworski, M.: Distribution of poles for scattering on the real line. J. Funct. Anal. 73, 277–296 (1987) 18. Zworski, M.: Counting scattering poles. In: Ikawa, M. (ed.) Spectral and Scattering Theory. New York: Marcel Dekker, 1994, pp. 301–331 Communicated by B. Simon

Commun. Math. Phys. 208, 413 – 428 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

A Stress Tensor for Anti-de Sitter Gravity Vijay Balasubramanian1,2 , Per Kraus3 1 Jefferson Laboratory of Physics, Harvard University, Cambridge, MA 02138, USA.

E-mail: [email protected]

2 Institute for Theoretical Physics, University of California, Santa Barbara, CA 93106, USA 3 Enrico Fermi Institute, University of Chicago, Chicago, IL 60637, USA.

E-mail: [email protected] Received: 20 April 1999 / Accepted: 8 July 1999

Abstract: We propose a procedure for computing the boundary stress tensor associated with a gravitating system in asymptotically anti-de Sitter space. Our definition is free of ambiguities encountered by previous attempts, and correctly reproduces the masses and angular momenta of various spacetimes. Via the AdS/CFT correspondence, our classical result is interpretable as the expectation value of the stress tensor in a quantum conformal field theory. We demonstrate that the conformal anomalies in two and four dimensions are recovered. The two dimensional stress tensor transforms with a Schwarzian derivative and the expected central charge. We also find a nonzero ground state energy for global AdS5 , and show that it exactly matches the Casimir energy of the dual N = 4 super Yang–Mills theory on S 3 × R. 1. Introduction In a generally covariant theory it is unnatural to assign a local energy-momentum density to the gravitational field. For instance, candidate expressions depending only on the metric and its first derivatives will always vanish at a given point in locally flat coordinates. Instead, we can consider a so-called “quasilocal stress tensor”, defined locally on the boundary of a given spacetime region. Consider the gravitational action thought of as a functional of the metric γµν which is induced on the boundary by its embedding into the bulk spacetime. The quasilocal stress tensor associated with a spacetime region has been defined by Brown and York to be [1]: 2 δSgrav . T µν = √ −γ δγµν

(1)

The resulting stress tensor typically diverges as the boundary is taken to infinity. However, one is always free to add a boundary term to the action without disturbing the bulk equations of motion. To obtain a finite stress tensor, Brown andYork propose a subtraction

414

V. Balasubramanian, P. Kraus

derived by embedding a boundary with the same intrinsic metric γµν in some reference spacetime, such as flat space. This prescription suffers from an important drawback: it is not possible to embed a boundary with an arbitrary intrinsic metric in the reference spacetime. Therefore, the Brown–York procedure is generally not well defined. For asymptotically anti-de Sitter (AdS) spacetimes, there is an attractive resolution to this difficulty. A duality has been proposed which equates the gravitational action of the bulk viewed as a functional of boundary data, with the quantum effective action of a conformal field theory (CFT) defined on the AdS boundary [2–4]. According to this correspondence, (1) can be interpreted as giving the expectation value of the stress tensor in the CFT:1 2 δSeff hT µν i = √ . −γ δγµν

(2)

The divergences which appear as the boundary is moved to infinity are then simply the standard ultraviolet divergences of quantum field theory, and may be removed by adding local counterterms to the action. These subtractions depend only on the intrinsic geometry of the boundary and are defined once and for all, in contrast to the ambiguous prescription involving embedding the boundary in a reference spacetime. This interpretation of divergences was first discussed in [4], and has been applied to various computations in, e.g., [8–11]. Inspired by the proposed correspondence, we develop a new procedure for defining the stress tensor of asymptotically locally anti-de Sitter spacetimes. We renormalize the stress-energy of gravity by adding a finite series in boundary curvature invariants to the action. The required terms are fixed essentially uniquely by requiring finiteness of the stress tensor. We then show that we correctly reproduce the masses and angular momenta of various asymptotically AdS spacetimes. See, e.g., [12–17] for previous studies of energy in AdS. According to (2), our definition should also exhibit the properties of a stress tensor in a quantum CFT. The boundary stress tensor of AdS3 is expected to transform under diffeomorphisms as a tensor plus a Schwarzian derivative. We verify this transformation rule, and so derive the existence of a Virasoro algebra with central charge c = 3`/2G, in agreement with the result of Brown and Henneaux [18]. We also demonstrate that the µ c R. stress tensor acquires the correct trace anomaly Tµ = − 24π The candidate dual to AdS5 gravity is four-dimensional N = 4 super Yang–Mills theory. Our procedure for computing the spacetime stress tensor (1) reproduces the expected trace anomaly of the gauge theory. An interesting – and at first surprising – feature of our stress tensor is that it is generally non-vanishing even when the bulk geometry is exactly AdS. In particular, global AdS5 , with an S 3 × R boundary, has a positive mass. In contrast, the reference spacetime approach, by construction, gives pure AdS a vanishing mass. Our result is beautifully explained via the proposed duality with a boundary CFT. The dual super Yang–Mills theory on a sphere has a Casimir energy that precisely matches our computed spacetime mass. We conclude by discussing prospects for defining an analogous quasilocal stress tensor in asymptotically flat spacetimes.

1 See [5,6,9,7] for some interesting examples.

A Stress Tensor for Anti-de Sitter Gravity

415

2. Defining the Stress Tensor Brown and York’s definition of the quasilocal stress tensor is motivated by HamiltonJacobi theory [1]. The energy of a point particle is the variation of the action with respect to time: E = −∂S/∂t. In gravity, lengths are measured by the metric, so time is naturally replaced by the boundary metric γµν , yielding a full stress tensor T µν : T µν = √

δS 2 . −γ δγµν

(3)

Here S = Sgrav (γµν ) is the gravitational action viewed as a functional of γµν . Of course, this is also the standard formula for the stress tensor of a field theory with action S defined on a surface with metric γµν . The gravitational action with cosmological constant 3 = −d(d − 1)/2`2 is: Z d(d − 1) 1 d+1 √ d x g R+ S= 16πG M `2 (4) Z √ 1 1 d d x −γ 2 + Sct (γµν ). − 8π G ∂ M 8π G The second term is required for a well defined variational principle (see, e.g., [19]), and Sct is the counterterm action that we will add in order to obtain a finite stress tensor. 2 is the trace of the extrinsic curvature of the boundary, and is defined below. Consider foliating the d + 1 dimensional spacetime M by a series of d dimensional timelike surfaces homeomorphic to the boundary ∂M. We let x µ be coordinates spanning a given timelike surface, and let r be the remaining coordinate. It is convenient to write the spacetime metric in an ADM-like decomposition [19]: ds 2 = N 2 dr 2 + γµν (dx µ + N µ dr)(dx ν + N ν dr).

(5)

Here γµν is a function of all the coordinates, including r. We will refer to the surface at fixed r as the boundary ∂Mr to the interior region Mr . The metric on ∂Mr is γµν evaluated at the boundary value of r. In AdS, the boundary metric acquires an infinite Weyl factor as we take r to infinity. So we will more properly think of the AdS boundary as a conformal class of boundaries (see, e.g., [4]). To compute the quasilocal stress tensor for the region Mr we need to know the variation of the gravitational action with respect to the boundary metric γµν .2 In general, varying the action produces a bulk term proportional to the equations of motion plus a boundary term. Since we will always consider solutions to the equations of motion, only the boundary term contributes: Z Z 1 δSct d d xπ µν δγµν + dd x δγµν , (6) δS = 8π G ∂ Mr δγµν ∂ Mr where π µν is the momentum conjugate to γµν evaluated at the boundary: π µν =

1 √ −γ (2µν − 2γ µν ). 16π G

2 See [1] for a detailed development of the formalism.

(7)

416

V. Balasubramanian, P. Kraus

Here the extrinsic curvature is 1 2µν = − (∇ µ nˆ ν + ∇ ν nˆ µ ), 2

(8)

where nˆ ν is the outward pointing normal vector to the boundary ∂Mr . The quasilocal stress tensor is thus 1 2 δSct µν µν µν 2 − 2γ + √ . (9) T = 8πG −γ δγµν Sct must be chosen to cancel divergences that arise as ∂Mr tends to the AdS boundary ∂M. In this limit we expect to reproduce standard computations of the mass of asymptotically AdS spacetimes [12,15,13,16,17]. Brown and York propose to embed ∂Mr in a pure AdS background and to let Sct be the action of the resulting spacetime region. A similar reference spacetime approach is taken by the authors of [15–17]. However, as noted by all these authors, it is not always possible to find such an embedding, and so the prescription is not generally well-defined. A reference spacetime is also implicitly present in the treatment of Abbott and Deser [12] which constructs a Noether current for fluctuations around pure AdS. Finally, Ashtekar and Magnon [13] exploit the conformal structure of asymptotically AdS spaces to directly compute finite conserved charges. It would be interesting to understand the relation of our work to their approach. We propose an alternative procedure: take Sct to be a local functional of the intrinsic geometry of the R boundary, chosen to cancel the ∂Mr → ∂M divergences in (9). Here we set Sct = ∂ Mr Lct , and state our results for AdS3 , AdS4 , and AdS5 : AdS3 :

Lct = −

1√ −γ `

1 1 2µν − 2γ µν − γ µν , 8π G ` 2 2√ ` Lct = − −γ 1 − R ` 4 1 2 2µν − 2γ µν − γ µν − `Gµν , ⇒ T µν = 8π G ` 2 3√ ` Lct = − −γ 1 − R ` 12 1 3 µν ` µν µν µν µν 2 − 2γ − γ − G ⇒T = . 8π G ` 2 ⇒ T µν =

AdS4 :

AdS5 :

(10)

All tensors above refer to the boundary metric γµν , and Gµν = Rµν − 21 Rγµν is the Einstein tensor of γµν . As we will see, the terms appearing in Sct are fixed essentially uniquely by requiring cancellation of divergences. The number of counterterms required grows with the dimension of AdS space. In general, we are also free to add terms of higher mass dimension to the counterterm action for AdSd+1 . But when d is odd, dimensional analysis shows that these terms make no contribution to T µν as the boundary is taken to infinity. For d even there is one potential ambiguity which we will explain and exorcise in later sections. The addition of Sct does not affect the bulk equations of motion or the Gibbons–Hawking

A Stress Tensor for Anti-de Sitter Gravity

417

black hole entropy calculations because the new terms are intrinsic invariants of the boundary. After adding the counterterms (11), the stress tensor (9) has a well defined limit as ∂Mr → ∂M. (More precisely, dimensional analysis determines the scaling of the stress tensor with the diverging Weyl factor of the boundary metric. However, observables like mass and angular momentum will be r independent.) To assign a mass to an asymptotically AdS geometry, choose a spacelike surface 6 in ∂M with metric σab , and write the boundary metric in ADM form: γµν dx µ dx ν = −N62 dt 2 + σab (dx a + N6a dt)(dx b + N6b dt).

(11)

Then let uµ be the timelike unit normal to 6. uµ defines the local flow of time in ∂M. If ξ µ is a Killing vector generating an isometry of the boundary geometry, there should be an associated conserved charge. Following Brown and York [1], this charge is: Z √ d d−1 x σ (uµ Tµν ξ ν ). (12) Qξ = 6

The conserved charge associated with time translation is then the mass of spacetime. Alternatively, we can define a proper energy density = uµ uν Tµν . To convert to mass, multiply by the lapse N6 appearing in (11) and integrate: Z √ M = d d−1 x σ N6 . 6

(13)

(14)

This definition of mass coincides with the conserved quantity in (12) when the timelike Killing vector is ξ µ = N6 uµ . Similarly, we can define a momentum Z √ (15) Pa = d d−1 x σ ja , 6

where ja = σab uµ T aµ .

(16)

When a is an angular direction, Pa is the corresponding angular momentum. Although we have only written the gravitational action in (4), our formulae are equally valid in the presence of matter. In particular, (14) and (15) give the total mass and momentum of the entire matter plus gravity system. 3. AdS3 We begin with the relatively simple case of AdS3 . We will show that our prescription correctly computes the mass and angular momentum of BTZ black holes, and reproduces the transformation law and conformal anomaly of the stress tensor in the dual CFT. The Poincaré patch of AdS3 can be written as:3 ds 2 =

`2 2 r 2 dr + 2 (−dt 2 + dx 2 ). r2 `

3 See, e.g., [21] for the embedding of the Poincaré patch in global AdS . 3

(17)

418

V. Balasubramanian, P. Kraus

A boundary at fixed r is conformal to R 1,1 : −γtt = γxx = r 2 /`2 . The normal vector to surfaces of constant r is r (18) nˆ µ = δ µ,r . ` Applying (9) we find 8πGTtt = −

r2 2 δSct +√ , `3 −γ δγ tt

r2 2 δSct , +√ ` −γ δγ xx 2 δSct = √ . −γ δγ tx

8πGTxx = 8πGTtx

(19)

Neglecting Sct , one would obtain divergent results for physical observables such as the mass Z Z √ t t (20) M = dx gxx N6 u u Ttt = dxTtt ∼ r 2 → ∞. So Ttt must be independent of r for large r in order for the spacetime to have a finite mass density. Sct is defined essentially uniquely by the requirement that it be a local, covariant function of the intrinsic geometry of the boundary. It is readilyRshown √ that the only such −γ . This then yields term that can cancel the divergence in (20) is Sct = (−1/`) Tµν = 0, which is clearly free of divergences. In general, we could have added further higher dimensional counterterms such as R and R 2 . Dimensional analysis shows that terms higher than R vanish too rapidly at infinity to contribute to the stress tensor. The potential contribution from the metric variation of R is Gµν , the Einstein tensor, which vanishes identically in two dimensions. So the minimal counterterm in (11) completely defines the AdS3 stress tensor. Since the stress tensor is now fully specified, it must reproduce the mass and angular momentum of a known solution. To check this, we study spacetimes of the form: `2 2 r 2 dr + 2 (−dt 2 + dx 2 ) + δgMN dx M dx N . r2 ` Working to first order in δgMN , we find ds 2 =

r r4 δgxx − ∂r δgxx , δgrr + ` 2` 2`5 r δgtt r4 − ∂r δgtt − 5 δgrr , = ` 2` 2` 1 r = δgtx − ∂r δgtx . ` 2`

(21)

8πGTtt = 8πGTxx 8πGTtx

The mass and momentum are: 4 Z δgxx r r 1 δg + δg dx − ∂ M= rr r xx , 8πG ` 2` 2`5 Z 1 1 r dx δgtx − ∂r δgtx . Px = − 8πG ` 2`

(22)

(23)

A Stress Tensor for Anti-de Sitter Gravity

419

We can apply these formulae to the spinning BTZ solution [20,21]: ds 2 = −N 2 dt 2 + ρ 2 (dφ + N φ dt)2 +

r2 dr 2 N 2ρ2

(24)

with N2 =

2) r 2 (r 2 − r+ , 2 2 ` ρ

Nφ = −

1 2 , ρ 2 = r 2 + 4GM`2 − r+ 2

4GJ , ρ2

p 2 r+ = 8G` M 2 `2 − J 2 ,

(25)

where φ has period 2π. Expanding the metric for large r we find 8GM`4 , δgtt = 8GM, δgtφ = −4GJ. (26) r4 R R 2π Inserting these into (23) with x → `φ and dx → ` 0 dφ gives the correct relations M = M and Pφ = J in agreement with conventional techniques. When M = −1/8G and J = 0, the BTZ metric reproduces global AdS3 , while the M = 0, J = 0 black hole looks like Poincaré AdS3 (17) with an identification of the boundary. It may seem surprising that global AdS3 apparently differs in mass from the Poincaré patch. The difference arises because the time directions of these coordinates do not agree, giving rise to different definitions of energy. δgrr =

3.1. Conformal Symmetry of AdS3 . Brown and Henneaux [18] have shown that gravity in asymptotically AdS3 spacetime is a conformal field theory with central charge c = 3`/2G. Both as a check of our approach, and because our covariant method will offer an alternative to the Hamiltonian formalism adopted in [18] and the Chern–Simons methods of [22], we would like to reproduce this result.4 In light of the AdS/CFT correspondence, we can think of the conformal symmetry group as arising from a 1 + 1 dimensional non-gravitational quantum field theory living (loosely speaking) on the boundary of AdS3 . On a plane with metric ds 2 = −dx + dx − , diffeomorphisms of the form x + → x + − ξ + (x + ), x − → x − − ξ − (x − )

(27)

transform the stress tensor as: c 3 + ∂ ξ , 24π + c 3 − ∂ ξ . → T−− + (2∂− ξ − T−− + ξ − ∂− T−− ) − 24π −

T++ → T++ + (2∂+ ξ + T++ + ξ + ∂+ T++ ) − T−−

(28)

The terms in parenthesis are just the classical tensor transformation rules, while the last term is a quantum effect. Let us briefly recall the origin of the latter. Although (27) is classically a symmetry of the CFT, it is quantum mechanically anomalous since we must specify a renormalization scale µ. To obtain a symmetry under (27), µ must also be rescaled to have the same measured value in the new coordinates as in the original coordinates. Equivalently, the metric should be Weyl rescaled to preserve the 4 Related work has been done by Hyun et.al. [9]

420

V. Balasubramanian, P. Kraus

form ds 2 = −dx + dx − . Such a rescaling of lengths acts non-trivially in the quantum theory and produces the extra terms in (28). We will focus on obtaining the final terms in (28) by starting from AdS3 in the form `2 2 dr − r 2 dx + dx − , (29) r2 for which Tµν = 0. We think of the dual CFT as living on the surface ds 2 = −r 2 dx + dx − with r eventually taken to infinity. Now consider the diffeomorphism (27). As above, this is not a symmetry since it introduces a Weyl factor into the boundary metric. To obtain a symmetry one must leave the asymptotic form of the metric invariant, and the precise conditions for doing so have been given by Brown and Henneaux [18]: ds 2 =

r2 g−− = O(1), + O(1), g++ = O(1), 2 `2 1 1 1 g+r = O( 3 ), g−r = O( 3 ). grr = 2 + O( 4 ), r r r r The diffeomorphisms which respect these conditions are: g+− = −

(30)

`2 2 − ∂ ξ , 2r 2 − `2 2 + ξ , x − → x − − ξ − − 2 ∂+ 2r r (31) r → r + (∂+ ξ + + ∂− ξ − ). 2 For large r, the corrections to the x ± transformations are subleading, and we recover (27). The metric then transforms as x+ → x+ − ξ + −

`2 2 `2 3 + `2 3 − 2 + − + 2 dr − r dx dx − ξ )(dx ) − (32) (∂ (∂ ξ )(dx − )2 . r2 2 + 2 − Since the asymptotic metric retains its form, this transformation is a symmetry. Using (32) we compute the stress tensor to be ` ` T−− = − (33) ∂ 3 ξ +, ∂ 3 ξ −. T++ = − 16πG + 16π G − This agrees with (28) if 3` . (34) c= 2G Thus we have verified the result of Brown and Henneaux [18]. In the CFT the full transformation law arose from doing a renormalization group rescaling of µ, while on the gravity side it arose from a diffeomorphism which rescaled the radial position of the boundary. This fits very nicely with the general feature of the AdS/CFT correspondence that scale size in the CFT is dual to the radial position in AdS. According to [23], r specifies an effective UV cutoff in the CFT; by rescaling r before taking it to infinity we are changing the way in which the cutoff is removed – but this is just the definition of a renormalization group transformation. We restricted attention to the diffeomorphism (31) because we were interested in symmetries which preserved the form of the boundary metric. More general diffeomorphisms may be studied, but these will modify the form of the CFT and so are not symmetries. ds 2 →

A Stress Tensor for Anti-de Sitter Gravity

421

3.2. Conformal Anomaly for AdS3 . The stress tensor of a 1 + 1 dimensional CFT has a trace anomaly Tµµ = −

c R. 24π

(35)

We will now verify that our quasilocal stress tensor has a trace of precisely this form. The mechanism for obtaining a conformal anomaly from the AdS/CFT correspondence was outlined by Witten [4] and studied in detail by Henningson and Skenderis [8]. Our approach is somewhat different from that of [8]. Taking the trace of the AdS3 stress tensor appearing in (11) we find Tµµ = −

1 (2 + 2/`). 8π G

(36)

Equation (36) gives the trace in terms of the extrinsic curvature; to compare with (35) we need to express the result in terms of the intrinsic curvature of the boundary. Since (36) is manifestly covariant, we may compute the right-hand side in any convenient coordinate system. We write ds 2 =

`2 2 dr + γµν dx µ dx ν . r2

The extrinsic curvature in these coordinates is r 2µν = − ∂r γµν . 2` So in this coordinate system (36) becomes 1 2 r − γ µν ∂r γµν + . Tµµ = − 8π G 2` `

(37)

(38)

(39)

To complete the calculation we need γµν as a power series in 1/r. Einstein’s equations show [24] that only even powers appear and that the leading term goes as r 2 . So we write (0) (2) + γµν + ··· . γµν = r 2 γµν

(40)

There are additional higher powers of 1/r as well as logarithmic terms [24], but these will not be needed. We now have h i 1 1 (0) −1 (2) Tr (γ ) γ + ··· . (41) Tµµ = − 8π G `r 2 Solving Einstein’s equations perturbatively gives [8] i `2 r 2 h R, Tr (γ (0) )−1 γ (2) = 2

(42)

where R is the curvature of the metric γµν . Finally, inserting this into (41) and taking r to infinity we obtain Tµµ = − which agrees with (35) when c = 3`/2G.

` R, 16π G

(43)

422

V. Balasubramanian, P. Kraus

4. AdS4 The only difference between the AdS4 and AdS3 stress tensor derivations is the need for an extra term in Sct to cancel divergences. Again, start with AdS4 in Poincaré form: ds 2 =

`2 2 r 2 dr + 2 (−dt 2 + dxi dxi ) r2 `

i = 1, 2.

(44)

Following Sect. 3, we compute the mass of the spacetime and demand that it be finite: Z Z r √ (45) M = d 2 x gxx N6 ut ut Ttt = d 2 x Ttt . ` A finite mass density requires Ttt ∼ r −1 for large r. Evaluating the stress tensor for the metric (44), we find 8πGTtt = −2

r2 2 δSct +√ , `3 −γ δγ tt

δSct r2 2 δij + √ , ` −γ δγ xi xj 2 δSct . (46) 8πGTtxi = √ −γ δγ txi R√ −γ ; in particular we find that The divergences are cancelled by choosing Sct = − 2` Tµν = 0. Now consider AdS4 in global coordinates: r2 dr 2 2 + r 2 (dθ 2 + sin2 θ dφ 2 ). (47) ds = − 1 + 2 dt 2 + r2 ` 1 + `2 8πGTxi xj = 2

It is easy to show that the mass is still given by (45) in the limit r → ∞, after replacing d 2 x by sin θdθdφ. We find that the counterterm introduced above correctly removes the a r 0 behaviour (leading to a divergent mass which r 2 divergence in Tµν , but there R remains √ can be cancelled by adding ` −γ R/2 to Sct . Altogether, this gives the counterterm action written in (11). We are free to add higher dimensional objects like R 2 to Sct , but they vanish too quickly at the AdS4 boundary to contribute to the stress tensor. In total, the stress tensor for the metric (47) is: ` + ··· , 4r 2 `3 = 2 + ··· , 4r `3 = 2 sin2 θ + · · · . 4r

8πGTtt = 8πGTθ θ 8πGTφφ

We test our definition on the AdS4 -Schwarzschild solution: 2 2 −1 r r0 r r0 dr 2 + r 2 d22 . dt 2 + 2 + 1 − ds 2 = − 2 + 1 − ` r ` r

(48)

(49)

A Stress Tensor for Anti-de Sitter Gravity

423

We find 8π GTtt =

r0 + ··· , `r

(50)

r0 . 2G

(51)

leading to a mass M=

This agrees with the standard definition of the AdS4 black hole mass. 4.1. Conformal Anomaly for AdS4 . Direct computation shows that the stress tensor for AdS4 is traceless. There is also a general argument that the trace vanishes for any even dimensional AdS, which we give instead. The stress tensor for AdSd+1 has length dimension −d. Since for large r the Weyl factor multiplying the boundary metric is proportional to r 2 , it must be the case that Tµµ ∼

1 . rd

(52)

Working in coordinates like (37), the trace has the structure Tµµ ∼ rγ µν ∂r γµν + (curvature invariants of γµν ).

(53)

Now, γµν has an expansion in even powers of r [24]: γµν = r

2

∞ (2n) X γµν n=0

r 2n

.

(54)

Using this in (53), and the fact that scalar curvature invariants always involve even powers of the metric, we find that only even powers of r can appear in the trace. Comparing with (52), shows that the stress tensor must vanish for odd d. This result is expected from the AdS/CFT correspondence, since even dimensional AdS bulk theories are dual to odd dimensional CFTs, which have a vanishing trace anomaly. 5. AdS5 The AdS5 counterterms are derived in parallel with AdS4 , so we can be brief. The expression for the spacetime mass is now: Z Z r2 3 √ t t (55) M = d x gxx N6 u u Ttt = d 3 x 2 Ttt . ` A finite mass density therefore requires Ttt ∼ r −2 for large r. Upon evaluating the stress tensor in Poincaré and global coordinates and imposing finiteness, we arrive at the counterterms written in (11). By dimensional analysis, the only possible higher dimensional terms in Sct that could make a finite contribution to the stress tensor are the squares of the Riemann tensor, the Ricci tensor and the Ricci scalar of the boundary metric. We will discuss these potential ambiguities in Sect. 5.1.

424

V. Balasubramanian, P. Kraus

We now check our definition against the known mass of particular solutions. Consider the metric " ! # !−1 r04 r04 r2 `2 2 2 2 2 dr (56) ds = 2 − 1 − 4 dt + (dxi ) + 1 − 4 ` r r r2 that arises in the near-horizon limit of the D3-brane (see, e.g., [17]). The stress tensor is 3r04 + ··· , 2`3 r 2 r4 = 30 2 + · · · . 2` r

8π GTtt = 8πGTxi xi

(57)

Using (55) gives M=

3r04 16π G`5

Z d 3 x.

This agrees with the standard formula for the mass density of this solution [17]. Next, consider the AdS-Schwarzschild black hole solution, 2 r 2 r dr 2 0 i dt 2 + h 2 ds 2 = − 2 + 1 − r0 2 r ` r + 1 − r `2

(58)

(59)

+ r 2 (dθ 2 + sin2 θ dφ 2 + cos2 θ dψ 2 ). Note that r0 = 0 gives the global AdS5 metric. We find 3r 2 3` + 02 + · · · , 2 8r 2`r `r 2 `3 = 2 + 02 + · · · , 8r 2r ! 3 `r 2 ` = + 02 sin2 θ + · · · , 2 8r 2r ! `r02 `3 = + 2 cos2 θ + · · · , . 8r 2 2r

8πGTtt = 8πGTθθ 8πGTφφ 8πGTψψ

(60)

The mass is M=

3π r02 3π `2 + . 32G 8G

(61)

The standard mass of this solution is 3π r02 /8G [17], which is the second term of our result (61). We have the additional constant 3π `2 /32G which is then the mass of pure global AdS5 when r0 = 0. It seems unusual from the gravitational point of view to have a mass for a solution that is a natural vacuum, but we will show that this is precisely correct from the perspective of the AdS/CFT correspondence.

A Stress Tensor for Anti-de Sitter Gravity

425

Casimir Energy. 5 String theory on AdS5 ×S 5 is expected to be dual to four dimensional N = 4, SU (N) super Yang–Mills [2]. We use the conversion formula to gauge theory variables: 2N 2 `3 = . G π

(62)

Then, setting r0 = 0, the mass of global AdS5 is: M=

3N 2 . 16`

(63)

The Yang–Mills dual of AdS5 is defined on the global AdS5 boundary with topology S 3 × R. A quantum field theory on such a manifold can have a nonvanishing vacuum energy – the Casimir effect. In the free field limit, the Casimir energy on S 3 × R is:6 Ecasimir =

1 (4n0 + 17n1/2 + 88n1 ), 960r

(64)

where n0 is the number of real scalars, n1/2 is the number of Weyl fermions, n1 is the number of gauge bosons, and r is the radius of S 3 . For SU(N), N = 4 super Yang–Mills n0 = 6(N 2 − 1), n1/2 = 4(N 2 − 1) and n1 = N 2 − 1 giving: Ecasimir =

3(N 2 − 1) . 16r

(65)

To compare with (63), remember that M is measured with respect to coordinate time while the Casimir energy is defined with √ respect to proper boundary time. Converting to coordinate time by multiplying by −gtt = r/` gives the Casimir “mass": Mcasimir =

3(N 2 − 1) . 16`

(66)

In the large N limit we find precise agreement with the gravitational mass (63) of global AdS5 . In related work, Horowitz and Myers [17] compared the mass of an analytically continued non-extremal D3-brane solution to the corresponding free-field Casimir energy in the gauge theory, and found agreement up to an overall factor of 3/4. They argued that the mathematical origin of the discrepancy was the same as for a 3/4 factor relating the gravitational entropy of the system to a free field entropy computation in the dual CFT [26]. In both cases, the gravitational result is valid at strong gauge coupling and, apparently, the extrapolation from the free limit of the gauge theory involves a factor of 3/4. In our case, however, the coefficients match precisely. In general, gravity calculations may not be extrapolated to the weakly coupled gauge theory, because large string theoretic corrections can deform the bulk geometry in this regime. This is the origin of the 3/4 factor discussed above. In our case, pure AdS5 is protected from stringy corrections because all tensors which might modify Einstein’s equation actually vanish when evaluated in this background [27]. This is why the Casimir energy in the weakly coupled, large N Yang–Mills exactly matches the gravitational mass of spacetime. 5 We thank Gary Horowitz for pointing out the relevance of the CFT Casimir energy to our result, and for discussing his related work with Hirosi Ooguri. 6 Noting that S 3 × R is the Einstein static universe, we can adopt the results of [25].

426

V. Balasubramanian, P. Kraus

5.1. Conformal Anomaly for AdS5 . The AdS5 conformal anomaly computation is a more laborious version of the AdS3 result in Sect. 3.2. The trace of the AdS5 stress tensor in (11) is Tµµ = −

1 (32 + 12/` − `R/2). 8π G

(67)

Again, write the bulk metric in the form (37) so that (38) gives the extrinsic curvature, yielding 3r 1 12 ` − γ µν ∂r γµν + − R(γµν ) . (68) Tµµ = − 8πG 2` ` 2 To identify the anomaly we must compute γµν to order r −2 : (0) (2) (4) + γµν + r −2 γµν + ··· . γµν = r 2 γµν

(69)

The coefficients are found to be [8] `2 (0) 1 (0) (0) (2) = (Rµν − R γµν ), γµν 2 6 i 2 h 1 (0) −1 (4) (0) −1 (2) . = Tr (γ ) γ Tr (γ ) γ 4 We also need the expansion of R(γµν ): R(γµν ) =

1 (0) δR 1 `2 (2) R + |r 2 γ (0) γµν = 2 R (0) − 4 2 µν r δγµν r 2r

1 2 µν (0) − R(0) R(0) Rµν . 6

(70)

Inserting these results into (68) and doing some algebra, one finds 1 µν `3 1 2 µ − R Rµν + R . Tµ = − 8π G 8 24

(71)

(72)

This result for the trace agrees with the work of Henningson and Skenderis [8]. These authors also show that upon using (62), precise agreement is obtained with the conformal anomaly of N = 4 super Yang–Mills. An Ambiguity. The minimal AdS5 counterterm action in (11) can be augmented by the addition of terms quadratic in the Riemann tensor, Ricci tensor and Ricci scalar of the boundary metric.7 A convenient basis for this ambiguity is provided by: Z h i √ d 4 x −γ aE + bCµνρσ C µνρσ + cR 2 . (73) 1Sct = `3 ∂ Mr

The first term is the Euler invariant E = Rµνρσ R µνρσ − 4Rµν R µν + R 2 and vanishes under variation, so we can omit it without loss of generality. C µνρσ is the Weyl tensor. Varying 1Sct with respect to the boundary metric produces an ambiguity in the stress tensor: 3 ` b c + cHµν ). (74) (bHµν 1Tµν = 16π G 7 Higher dimensional invariants give a vanishing contribution to the stress tensor at the AdS boundary.

A Stress Tensor for Anti-de Sitter Gravity

427

The tensors H b and H c are computed in [25]; their trace gives a contribution to the anomaly 1Tµµ ∝ 2R.

(75)

For general boundary metrics there is therefore a two parameter set of possible stress tensors, whose anomalies have varying coefficients for 2R. Exactly the same ambiguity is present in the definition of the renormalized stress tensor of the dual field theory on the curved boundary [25]. Our gravitational result can only be matched to field theory computations after the ambiguous parameters are matched. For conformally flat b vanishes leaving a one parameter ambiguity, which is fully boundaries the tensor Hµν specified by the coefficient of 2R in the anomaly. So we learn from (72) that gravitational energies computed with the minimal counterterm action 1Sct = 0 should be compared with a field theory regularization which produces a vanishing 2R anomaly coefficient. Precisely this was done in the above comparison of Casimir energies for global AdS5 . The boundary S 3 × R is conformally flat, and we have checked that the field theory computation that produces (64) yields no 2R term in the anomaly. This explains the agreement between the gravity and field theory results, despite the apparent ambiguity in choosing 1Sct . 6. Discussion We have formulated a stress tensor which gives a well-defined meaning to the notions of energy and momentum in AdS. Through the AdS/CFT correspondence, we have also found results for the expectation value of the stress tensor in the dual CFT. Our proposal exhibits the desired features of a stress tensor, both from the gravitational and CFT points of view. The procedure we have followed for defining the stress tensor is a particular example of the ideas developed in [28]. There it was shown how to associate the asymptotic behavior of each bulk field with the expectation value of a CFT operator. The relation studied here between the gravitational field and the stress tensor is an example of this correspondence. It would be desirable to formulate an analogous stress tensor in asymptotically flat spacetimes. It is not immediately clear how to define counterterms, since there is no longer a dimensionful parameter like ` allowing one to form a dimensionless counterterm action. On the other hand, flat spacetime is recovered from AdS by taking ` → ∞, so we might expect that applying this limit to our formulae would yield the appropriate stress tensor. However, the situation is complicated by the fact that we must keep r finite while applying the limit, taking r → ∞ afterwards. The stress tensor at finite r should be interpreted in a CFT with an ultraviolet cutoff [23]. This implies that the limits ` → ∞, r → ∞ can be understood in renormalization group terms [29]. Acknowledgements. V.B. is supported by the Harvard Society of Fellows and NSF grants NSF-PHY-9802709 and NSF-PHY-9407194. P.K. is supported by NSF Grant No. PHY-9600697. We have had helpful discussions with Emil Martinec, Joe Polchinski, Jennie Traschen and, particularly, Gary Horowitz and Don Marolf.

References 1. Brown, J.D., York, J.W.: Quasilocal energy and conserved charges derived from the gravitational action. Phys. Rev. D47, 1407 (1993)

428

V. Balasubramanian, P. Kraus

2. Maldacena, J.: The large N limit of superconformal field theories and supergravity. hep-th/9711200; Adv. Theor. Math. Phys. 2, 231 (1998) 3. Gubser, S.S., Klebanov, I.R., Polyakov, A.M.: Gauge theory correlators from noncritical string theory. hep-th/9802109; Phys. Lett. B428, 105 (1998) 4. Witten, E.: Anti-de Sitter space and holography. hep-th/9802150; Adv. Theor. Math. Phys. 2, 253 (1998) 5. Navarro-Salas, J., Navarro, P.: A note on Einstein gravity on AdS3 and boundary conformal field theory. hep-th/9807019; Phys. Lett. B439, 262 (1998) 6. Martinec, E.J.: Conformal field theory, geometry and entropy. hep-th/9809021 7. Horowitz, G.T., Itzhaki, N.: Black holes, shock waves, and causality in the AdS/CFT correspondence. hep-th/9901012; JHEP 02, 010 (1999) 8. Henningson, M., Skenderis, K.: The holographic Weyl anomaly. hep-th/9806087; JHEP 9807, 023 (1998) 9. Hyun, S., Kim, W.T., Lee, J.: Statistical entropy and AdS/CFT correspondence in BTZ black holes. hep-th/9811005; Phys. Rev. D59, 084020 (1999) 10. Chalmers, G., Schalm, K.: Holographic normal ordering and multiparticle states in the AdS/CFT correspondence. hep-th/9901144 11. Nojiri, S., Odintsov, S.: Conformal anomaly for dilaton coupled theories from AdS/CFT correspondence. hep-th/9810008; Phys. Lett. B444, 92 (1998) 12. Abbott, L.F., Deser, S.: Stability of gravity with a cosmological constant. Nucl. Phys. B195, 76 (1982) 13. Ashtekar, A., Magnon, A.: Asymptotically anti-de Sitter spacetimes. Class. Quant. Grav. 1, L39 (1984) 14. Henneaux, M., Teitelboim, C.: Asymptotically anti-de Sitter spaces. Commun. Math. Phys. 98, 391 (1985) 15. Brown, J.D., Creighton, J., Mann, R.B.: Temperature, energy and heat capacity of asymptotically anti-de Sitter black holes. hep-th/9405007; Phys. Rev. D50, 6394 (1994) 16. Horowitz, G.T., Hawking, S.W.: The gravitational Hamiltonian, action, entropy and surface terms. grqc/9501014; Class. Quant. Grav. 13, 1487 (1996) 17. Horowitz, G.T., Myers, R.C.: The AdS/CFT correspondence and a new positive energy conjecture for general relativity. hep-th/9808079; Phys. Rev. D59, 026005 (1999) 18. Brown, J.D., Henneaux, M.: Central charges in the canonical realization of asymptotic symmetries: An example from three-dimensional gravity. Commun. Math. Phys. 104, 207 (1986) 19. R.M. Wald: General relativity. Chicago, IL: University of Chicago Press, 1984 20. Bañados, M., Teitelboim, C., Zanelli, J.: The black hole in three-dimensional space-time. hep-th/9204099; Phys. Rev. Lett. 69, 1849 (1992) 21. Bañados, M., Henneaux, M., Teitelboim, C., Zanelli, J.: Geometry of the 2 + 1 black hole. gr-qc/9302012; Phys. Rev. D48, 1506 (1993) 22. Bañados, M.: Global charges in Chern–Simons field theory and the (2 + 1) black hole. hep-th/9405171; Phys. Rev. D52, 5816 (1996) 23. Susskind, L., Witten, E.: The holographic bound in anti-de Sitter space. hep-th/9805114 24. C. Fefferman and C.R. Graham. Conformal Invariants. In: Elie Cartan et les Mathématiques d’aujourd’hui, Astérisque, 1985, p. 95 25. Birrell, N.D., Davies, P.C.W.: Quantum fields in curved space. Cambridge: Cambridge University Press, 1982 26. Gubser, S.S., Klebanov, I.R., Peet, A.W.: Entropy and temperature of black 3-branes. hep-th/9602135; Phys. Rev. D54, 3915 (1996) 27. Kallosh, R., Rajaraman, A.: Vacua of M theory and string theory. hep-th/9805041; Phys. Rev. D58, 125003 (1998) 28. Balasubramanian, V., Kraus, P., Lawrence, A., Trivedi, S.P.: Holographic probes of anti-de Sitter spacetimes. hep-th/9808017; Phys. Rev. D59, 104021 (1999) 29. Balasubramanian, V., Kraus, P.: Space-time and the holographic renormalization group. hep-th/9903190; to appear in Phys. Rev. Lett. Communicated by H. Nicolai

Commun. Math. Phys. 208, 429 – 487 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On α-Induction, Chiral Generators and Modular Invariants for Subfactors Jens Böckenhauer1 , David E. Evans1 , Yasuyuki Kawahigashi2 1 School of Mathematics, University of Wales, Cardiff, PO Box 926, Senghennydd Road, Cardiff CF2 4YH,

Wales, UK. E-mail: [email protected]; [email protected]

2 Department of Mathematical Sciences, University of Tokyo, Komaba, Tokyo, 153-8914, Japan. E-mail:

[email protected] Received: 13 April 1999 / Accepted: 13 July 1999

Abstract: We consider a type III subfactor N ⊂ M of finite index with a finite system of braided N-N morphisms which includes the irreducible constituents of the dual canonical endomorphism. We apply α-induction and, developing further some ideas of Ocneanu, we define chiral generators for the double triangle algebra. Using a new concept of intertwining braiding fusion relations, we show that the chiral generators can be naturally identified with the α-induced sectors. A matrix Z is defined and shown to commute with the S- and T-matrices arising from the braiding. If the braiding is nondegenerate, then Z is a “modular invariant mass matrix” in the usual sense of conformal field theory. We show that in that case the fusion rule algebra of the dual system of M-M morphisms is generated by the images of both kinds of α-induction, and that the structural information about its irreducible representations is encoded in the mass matrix Z. Our analysis sheds further light on the connection between (the classifications of) modular invariants and subfactors, and we will construct and analyze modular invariants from SU(n)k loop group subfactors in a forthcoming publication, including the treatment of all SU(2)k modular invariants. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . 2.1 Morphisms and sectors . . . . . . . . . . . . . . 2.2 Braided endomorphisms . . . . . . . . . . . . . 3. Graphical Intertwiner Calculus . . . . . . . . . . . . 3.1 Basic graphical intertwiner calculus . . . . . . . 3.2 Frobenius reciprocity and rotations . . . . . . . . 3.3 α-Induction for braided subfactors . . . . . . . . 4. Double Triangle Algebras for Subfactors . . . . . . . 5. α-Induction, Chiral Generators and Modular Invariants

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

430 434 434 436 442 442 448 454 458 466

430

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

5.1 Relating α-induction to chiral generators . . . . . . 5.2 Modular invariants for braided subfactors . . . . . 5.3 Generating property of α-induction . . . . . . . . . 6. Representations of the M-M Fusion Rule Algebra . . . 6.1 Irreducible representations of the M-M fusion rules 6.2 The left action on M-N sectors . . . . . . . . . . . 7. Conclusions and Outlook . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

466 469 472 473 473 481 484

1. Introduction It is a surprising fact that a series of at first sight unrelated phenomena in mathematics and physics are governed by the scheme of A-D-E Dynkin diagrams, such as simple Lie algebras, finite subgroups of SL(2; C), simple singularities of complex surfaces, quivers of finite type, modular invariant partition functions of SU(2) WZW models and subfactors of Jones index less than four. Though a good understanding of the interrelations has not yet been achieved, this coincidence indicates that there are deep connections between these different fields which even seem to go beyond the A-D-E governed cases, e.g. finite subgroups of SL(n; C), modular invariants of SU(n) WZW models, or (certain) SU(n)k subfactors of larger index. This paper is addressed to the relation between the (classifications of) modular invariants in conformal field theory and subfactors in operator algebras. In rational (chiral) conformal field theory one deals with a chiral algebra which possesses a certain finite spectrum of representations (or superselection sectors) πλ acting on a Hilbert space Hλ . Its characters χλ (τ ) = tr Hλ (e2πiτ (L0 −c/24) ), Im(τ ) > 0, L0 being the conformal Hamiltonian and c the central charge, transform unitarily under “reparametrization of the torus”, i.e. there are matrices S and T such that X X Sλ,µ χµ (τ ), χλ (τ + 1) = Tλ,µ χµ (τ ), χλ (−1/τ ) = µ

µ

which are the generators of a unitary representation of the (double cover of the) modular group SL(2; Z) in which T is diagonal.1 In order to classify conformal field theories, in particular extensions in a certain sense of a given theory, one searches for modular invariant partition functions Z(τ ) = Z(−1/τ ) = Z(τ + 1) of the form X Zλ,µ χλ (τ )χµ (τ )∗ , Z(τ ) = λ,µ

where Zλ,µ = 0, 1, 2, . . . ,

Z0,0 = 1.

(1)

Here the label “0” refers to the “vacuum” representation, and the condition Z0,0 = 1 reflects the physical concept of uniqueness of the vacuum state. The matrix Z arising this way is called a modular invariant mass matrix. Mathematically speaking, the problem can be rephrased like this: Find all the matrices Z in the commutant of the unitary representation of SL(2; Z) defined by S and T subject to the conditions in Eq. (1). In 1 More precisely, for current algebras the characters depend also on other variables than τ , corresponding to Cartan subalgebra generators which are omitted here for simplicity. But these variables are responsible that one is in general dealing with the whole group SL(2; Z) rather than PSL(2; Z).

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

431

this paper we study this mathematical problem in the subfactor context. We start with a von Neumann algebra, more precisely a factor N endowed with a system of braided endomorphisms. Such a braiding defines matrices S and T which provide a unitary representation of SL(2; Z) if it is non-degenerate. We then study embeddings N ⊂ M in larger factors M which are in a certain sense compatible with the braided system of endomorphisms. We show that such an embedding N ⊂ M determines a modular invariant mass matrix in exactly the sense specified above. Longo and Rehren have studied nets of subfactors and defined a useful formula to extend a localized transportable endomorphism of the smaller to the larger observable algebra, realizing a suggestion in [43]. Xu [47,48] has worked on essentially the same construction applied to subfactors arising from conformal inclusions with the loop group construction of A. Wassermann [45]. Two of us systematically analyzed the Longo– Rehren extension for nets of subfactors on S 1 [2,4]. As sectors, a reciprocity between extension and restriction of localized transportable endomorphisms was established, analogous to the induction-restriction machinery of group representations, and therefore the extension was called α-induction in order to avoid confusion with the different sector induction. It was also noticed in [2] that the extended endomorphisms leave local algebras invariant and hence α-induction can also be considered as a map which takes certain endomorphisms of a local subfactor to endomorphisms of the embedding factor. This theory was applied to nets arising from conformal field theory models in [3,4], and it was shown that for all type I modular invariants of SU(2) respectively SU(3) there are associated nets of subfactors and in turn α-induction gives rise to fusion graphs. In fact it was shown that that these graphs are the A-D-E Dynkin diagrams respectively their generalizations of [7,8], and this is no accident: The homomorphism property of α-induction relates the spectrum of the fusion graphs to the non-zero diagonal entries of the modular invariant mass matrix. A few months after the work of Longo–Rehren, Ocneanu presented his theory of “quantum symmetries” of Coxeter graphs and gave lectures [39] one year later. He introduced a notion of a “double triangle algebra” and defined elements pj± which we refer to as “chiral generators” as they were not specifically named there. Ocneanu’s analysis has much in common with work of Xu [47] and two of us [3,4] about subfactors of type E6 , E8 and Deven . The reason for this is that the same structures are studied from different viewpoints, as we will outline in this paper. We start with a fairly general setting which admits both constructions, α-induction as well as Ocneanu’s double triangle algebras and chiral generators. Namely, we consider a type III subfactor N ⊂ M of finite index with a finite system of N-N morphisms which includes the irreducible constituents of the dual canonical endomorphism. (A “system of morphisms” means essentially that, as sectors, the morphisms form a closed algebra under the sector “fusion” product, see Definition 2.1 below.) Therefore the subfactor is in particular forced to have finite depth. The inclusion structure associates to the N-N system automatically N-M, M-N and M-M systems. The typical situation is that the system of M-M morphisms is the “unknown part” of the theory. As an easy reformulation of Ocneanu’s idea from his work on Goodman–de la Harpe–Jones subfactors associated with Dynkin diagrams one can define the double triangle algebra for such a setting, and it provides a powerful tool to gain information about the “unknown part” from the “known part” of the theory. Namely, the double triangle algebra is a direct sum of intertwiner spaces equipped with two different product structures, and its center Zh with respect to the “horizontal product” turns out to be isomorphic to the (in general non-commutative) fusion rule algebra of the M-M system when endowed with the “vertical product”. This

432

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

kind of duality is the subfactor analogue to the group algebra with its pointwise and convolution products. Under the assumption that the N -N system is braided there is automatically the notion of α-induction, which extends N -N to (possibly reducible) M-M morphisms. (This notion does not even depend on the finite depth condition.) The braiding provides powerful tools to analyze the structure of the center Zh at the same time, and the analysis is most conveniently carried out with a graphical intertwiner calculus which will be explained in detail in this paper. Besides the standard “braiding fusion symmetries” for wire diagrams representing intertwiners of the braided N-N morphisms, we show that the theory of α-induction gives rise naturally to an extended symmetry which we call “intertwining braiding fusion relations”. This reduces all graphical manipulations representing the relations between intertwiners to easily visible purely topological moves, and it allows us to work without the “sliding moves along walls” involving “quantum 6j -symbols for subfactors” which are the main technical tool in [39]. With a braiding on the N-N system we can define chiral generators pλ± in the center Zh , and our notion essentially coincides with Ocneanu’s definition of elements pj± given graphically in his A-D-E setup. We show that the decomposition of the pλ± ’s into minimal central projections in Zh corresponds exactly to the sector decomposition of the α-induced sectors [αλ± ], and therefore they can be naturally identified. As shown by Rehren [40], a system of braided endomorphisms gives rise to S- and T-matrices which provide a unitary representation of the modular group SL(2; Z) whenever the braiding is non-degenerate. (Relations between modular S- and T-matrices and braiding data are also discussed in [35,14,13].) In terms of α-induction we define a matrix Z with entries Zλ,µ = hαλ+ , αµ− i for N -N morphisms λ, µ, where the brackets denote the dimension of the intertwiner space Hom(αλ+ , αµ− ). As it corresponds to the “vacuum” in physical applications, we use the label “0” for the identity morphism idN , and hence our matrix Z satisfies the conditions in Eq. (1), where now Z0,0 = 1 is just the factor property of M. We show that Z commutes with S and T and therefore Z is a “modular invariant mass matrix” in the sense of conformal field theory if the braiding is non-degenerate. In fact, the non-degenerate case is the most interesting one, as in the SU(n)k examples in conformal field theory. We apply an argument of Ocneanu to our situation to show that in that case, due to the identification with chiral generators, both kinds of α-induction together generate the whole M-M fusion rule algebra. Moreover, the essential information about its representation theory (or equivalently, about the decomposition of the center Zh with the vertical product into simple matrix algebras) is then encoded in the mass matrix Z: We show that the irreducible representations of the M-M fusion rule algebra are labelled by pairs λ, µ with Zλ,µ 6= 0, and that their dimensions are given exactly by the number Zλ,µ . Consequently, the M-M fusion rules are then commutative if and only if all Zλ,µ ∈ {0, 1}. An analogous result has been claimed by Ocneanu for his A-D-E setting related to the modular invariant mass matrices of the SU(2) WZW models of [6,23]. He has his own geometric construction of modular invariants sketched in the lectures but not included in the lecture notes [39]. Our construction is different and based on the results of [4], and it shows that the structural results do not depend on the very special properties of Dynkin diagrams and hold in a far more general context. We also analyze the representation of the M-M fusion rule algebra arising from its left action on M-N sectors. As corollaries of our analysis we find that the number of N -M (or M-N ) morphisms is given by the trace tr(Z), whereas the number of M-M morphisms is given by tr(Z tZ).

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

433

In a forthcoming publication we will further analyze and apply our construction to subfactors constructed by means of the level k positive energy representations of the SU(n) loop group theory. For these examples, the braiding is always non-degenerate and, moreover, the S- and T-matrices are the modular matrices performing the character transformations of the corresponding SU(n)k WZW theory. Therefore the construction of braided subfactors2 for these models yields non-diagonal modular invariants Z. E.g. for SU(2)k one can construct the subfactors in terms of local loop groups which recover the A-D-E modular invariants of [6,23]. In our setting also the “type II” or “non-blockdiagonal” invariants can be treated by dropping the chiral locality condition. (The chiral locality condition, expressing local commutativity of the extended chiral theory in the formulation of nets of subfactors [33], implies “ασ -reciprocity” [2] which in turn forces the modular invariant to be of type I. Detailed explanation and non-local examples will be provided in [5].) Thus this paper extends the known results on conformal inclusions [47,48,3,4] and simple current extensions [3,4] of SU(n)k , and it generally relates (the classification of) modular invariants to (non-degenerately) braided subfactors. Furthermore our results prove two conjectures by two of us [4, Conj. 7.1 & 7.2]. This paper is organized as follows. In Sect. 2 we review some basic facts about morphisms, intertwiners, sectors and braidings, and we reformulate Rehren’s result about S- and T-matrices arising from superselection sectors in our context of braided factors. In Sect. 3 we establish the graphical methods for the intertwiner calculus we use in this paper. The abstract mathematical structure underlying the basic graphical calculus (Subsect. 3.1) is “strict monoidal C ∗ -categories” [9]. Graphical methods for calculations involving fusion and braiding have been used in various publications, see e.g. [34,28, 46,15,14,24,22]. However, for our purposes it turns out to be extremely important to handle normalization factors with special care, and to the best of our knowledge, a comprehensive exposition which applies to our framework has not been published somewhere. So we work out a “rotation covariant” intertwiner calculus here, based on a formulation of Frobenius reciprocity by Izumi [19]. We then define α-induction for braided subfactors and use it to extend our graphical calculus conveniently. In Sect. 4 we present the double triangle algebra and analyze its properties. In Sect. 5 we present our version of Ocneanu’s graphical notion of chiral generators, and we show that it can be naturally identified with the α-induced sectors. We then define the “mass matrix” Z and show that it commutes with the S- and T-matrices of the N -N system. Assuming now that the braiding is non-degenerate, we show that the M-M fusion rule algebra is generated by the images of the two kinds (+ and −) of α-induction. In Sect. 6 we decompose Zh with the vertical product into simple matrix algebras which is equivalent to the determination of all the irreducible representations of the M-M fusion rule algebra, and we show that their dimensions are given by the entries of the modular invariant mass matrix. Then we analyze the representation arising from the left action on M-N sectors. In Sect. 7 we finally conclude this paper with general remarks and comments and an outlook to the applications to subfactors arising from conformal field theory which will be treated in [5].

2 We remark that our short-hand notion of a “braided subfactor” meaning a subfactor for which Assumptions 4.1 and 5.1 below hold is different from the notion used in [31].

434

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

2. Preliminaries 2.1. Morphisms and sectors. For our purposes it turns out to be convenient to make use of the formulation of sectors between different factors. We follow here (up to minor notational changes) Izumi’s presentation [19,20] based on Longo’s sector theory [30]. Let A, B be infinite factors. We denote by Mor(A, B) the set of unital ∗-homomorphisms from A to B. We also denote End(A) = Mor(A, A), the set of unital ∗-endomorphisms. For ρ ∈ Mor(A, B) we define the statistical dimension dρ = [B : ρ(A)]1/2 , where [B : ρ(A)] is the minimal index [21,29]. A morphism ρ ∈ Mor(A, B) is called irreducible if the subfactor ρ(A) ⊂ B is irreducible, i.e. if the relative commutant ρ(A)0 ∩ B consists only of scalar multiples of the identity in B. Two morphisms ρ, ρ 0 ∈ Mor(A, B) are called equivalent if there exists a unitary u ∈ B such that ρ 0 (a) = uρ(a)u∗ for all a ∈ A. We denote by Sect(A, B) the quotient of Mor(A, B) by unitary equivalence, and we call its elements B-A sectors. Similar to the case A = B, Sect(A, B) has the natural operations, sums and products: For ρ1 , ρ2 ∈ Mor(A, B) choose generators t1 , t2 ∈ B of a Cuntz algebra O2 , i.e. such that ti∗ tj = δi,j 1 and t1 t1∗ +t2 t2∗ = 1. Define ρ ∈ Mor(A, B) by putting ρ(a) = t1 ρ1 (a)t1∗ + t2 ρ2 (a)t2∗ for all a ∈ A, and then the sum of sectors is defined as [ρ1 ] ⊕ [ρ2 ] = [ρ]. The product of sectors comes from the composition of endomorphisms, [ρ1 ][ρ2 ] = [ρ1 ◦ ρ2 ]. We often omit the composition symbol “◦”, so [ρ1 ][ρ2 ] = [ρ1 ρ2 ]. The statistical dimension is an invariant for sectors (i.e. equivalent morphisms have equal dimension) and is additive and multiplicative with respect to these operations. Moreover, for [ρ] ∈ Sect(A, B) there is a unique conjugate sector [ρ] ∈ Sect(B, A) such that, if [ρ] is irreducible, [ρ] is irreducible as well and [ρ] × [ρ] contains the identity sector [idA ] and [ρ] × [ρ] contains [idB ] precisely once. We choose ¯ thus [ρ] ¯ = [ρ]. For a representative endomorphism of [ρ] and denote it naturally by ρ, conjugates we have dρ¯ = dρ . As for bimodules one may decorate B-A sectors [ρ] with suffixes, B [ρ]A , and then we can multiply B [ρ]A × A [σ ]B but not, for instance, B [ρ]A with itself. For ρ, τ ∈ Mor(A, B) we denote Hom(ρ, τ ) = {t ∈ B : t ρ(a) = τ (a) t, a ∈ A} and hρ, τ i = dim Hom(ρ, τ ). If [ρ] = [ρ1 ] ⊕ [ρ2 ] then hρ, τ i = hρ1 , τ i + hρ2 , τ i. Note that if ρ is irreducible then for t, t 0 ∈ Hom(ρ, τ ) it follows that t ∗ t 0 is a scalar and then putting t ∗ t 0 = ht, t 0 i1B

(2)

defines an inner product on Hom(ρ, τ ). One often calls Hom(ρ, τ ) a “Hilbert space of isometries” in this case. If ρ ∈ Mor(A, B) with dρ < ∞ then ρ¯ ∈ Mor(B, A) is a conjugate if there are ¯ and r¯ρ ∈ Hom(idB , ρ ρ) ¯ such that isometries rρ ∈ Hom(idA , ρρ) ¯ rρ )∗ rρ = dρ−1 1A , ρ(rρ )∗ r¯ρ = dρ−1 1B and ρ(¯ and in the case that ρ is irreducible such isometries rρ and r¯ρ are unique up to a common phase. If C is another factor and σ ∈ Mor(C, A) and τ ∈ Mor(C, B) are morphisms with

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

435

finite statistical dimensions dσ , dτ < ∞, and conjugate morphisms σ¯ ∈ Mor(A, C), τ¯ ∈ Mor(B, C), respectively, then the “left and right Frobenius reciprocity maps”, r dρ dσ ¯ ), t 7−→ ρ(t) ¯ ∗ rρ , Lρ : Hom(τ, ρσ ) −→ Hom(σ, ρτ d τ r dρ dτ ∗ ¯ s 7 −→ s τ¯ (¯rρ ), Rρ : Hom(σ¯ , τ¯ ρ) −→ Hom(τ¯ , σ¯ ρ), dσ are anti-linear (vector space) isomorphisms with inverses r

dρ dτ ρ(x)∗ r¯ρ , d σ r dρ dσ ∗ : Hom(τ¯ , σ¯ ρ) ¯ −→ Hom(σ¯ , τ¯ ρ), y 7 −→ y σ¯ (rρ ), dτ

¯ ) −→ Hom(τ, ρσ ), L−1 ρ : Hom(σ, ρτ R−1 ρ

x 7 −→

respectively [19]. (See also [14, Sect. 5] and [13, App. A] for such formulae arising from superselection sectors.) Hence we have in particular Frobenius reciprocity [19,32], hτ, ρσ i = hρτ, ¯ σ i = hρ, ¯ σ τ¯ i = hσ¯ ρ, ¯ τ¯ i = hσ¯ , τ¯ ρi = hτ σ¯ , ρi. If τ and σ are irreducible then the Frobenius reciprocity maps are even (anti-linearly) isometric: With the inner products as in Eq. (2) on the above intertwiner spaces we have ht, t 0 i = hLρ (t 0 ), Lρ (t)i for t, t 0 ∈ Hom(τ, ρσ ) and similarly hs, s 0 i = hRρ (s 0 ), Rρ (s)i for s, s 0 ∈ Hom(σ¯ , τ¯ ρ). The map φρ : B → A defined by ¯ rρ , b ∈ B φρ (b) = rρ∗ ρ(b) is completely positive, normal, unital φρ (1B ) = 1A and satisfies φρ (ρ(a1 )bρ(a2 )) = a1 φρ (b)a2 , a1 , a2 ∈ A, b ∈ B. The map is called the (unique) standard left inverse. The minimal conditional expectation for the subfactor ρ(A) ⊂ B is given by Eρ = ρ ◦ φρ . Let now ρ, σ, τ as above be irreducible with standard left inverses φρ , φσ , φτ , respectively, and let t ∈ Hom(τ, ρσ ) be non-zero. Then φρ (tt ∗ ) ∈ Hom(σ, σ ) is a positive scalar and E˜ τ : B → τ (C) given by ρ ◦ φρ (tt ∗ )E˜ τ (b) = τ ◦ φσ ◦ φρ (tbt ∗ ) for all b ∈ B is a conditional expectation for the subfactor τ (C) ⊂ B. Since conditional expectations for irreducible subfactors are unique we conclude that φτ (b) Eρ (tt ∗ ) = φσ ◦ φρ (tbt ∗ ), b ∈ B holds for any t ∈ Hom(τ, ρσ ). Moreover, t ∗ t 0 is a scalar for any t, t 0 ∈ Hom(τ, ρσ ), t ∗ t 0 = ht, t 0 i1B , and so is Lρ (t)∗ Lρ (t 0 ), in fact ht, t 0 i1A = hLρ (t 0 ), Lρ (t)i1A ≡ Lρ (t 0 )∗ Lρ (t) =

dρ dσ ∗ 0 ∗ r ρ(t ¯ t )rρ , dτ ρ

and this is φρ (t 0 t ∗ ) =

dτ ht, t 0 i1A . dρ dσ

(3)

436

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Now let N ⊂ M be an infinite subfactor of finite index. Let γ ∈ End(M) be a canonical endomorphism from M into N and θ = γ |N ∈ End(N ). By ι ∈ Mor(N, M) we denote the injection map, ι(n) = n ∈ M, n ∈ N. Then dι = [M : N ]1/2 , and a conjugate ι¯ ∈ Mor(M, N) is given by ι¯(m) = γ (m) ∈ N , m ∈ M. (These formulae could in fact be used to define the canonical and dual canonical endomorphism.) Note that γ = ι¯ι and θ = ι¯ι, and there are isometries w ≡ rι ∈ Hom(idN , θ ) and v ≡ r¯ι ∈ Hom(idM , γ ) such that w∗ v = γ (v ∗ )w = [M : N]−1/2 1. Moreover, we have the pointwise equality M = Nv, and for each m ∈ M the decomposition m = nv yields a unique element n ∈ N . Explicitly, n = [M : N]1/2 w ∗ γ (m). Now let us consider a single factor A and its sectors. For a set of irreducible sectors which is closed under conjugation and irreducible decomposition of products (a “sector basis” in the notation of [2–4] in the case that the set is finite) it is often useful to choose one representative endomorphism for each sector. Definition 2.1. We call a subset 1 ⊂ End(A) a system of endomorphisms if it satisfies the following properties. 1. Each λ ∈ 1 is irreducible and has finite statistical dimension. 2. Different elements in 1 are inequivalent, i.e. different as sectors. 3. idA ∈ 1. ¯ is the conjugate sector of 4. For any λ ∈ 1, we have a morphism λ¯ ∈ 1 such that [λ] [λ]. 5. 1 is closed under composition and subsequent irreducible decomposition, i.e. for ν with [λ][µ] = P ν [ν] as N any λ, µ ∈ 1 we have non-negative integers Nλ,µ ν∈1 λ,µ sectors. ν = Note that we do not assume finiteness of 1 in this definition. The numbers Nλµ µ ν hλµ, νi are called fusion coefficients. Frobenius reciprocity now reads Nλ,µ = Nλ,ν ¯ = λ , and associativity of the sector product yields Nν, µ¯ X X ν µ τ ν Nλ,µ Nρ,σ = Nλ,ρ Nτ,σ . µ∈1

τ ∈1

The additivity and multiplicativity of the statistical dimension with respect to sector P ν d = d d , λ, µ, ν ∈ 1. Defining matrices N sums and products implies ν∈1 Nλ,µ ν λ µ µ ν with entries (Nµ )λ,ν = Nλ,µ gives Nµ¯ as the transpose of Nµ and defines the “reguP ν N , and the statistical lar representation” of the sector products, Nλ Nµ = ν∈1 Nλ,µ ν dimension can be regarded as a one-dimensional representation or as a simultaneous eigenvector of all matrices Nµ with eigenvalues dµ (λ, µ, ν ∈ 1). 2.2. Braided endomorphisms. Let A again be an infinite factor and 1 a system of endomorphisms of A. In general the sector products are not commutative. If the sectors commute, then a “systematic choice of unitary intertwiners” in each space Hom(λµ, µλ), λ, µ ∈ 1, is called a braiding (which need not exist in general). To be more precise, we give the following: Definition 2.2. We say that a system 1 of endomorphisms is braided if for any pair λ, µ ∈ 1 there is a unitary operator ε(λ, µ) ∈ Hom(λµ, µλ) subject to initial conditions ε(idA , µ) = ε(λ, idA ) = 1,

(4)

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

437

and whenever t ∈ Hom(λ, µν) we have the braiding fusion equations (BFE’s) ρ(t) ε(λ, ρ) = ε(µ, ρ) µ(ε(ν, ρ)) t, t ε(ρ, λ) = µ(ε(ρ, ν)) ε(ρ, µ) ρ(t), ρ(t)∗ ε(µ, ρ) µ(ε(ν, ρ)) = ε(λ, ρ) t ∗ , t ∗ µ(ε(ρ, ν)) ε(ρ, µ) = ε(ρ, λ) ρ(t)∗ ,

(5)

for any λ, µ, ν ∈ 1. The unitaries ε(λ, µ) are called braiding operators (or statistics operators). Note that a braiding ε ≡ ε + always comes along with another “opposite” braiding ε− , namely operators ε − (λ, µ) = (ε+ (µ, λ))∗ , ε+ (µ, λ) ≡ ε(µ, λ), satisfy the same relations. The unitaries ε + (λ, µ) and ε− (λ, µ) are different in general but may coincide for some λ, µ. Later we will also use the following notion of non-degeneracy of a braiding (cf. [40]). Definition 2.3. We say that a braiding ε on a system of endomorphisms 1 is nondegenerate, if the following condition is satisfied: If some morphism λ ∈ 1 satisfies ε + (λ, µ) = ε− (λ, µ) for all morphisms µ ∈ 1, then we have λ = idA . We may also extend a given braiding from 1 in a well defined manner to all equivalent and sum endomorphisms as follows.P We denote by 6(1) the set ofPall endomorphisms n m ∗ ∗ λ, ρ ∈ End(A) given as λ(a) = i=1 ti λi (a)ti and ρ(a) = j =1 sj ρj (a)sj for all a ∈ A, where ti ∈ A, i = 1, 2 . . . ,P n, and sj ∈ A, j = 1, 2, . . . , m, are Cuntz algebra generators, i.e. ti∗ tk = δi,k 1 and ni=1 ti ti∗ = 1, and similarly sj∗ sl = δj,l 1 and Pm ∗ j =1 sj sj = 1, and λi , ρj ∈ 1. (Here n, m ≥ 1.) For λ, ρ as above we put ε(λ, ρ) =

m n X X i=1 j =1

sj ρj (ti ) ε(λi , ρj ) λi (sj∗ )ti∗ ,

(6)

and one can check that this definition is independent of the ambiguities in the choice of isometries ti ∈ Hom(λi , λ) and sj ∈ Hom(ρj , ρ). Note that in the case n = m = 1 this reads ε(Ad(u) ◦ λ, Ad(q) ◦ ρ) = qρ(u) ε(λ, ρ) λ(q ∗ )u∗

(7)

with some unitaries u, q ∈ A. Then for any sum endomorphisms λ, µ, ρ ∈ 6(1) the BFE’s (5) hold as well or, alternatively, we have the naturality equations ρ(t) ε(λ, ρ) = ε(µ, ρ) t, t ε(ρ, λ) = ε(ρ, µ) ρ(t)

(8)

whenever t ∈ Hom(λ, µ). Using decompositions of products λµ, λ, µ ∈ 6(1) one can then easily show by use of the BFE’s that ε(λµ, ρ) = ε(λ, ρ) λ(ε(µ, ρ)), ε(λ, µρ) = µ(ε(λ, ρ)) ε(λ, µ).

(9)

By plugging this in Eq. (8) we find that BFE’s hold for endomorphisms in 6(1) as well and Eq. (8) yields for ε(λ, µ) ∈ Hom(λµ, µλ) the braid relation (or “Yang–Baxter equation”) ρ(ε(λ, µ)) ε(λ, ρ) λ(ε(µ, ρ)) = ε(µ, ρ) µ(ε(λ, ρ)) ε(λ, µ).

(10)

438

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Now let 1 be a braided system of endomorphisms and let ρ, ρ¯ ∈ 1 be conjugate ¯ and r¯ ≡ r¯ρ ∈ Hom(idA , ρ ρ) ¯ isometries morphisms. Denote by r ≡ rρ ∈ Hom(idA , ρρ) such that ρ(r)∗ r¯ = ρ(¯ ¯ r )∗ r = dρ−1 1, ¯ ρ)∗ r¯ ∈ Hom(idA , ρρ) ¯ which are then unique up to a common phase.3 Note that ε(ρ, ∗ is an isometry and hence ε(ρ, ¯ ρ) r¯ = ωρ r for some phase ωρ ∈ T which is called the statistics phase and is obviously independent of the common phase of r and r¯ . In fact ωρ is even independent of the choice of ρ and ρ¯ within their sectors: If ρ 0 = Ad u ◦ ρ and ρ¯ 0 = Ad u¯ ◦ ρ¯ for some unitaries u, u¯ ∈ A, then it is easy to see that ¯ ∈ Hom(idA , ρ¯ 0 ρ 0 ) and r¯ 0 = uρ(u)¯ ¯ r ∈ Hom(idA , ρ 0 ρ¯ 0 ) also isometries r 0 = u¯ ρ(u)r 0 ∗ 0 0 ∗ 0 −1 ¯ r ) r = dρ 1. Now the braiding operator transforms as ε(ρ¯ 0 , ρ 0 ) = fulfill ρ(r ) r¯ = ρ(¯ uρ(u)ε( ¯ ρ, ¯ ρ)ρ(u) ¯ ∗ u¯ ∗ and hence ¯ ρ, ¯ ρ)∗ r¯ = ωρ r 0 . ε(ρ¯ 0 , ρ 0 )∗ r¯ 0 = u¯ ρ(u)ε( The statistics phase can also be obtained by ¯ ρ))r = ωρ dρ−1 1. φρ (ε(ρ, ρ)) = r ∗ ρ(ε(ρ, (The number ωρ dρ−1 is usually called the statistics parameter.) This is obtained from the initial condition and the BFE: ¯ ρ)ρ(ε(ρ, ¯ ρ))r, ρ(r) = ρ(r)ε(idA , ρ) = ε(ρ, ¯ ρ)∗ = ωρ r¯ ∗ we obtain but since r ∗ ε(ρ, ¯ ρ))r = r ∗ ε(ρ, ¯ ρ)∗ ρ(r) = ωρ r¯ ∗ ρ(r) = ωρ dρ−1 1. r ∗ ρ(ε(ρ, Moreover we have ωρ = ωρ¯ . This can be seen as follows. We have ¯ ρ))ε(ρ, ρ)ρ(r), ¯ r = rε(ρ, idA ) = ρ(ε(ρ, ¯ ρ)) = ρ(r)∗ ε(ρ, ρ) ¯ ∗ , thus hence r ∗ ρ(ε(ρ, ¯ ∗ r = ωρ¯ ρ(r)∗ r¯ = ωρ¯ dρ−1 , ωρ dρ−1 1 = ρ(r)∗ ε(ρ, ρ) ¯ ∗ . Another since ε(ρ, ρ) ¯ ∗ r = ωρ¯ r¯ by definition. Therefore we have ωρ r ∗ = r¯ ∗ ε(ρ, ρ) application of the BFE yields ε(ρ, ρ)ρ(¯r ) = ρ(ε(ρ, ρ)) ¯ ∗ r¯ , hence we have ¯ ∗ r¯ = ωρ ρ(r)∗ r¯ = ωρ dρ−1 1. ρ(¯r )∗ ε(ρ, ρ)ρ(¯r ) = ρ(¯r )∗ ρ(ε(ρ, ρ)) ¯ and r¯ ≡ r¯λ ∈ Hom(idA , λλ) ¯ Now let λ, µ, ν ∈ 1. Let r ≡ rλ ∈ Hom(idA , λλ) ¯ r )∗ r = d −1 1. Let t, t 0 ∈ Hom(λ, µν). Recall that be isometries such that λ(r)∗ r¯ = λ(¯ λ 3 If ρ is not self-conjugate then we may choose r = r¯ and r¯ = r . However, if ρ is self-conjugate, ρ ρ ρ¯ ρ¯ ρ = ρ, ¯ we do not have rρ = r¯ρ in general. This is only true for so-called “real” sectors, and for “pseudo-real” sectors we have rρ = −¯rρ .

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

439

φµ (t 0 t ∗ ) = dλ dµ−1 dν−1 t ∗ t 0 ∈ Hom(λ, λ) is a scalar. We can now compute ωλ dµ−1 dν−1 t ∗ t 0 = ωλ dλ−1 φν ◦ φµ (t 0 t ∗ ) = φν ◦ φµ (t 0 λ(¯r )∗ ε(λ, λ)λ(¯r )t ∗ ) = r¯ ∗ φν ◦ φµ (t 0 ε(λ, λ)t ∗ ) r¯ = r¯ ∗ φν ◦ φµ (ε(λ, µν)λ(t 0 )t ∗ ) r¯ = r¯ ∗ φν ◦ φµ (ε(λ, µν)t ∗ ) t 0 r¯ = r¯ ∗ t ∗ φν ◦ φµ (ε(µν, µν)) t 0 r¯ = r¯ ∗ t ∗ φν ◦ φµ (µ(ε(µ, ν))µ2 (ε(ν, ν))ε(µ, µ)µ(ε(ν, µ))) t 0 r¯ = ωµ dµ−1 r¯ ∗ t ∗ φν (ε(µ, ν)µ(ε(ν, ν))ε(ν, µ)) t 0 r¯

= ωµ dµ−1 r¯ ∗ t ∗ φν (ν(ε(ν, µ)ε(ν, ν)ν(ε(µ, ν)) t 0 r¯ = ωµ ων dµ−1 dν−1 r¯ ∗ t ∗ ε(ν, µ)ε(µ, ν) t 0 r¯ = ωµ ων dµ−1 dν−1 t ∗ ε(ν, µ)ε(µ, ν) t 0 ,

where we finally could omit the r¯ ’s since t ∗ ε(ν, µ)ε(µ, ν)t 0 ∈ Hom(λ, λ) is a scalar. As ε(ν, µ)ε(µ, ν)t 0 ∈ Hom(λ, µν) we find ωλ ht, t 0 i = ωµ ων ht, ε(ν, µ)ε(µ, ν)t 0 i for any t, t 0 ∈ Hom(λ, µν), and therefore we arrive at the important relation ε(ν, µ)ε(µ, ν) t =

ωλ t for all t ∈ Hom(λ, µν). ωµ ων

(11)

Decomposing [µν] in all irreducible sectors [λ] and choosing for each λ ∈ 1 λ , where some orthonormal bases of intertwiners tλ;i ∈ Hom(λ, µν), i = 1, 2, . . . , Nµ,ν P P ∗ λ Nµ,ν = hλ, µνi as usual, we have λ∈1 i tλ;i tλ;i = 1, and therefore we find by Eqs. (3) and (11), !∗ XX ∗ tλ;i tλ;i φµ (ε(ν, µ)ε(µ, ν))∗ = φµ ε(ν, µ)ε(µ, ν) λ∈1 i

X ωµ ων dλ λ = Nµ,ν 1. ωλ dµ dν λ∈1

One then defines a matrix Y in terms of these numbers [40] (see also [14,13]): X ωµ ων λ Nµ,ν dλ , µ, ν ∈ 1, Yµ,ν = ωλ

(12)

λ∈1

i.e. dµ dν φµ (ε(ν, µ)ε(µ, ν))∗ = Yµ,ν 1. Then one has Yλ,µ = Yµ,λ = Yλ¯∗,µ = Yλ, ¯ µ¯ . ∗ The first equality is obvious from Eq. (12), so we only need to show Yλ,µ = (Yλ,µ ¯ ) . In ∗ ∗ ¯ µ) µ(rλ ) and r λ(ε(µ, ¯ ¯ λ)) = fact, applying the BFE again yields λ(ε(λ, µ))rλ = ε(λ, λ ¯ ∗ . Hence µ(rλ )∗ ε(µ, λ)

¯ λ∗ λ¯ (ε(µ, λ)ε(λ, µ))rλ )rµ )∗ Yλ,µ 1 = φµ (Yλ,µ ) = dλ dµ (rµ∗ µ(r ∗ ∗ ¯ ∗ ε(λ, ¯ µ)∗ )rµ rλ )∗ = (r ∗ Yλ,µ = dλ dµ (rλ∗ rµ∗ µ(ε(µ, ¯ λ) ¯ ) 1. λ ¯ rλ ) = (Yλ,µ

440

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Moreover, we have Yν,ρ Yµ,ρ = dρ

X λ

λ Nµ,ν Yρ,λ ,

since Yν,ρ Yµ,ρ 1 = dρ2 dµ dν φν (ε(ρ, ν)φµ (ε(ρ, µ)ε(µ, ρ))ε(ν, ρ))∗

= dρ2 dµ dν φν ◦ φµ (µ(ε(ρ, ν))ε(ρ, µ)ε(µ, ρ)µ(ε(ν, ρ)))∗ P P ∗ )ε(µν, ρ))∗ = dρ2 dµ dν λ i φν ◦ φµ (ε(ρ, µν)ρ(tλ;i tλ;i P P ∗ )∗ = dρ2 dµ dν λ i φν ◦ φµ (tλ;i ε(ρ, λ)ε(λ, ρ)tλ;i P P P λ ∗ 2 ∗ = dρ dµ dν λ i φµ (tλ;i tλ;i ) φλ (ε(ρ, λ)ε(λ, ρ))∗ = dρ λ Nµ,ν Yρ,λ 1.

From now on we assume that the system 1 is finite. We define the complex number X dλ2 ωλ , z1 = λ∈1

and if z1 6 = 0 we put c = 4 arg(z1 )/π. Note that the c is here only defined mod 8 and we may make a choice. Let C be the conjugation matrix with entries Cλ,µ = δλ,µ¯ . Clearly, C = C ∗ = C −1 . We then have the following Proposition 2.4. Let 1 be finite system of endomorphisms with z1 6= 0. Then S- and T-matrices defined by Sλ,µ = |z1 |−1 Yλ,µ , Tλ,µ = e−π ic/12 ωλ δλ,µ , λ, µ ∈ 1, obey the partial Verlinde modular algebra T ST ST = S, CT C = T , CSC = S and T ∗ T = 1. To prove the proposition, we simply compute P P ∗ ∗ µ ωλ Yλ,µ ωµ Yµ,ν ων = ωλ ων Pµ ωµ Yλ,µ¯ Yν,µ¯ σ Y∗ = ωλ ων µ,σ ωµ dµ Nλ,ν µ,σ ¯ P ωρ σ Nρ = ωλ ων µ,ρ,σ ωµ dµ Nλ,ν µ,σ ¯ ωµ ωσ dρ P σ ωρ = ωλ ων ρ,σ dρ2 dσ Nλ,ν ωσ P 2 = Yλ,ν ρ dρ ωρ = Yλ,ν z1 , hence T ST ST = e−πic/4 |z1 |−1 Sz1 = S. The remaining relations CT C = T , CSC = S and T ∗ T = 1 are obvious. We define weight vectors y λ with components yµλ = Yλ,µ and statistics characters χλ : 1 → C with evaluations χλ (µ) = dλ−1 Yλ,µ , λ, µ ∈ 1. We have seen that the weight vectors y λ are simultaneous eigenvectors of the fusion matrices Nµ with eigenvalues χλ (µ), Nµ y λ = χλ (µ)y λ . Hence we obtain by computing inner products, ¯ ∗ hy λ , y µ i = χλ (ρ)hy λ , y µ i. χµ (ρ)hy λ , y µ i = hy λ , Nρ y µ i = hNρ¯ y λ , y µ i = χλ (ρ) Therefore the eigenvectors are either orthogonal, hy λ , y µ i = 0, or parallel, dµ y λ = dλ y µ since then the characters are equal, χλ = χµ . It is obvious that if some λ ∈ 1 is degenerate, i.e. has trivial monodromy with all other µ ∈ 1, then y λ is parallel to the

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

441

vector y 0 . (Here and later we use the label “0” for the identity idA ∈ 1.) Note that we have yµ0 = dµ , and then Yλ,µ = dλ dµ . Conversely, if y λ is parallel to y 0 we have seen that then necessarily Yλ,µ = dλ dµ , hence X ωλ ωµ ρ X ρ Nλ,µ dρ = dλ dµ = Nλ,µ dρ , µ ∈ 1, Yλ,µ = ωρ ρ∈1

ρ∈1

and this is clearly only possible if all the eigenvalues ωλ ωµ ωρ−1 of the monodromy are trivial, i.e. if λ is degenerate. We conclude P that a braiding on 1 is non-degenerate if and only if hy λ , y 0 i = δλ,0 w, where w = λ∈1 dλ2 is the global index. We now arrive at Rehren’s result [40]. Theorem 2.5. The following conditions are equivalent for a finite braided system of endomorphisms 1: 1. The braiding on 1 is non-degenerate. 2. We have w = |z1 |2 and the matrices S and T obey the full Verlinde modular algebra S ∗ S = T ∗ T = 1, (ST )3 = S 2 = C, CT C = T , moreover S diagonalizes the fusion rules (Verlinde formula): ∗ X Sλ,ρ Sµ,ρ Sν,ρ ν = . Nλ,µ S0,ρ ρ∈1

Note that the implication 2. ⇒ 1. is trivial since invertibility of S implies that there is no vector y λ parallel y 0 . So let us assume that the braiding is non-degenerate: hy λ , y 0 i = δλ,0 w for all λ ∈ 1. Then we can first check P P P ωµ ων λ −1 = −1 −1 w = µ hy 0 , y µ idµ ωµ µ,ν dν Yµ,ν dµ ωµ = µ,ν,λ dν ωλ Nµ,ν dλ dµ ωµ P P µ = µ,ν,λ dλ dν ωωλν Nν¯ ,λ dµ = λ,ν dλ2 ωλ−1 dν2 ων , 2 P thus w = λ∈1 dλ2 ωλ ≡ |z1 |2 . Next we compute X X X ∗ ν ν 0 ν 0 Yλ,ρ Yµ,ρ = Nλ,µ Y d = Nλ,µ hy λ , y µ i = ρ,ν ρ ¯ ¯ hy , y i = Nλ¯ ,µ w = δλ,µ w, ρ

hence

S∗S

ρ,ν

= 1. Similarly we observe that

P

ρ

ν

Yλ,ρ Yµ,ρ =

P

ρ

Yλ¯∗,ρ Yµ,ρ = δλ,µ ¯ w,

giving S 2 = C which obviously commutes with T . Finally we check ∗ X Sλ,ρ Sµ,ρ Sν,ρ ρ

S0,ρ

= w−1

∗ X Yλ,ρ Yµ,ρ Yν,ρ ρ

= w−1

X ρ,σ

dρ σ ∗ Nλ,µ Yρ,σ Yν,ρ =

X σ

σ ν Nλ,µ δν,σ = Nλ,µ ,

proving the Verlinde identity. Corollary 2.6. If the braiding on 1 is non-degenerate, then the matrix S and the diagonal matrix T are the images S = U (S) and T = U (T ) of canonical generators 0 −1 11 S= , T = , 1 0 01

442

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

in a unitary representation U of the modular group4 SL(2; Z) with dimension |1|, the cardinality of 1. 3. Graphical Intertwiner Calculus 3.1. Basic graphical intertwiner calculus. We now introduce our conventions to represent and manipulate intertwiners graphically. We consider a braided system of endomorphisms 1 ⊂ End(A) with A a type III factor. Essentially we represent intertwiners by “wire diagrams” where the (oriented) wires represent endomorphisms λ ∈ 1. This works as follows. For an intertwiner x ∈ Hom(λ1 λ2 · · · λn , µ1 µ2 · · · µm ) we draw a (dashed) box with n (downward) incoming wires labelled by λ1 , . . . , λn and m (downward) outgoing wires µ1 , . . . , µm as in Fig. 1, λi , µj ∈ 1. Therefore λ1

µ1

λ2

?

···

λn

?

x

µ2

?

? ?

µm

···

?

Fig. 1. An intertwiner x

the diagrammatic representation of x does not only specify it as an operator, it even specifies the intertwiner space it is considered to belong to. (Note that the same operator can belong to different intertwiner spaces as e.g. the identity operator belongs to any Hom(λ, λ) with λ varying.) If a morphism ρ ∈ 1 is applied to x, then ρ(x) ∈ Hom(ρλ1 λ2 · · · λn , ρµ1 µ2 · · · µm ) is represented graphically by adding a straight wire on the left as in Fig. 2. Reflecting the fact that x can also be considered as an intertwiner in λ1 ρ

?

µ1

λ2

?

µ2

?

?

···

x

?

···

λn

?

µm

?

Fig. 2. The intertwiner ρ(x)

Hom(λ1 λ2 · · · λn ρ, µ1 µ2 · · · µm ρ) we can always add (or remove) a straight wire on the right as in Fig. 3 without changing the intertwiner as an operator. We say that intertwiners x ∈ Hom(λ1 λ2 · · · λn , µ1 µ2 · · · µm ) and y ∈ Hom(ν1 ν2 · · · νk , ρ1 ρ2 · · · ρl ), ρj ∈ 1, are diagrammatically composable if m = k and µi = νi for all i = 1, 2, . . . , m. Then the composed intertwiner yx ∈ Hom(λ1 λ2 · · · λn , ρ1 ρ2 · · · ρl ) is represented graphically by putting the wire diagram for x on top of that for y as in Fig. 4. We also call this graphical procedure composition of wire diagrams. Sometimes diagrammatic composability may be achieved by adding or removing straight wires on the right. Now let 4 In the literature the name “modular group” is often reserved for PSL(2; Z) = SL(2; Z)/Z rather than 2 SL(2; Z). Clearly, we obtain a representation of PSL(2; Z) whenever the charge conjugation is trivial, C = 1.

On α-Induction, Chiral Generators and Modular Invariants for Subfactors λ1

µ1

λ2

?

?

···

x

µ2

?

?

···

443

λn

?

ρ

µm

?

?

Fig. 3. The intertwiner x λ1

µ1

ρ1

λ2

?

µ2

?

ρ2

?

?

···

x

?

···

y

?

···

λn

?

µm

?

ρl

?

Fig. 4. Product yx of diagrammatically composable intertwiners x and y

also x 0 ∈ Hom(λ01 λ02 · · · λ0n0 , µ01 µ02 · · · µ0m0 ) with λ0i , µ0j ∈ 1. The intertwining property of x yields the identity µ1 µ2 · · · µm ρ1 ρ2 · · · ρl (x 0 )x = xλ1 λ2 · · · λn ρ1 ρ2 · · · ρl (x 0 ), and this is diagrammatically given in Fig. 5. Thus we have some freedom in translating intertwiner boxes vertically without actually changing the represented intertwiner. λ1

···

?

x

ρ1

···

?

µm

?

λ1

··· ···

µ1

λ0n0

λ01

λ

?n

?

ρl

? µ01

?

x0

···

?

λ01

λn ···

? µ0m0

?

=

? µ1

x

···

?

? µm

?

···

?

x0

λ0

?n0

··· ρ1

?

ρl

··· µ01

?

?

µ0

?m0

Fig. 5. Vertical translation intertwiners x and x 0

The intertwiners we consider are (sums over) compositions of elementary intertwiners arising from the unitary braiding operators ε(λ, µ) ∈ Hom(λµ, µλ) and isometries t ∈ Hom(λ, µν). The wire diagrams and boxes we are dealing with are therefore compositions of “elementary boxes” representing the elementary intertwiners. We now have to introduce some normalization convention. First, the identity intertwiner 1 ≡ 1A is naturally given by the “trivial box” with only straight wires of arbitrary labels. The next elementary intertwiner is ρ1 ρ2 · · · ρn (ε(λ, µ)) for which we draw a box as in Fig. 6 where the arbitrary labels ν1 , . . . , νm are irrelevant and may be omitted. Similarly, the 1/4 1/4 −1/4 box of Fig. 7 represents the elementary intertwiner dµ dν dλ ρ1 ρ2 · · · ρn (t), where t ∈ Hom(λ, µν) is an isometry. We label the trivalent vertex in the box by t since Hom(λ, µν) may be more than one-dimensional and so we have to specify the intertwiner. (Note that there would still be an ambiguity of a phase for the choice of an isometry even if Hom(λ, µν) is only one-dimensional.) Finally, the elementary intertwin1/4 1/4 −1/4 ers ε(λ, µ)∗ = ε− (µ, λ) and dµ dν dλ ρ1 ρ2 · · · ρn (t)∗ are represented by Figs. 8

444

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

ρ1

ρ2

?

ρn

···

?

?

µ

R

λ

ν1

ν2

?

?

···

νm

?

Fig. 6. ρ1 ρ2 · · · ρn (ε(λ, µ))

ρ1

ρ2

?

λ

ρn

···

?

?

µ

?

t

R

ν

ν1

ν2

?

?

···

νm

?

r Fig. 7.

ρ1

ρ2

?

4 dµ dν dλ

···

ρ1 ρ2 · · · ρn (t) where t ∈ Hom(λ, µν) is an isometry

ρn

?

?

λ

R

µ

ν1

ν2

?

?

···

νm

?

Fig. 8. ρ1 ρ2 · · · ρn (ε(λ, µ))∗ = ρ1 ρ2 · · · ρn (ε − (µ, λ))

ρ1

ρ2

?

···

?

? r

Fig. 9.

µ

ρn

4 dµ dν dλ

t∗

R λ

?

ν

ν1

ν2

?

?

···

νm

?

ρ1 ρ2 · · · ρn (t)∗ , where t ∈ Hom(λ, µν) is an isometry

and 9, i.e. they are obtained from the original boxes in Figs. 6 and 7 by vertical reflection and inversion of all the arrows.Note that ε ≡ ε+ represents overcrossing and ε− undercrossing of wires. We will consider intertwiners which are products of diagrammatically composable elementary intertwiners. In terms of wire diagrams we are correspondingly dealing with compositions of elementary boxes of Figs. 6–9 so that the wires with the same labels (and orientations) can and will be glued together in parallel and then we finally forget about the boundaries of the (dashed) boxes. Therefore, if a wire diagram represents some intertwiner x then x ∗ is represented by the diagram obtained by vertical reflection and reversing all the arrows. Note that our resulting wire diagrams are then composed only from straight lines, over- and undercrossings (in X-shape) and trivalent vertices (in Y-shape or inverted Y-shape). So far, we have considered only wires with downward orientation. We now introduce also the reversed orientation in terms of conjugation as follows: Reversing the orientation ¯ Also we will usually omit drawing a of an arrow on a wire changes its label λ to λ. wire labelled by id ≡ idA . For each λ ∈ 1 we fix (the common phase of) isometries ¯ and r¯λ ∈ Hom(id, λλ) ¯ such that λ(rλ )∗ r¯λ = λ(¯ ¯ rλ )∗ rλ = d −1 1 and in rλ ∈ Hom(id, λλ) λ √ turn for dλ rλ we draw one of the equivalent diagrams in Fig. 10. So the normalized isometries and their adjoints appear in wire diagrams as “caps” and “cups”, respectively. The point is that with our normalization convention, the relation λ(rλ )∗ r¯λ = dλ−1 1 (and its adjoint) gives a topological invariance for intertwiners represented by wire diagrams, displayed in Fig. 11. Note that then the wire diagrams in Fig. 12 represent the scalar dλ (i.e. the intertwiner dλ 1 ∈ Hom(id, id)). Also note the “vertical Reidemeister

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

id

λ¯

?

=

λ

R

λ¯

Fig. 10. Wire diagrams for

= λ

=

λ

R

445

λ

? √

dλ rλ

= λ

?

λ

?

?

Fig. 11. A topological invariance for intertwiners represented by wire diagrams

-

=

λ

λ

Fig. 12. Wire diagrams for the statistical dimension dλ

= λ

=~

µ

= λ

µ

λ

? ?

=~

µ

Fig. 13. Unitarity of braiding operators as a vertical Reidemeister move of type II

move of type II” in Fig. 13 is just the unitarity condition ε(λ, µ)∗ ε(λ, µ) = 1 = ε(µ, λ)ε(µ, λ)∗ . The BFE’s yield another topological invariance, see Fig. 14 for the first equation and Fig. 15 for the second equation. The third and fourth equations are obtained λ

λ

?

t

=

ρ

µ

?

?

t

R

ν

ρ

µ

? ?

Fig. 14. The first braiding fusion equation

ν

?

446

J. Böckenhauer, D. E. Evans, Y. Kawahigashi λ

λ

?

t

=

µ

?

t

R

ν

µ

ρ

?

ν

?

ρ

? ?

Fig. 15. The second braiding fusion equation

=

ρ

?

µ

R

ρ

λ

R

µ

λ

?

Fig. 16. The braid relation as a vertical Reidemeister move of type III

similarly by use of the co-isometry t ∗ ; we leave it as an exercise to the reader to draw the corresponding wire diagrams. Up to conjugation they can also be obtained by changing over- to undercrossings in Figs. 14 and 15. Finally, the braid relation, Eq. (10), represents graphically a vertical Reidemeister move of type III, presented in Fig. 16. The topological invariance gives us the freedom to write down the intertwiner algebraically from a given wire diagram: We can deform the wire diagram by finite sequences of the above moves and then split it in elementary wire diagrams – in whatever way we decompose the wire diagrams into horizontal slices of elementary intertwiners, we always obtain the same intertwiner due to our topological invariance identities. Next we recall that we can write the statistics phase ωλ as the intertwiner dλ rλ∗ λ¯ (ε(λ, λ))rλ . Therefore we obtain for ωλ the wire diagram on the left-hand side of Fig. 17. The diagram λ

6

= λ

?

Fig. 17. Statistics phase ωλ as a “twist”

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

447

¯ λ )∗ ε(λ, ¯ λ) ¯ λ(r ¯ λ ). on the right-hand side expresses that ωλ can also be obtained as dλ λ(r ∗ Note that we obtain the complex conjugate ωλ by exchanging over- and undercrossings. Similarly, we recall that we can write a matrix element Yλ,µ = Yµ,λ of Rehren’s Y¯ − (λ, µ)ε− (µ, λ))rµ . Dividing by dλ matrix as dλ dµ φµ (ε(λ, µ)ε(µ, λ))∗ = dλ dµ rµ∗ µ(ε we obtain χλ (µ), the statistics character χλ evaluated on µ, represented graphically by the wire diagram in Fig. 18. We have drawn the circle µ symmetrically relative to

µ

6 λ

?

Fig. 18. Rehren’s statistics character χλ evaluated on µ: χλ (µ)

the straight wire λ because it does not make a difference whether we put the “caps” and “cups” for the isometry rµ and its conjugate rµ∗ on the left or on the right due to the braiding fusion relations. As it is a scalar, we can write Yλ,µ = r¯µ∗ Yλ,µ r¯µ and therefore its expression dλ dµ r¯µ∗ rλ∗ λ¯ (ε− (µ, λ)ε− (λ, µ))rλ r¯µ yields exactly the “Hopf link” as the wire diagram for the matrix element Yλ,µ , given by the left-hand side of ∗ together with Fig. 19. The equality to the right-hand side is just the relation Yλ,µ = Yλ, µ¯ the prescription of representing conjugates. Recall that if 1 is finite √ then the Y-matrix differs from the S-matrix just by an overall normalization factor w, where w is the global index. Often we consider intertwiners which are sums over intertwiners represented by the same wire diagram but the sum runs over one or more of the labels. Then we simply write the sum symbol in front of the diagram, we may similarly insert scalar factors. Now recall that for finite 1 the non-degeneracy of the braiding is encoded in the Porthogonality relation hy 0 , y λ i = δλ,0 w. In terms of the statistics characters this reads µ dµ χλ (µ) = dλ−1 δλ,0 w = δλ,0 w. Graphically this can be represented as in Fig. 20. This kind of (graphical) relation has also been used more recently in [44,38,25] and was called a “killing ring” in [38]. Wire diagrams can also be used for intertwiners of morphisms between different factors. Let A, B, C infinite factors, ρ ∈ Mor(A, B), σ ∈ Mor(C, B), τ ∈ Mor(A, C) irreducible morphisms and t ∈ Hom(ρ, σ τ ) an isometry. Then Fig. 21 represents the 1/4 1/4 −1/4 intertwiner dσ dτ dρ t. Similarly we can draw a picture using a co-isometry. Along the lines of the previous paragraphs, we can similarly build up larger wire diagrams out of trivalent vertices involving different factors. We do not need the triangles with corners labelled by factors as we can also label the regions between the wires. So far we do not have a meaningful way to cross wires with differently labelled regions left and right, but all the arguments listed above which do not involve braidings can be used for intertwiners of morphisms between different factors exactly as proceeded above. Moreover, the diagrams may also involve wires where left and right regions are labelled by the same factor, i.e. these wires correspond to endomorphisms of some factor which may well form a braided system, and then one may have crossings for those wires.

448

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

-

λ

µ

-

λ

-

µ

=

Fig. 19. Matrix element Yλ,µ of Rehren’s Y-matrix as a “Hopf link”

X

dµ

µ

µ∈1

X

=

6

µ

µ∈1

6

µ

= δλ,0 w

6

λ

λ

?

?

Fig. 20. Orthogonality relation for a non-degenerate braiding (“killing ring”) ρ

B

A

?

σ

t

R

C q

Fig. 21. The intertwiner

τ

4 dσ dτ dρ

t as a triangle

3.2. Frobenius reciprocity and rotations. Let A, B, C be infinite factors, ρ ∈ Mor(A, B), τ ∈ Mor(C, B), σ ∈ Mor(C, A) morphisms with finite statistical dimensions dρ , dτ , dσ < ∞, respectively, and let t ∈ Hom(τ, ρσ ). Then s dρ dσ ρ(t) ¯ ∗ rρ ∈ Hom(σ, ρτ ¯ ) Lρ (t) = dτ and

s Rσ (t) =

dρ dσ ∗ t ρ(¯rσ ) ∈ Hom(ρ, τ σ¯ ) dτ

are the images under left and right Frobenius maps. Displaying the intertwiners 1/2 1/2 ¯ and dσ ρ(¯rσ )∗ t graphically yields the identities in Figs. 22 and 23, redρ rρ∗ ρ(t) spectively. These morphisms need not be irreducible. Taking them as products, we may 1/2 ¯ replace any of them by bundles of wires. We call the linear isomorphisms t 7→ dρ rρ∗ ρ(t)

and t 7 → dσ ρ(¯rσ )∗ t the left and right Frobenius rotations. Now let us assume that t is isometric and labels a trivalent vertex of wires corresponding to irreducible morphisms ρ, τ, σ . With the above “transformation law” we then have the identity of Fig. 24, where the first equality is just a definition which gives us some prescription of “tightening” wires at trivalent vertices. In fact, the label Lρ (t)∗ of the 1/2

On α-Induction, Chiral Generators and Modular Invariants for Subfactors ρ

6

ρ

τ

?

6

q

=

t

449 τ

?

dτ ∗ dσ Lρ (t)

σ

σ

?

?

Fig. 22. Left Frobenius reciprocity for an intertwiner t ∈ Hom(τ, ρσ ) τ

6σ

?

τ

ρ

q

=

t

6σ

?

dτ ∗ dρ Rσ (t)

ρ

?

?

Fig. 23. Right Frobenius reciprocity for an intertwiner t ∈ Hom(τ, ρσ )

trivalent vertex makes sense since it is a co-isometry: Due to irreducibility of τ and σ , the map t 7 → Lρ (t)∗ is isometric. Similarly, we get Fig. 25 (using irreducibility of τ and ρ). Hence the prefactor in Figs. 22 and 23 is just such that it transforms isometries with natural normalization prefactors into co-isometries with natural normalization prefactors and, by taking adjoints, the other way round which gives the graphical identities given in Fig. 26. We may now use the replacement prescription three times, beginning with a trivalent vertex labelled by an isometry t ∈ Hom(τ, ρσ ) and proceeding in a clockwise ¯ τ¯ ) in the corner where direction. Then we end up with a co-isometry 2(t)∗ ∈ Hom(σ¯ ρ, ρ

6 t

ρ

τ

6

ρ

τ

?

:=

Lρ (t)∗

σ

?

τ

=

t

σ

6

σ

?

?

Fig. 24. Left Frobenius reciprocity for a trivalent vertex labelled by an isometry

6σ

τ

R

t ρ

?

6σ

τ

?

:=

t ρ

?

τ =

Rσ (t)∗

R ρ

?

Fig. 25. Right Frobenius reciprocity for a trivalent vertex labelled by an isometry

6σ

450

J. Böckenhauer, D. E. Evans, Y. Kawahigashi σ t∗

?

?

=

ρ

ρ

σ

τ

?

ρ

Lρ (t)

t∗ ? I

and τ

?

ρ

τ

?

? I

= σ

τ

Rσ (t)

?

σ

Fig. 26. Frobenius reciprocity for a trivalent vertex labelled by a co-isometry

we originally had the label t. In fact, 2(t) = Rρ (Lτ (Rσ (t))) =

p dρ dσ dτ rτ∗ τ¯ (t ∗ ρ(¯rσ )¯rρ ).

˜ ∗ ∈ Similarly we can go in the counter-clockwise direction and then we obtain 2(t) Hom(σ¯ ρ, ¯ τ¯ ), where p ˜ ¯ rτ∗ t ∗ )σ¯ (rρ )rσ , 2(t) = Lσ (Rτ (Lρ (t))) = dρ dσ dτ σ¯ ρ(¯ and in order to establish a well-defined rotation procedure we have to show that 2(t) = ˜ 2(t). Now p ˜ = dρ dσ dτ τ¯ (¯rτ∗ t ∗ )2(t)∗ σ¯ (rρ )rσ 2(t)∗ 2(t) = dρ dσ dτ τ¯ (¯rτ∗ t ∗ )τ¯ (¯rρ∗ ρ(¯rσ∗ )t)τ¯ τ (σ¯ (rρ )rσ )rτ = dρ dσ dτ τ¯ (¯rτ∗ t ∗ r¯ρ∗ ρ(¯rσ∗ )ρσ (σ¯ (rρ )rσ )t)rτ = dρ dτ τ¯ (¯rτ∗ t ∗ r¯ρ∗ ρ(rρ )t)rτ = dτ τ¯ (¯rτ∗ )rτ = 1, ∗ (2(t) − 2(t)) ˜ ˜ ˜ = 0, i.e. 2(t) = 2(t). Thus a trivalent vertex lahence (2(t) − 2(t)) belled with an isometry t ∈ Hom(τ, ρσ ) can equivalently be labelled with a co-isometry ¯ τ¯ ). So here we have established some “rotation invariance” of triva2(t)∗ ∈ Hom(σ¯ ρ, lent vertices (in standard inverted Y-shape or Y-shape) with a replacement prescription for the rotated labelling (co-) isometries. Next we turn to the rotation of crossings when we have a braiding. Assume we have a braided system of endomorphisms 1 3 λ, µ, ν of some factor A. From the BFE we obtain rλ = λ¯ (ε∓ (µ, λ))ε∓ (µ, λ¯ )µ(rλ ). Applying λ and multiplying by dλ ε± (λ, µ)¯rλ∗ from the left yields

ε± (λ, µ) = dλ r¯λ∗ λ(ε∓ (µ, λ¯ ))λµ(rλ ).

(13)

¯ λ))¯rµ , and by multiplying with The BFE yields similarly λ(¯rµ ) = ε± (µ, λ)µ(ε∓ (µ, dµ µλ(rµ∗ )ε± (λ, µ) from the left we obtain ¯ λ))¯rµ , ε± (λ, µ) = dµ µλ(rµ∗ )µ(ε− (µ, and therefore we have the graphical identity given in Fig. 27, here displayed only for overcrossings. Then this procedure can even be iterated so that we obtain arbitrarily twisted crossings. Note that for the rotation of crossings we do not need any relabelling prescription as this is encoded in the BFE’s. We now turn to the discussion of “abstract pictures” which admit different intertwiner interpretations according to Frobenius rotations. Let A1 , A2 , ..., A` be factors equipped with sets 1i,j ⊂ Mor(Ai , Aj ), i, j = 1, 2, ..., `, of irreducible, pairwise inequivalent

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

= µ

?

λ ?

451

= µ

?

λ ?

µ

?

λ ?

Fig. 27. Rotation of crossings

F morphisms with finite index such that i,j 1i,j is closed under conjugation and irreducible decomposition of products (whenever composable) as sectors, and in particular each 1i,i is a system of endomorphisms. Some of the systems 1i,i may be braided. We now consider “labelled knotted graphs” of the following form. On a finite connected and simply connected region in the plane we have a finite number of wires (i.e. images of piecewise C ∞ maps from the unit interval into the region). Within the region there is a finite number of trivalent vertices (i.e. common endpoints of three wires) and crossings of two wires, and for the latter there is a notion of over- and undercrossing (i.e. for each crossing there is one wire “on top of the other”). If wires are not closed (i.e. if their two endpoints do not coincide) then they are only allowed to have trivalent vertices or distinguished points on the boundary of the region as their endpoints. The wires meet each other only at theFtrivalent vertices and crossings, and they are directed and labelled by the morphisms in i,j 1i,j subject to the following rules. Crossings are only possible for wires with labelling morphisms in some 1i,i with braiding. Furthermore it must be possible to associate the factors Ai to the free regions between the wires such that any wire labelled by some ρ ∈ 1i,j has the “source” factor Ai on its left and the “range” factor Aj on its right relative to the orientation (composition compatibility). We identify graphs which are transformed into each other by inversion of the orientation of a wire and simultaneous replacement of its label, say ρ ∈ 1i,j , by the representative conjugate morphism ρ¯ ∈ 1j,i . Finally, the trivalent vertices are labelled either by isometric or coisometric intertwiners which are associated locally to one corner region of the trivalent vertex as follows. If τ ∈ 1i,j , ρ ∈ 1k,j , σ ∈ 1i,k label the three wires of a trivalent vertex, τ is entering and, following counter-clockwise, ρ and σ are outgoing (as e.g. the trivalent vertex in Figs. 24 and 25, possibly up to isotopy and rotation), then in the local corner region opposite to τ the label must either be an isometry t ∈ Hom(τ, ρσ ) or a ¯ τ¯ ). If the wires at a trivalent vertex have orientation different co-isometry s ∗ ∈ Hom(σ¯ ρ, from this, the rule can be derived from the previous case by reversing orientations and simultaneous relabelling by conjugate morphisms. Now let G be such a labelled knotted graph as above. To interpret G as an intertwiner, we may put it in some “Frobenius annulus” as shown in Fig. 28 for an example.5 A Frobenius annulus has labelled wires inside such that each of them meets an open end of a wire of G at one endpoint (labelled by ρ1 ,...,ρ12 in our example), matching the label and orientation of this wire, and this way all the open ends of the wires of G are either connected to the top or bottom of the outside square boundary of the annulus. No crossings or trivalent vertices are allowed in the annulus, but it may contain cups or caps. Gluing the wires together and forgetting about the boundary of G and the annulus, we will read the result as a wire diagram and therefore the annulus corresponds to a 5 Our notion of a Frobenius annulus is inspired by the annular invariance used in Jones’ definition of a “general planar algebra” [22].

452

J. Böckenhauer, D. E. Evans, Y. Kawahigashi ρ2

6ρ1

ρ12 6 ρ11 6

ρ10

? -

G

6 6 6 ρ3

ρ4

ρ5

ρ6

?

ρ7

ρ8 ρ9 ? ?

Fig. 28. A Frobenius annulus surrounding G

“Frobenius choice”, deciding whether we will get a certain intertwiner or its image by certain Frobenius rotations, cf. Figs. 22 and 23 (and their adjoints). Reading vertically downwards, we may now have the problem that on a finite number of horizontal levels a finite number of singular points of crossings, trivalent vertices, cups and caps are exactly on the same level (or “height”) so that we cannot time slice the diagram into stripes containing only one elementary intertwiner. Also some wires may have pieces going exactly horizontally. We now allow to make small vertical translations such that these crossings and trivalent vertices are put on slightly different levels and all wires obtain piecewise slopes, without letting wires touch or producing new crossings, but we may possibly produce some new cups or caps. In the latter case we can always arrange it so that even each new cup or cap appears on a distinct level. The trivalent vertices and crossings may not be in “standard form”, i.e. in Y- or inverted Y shape respectively X-shape. In an “-neighborhood” of a trivalent vertex, we now bend the wires so that the angles are arranged in standard form. Similarly we modify the crossings to bend them into an X-shape. Using for labels at trivalent vertices our replacement prescription by Frobenius reciprocity, we can obtain isometries as labels for trivalent vertices in inverted Y-shape, located on the bottom corner region, and co-isometries as labels for trivalent vertices in Y-shape, located on the top corner region. Again, these topological moves are allowed to produce at most new cups or caps, all on different levels so that the resulting diagram can be time sliced into stripes of elementary diagrams. Clearly, this procedure of deforming a labelled knotted graph in a Frobenius annulus into a regular wire diagram is highly ambiguous. However, the ambiguities in the above procedures are irrelevant: The ambiguities arising from the production of slopes of wires and different levels of certain elementary intertwiners are irrelevant due to the topological invariance of Fig. 11 and the freedom of translating intertwiners vertically as shown in Fig. 5, and the ambiguities arising from rotations of the elementary intertwiners are irrelevant due to the rotation invariance of trivalent vertices and crossings, as we have established in Figs. 24–27. Now let G1 and G2 be two labelled knotted graphs as above which are defined on the same (connected, simply connected) region in the plane and have the same entering and outgoing wires at the same points with the same orientation, i.e. they have coinciding open ends so that they fit in the same Frobenius annuli. When embedded in some Frobenius annulus it may now happen that the corresponding intertwiners are the same, even if G1 and G2 are different. Because of the isomorphism property of Frobenius rotations

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

453

it is clear that then G1 and G2 yield the same intertwiner through embedding in any Frobenius annulus. We can write down sufficient conditions for such equality in terms of some “regular isotopy”: For given G1 and G2 as above choose a Frobenius annulus and regularize the pictures into two wire diagrams W1 and W2 , respectively. We call G1 and G2 regularly isotopic if W1 can be transformed into W2 by the following list of moves: 1. Reversing orientation of some wires with simultaneous relabelling by conjugate morphisms, 2. any horizontal translations of elementary intertwiners which may change slopes of wires but which do not let the wires meet or involve cups or caps, 3. vertical translations of elementary intertwiners as in Fig. 5, 4. topological moves as in Fig. 11, 5. rotations of trivalent vertices and their labels as in Figs. 24–26, 6. and for wires corresponding to a braided system 1i,i we additionally admit (a) vertical Reidemeister moves of type II as in Fig. 13, (b) moving crossings over and under trivalent vertices, cups and caps according to the BFE’s (cf. Figs. 14 and 15 for the first two relations), (c) vertical Reidemeister moves of type III for crossings (cf. Fig. 16 for overcrossings), (d) rotations of crossings (cf. Fig. 27 for overcrossings). Thus the ambiguity in the regularization procedure means in particular that from one graph we can only obtain wire diagrams that can be transformed into each other by these moves. It is easy to see that regular isotopy is an equivalence relation for knotted labelled graphs. Moreover, for closed labelled knotted graphs (i.e. without open ends) which are then embedded in a trivial annulus, the local rotation invariance of the elementary intertwiners ends up in a total rotation invariance: We can rotate the picture freely, the rotated graph is always regularly isotopic to the original one and we will always end up with the same scalar (times 1Ai , where Ai is the factor associated to the outside region).6 Let us finally consider an intertwiner x ∈ Hom(ρ, ρ) with ρ ∈ Mor(A, B) irreducible. Then clearly x is a scalar: x = ξ 1B , ξ ∈ C. Hence we have the identity dρ ξ 1B ≡ dρ x = dρ r¯ρ∗ x r¯ρ , and this is graphically the left-hand side in Fig. 29. On the ρ

dρ

?

x

ρ

?

=

x

6

←→

ρ

6

x

ρ

Fig. 29. Two intertwiners of the same scalar value

other hand, application of the left inverse yields dρ φρ (x) = dρ rρ∗ ρ(x)r ¯ ρ = dρ ξ 1A , which is a different intertwiner of the same scalar value, and it is represented graphically by the right-hand side in Fig. 29. Thus the left and right-hand side in Fig. 29 represent the same scalar. If we consider closed wire diagrams and are only interested in the scalars they represent, then we therefore have a “regular isotopy on the 2-sphere”. 6 For a single kind of wire corresponding to a braided system, this invariance is similar to the complex number-valued regular isotopy invariant of knotted graphs obtained in [36].

454

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

3.3. α-Induction for braided subfactors. We now consider α-induction of [2–4] in the setting of braided subfactors. Here we work with a type III subfactor N ⊂ M, equipped with a braided system 1 ⊂ End(N ) in the sense of Definition 2.1 such that for the injection map ι : N → M, the sector [¯ιι] decomposes into a finite sum of sectors of morphisms in 1. (Here ι¯ denotes any choice of a representative morphism for the conjugate sector of [ι].) Note that since elements in 1 have by definition finite statistical dimension, it follows that the injection map has finite statistical dimension and thus the subfactor N ⊂ M has finite index. But also note that we did neither assume the finite depth condition on N ⊂ M (we did not assume finiteness of 1) nor non-degeneracy of the braiding at this point. As usual, we denote the canonical endomorphism ι¯ι ∈ End(M) by γ = ι¯ι, the dual canonical endomorphism ι¯ι ∈ End(N ) by θ = ι¯ι and “canonical” isometries by v ∈ M and w ∈ N , more precisely, we have v ∈ Hom(idM , γ ) and w ∈ Hom(idN , θ) such that w∗ v = γ (v ∗ )w = [M : N]−1/2 1. Recall that we have pointwise equality M = Nv. With a braiding ε on 1 and its extension to 6(1) as in Subsect. 2.2 we can define the α-induced αλ± for λ ∈ 6(1) exactly as in [33,2], namely we define αλ± = ι¯ −1 ◦ Ad(ε± (λ, θ )) ◦ λ ◦ ι¯. Then αλ+ and αλ− are morphisms in Mor(M, M) with the properties αλ± ◦ι = ι◦λ, αλ± (v) = ± ± = αλ± αµ± if also µ ∈ 6(1), and clearly αid = idM . Note that the first ε ± (λ, θ)∗ v, αλµ N property yields immediately dα ± = dλ by the multiplicativity of the minimal index [31]. λ

± ¯ We also obtain easily that αλ± = αλ± ¯ since we obtain rλ = ε (θ, λλ)θ (rλ ) and similarly ± ¯ r¯λ = ε (θ, λλ)θ(¯rλ ) easily from Eq. (8). Multiplying both relations by v from the ± ± ± ± rλ , hence rλ ∈ Hom(idM , αλ± right yields rλ v = αλ± ¯ αλ (v)rλ and r¯λ v = αλ αλ¯ (v)¯ ¯ αλ ), ± ± r¯λ ∈ Hom(idM , αλ αλ¯ ) as M = N v, thus we can put Rα ± = ι(rλ ), R¯ α ± = ι(¯rλ ) as λ

λ

R-isometries for the α-induced morphisms, i.e. αλ± = αλ± ¯ . Note also that the definition of αλ± does not depend on the choice of the representative morphism ι¯ for the conjugate sector of [ι] due to the transformation properties of the braiding operators, Eq. (7). Though the local net structure for N (I ) ⊂ M(I ) is assumed in [33,2], we need only an assumption of a braiding for the definition of αλ± . We, however, have to be careful, because we do not assume the chiral locality condition ε(θ, θ )γ (v) = γ (v) in this paper. (The name “chiral locality” is motivated from the treatment of extensions of chiral observables in conformal field theory in the setting of nets of subfactors [33], where the extended net is shown to satisfy local commutativity if and only if the condition ε(θ, θ)γ (v) = γ (v) is met [33, Thm. 4.9].) Some theorems in [2–4] do depend on the chiral locality condition and are not true in this more general setting of α-induction. Namely, with ε(θ, θ)γ (v) = γ (v) it was easily derived [2, Lemma 3.5] by using the BFE that then Hom(αλ± , αµ± ) = Hom(ιλ, ιµ) for λ, µ ∈ 6(1). As a surprising corollary (cf. [2, Cor. 3.6]) one found by putting λ = µ = idN that ι, thus the subfactor N ⊂ M, was irreducible which had not been assumed. Another corollary was then the “main formula” [2, Thm. 3.9], giving hαλ± , αµ± i = hιλ, ιµi = hθ λ, µi by Frobenius reciprocity. (Moreover, in the framework of nets of subfactors N ⊂ M, where the braidings arise from the transportability of localized endomorphisms, a certain reciprocity formula hαλ± , βi = hλ, σβ i, called “ασ -reciprocity”, between localized transportable endomorphisms λ and β of the smaller respectively the larger net was established; here σ -restriction is essentially σβ = ι¯βι.) Without chiral locality, these results are in general not true: The subfactor N ⊂ M is neither forced to be irreducible, nor does the main

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

455

formula hold, however, we always have the inequality hαλ± , αµ± i ≤ hθ λ, µi, since only the “≥” part of the proof of [2, Thm. 3.9] uses chiral locality. It is a simple application of the braiding fusion equation and does not involve chiral locality that for λ, µ, ν ∈ 6(1) we have the (equivalent) relations [2, Lemma 3.25] αρ∓ (Q)ε± (λ, ρ) = ε± (µ, ρ)Q, Qε± (ρ, λ) = ε± (ρ, µ)αρ± (Q)

(14)

whenever Q ∈ Hom(ιλ, ιµ). Let a ∈ Mor(M, N) be such that [a] is a subsector of [µ¯ι] for some µ ∈ 6(1). ¯ is a subsector of [ι¯ν ] Hence aι ∈ 6(1). Similarly, let b¯ ∈ Mor(N, M) be such that [b] ¯ ι¯ν ) is an isometry we put for some ν¯ ∈ 6(1). If T ∈ Hom(b, ¯ = T ∗ ε± (λ, ν¯ )α ± (T ), E ± (b, ¯ λ) = (E ∓ (λ, b)) ¯ ∗. E ± (λ, b) λ Note that the definition is independent of the choice of T and ν¯ in the following sense: ¯ ιτ¯ ) is an isometry for some τ¯ ∈ 6(1) then ST ∗ ∈ Hom(ι¯ν , ιτ¯ ) and If also S ∈ Hom(b, therefore ¯ = S ∗ ST ∗ ε± (λ, ν¯ )α ± (T ) = S ∗ ε± (λ, τ¯ )α ± (ST ∗ T ) = S ∗ ε± (λ, τ¯ )α ± (S). E ± (λ, b) λ λ λ ¯ is unitary. Similarly one easily checks that E ± (λ, b) Proposition 3.1. Let λ ∈ 6(1), let a ∈ Mor(M, N ) be such that [a] is a subsector of ¯ is a subsector of [ι¯ν ] [µ¯ι] for some µ ∈ 6(1) and let b¯ ∈ Mor(N, M) be such that [b] for some ν¯ ∈ 6(1). Then we have ¯ ∈ Hom(α ± b, ¯ bλ). ¯ ε ± (λ, aι) ∈ Hom(λa, aαλ± ), E ± (λ, b) λ

(15)

Proof. The first relation in Eq. (15) is trivial on N , so we only need to show it for v since M = Nv. Note that a(v) ∈ Hom(aι, aιθ ), therefore Eq. (5) yields a(v)ε± (λ, aι) = aι(ε± (λ, θ ))ε± (λ, aι)λ(a(v)), hence a ◦ αλ± (v) = aι(ε± (λ, θ )∗ )a(v) = ε± (λ, aι)λ(a(v))ε± (λ, aι)∗ = Ad ε± (λ, aι) ◦ λ ◦ a(v). ¯ ι¯ν ): For the second relation we use the fact that T T ∗ ∈ Hom(ι¯ν , ι¯ν ) for T ∈ Hom(b, ¯ ± b(n) ¯ E ± (λ, b)α = T ∗ ε± (λ, ν¯ )αλ± (T T ∗ ν¯ (n)T ) = T ∗ ε± (λ, ν¯ )λ¯ν (n)αλ± (T ) λ ± (λ, b) ¯ ¯ = T ∗ ν¯ λ(n)ε± (λ, ν¯ )α ± (T ) = bλ(n)E λ

for all n ∈ N. u t Due to Prop. 3.1 we can now draw the pictures in Fig. 30 for the operators ε± (λ, aι) ¯ The pictures for their conjugates ε∓ (aι, λ) and E ∓ (b, ¯ λ) are as usual and E ± (λ, b). obtained by horizontal reflection and inversion of arrows of the pictures in Fig. 30. ¯ are subsectors of [ιµ] Lemma 3.2. Let a, ¯ b¯ ∈ Mor(M, N ) be such that [a] ¯ and [b] ¯ and ¯ we have [ι¯ν ] for some µ, ¯ ν¯ ∈ 6(1), respectively. Whenever Y ∈ Hom(a, ¯ b) ¯ ρ) Y, Y E ± (ρ, a) ¯ αρ± (Y ). ¯ ρ) = E ± (b, ¯ = E ± (ρ, b) αρ∓ (Y ) E ± (a,

456

J. Böckenhauer, D. E. Evans, Y. Kawahigashi λ

R a

λ

a

?

R

; +

αλ ?

a

αλ+

a

?

; −

αλ ?

b

αλ−

6b R

;

?λ

6b R

b

?λ

¯ E − (λ, b), ¯ respectively Fig. 30. Wire diagrams for ε+ (λ, aι), ε − (λ, aι), E + (λ, b),

¯ ι¯ν ) be isometries. Then E ± (a, Proof. Let S ∈ Hom(a, ¯ ιµ) ¯ and T ∈ Hom(b, ¯ ρ) = ∓ ∗ ± ± ∗ ± ¯ ¯ ρ)S and E (ρ, b) = T ε (ρ, ν¯ )αρ± (T ). Now T Y S ∗ ∈ Hom(ιµ, ¯ ι¯ν ). αρ (S) ε (µ, Inserting this in Eq. (14) yields the statement. u t In order to establish a symmetry for “moving crossings over trivalent vertices” we can now state the following Proposition 3.3. Let λ, ρ ∈ 6(1), let a, b ∈ Mor(M, N ) be such that [a] and [b] are subsectors of [µ¯ι] and [ν¯ι] for some µ, ν ∈ 6(1) and let a, ¯ b¯ ∈ Mor(N, M) ¯ x ∈ Hom(a, λb) and Y ∈ be conjugates, respectively. Whenever t ∈ Hom(λ, a b), ¯ we have the intertwining braiding fusion equations (IBFE’s): Hom(a, ¯ bλ), ¯ ρ)) t, ρ(t) ε ± (λ, ρ) = ε± (aι, ρ) a(E ± (b, ± ± ± ¯ ε (ρ, aι) ρ(t), t ε (ρ, λ) = a(E (ρ, b)) ± ± ρ(x) ε (aι, ρ) = ε (λ, ρ) λ(ε± (bι, ρ)) x, x ε ± (ρ, aι) = λ(ε± (ρ, bι)) ε± (ρ, λ) ρ(x), ¯ ρ) b(ε ¯ ± (λ, ρ)) Y, ¯ ρ) = E ± (b, αρ∓ (Y ) E ± (a, ¯ ± (ρ, λ)) E ± (ρ, b) ¯ αρ± (Y ). Y E ± (ρ, a) ¯ = b(ε

(16) (17) (18) (19) (20) (21)

¯ must be a subsector of [ι¯ν ] for ν¯ ∈ 6(1) a conjugate of ν, there is an Proof. Since [b] ¯ ι¯ν ). Note that then a(T ) ∈ Hom(a b, ¯ aι¯ν ). Hence by naturality isometry T ∈ Hom(b, and Proposition 3.1 we compute ¯ = a(T ∗ )ε± (ρ, aι¯ν )ρa(T ) = a(T ∗ )a(ε± (ρ, ν¯ ))ε± (ρ, aι)ρa(T ) ε ± (ρ, a b) ¯ ± (ρ, aι), = a(T ∗ )a(ε± (ρ, ν¯ ))aαρ± (T )ε± (ρ, aι) = a(E ± (ρ, b))ε ¯ ρ) = ε± (aι, ρ)a(E ± (b, ¯ ρ). We also obtain and hence also ε ± (a b, ε ± (λbι, ρ) = ε± (λ, ρ)λ(ε± (bι, ρ)) and

ε ± (ρ, λbι) = λ(ε± (ρ, bι))ε± (ρ, λ)

by Eq. (9). Note that x ∈ Hom(aι, λbι) by restriction. Equations (16)–(19) follow now ¯ = ¯ ι¯ν λ), and hence E ± (ρ, bλ) by naturality, Eq. (8). Next, we note that T ∈ Hom(bλ, ∗ ± ± T ε (ρ, ν¯ λ)αρ (T ). Therefore ¯ = T ∗ ν¯ (ε± (ρ, λ))ε± (ρ, ν¯ )αρ± (T ) = b(ε ¯ ± (ρ, λ))T ∗ ε± (ρ, ν¯ )αρ± (T ) E ± (ρ, bλ) ¯ ¯ ± (ρ, λ))E ± (ρ, b), = b(ε ¯ ρ) = E ± (b, ¯ ρ)b(ε ¯ ± (λ, ρ)). Now Eqs. (20) and (21) follow from and hence also E ± (bλ, Lemma 3.2. u t

On α-Induction, Chiral Generators and Modular Invariants for Subfactors λ

λ

? tI =

αρ−

?

ρ

a

?

tI

457

b

ρ

a

b

? ?

Fig. 31. The first intertwining braiding fusion equation (overcrossings)

6a

αρ+

6a

αρ+

Y =

Y

b

R

λ

b

ρ

?

λ

ρ

? ?

Fig. 32. The sixth intertwining braiding fusion equation (overcrossings)

These IBFE’s can be nicely visualized in diagrams. We display Eq. (16) in Fig. 31 and Eq. (21) in Fig. 32, both for overcrossings. We leave the remaining diagrams as a straightforward exercise to the reader. Note that the IBFE’s give us the freedom to move wires with label ρ and αρ± freely over trivalent vertices which involve one N-N wire and ¯ yields a “vertical Reidemeister move two N-M wires. Unitarity of operators E ± (λ, b) of type II” similar to Fig. 13. We can now also easily elaborate the rotation behavior of mixed crossings displayed in Fig. 30 (and consequently their conjugates). Crucial for this is the fact that Rα ± = ι(rλ ) ≡ rλ and R¯ α ± = ι(¯rλ ) ≡ r¯λ can be used as R-isometries for λ

λ

the α-induced morphisms as Rα ± ∈ Hom(idM , αλ± αλ± ) and R¯ α ± ∈ Hom(idM , αλ± αλ± ) λ

λ

satisfy αλ± (Rα ± )∗ R¯ α ± = dλ−1 1M and αλ± (R¯ α ± )∗ R¯ α ± = dλ−1 1M and dα ± = dλ . First we λ λ λ λ λ notice that we have ε ± (λ, aι) = dλ r¯λ λ(ε∓ (aι, λ¯ )) λa(rλ ) ¯ and r¯a ∈ Hom(idN , a a) ¯ be isometries such by Eq. (13). Now let Ra ∈ Hom(idM , aa) ¯ ra )∗ Ra = da−1 , and otherwise we keep the notations as that a(Ra )∗ r¯a = da−1 1N and a(¯ ¯ λ))¯ra = ε± (λ, aι)λ(¯ra ). Hence we have in Prop. 3.3. From Eq. (17) we obtain a(E ∓ (a, ε ± (λ, aι) = da ε± (λ, aι) λa(Ra )∗ λ(¯ra ) = da aαλ± (Ra )∗ ε± (λ, aι) λ(¯ra ) ¯ λ)) r¯a . = da aαλ± (Ra )∗ a(E ∓ (a,

Next we compute, using again Eq. (13), ¯ = T ∗ ε± (λ, ν¯ ) α ± (T ) = dλ T ∗ r¯λ λ(ε∓ (¯ν , λ))λ¯ ¯ ν (rλ ) α ± (T ) E ± (λ, b) λ λ ± ± ± ± ∓ ¯ ¯ ∗ ∗ ∗ ∓ ¯ ¯ λ ). ¯ = dλ r¯ α (α (T ) ε (¯ν , λ)T )α b(rλ ) = dλ r¯ α (E (b, λ))α ± b(r λ λ

λ¯

λ

λ λ

λ

458

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Finally, as Eq. (17) yields r¯a∗ a(E ± (λ, a) ¯ = λ(¯ra )∗ ε∓ (aι, λ), we obtain ± ¯ = da a(¯ ¯ ra )∗ aa(E ¯ (λ, a))R ¯ ¯ ra )∗ a(ε ¯ ∓ (aι, λ))Ra . E ± (λ, a) a = da aλ(¯

Drawing for Rα ± = ι(rλ ) and R¯ α ± = ι(¯rλ ) caps of the wires αλ± , these relations yield λ λ graphically the analogues of Fig. 27. We conclude that we can include the crossings of Fig. 30 consistently in our “rotation covariant” graphical framework. 4. Double Triangle Algebras for Subfactors We now formulate Ocneanu’s construction [39] for a subfactor with finite index and finite depth rather than for bi-unitary connections and bimodules arising from Goodman-de la Harpe-Jones subfactors associated to A-D-E Dynkin diagrams in order to apply it in a more general context. From now on we work with N ⊂ M satisfying the following Assumption 4.1. Let N ⊂ M be a type III subfactor with finite index. We assume that we have a system of endomorphisms N XN ⊂ Mor(N, N ) ≡ End(N ) in the sense of Definition 2.1 such that for the injection map ι : N → M, the sector [θ] = [¯ιι] decomposes into a sum of sectors of morphisms in N XN . We choose sets of morphisms N XM ⊂ Mor(M, N), M XN ⊂ Mor(N, M) and M XM ⊂ Mor(M, M) ≡ End(M) consisting of representative endomorphisms of irreducible subsectors of sectors of the form [λ¯ι], [ιλ] and [ιλ¯ι], λ ∈ N XN , respectively. (We may and do choose idM in M XM as the endomorphism representing the trivial sector.) We also assume that N XN is finite. Consequently, the set X = N XN t N XM t M XN t M XM is finite. Note that Assumption 4.1 implies that representative morphisms for all irreducible sectors appearing in decompositions of powers [γ k ] ([θ k ]) of Longo’s (dual) canonical endomorphism are contained in M XM (N XN ). In other words, the set X contains at least the morphisms corresponding to the (equivalence classes of) bimodules arising from this subfactor through the Jones tower, and therefore we may call an X which does not contain any other morphisms a minimal choice. We conclude that finiteness of N XN in Assumption in 4.1 automatically implies that the subfactor N ⊂ M has finite depth. We used sectors instead of bimodules in view of our “identification” of chiral generators with α-induced sectors below. Therefore we need a sector approach in order to define α-induction since its definition involves ι¯ −1 , and hence we work with factors of type III. (We do not need hyperfiniteness of M for our purposes.) We now use the graphical calculus presented in Sect. 3. In the graphical method of [37] (and [11, Chapter 12]), factors, bimodules (morphisms), and intertwiners are represented with trivalent vertices, edges, and triangles, respectively, and this is where the name “double triangle algebra” comes from. However, here (as in [38,39]) these three kinds of objects are represented by regions, wires, and trivalent vertices, respectively, though the labels for regions are omitted for notational simplicity. with two multiFor X in Assumption 4.1, we define the double triangle algebra plications ∗h and ∗v as follows. As a linear space, we set M ¯ cd). ¯ = Hom(a b, a,b,c,d∈N XM

is presented graphThis is a finite dimensional complex linear space. An element in ically as in Fig. 33 under the interpretation in Sect. 3 with the convention of reading

On α-Induction, Chiral Generators and Modular Invariants for Subfactors s∗

a

459

b

λ

?

c

d

t

Fig. 33. An element in

the diagram from the top to the bottom. (A general element in is a linear combination of this type of element.) We can interpret the same diagram with the convention of reading the diagram from the left to the right or, equivalently, keeping the top-to-bottom convention but putting the diagram in a suitable Frobenius annulus. Then the resulting intertwiner is in M ¯ = Hom(ca, ¯ db). a,b,c,d∈N XM

The isomorphism of these two spaces is given by application of two Frobenius rotations, and . By our convention of the normaland we can use this isomorphism to identify 1/4 1/4 1/4 1/4 −1/2 ization in Sect. 3, the diagram of Fig. 33 represents an element da db dc dd dλ ts ∗ ¯ cd), ¯ where s ∈ Hom(λ, a b) ¯ and t ∈ Hom(λ, cd) ¯ are isometries in the block Hom(a b, and λ ∈ N XN . Similarly we may use elements in which are graphically represented ¯ and β ∈ M XM . Note as in Fig. 34 with isometries S ∈ Hom(β, ca), ¯ T ∈ Hom(β, db) a

b β

S∗ c

-

T

d

Fig. 34. An element in

that elements of the form in Fig. 33, or equivalently of the form in Fig. 34, span linearly. Our graphical convention is as follows. We use thin, thick, and very thick wires for N-N morphisms, N -M morphisms, and M-M morphisms, respectively, analogous to the convention [39]. We call them N -N wires, and so on. We label N -N morphisms with Greek letters λ, µ, ν, . . . , N -M morphisms with Roman letters a, b, c, d, . . . , and M-M morphisms with Greek letters β, β 0 , β 00 , . . . . We orient N-N or M-M wires but we put no orientations on N -M wires since it is clear from the context whether we mean an N-M morphism a or an M-N morphism a. ¯ We simply put a label a for an unoriented or , the same intertwiner thick wire for both. Note that, whatever we consider, (as an operator) may appear in different blocks of the double triangle algebra, e.g. the ¯ a b), ¯ a, b ∈ N XM . The graphical notation identity idN is an element in any Hom(a b, is particularly useful in order to avoid this kind of confusion because diagrams as in

460

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

s∗

a

?

t

t0

d0

s0∗

s∗

a = δb,a 0 δd,c0

?

c0

d

b0

µ

∗h

λ c

s0∗

a0

b

b

λ c

b0

µ

d

? ? t0

t

d0

Fig. 35. The horizontal product ∗h on

Figs. 33 and 34 always specify also the associated block in addition to the intertwiner as an operator. is defined as in Fig. 35. The meaning of the rightThe horizontal product ∗h on hand side is as follows. The product is by definition zero if the labels of the open ends of the wires facing each other do not match. If they match, we glue the wires of the two diagrams together as in Fig. 35 and interpret it as an intertwiner. It belongs to the block of the double triangle algebra which is specified by the four remaining open ends of the new diagram. This is a horizontal version of the composition of intertwiners described in Sect. 3. We also can represent this horizontal product in terms of elements in Fig. 34. This is described in Fig. 36, because the convention of Sect. 3 means that this product is just the composition of the intertwiners in , and this composition is realized by taking the inner product of the two intertwiners in the right-hand side in Fig. 36. by composing two diagrams vertiWe similarly define the vertical product ∗v on cally, but with extra coefficients as in Fig. 37. The meaning of the right-hand side is as before. Note that the definitions of horizontal and vertical products are not completely symmetric due to the extra coefficients we chose. This choice is somewhat arbitrary but it just turns out to be useful for our purposes. Namely, with this definition of the products,

a

-β

S∗

a0

b T

c

∗h

S0∗

b0

-β

0

T0

c0

d

s = δb,a 0 δd,c0 δβ,β 0

b0

a db dd 0 hS , T i dβ

-β

S∗

d0

d0

c

Fig. 36. The horizontal product presented in another way

a β c

a0

b

d

β0

∗v c0

a0

b0

-

√ = δa,c0 δb,d 0 da db

d0

Fig. 37. The vertical product ∗v in

b0 β0

a

β c

T0

d

b

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

461

the minimal central projections of ( , ∗h ) have simple and useful composition rules with respect to the vertical product ∗v , see Theorem 4.4 below. We clearly also have a ∗-structure for the horizontal product obtained by vertical reflection of the diagram, adjoining labels for trivalent vertices and reversing orientations of wires. Analogously, a ∗-structure for the vertical product comes from horizontal reflection. The basic idea is that the 90-degree rotation is something like a “Fourier transform” which transforms the two products into each other, similar to the situation of the group algebra of a finite or compact group. β;i ¯ ∈ Hom(β, ba), For each β, λ, a, b we choose orthonormal bases of isometries T i=

β 1, 2, ..., Nb,a ¯ ,

and N

λ;j ta,b¯

¯ j = 1, 2, ..., N λ , so that ∈ Hom(λ, a b) a b¯

β

¯ b,a X X

β∈M XM i=1

¯ b,a

Nλ

β;i

β;i

∗ Tb,a ¯ (Tb,a ¯ ) = 1M and

a,b¯ X X

λ∈N XN j =1

λ;j

λ;j

ta,b¯ (ta,b¯ )∗ = 1N

(22)

for all a, b ∈ N XM . Then it is easy to see that the elements in Fig. 38 form bases of

p

d,b,j

eβ;c,a,i = √ 4

b

a dβ

da db dc dd

s

β-

β;i ∗ (Tc,a ¯ )

β;j d,b ,

T¯

a,b,i = fλ;c,d,j

a dλ da db dc dd

λ

d

c

c

d,b,j

a,b,i , ∗h ) and fλ;c,d,j for (

Fig. 38. Matrix units eβ;c,a,i for (

(t λ;i¯ )∗ b a,b

?

λ;j

t ¯ c,d

d

, ∗v )

which constitute complete systems of matrix units ( , ∗h ) respectively ( , ∗v ). Thus for each of the two multiplications the double triangle algebra is a direct sum of full matrix algebras. The two different bases are transformed into each other by a unitary transformation with coefficients given by the 6j -symbols for subfactors of [37] (see [11, Chapter 12] for the basic properties of “quantum 6j -symbols”), but this will not be exploited here. P b,a,i ∈ . Definition 4.2. For each β ∈ M XM we define an element eβ = a,b,i eβ;b,a,i Graphically, this element is given by the left-hand side in Fig. 39. We use the convention shown on the right-hand side in Fig. 39 to represent this element.

a X a,b,i

s

dβ da db

a β

β;i ∗ ) b,a

(T ¯

b

-

a β;i b,a

T¯

=:

b

Fig. 39. The minimal central projection eβ

β

X a,b

a

b

b

462

J. Böckenhauer, D. E. Evans, Y. Kawahigashi β

Due to the summation over i = 1, 2, ..., Nb,a ¯ , the definition is independent of the choice of the intertwiner bases as different orthonormal bases are related by a unitary matrix. We will use such a graphical convention whenever we have a sum over internal “fusion channels” of two corresponding trivalent vertices together with prefactors which renormalize the trivalent vertices to isometries. Note that we obtain a prefactor, as displayed in Fig. 40 for an example, when we turn around the small arcs at trivalent vertices. Here the dotted parts mean that there might be expansions as given in the following lemma or later even be braiding operators in between; it is just important that the small arcs at corresponding trivalent vertices denote the same summation over internal fusion channels. a

λ

a

b

-

a

dλ = db

λ

a

b

Fig. 40. Turning around small arcs yields a prefactor

Lemma 4.3. The identity of Fig. 41 holds. Analogous identities hold if a, b, β are replaced by wires of other type (in a compatible way).

a

a =

X

a β

-

β

b

b

b

Fig. 41. The identity with expansion using β

Proof. With the normalization convention as in Fig. 39, this is just the expansion of the identity in Eq. (22), and this certainly holds as well using similar expansions with other intertwiner bases. u t Note that the identity in Fig. 41 may, for example, also appear rotated by 90 degrees as we can put the left- and right-hand sides in some Frobenius annulus as described in Subsect. 3.2. As we have already indicated, the horizontal product is essentially the composition of intertwiners in . The main point of the double triangle algebra is the following. Suppose we have complete information on the fusion rules of N-N, N -M, M-N morphisms in terms of matrix elements in X and their 6j -symbols. We can define the algebra λ;a,b,i fc,d,j and determine their composition with respect to the horizontal product without any information of the M-M morphisms. Then we can find M-M sectors and determine their fusion rules by the following theorem which generalizes a result for Goodman–de la Harpe–Jones subfactors in [39] in a straightforward manner.

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

463

Theorem 4.4. For any β ∈ M XM the element eβ ∈ of Definition 4.2 is a minimal central projection with respect to the horizontal product, and all minimal central projections arise in this way in a bijective correspondence. Furthermore, we have7 X

e β ∗v e β 0 =

β 00 ∈M XM

dβ dβ 0 β 00 N 0 eβ 00 dβ 00 β,β

for all β, β 0 ∈ M XM . In particular, the center Zh of product is closed under the vertical product.

(23)

with respect to the horizontal

Proof. That each eβ is a minimal central projection and that all minimal central projections arise in this way is obvious from the description of the matrix units. The vertical product eβ ∗v eβ 0 is given graphically by the left-hand side of Fig 42. We can use the a X a,b,c

a

β0

a

db

b

b

β

c

X

=

a

-

β 000

-

db

c

c

a

β0

-

β 00

-

b

a,b,c, β 00 ,β 000 ,β 0000

c

β0

a 0000

β -

b

-

β

β

c

c

Fig. 42. The vertical product eβ ∗v eβ 0 0 expansion of Lemma 4.3 for the two parallel P wires β and β in the middle. Now note that the horizontal unit is given by 1h = β eβ . Therefore, by multiplying 1h from the left and from the right, we obtain the diagram on the right-hand side of Fig. 42. Reading the diagram from left to right, we observe that intertwiners in Hom(β 000 , β 00 ) and Hom(β 00 , β 0000 ) are involved here. Hence we first obtain a factor δβ 000 ,β 00 δβ 00 ,β 0000 . Next, we can use the trick of Fig. 40 to turn around the small arcs at the trivalent vertices involving a, b, β 0 . This yields a factor dβ0 /db . This way we see that the diagram on the right-hand

as the diagram in. Fig. 43. Now let

side of Fig. 42 represents the same element of the -

a X a,b,c,β 00

dβ 0

β 00

c

b

a c

β0

β0

β 00

β

-

β

-

-

a

b

a c

β 00

-

c

Fig. 43. The vertical product eβ ∗v eβ 0

us look at the part of this picture inside the dotted box. Reading it from the left, this part 7 Note that the fusion coefficients with dimension prefactors as in Eq. (23) coincide with the structure constants used for C-algebras [1].

464

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

-

a X a,c,β 00 ,β 000

dβ 0

β 00

a

-

c

c

β0

β 00

-

-

β0

-

-

β

β

β 000

-

β 000

a

a c

β 00

-

c

Fig. 44. The vertical product eβ ∗v eβ 0

P

β 00 ;k

β 00 ;k

Ti Tβ,β 0 (Tβ,β 0 )∗ Ti∗ , and the sum over i runs over ¯ β¯0 ) since we a full orthonormal bases of isometries Ti in the Hilbert space Hom(β, ca have the summation over b. Next we look at the part inside the dotted box of the diagram in Fig. 44. Here, since we introduced the sum over β 000 , the part can be similarly read P β 00 ;k β 00 ;k for fixed a and c as j,k Sj Tβ,β 0 (Tβ,β 0 )∗ Sj∗ , where the sum over j runs over another ¯ β¯0 ). Since such bases orthonormal basis of isometries Si in the Hilbert space Hom(β, ca {Ti } and {Sj } are related by a unitary matrix transformation (this is essentially “unitarity of 6j -symbols”), we conclude that the diagrams in Figs. 43 and 44 represent the same element in . We now see that we first obtain a factor δβ 00 ,β 000 . Next we can turn around the small arcs at the outer two trivalent vertices involving β, β 0 and β 000 = β 00 so that we obtain a factor dβ /dβ 00 . Then, by “stretching” the diagram a bit, we can read the diagram for fixed a, c, β 00 as can be read for fixed a and c as

β 00

Nc,a ¯

X

i,k

β 00

Nβ,β 0

X dβ dβ 0 β 00 ;i β 00 ;i β 00 ;j β 00 ;l T ¯ (Tc,a )∗ Tc,a (Tβ,β 0 )∗ ¯ ¯ dβ 00 c,a

i,j,m=1 k,l=1

β 00 ;k

β 00 ;k

β 00 ;l

β 00 ;j

β 00 ;m

β 00 ;m

Tβ,β 0 (Tβ,β 0 )∗ Tβ,β 0 (Tc,a )∗ Tc,a (Tc,a )∗ ¯ ¯ ¯ β 00

Nc,a ¯

X dβ dβ 0 β 00 β 00 ;i β 00 ;i = N 0 T ¯ (Tc,a )∗ . ¯ dβ 00 β,β c,a i=1

t Now proceeding with the summations over a, c, β 00 yields the statement. u Now consider the vector space with basis elements [β], β ∈ M XM which we can P β 00 endow with a product through [β][β 0 ] = β 00 Nβ,β 0 [β 00 ]. We call the algebra defined this way the M-M fusion rule algebra. Similarly we define the N-N fusion rule algebra using morphisms in N XN . Definition 4.5. We define a linear map 8 from the M-M fusion rule algebra to Zh by linear extension of 8([β]) = eβ /dβ . Theorem 4.4 now says that this map 8 is an isomorphism from the M-M fusion rule algebra onto (Zh , ∗v ). Note that (Zh , ∗v ) is a non-unital subalgebra of ( , ∗v ). The P P a,b,j unit 1v of ( , ∗v ) is given by 1v = λ fλ , where fλ = a,b,j fλ;a,b,j whereas the unit of (Zh , ∗v ) is given by e0 .

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

Definition 4.6. We define two linear functionals ϕh and τv on two product structures ∗h and ∗v by linear extension of

465

corresponding to the

d,b,j

ϕh (eβ;c,a,i ) = δa,b δc,d δi,j da dc dβ /w2 ,

(24)

a,b,i ) = δa,c δc,d δi,j dλ . τv (fλ;c,d,j

Applied to an element in Fig. 33 (Fig. 34) the functional ϕh (τv ) can be characterized graphically as in Fig. 45 (Fig. 46). Therefore these functionals correspond to closing the open ends of a diagram with prefactors as in the middle part of Figs. 45 and 46. a ϕh :

a

b β-

S∗ c

T

da dc 7 −→ δa,b δc,d w2

β-

S∗

1/2

T

= δa,b δc,d

(da dc )3/2 dβ

c

d

w2

hS, T i

Fig. 45. The horizontal functional ϕh

a τv :

s∗ λ

c

?

t

s∗

b 7 −→ δa,c δb,d

p da db

a

d

λ

?

b

1/2

= δa,b δc,d da db dλ

hs, ti

t

Fig. 46. The vertical functional τv

P Recall that the global index of N XN is given by w = λ∈N XN dλ2 . Note that we P P have sector decompositions [aι] = λ hλ, aιi[λ] and hence da dι = λ hλ, aιidλ for hλ, aιi = hλ¯ι, aiP we obtain similarly dλ dι = any a ∈ N XM . Using Frobenius P 2reciprocity P P 2 hλ, aιid . Hence w = d = hλ, aιid d /d = a λ a ι a P λ λ λ,a a da . Similarly we obtain w = β dβ2 (cf. [37]). Lemma 4.7. We have ϕh (eβ ) = dβ2 /w. In particular, the functional ϕh is a faithful state on (

, ∗h ). The functional τv is a (un-normalized) faithful trace on (

, ∗v ).

Proof. By Definition 4.6 and Fig. 39, we compute   X X X β −2 b  Na,b = Na,β db  dβ da w−2 = dβ2 w −1 . ϕh (eβ ) = ¯ da db dβ w a,b∈N XM

a∈N XM

P

b∈N XM

Since the horizontal unit 1h is given by 1h = β eβ we find that ϕ(1h ) = 1. As ϕh sends off-diagonal matrix units to zero and the diagonal ones to strictly positive numbers, this proves that ϕh is a faithful state. Obviously also τv sends off-diagonal matrix units (with respect to ∗v ) to zero and the diagonal ones to strictly positive numbers, and hence it is a strictly positive functional but it is not normalized. The trace property τv (xy) = τv (yx) t is clear from the definition of τv using matrix units for x and y. u

466

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

For τv we could have gained analogous properties as for ϕh by replacing the scalar dλ in Eq. (24) by da db dλ /w2 (and by multiplying the scalars in Fig. 46 also by da db /w2 ). However, we chose a different normalization on each matrix unit in order to turn τv into a trace on ( , ∗v ). Later we want to study the center (Zh , ∗v ) which is, as we have seen, a subalgebra of ( , ∗v ). Therefore τv provides a faithful trace on (Zh , ∗v ) but it has in general different weightings on its simple summands. To construct from τv a trace which sends one-dimensional projections to one will in particular be possible in the case that N XN is non-degenerately braided, see Subsect. 6.1 below. This is also the case in the following most basic example of the double triangle algebra. Let N be a type III factor and G a finite group acting freely on N . Consider the subfactor N ⊂ N o G = M. Then (with the minimal choice for X ) the double triangle algebra for this subfactor is just the group algebra of G. That is, the double triangle algebra is spanned by the group elements linearly. The horizontal product is given by the group multiplication. By Proposition 4.4 we conclude that the minimal central projections in and thus irreducible M-M sectors are labelled by the irreducible representations of G. (Of course, this identification of the M-M sectors is well-known for that example.) The functional τv gives the standard trace on the group algebra, and the vertical product corresponds to the ordinary tensor product of group representations. 5. α-Induction, Chiral Generators and Modular Invariants 5.1. Relating α-induction to chiral generators. We will now define chiral generators for braided subfactors and prove that the concepts of α-induction and chiral generators are essentially the same. For the rest of this paper deal with the following Assumption 5.1. In addition to Assumption 4.1 we now assume that the system N XN is braided. With the braiding we have now the notion of α-induction in the sense of Subsect. 3.3. From now on we are also dealing with crossings of N-N wires and mixed crossings introduced in Subsect. 3.3. We now present chiral generators as our version of a definition Ocneanu originally introduced for systems of bimodules arising from A-D-E Dynkin diagrams in [39]. The construction of the chiral generator is similar to the “Ocneanu projection” in the tube algebra [38] (see also [12]) and also related to Izumi’s analysis [20] of the tube algebra in terms of sectors for the Longo–Rehren inclusion [33]. by the diagram on Definition 5.2. For any λ ∈ N XN , we define an element pλ+ ∈ the left-hand side of Fig. 47 and call it a chiral generator. Similarly, we also define pλ− by exchanging over- and undercrossings. Note that we do not assume the non-degeneracy of the braiding for the definition pλ+ . We obtain the diagram in the middle from the one on the left-hand side in Fig. 47 by applying two IBFE’s. This way we obtain two twists in the semi-circular thin wires which correspond to the label λ but they give complex conjugate phases so that their effects cancel out. The diagram on the right-hand side is obtained by Lemma 4.3 and application of the IBFE, and this shows that our definition coincides with Ocneanu’s notion given in his setting. Since αλ± ι = ιλ we find that each irreducible subsector [β] of [αλ± ] is the equivalence class of some β ∈ M XM if λ ∈ N XN . Therefore we have the sector decomposition

On α-Induction, Chiral Generators and Modular Invariants for Subfactors a X

a α+

=

-λ

a,b

b

a X

αλ+

a,b

b

b

a

a

-

467

=

X a,b,ν

b

b

b

b

- λ ν ?

a

b

Fig. 47. A chiral generator pλ+

P ± ± [αλ± ] = β∈M XM hβ, αλ i[β], and we can consider [αλ ] as an element of the M-M fusion algebra. The relation between the sector decomposition of [αλ± ] and the chiral generator is clarified by the following result. P −1 + Theorem 5.3. For any λ ∈ N XN , we have dλ−1 pλ± = β∈M XM dβ hβ, αλ ieβ , and ± ± ± consequently pλ = dλ 8([αλ ]). In particular, pλ is in the center Zh . Proof. We only show the statement for the +-sign; the other case is analogous. First we fix a, b ∈ N XM and λ ∈ N XN . For each β ∈ M XM we choose orthonormal bases P β;i ¯ i = 1, 2, ..., N β , so that β,i T β;i (T β;i )∗ = 1M . ∈ Hom(β, ba), of isometries Tba ¯ ¯ ¯ ¯ b,a ba ba β;i

Using Frobenius reciprocity, we obtain an orthonormal basis of isometries L−1 ¯ )= b (Tba 1/2 1/2 −1/2

β;i

∗ da db dβ b(Tba ¯ ) r¯b ∈ Hom(a, bβ). ¯ such that there is an isometry Here we chose an isometry r¯b ∈ Hom(idN , bb) ∗ ¯ ¯ rb )∗ Rb = d −1 1M , Rb ∈ Hom(idM , bb) subject to relations b(Rb ) r¯b = db−1 1N and b(¯ b as usual. Choosing also orthonormal bases of isometries Vβ;` ∈ Hom(β, αλ+ ), ` = P ∗ = 1M ) we find that 1, 2, ..., hβ, αλ+ i, for each β ∈ M XM (so that β,` Vβ;` Vβ;` β;i

+ {b(Vβ;` )L−1 ¯ )}β,i,` gives an orthonormal basis of isometries of Hom(a, bαλ ). Fib (Tba nally, using Proposition 3.1, we find that putting s da db + β;i β;i ∗ −1 + ∗ ε (λ, bι)∗ b(Vβ;` (Tba rb sβ;`,i = ε (λ, bι) b(Vβ;` )Lb (Tba ¯ )= ¯ ) )¯ dβ

defines an orthonormal basis of isometries {sβ;`,i }β,i,` of Hom(a, λb). Then we have for any ` = 1, 2, ..., hβ, αλ+ i by the elementary relations for the intertwiners Rb , r¯b the following identity: β;i β;i ∗ β;i ∗ β;i ∗ ¯ 2 ¯ r )∗ bb(T ∗ ¯ ¯ rb ) Tba b ¯ (Tba ¯ ) = db b(¯ ¯ Vβ;` ) Rb Rb bb(Vβ;` (Tba ¯ ) ) b(¯ ba d β db ¯ β;`,i ε+ (λ, bι)∗ ) Rb Rb∗ b(ε ¯ + (λ, bι)s ∗ ). b(s = β;`,i da

The second line yields graphically exactly the diagram in Fig. 48 where we read the diagram from the left to the right in order to interpret it as an intertwiner in . Now β let us take on both sides first the summation over i = 1, 2, ..., Nb,a ¯ . Then the left-hand ¯ ¯ side gives exactly the Hom(ba, ba) part of eβ (in ) as defined in Definition 4.2. Next we divide by dβ and we proceed with the summation over ` = 1, 2, ..., hβ, αλ+ i and P ¯ ba) ¯ part of β d −1 hβ, α + ieβ β ∈ M XM . On the left-hand side we obtain the Hom(ba, β λ

468

J. Böckenhauer, D. E. Evans, Y. Kawahigashi b a √

∗ sβ;`,i

sβ;`,i

R

dβ

b

dλ da db

-

αλ+

λ

a

λ

b

b β;i β;i ∗ (T ¯ ) ba ba

Fig. 48. Diagram for T ¯

¯ ba) ¯ part of 8([α + ]). On the right-hand side this way, and this is exactly the Hom(ba, λ we now have a summation over the full basis {sβ;`,i }β,i,` of Hom(a, λb). Therefore we can use the graphical convention of Fig. 39 to put a small semi-circle around the wire √ labelled by λ at the two trivalent vertices. This gives us a factor da db /dλ so that only a factor dλ−1 remains from the original prefactor in Fig. 48. Thus, by repeating the above procedure for all a, b ∈ N XM and making finally the summation over a, b ∈ N XM , we obtain on the left the full 8([αλ+ ]) whereas the right-hand side gives graphically the diagram in Fig. 49. The diagram on the left-hand side in Fig. 47 is obtained from Fig. 49,

b

b

a

a X 1 dλ

R

a,b

λ

-

-

αλ+

λ

b

b Fig. 49. The image 8([αλ+ ]) =

P

−1 + β dβ hβ, αλ ieβ

up to the factor dλ , by a topological move. u t Note that it was not clear from the definition that the chiral generators are in the center Zh , but Theorem 5.3 proves this centrality as it states that pλ± is a linear combination of eβ ’s. Also note that if αλ± is irreducible then pλ± is a (horizontal) projection, however, if αλ± is not irreducible, then pλ± is a sum over projections with weight coefficients arising from the nature of the isomorphism 8 in Definition 4.5. Two of us [4, Subsect. 3.3] established a relative braiding between the two kinds of α-induction, which holds in a fairly general context. (It does neither depend on chiral locality nor even on finite depth.) Theorem 5.3 now shows that Ocneanu’s relative braiding [39] is a special case of the analysis in [4, Subsect. 3.3]. From Theorem 5.3 and the homomorphism property of α-induction [2, Lemma 3.10], we obtain immediately the following

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

469

Corollary 5.4. The chiral generators pλ± are in Zh . For λ, µ ∈ N XN , we have pλ± ∗v pµ± =

X dλ dµ ν Nλ,µ pν± . dν

ν∈N XN

Note that this corollary shows that the M-M fusion rule algebra contains two representations of the N-N fusion rule algebra.

5.2. Modular invariants for braided subfactors. We will now show that a notion of “modular invariant” arises naturally for a braided subfactor. We first note that under Assumption 5.1, we have matrices Y = (Yλ,µ ) and T = (Tλ,µ ) for the system 1 = N XN as in Subsect. 2.2. We recall that in the case that the braiding is non-degenerate, the matrix S = w−1/2 Y is unitary and the matrices S and (the diagonal) T obey theVerlinde modular algebra by Theorem 2.5. Motivated by the results of [4] we now construct a certain matrix Z commuting with Y and T such that it is a “modular invariant mass matrix” in the usual sense of conformal field theory whenever the braiding is non-degenerate. Definition 5.5. For a system X satisfying Assumption 5.1, we define a matrix Z with entries Zλ,µ = hαλ+ , αµ− i, λ, µ ∈ N XN . ± = idM is irreducible by virtue As Zλ,µ is by definition a dimension and since αid N of the factor property of M, the matrix elements obviously satisfy the conditions in Eq. (1) for λ, µ ∈ N XN , where the label “0” refers as usual to the identity morphism idN ∈ N XN . We relate the definition of Z to the chiral generators by the following

Theorem 5.6. We have the identity Zλ,µ =

w ϕh (pλ+ ∗h pµ− ), λ, µ ∈ N XN . dλ dµ

(25)

Therefore the number Zλ,µ is graphically represented as in Fig. 50.

c b Zλ,µ =

X b,c

db dc wdλ dµ

αλ+

−

?

6αµ b c

Fig. 50. Graphical representation of Zλ,µ

470

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Proof. From Theorem 5.3 we obtain X β∈M XM

1 1 + hα , βieβ = pλ+ . dβ λ dλ

Hence X β∈M XM

1 1 + hα , βihαµ− , βieβ = p+ ∗h pµ− . dλ dµ λ dβ2 λ

Application of the horizontal state ϕh of Definition 4.6 and multiplication by w yields Eq. (25) since [αλ+ ] and [αµ− ] decompose into sectors [β] with β ∈ M XM , and by Lemma 4.7. Now the right-hand side of Eq. (25) is given graphically by the diagram on the left in Fig. 51, and we can slide around the trivalent vertices to obtain the diagram on the c X b,c

db dc wdλ dµ

c

-

-

αλ+

− αµ

=

X b,c

da db wdλ dµ

− αµ

c

αλ+

c

b

b

b b −1 ϕ (p + ∗ p − ) Fig. 51. The scalar wdλ−1 dµ h λ h µ

right-hand side. Without changing the scalar value we can now open the outer wire labelled by b and close it on the other side, as in Fig. 29. This way we obtain the picture in Fig. 50 up to a 90 degree rotation, but a rotation is irrelevant for the scalar values. u t We remark that we can apply Lemma 4.3 to replace the two horizontal wires labelled by b by a summation over a thin wire ν, and this way we obtain an equivalent diagram from Fig. 50 for the matrix elements Zλ,µ , which only consists of thin (N -N ) wires λ, µ, ν and thick (N -M) wires b, c but which does not involve very thick (M-M) wires labelled by α-induced morphisms αλ+ , αµ− . Theorem 5.7. The matrix Z of Definition 5.5 commutes with the matrices Y and T of the system N XN . P Proof. Using the diagram for the matrix elements Yν,λ in Fig. 19, the sum λ Yν,λ Zλ,µ can be represented by the diagram on the left-hand side of Fig. 52. Using Lemma 4.3 and also the trick to turn around the small arcs given in Fig. 40, we obtain the right-hand side of Fig. 52. We can now slide around the lower trivalent vertex of the wire ν to obtain the left-hand side of Fig. 53. Next, we can use Lemma 4.3 to replace the two parallel horizontal wires with labels a and b by a summation over a thin wire ρ. Similarly, but the other way round, we can then use Lemma 4.3 to replace the summation over the wire with label λ by two straight horizontal wires with labels b and c. This way we obtain the right-hand side of Fig. 53. Now it should be clear how to proceed: We slide around the upper trivalent vertex of the wire µ counter-clockwise. Then we see that the result gives

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

471

c

ν

X db dc wdµ

λ

b,c,λ

c

b

6 ? b

6

µ

=

ν

a

X a,b,c,λ

da db dc wdµ dν

6 ? b

λ

b

b

c

c

6

µ

Fig. 52. Commutation of Y and Z c

b

b

b

X a,b,c,λ

da db dc wdµ dν

λ

?

ν

a

6 µ

=

X a,b,c,ρ

da db dc wdµ dν

ν

a

b

c ρ

µ 6

a

-

b

b

c Fig. 53. Commutation of Y and Z

P us the diagram for ρ Zν,ρ Yρ,µ , rotated by 90 degrees. This proves Y Z = ZY . Next we show commutativity of Z with T . We have to show ωλ Zλ,µ = Zλ,µ ωµ . Using the graphical expression for the statistics phase ωλ on the left-hand side of Fig. 17, we can represent ωλ Zλ,µ by the left-hand side of Fig. 54. We now start to rotate the upper oval c

X b,c

db dc wdλ dµ

λ

b

?b 6µ

=

X b,c

db dc wdλ dµ

λ

?

b

µ

6

c

b

c

c Fig. 54. Commutation of T and Z

consisting of the thick wires b and c in a clockwise direction. This way we obtain the right-hand side of Fig. 54. It should now be clear that, if we continue rotating to a full rotation by 360 degrees, then we remove the twist from the wire λ whereas we obtain a twist in the wire µ which is of the type displayed on the right-hand side of Fig. 17, thus t representing ωµ . Hence T Z = ZT . u The following is now immediate by Thm. 2.5, which states that in the non-degenerate case matrices S = w−1/2 Y and T provide a unitary representation of the modular group SL(2; Z).

472

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

Corollary 5.8. If the braiding on N XN is non-degenerate, then the matrix Z defined in Definition 5.5 is a modular invariant mass matrix. In conformal field theory the SL(2; Z) action arises from a “reparametrization of the torus”, and in the parameter space S corresponds to a 90 degree rotation and T to twisting the torus. Note that this action is nicely reflected in the proof of Thm. 5.7. 5.3. Generating property of α-induction. We now show that both kinds of α-induction generate the whole M-M fusion rule algebra (or the sector algebra in our terminology of [2–4]) in the case that the N-N system is non-degenerately braided. That is, from now on we work with the following Assumption 5.9. In addition to Assumption 5.1, we now assume that the braiding on N XN is non-degenerate in the sense of Definition 2.3. With Assumption 5.9 we can now use the “killing ring”, the orthogonality relation of Fig. 20, and this turns out to be a powerful tool in the graphical framework. The following theorem states in particular that any minimal central projection eβ of ( , ∗h ) appears in the linear decomposition of some pλ+ ∗v pµ− . Such a generating property of pj± ’s has also been noticed by Ocneanu in the setting of the lectures [39]. We can apply his idea of the proof (which is not included in the notes [39]) to our situation without essential change. P , Theorem 5.10. Under Assumption 5.9, we have λ,µ∈N XN pλ+ ∗v pµ− = w1h in and consequently X X dλ dµ [αλ+ ][αµ− ] = w dβ [β] (26) λ,µ∈N XN

β∈M XM

in the M-M fusion rule algebra. In particular, for any β ∈ subsector of [αλ+ ][αµ− ] for some λ, µ ∈ N XN .

M XM

the sector [β] is a

P Proof. The sum λ,µ pλ+ ∗v pµ− is given graphically by the left-hand side of Fig. 55. By using Lemma 4.3 for the two parallel vertical wires c on the bottom and the IBFE a

a

a,b,c,λ,µ

db

b

b

=

X a,b,c,λ,µ,ν

6b

αλ+ c

Fig. 55. The sum

a

µ

db

λ

-

c

c

b

− αµ

X

c

a

c

ν

?

c

P

+ − λ,µ pλ ∗v pµ

moves we obtain the right-hand side of Fig. 55. For the summation over the thin wire λ we can use Lemma 4.3 again to obtain the left-hand side of Fig. 56. Now we can slide

On α-Induction, Chiral Generators and Modular Invariants for Subfactors c

a X a,b,c,µ,ν

c

a

6

db

X

=

a,b,c,µ,ν

ν

?

c

c

a

µ

b

c

b 6

a µ

db

c

ν

?

c

Fig. 56. The sum

473

c

P

+ − λ,µ pλ ∗v pµ

around the right trivalent vertex of the wire µ, and this yields the right-hand side of Fig. 56. Next we can use the trick of Fig. 40 to turn around the small arcs from the wire µ to the wire b. This yields a factor dµ /db . Then we can proceed with the summation over b, using Lemma 4.3 once more, and this gives us the left-hand side of Fig. 57. Now we a X a,c,µ,ν

dµ

a

c c

ν

?

a

µ =

X w d a,c c

c

Fig. 57. The sum

a

c

c

c

P

+ − λ,µ pλ ∗v pµ

observe that the summation over µ provides a killing ring, and hence we obtain a factor wδν,0 . The normalization convention for the small arcs yields another factor 1/dc , and hence we get exactly the right-hand side of Fig. 57. The circular wire c cancels the factor 1/dc , and thus we are left exactly with the global index w times a summation P over two straight horizontal wires, and the latter is exactly the horizontal unit 1h = β eβ . The rest is application of the isomorphism 8. u t We remark that the non-degeneracy of the braiding played an essential role in the proof. In fact there are counter-examples showing that the generating property does not hold in general if the braiding is degenerate (e.g. the finite group case discussed in Sect. 4.2 of [2] serves as such an example). 6. Representations of the M-M Fusion Rule Algebra 6.1. Irreducible representations of the M-M fusion rules. We next study in detail the algebra (Zh , ∗v ) or, equivalently, the M-M fusion rule algebra in the case that the N-N system is non-degenerately braided. Note that the Assumption 5.1 implies in particular that the N -N fusion rules algebra is Abelian. However, the M-M fusion rules are in general non-commutative, and therefore so is the center (Zh , ∗v ). We are now going to decompose (Zh , ∗v ) in simple matrix algebras. Note that such a decomposition of (Zh , ∗v ) is equivalent to the determination of the irreducible representations of the M-M fusion rule algebra.

474

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

λ X

µ

?

a

6 b c

t

s

a

λ,µ

Fig. 58. The vector b,c,t,s ∈ Hλ,µ

We need some preparation. As in the graphical setting for the double triangle algebra, λ,µ we can consider the diagram in Fig. 58 as a vector b,c,t,s ∈ Hλ,µ , where Hλ,µ is the L ¯ a a), ¯ λ, µ ∈ N XN . Here b, c ∈ N XM , and vector space Hλ,µ = a∈N XM Hom(λµ, ¯ t ∈ Hom(λ, bc) ¯ and s ∈ Hom(µ, ¯ cb) are isometries labelling the two trivalent vertices in Fig. 58. It is important to notice that we do not allow coefficients depending on a: The same isometries t, s are used in each block Hom(λµ, ¯ a a) ¯ of Hλ,µ . We next define the subspace Hλ,µ ⊂ Hλ,µ spanned by such vectors: λ,µ ¯ ¯ s ∈ Hom(µ, ¯ cb)}. Hλ,µ = span{b,c,t,s | b, c ∈ N XM , t ∈ Hom(λ, bc), λ,µ

λ,µ

λ,µ

λ,µ

Take two such vectors b,c,t,s and b0 ,c0 ,t 0 ,s 0 . We define an element |b0 ,c0 ,t 0 ,s 0 ihb,c,t,s | ∈

by the diagram in Fig. 59. (This notation will be justified by Lemma 6.1 below.) t∗

c

s∗

a

b X λ

a,a 0

?

6µ b0 c0

t0

λ,µ

a0

s0 λ,µ

Fig. 59. The element |b0 ,c0 ,t 0 ,s 0 ihb,c,t,s | ∈ λ;i λ , We now choose orthonormal bases of isometries tb, ¯ i = 1, 2, ..., Nb, c¯ ∈ Hom(λ, bc), c¯ λ,µ

for each λ, b, c and put ξ

=

λ,µ

λ;i b,c,tb, c¯ ,t

µ;j ¯ c,b¯

with some multi-index ξ = (b, c, i, j ).

Varying ξ , we obtain a generating set of Hλ,µ which will, however, in general not be a λ,µ λ,µ basis as the vectors ξ may be linearly dependent in Hλ,µ . Let 8j ∈ Hλ,µ , j = 1, 2, P ξ λ,µ λ,µ ξ any two vectors. We can expand them as 8j = ξ cj ξ with cj ∈ C, but note that λ,µ

λ,µ

this expansion is not unique. We now define an element |81 ih82 | ∈ X ξ ξ0 λ,µ λ,µ λ,µ λ,µ c1 (c2 )∗ |ξ ihξ 0 |, |81 ih82 | = ξ,ξ 0

by (27)

On α-Induction, Chiral Generators and Modular Invariants for Subfactors λ,µ

475

λ,µ

and a scalar, h82 , 81 i ∈ C, λ,µ

λ,µ

h82 , 81 i =

1 λ,µ λ,µ τv (|81 ih82 |). dλ dµ

(28)

Lemma 6.1. Equation (27) extends to a sesqui-linear map Hλ,µ × Hλ,µ → Zh which is positive definite: If |8λ,µ ih8λ,µ | = 0 for some 8λ,µ ∈ Hλ,µ then 8λ,µ = 0. Consequently, Eq. (28) defines a scalar product turning Hλ,µ into a Hilbert space. L Proof. As in particular 8j ∈ Hλ,µ , we can write 8j = a (8j )a with (8j )a ∈ Hom(λµ, ¯ a a) ¯ according to the direct sum structure of Hλ,µ , j = 1, 2. Assume 81 = 0. λ,µ λ,µ Then clearly (81 )a = 0 for all a. Now the Hom(a a, ¯ a 0 a¯0 ) part of |81 ih82 | ∈ λ,µ λ,µ is given by (81 )a 0 (82 )∗a , hence |81 ih82 | = 0. A similar argument applies to 82 , λ,µ λ,µ is independent of the linear expansions of and hence the element |81 ih82 | ∈ . Now the 8j ’s. Therefore Eq. (27) defines a sesqui-linear map Hλ,µ × Hλ,µ → λ,µ λ,µ assume |81 ih81 | = 0. Then in particular (81 )a (81 )∗a = 0 for all a ∈ N XM , and hence 81 = 0, proving strict positivity. That the sesqui-linear form h·, ·i on Hλ,µ is non-degenerate follows now from positive definiteness of τv . It remains to show that λ,µ λ,µ |81 ih82 | ∈ Zh . But this is clear since any element of the form in Fig. 33 can be “pulled through” the diagram in Fig. 59 by using the IBFE’s. u t ¯ λ, µ,λ0 , Lemma 6.2. We have the identity in Fig. 60 for intertwiners in Hom(λ0 µ¯ 0 , λµ), 0 µ ∈ N XN .

λ0

0

?

µ 6 b0

X a

t0

da

t∗

c0 s 0 c s∗

λ,µ

λ,µ

? 6

= δλ,λ0 δµ,µ0 hb,c,t,s , b0 ,c0 ,t 0 ,s 0 i

b

a λ

?

µ 6

λ

µ

Fig. 60. An identity in Hom(λ0 µ¯ 0 , λµ) ¯

Proof. Using Lemma 4.3 we can replace the left-hand side of Fig. 60 by the left-hand side of Fig. 61. Next we can slide one of the trivalent vertices of the wire ν around the wire a. Using the identity of Fig. 40, we obtain a factor dν /da , and we can now proceed with the summation over a, again using Lemma 4.3. Using also Lemma 4.3 for the parallel wires c, c0 as well as b and b0 , we obtain the right-hand side of Fig. 61. Using now Lemma 4.3 once again for the wires ρ, τ , we can pull the wire ν over the middle expansion. The summation over ν yields a killing ring which disconnects the picture into ¯ Hence we two halves, one is an intertwiner in Hom(λ0 , λ) and the other in Hom(µ¯ 0 , µ). obtain a factor δλ,λ0 δµ,µ0 , and we conclude that the left-hand side in Fig. 60 represents ¯ λµ), ¯ ζ ∈ C. To compute that scalar, we a scalar intertwiner δλ,λ0 δµ,µ0 ζ 1N ∈ Hom(λµ, can start again on the left-hand side of Fig. 60, now putting λ0 = λ and µ0 = µ. The

476

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

X ν,a

b0 da

0

?-ν

λ0

t0

t∗

c0 c

µ0 ? -ν 6 c0 c0 t 0 - s0 ∗ t∗ s

λ0

µ 6 s0

b0 =

s∗

X ν,ρ,τ

b

b

a

τ -

b0

?

λ

ρ

c

dν

µ 6

λ

?

c

b b0 µ 6

Fig. 61. The identity in Hom(λ0 µ¯ 0 , λµ) ¯ t∗ X a

t0 t∗

s∗

b

b0 da dλ dµ

c

c0 s 0 c s∗

µ

λ ?6

←→

X a

da dλ dµ

λ

?

b a

a

µ 6 b0

t0

c0

s0

Fig. 62. Computation of the scalar ζ

diagram on the left-hand side of Fig. 62 clearly represents an intertwiner of the same scalar value ζ . We can now use the move of Fig. 29 which does not change the scalar value: We open the wire a on the left and close it on the right. The resulting diagram is regularly isotopic to the diagram on the right-hand side of Fig. 62. Thus we are left with λ,µ λ,µ t exactly the diagram for dλ−1 dµ−1 τv (|b0 ,c0 ,t 0 ,s 0 ihb,c,t,s |). This proves the lemma. u The following is now immediate by the definition of the vertical product. λ,µ

Corollary 6.3. Let 8j λ,µ

λ,µ

λ0 ,µ0

|81 ih82 | ∗v |91

λ0 ,µ0

∈ Hλ,µ and 9j λ0 ,µ0

ih92

∈ Hλ0 ,µ0 , j = 1, 2. Then we have λ,µ

λ,µ

λ,µ

λ,µ

| = δλ,λ0 δµ,µ0 h82 , 91 i |81 ih92 |

(29)

in the double triangle algebra. λ,µ dimH

Whenever Hλ,µ 6 = {0} we can choose an orthonormal basis {Ei }i=1 λ,µ . Then λ,µ λ,µ Lemma 6.1 and Corollary 6.3 tell us that { |Ei ihEj | }λ,µ,i,j forms a set of non-zero matrix units in (Zh , ∗v ). However, we do not know yet whether this is a complete set. λ,µ

Lemma 6.4. Let πλ,µ (eβ )b,c,t,s ∈ Hλ,µ denote the vector which is given graphically ¯ by the diagram in Fig. 63, where λ, µ ∈ N XN , b, c ∈ N XM , and t ∈ Hom(λ, bc), ¯ are isometries. Then in fact πλ,µ (eβ )λ,µ ∈ Hλ,µ . s ∈ Hom(µ, ¯ cb) b,c,t,s Proof. Using Lemma 4.3 and also the trick of Fig. 40, we can draw the diagram on the λ,µ left-hand side in Fig. 64 for πλ,µ (eβ )b,c,t,s . Now let us look at the part of this picture above the dotted line. In a suitable Frobenius annulus, this part can be read for fixed ν and

On α-Induction, Chiral Generators and Modular Invariants for Subfactors λ X a,a 0

477

µ

?

6 b

da 0

t

c

s

a0

β

a

a λ,µ

Fig. 63. The vector πλ,µ (eβ )b,c,t,s ∈ Hλ,µ ν

ν

a0

X a,a 0 ,ν

dβ

b

? λ

t

a

µ

c β

6

a0

b

=

X a,a 0 ,ν

s

a

-

a0

?

dβ

λ

b a

t

µ c β

a0

6 s

b a

-

λ,µ

Fig. 64. The vector πλ,µ (eβ )b,c,t,s ∈ Hλ,µ

P a as i λµ(t ¯ i )ε− (ν, λµ)t ¯ i∗ , and the sum runs over a full orthonormal basis of isometries ¯ since we have the summation over a 0 . Next we look ti in the Hilbert space Hom(ν, bβ¯ a) at the part above the dotted line on the right-hand side of Fig. 64. This can be similarly P ¯ j )ε − (ν, λµ)s ¯ j∗ , where the sum runs over another full read for fixed ν and a as j λµ(s ¯ Since such bases {ti } and {sj } are orthonormal basis of isometries sj ∈ Hom(ν, bβ¯ a). related by a unitary matrix transformation (this is again just “unitarity of 6j -symbols”), the left and right-hand side represent the same vector in Hλ,µ . Then, using again Lemma λ,µ 4.3 and also the trick of Fig. 40, we conclude that the vector πλ,µ (eβ )b,c,t,s can be represented by the diagram on the left-hand side of Fig. 65. Now let us look at the part of λ X a,a 0

da 0

b a

?

µ

6 a0

t c β -

b s

←→

X c0 ,i,j

a0 coeff (c0 ,i,j )

λ ?

t λ;i 0 ¯0 a ,c

µ c0

t

6a

0

µ;j ¯ c0 ,a¯0

λ,µ

Fig. 65. The vector πλ,µ (eβ )b,c,t,s ∈ Hλ,µ

the diagram inside the dotted box. In a suitable Frobenius annulus, this can be interpreted as an intertwiner in Hom(λµ, ¯ a 0 a¯0 ). But any element in this space can be written as a µ;j ¯ linear combination of elements constructed from basis isometries t λ;i , t , as indicated a 0 ,c¯0 c0 ,a¯0 in the dotted box on the right-hand side of Fig. 65. The coefficients in its linear expansion

478

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

depend only on c0 , i, j for fixed a 0 , β, b, c, t, s, but certainly not on a. This shows that λ,µ λ,µ λ,µ t πλ,µ (eβ )b,c,t,s is a linear combination of ξ ’s, thus πλ,µ (eβ )b,c,t,s ∈ Hλ,µ . u λ,µ

λ,µ

The map b,c,t,s 7 → πλ,µ (eβ )b,c,t,s defines clearly a linear map πλ,µ (eβ ) : Hλ,µ → ¯ a a) ¯ block. From Hλ,µ since it is just a linear intertwiner multiplication on each Hom(λµ, Lemma 6.4 we now learn that πλ,µ (eβ ) is in fact a linear operator on Hλ,µ . With the definition of the vertical product we now immediately obtain the following λ,µ dimHλ,µ }i=1

Corollary 6.5. With orthonormal bases {Ei λ,µ

|Ei =

λ0 ,µ0

λ,µ

ihEj | ∗v eβ ∗v |Ek

of each Hλ,µ we have

λ0 ,µ0

ihEl

λ,µ λ,µ δλ,λ0 δµ,µ0 hEj , πλ,µ (eβ )Ek i

| λ,µ

|Ei

λ,µ

ih El

(30)

|.

Since Zh is spanned by the eβ ’s, we obtain a map πλ,µ : Zh → B(Hλ,µ ) by linear extension, and we obtain similarly the following Corollary 6.6. The map πλ,µ : Zh → B(Hλ,µ ) is a representation of (Zh , ∗v ). We now tackle the problem of completeness of the system of matrix units. Definition 6.7. For λ, µ ∈ N XN we define the vertical projector qλ,µ ∈ p dλ dµ X λ,µ λ,µ |ξ ihξ |. qλ,µ = w2

by (31)

ξ

c

a

b X db dc w2

λ

a,b,c,d

?

6µ b

d

c

Fig. 66. A vertical projector qλ,µ

This is given graphically in Fig. 66. (Clearly, we can use Lemma 4.3 twice to obtain an equivalent picture which does not involve pieces of very thick wires corresponding to αλ+ and αµ− .) We are now ready to prove the main result of this section. Theorem 6.8. Under Assumption 5.9, the vertical projector qλ,µ is either zero or a minimal central projection in (Zh , ∗v ). We have mutual orthogonality qλ,µ ∗v qλ0 ,µ0 = δλ,λ0 δµ,µ0 qP λ,µ and the vertical projectors sum up to the multiplicative identity of (Zh , ∗v ): λ,µ∈N XN qλ,µ = e0 . Moreover, qλ,µ = 0 whenever Zλ,µ = 0 and otherwise the simple summand qλ,µ ∗v Zh is a full Zλ,µ × Zλ,µ matrix algebra, where Zλ,µ is the (λ, µ)-entry of the modular invariant mass matrix of Definition 5.5.

On α-Induction, Chiral Generators and Modular Invariants for Subfactors c

a

479 c

a

b

b X a,b,c,d,λ,µ

db dc w2

λ

?

X

=

µ

6

a,b,c,d,λ,µ,ν,ρ

b d

ν;i ∗ a (ta b¯ ) b

a,b,c,d µ,ν,ρ,τ,i,j

p dc dν dρ √ w2 da dd

ν ρ d

? τ

t ν;i ¯ ab

b

-

?

ρ

µ

ν

6

b

a,b,c,d, µ,ν,ρ,τ,i,j

6b 6

ν;i ∗ a (ta b¯ ) b p ν ? dc dν dρ √ w 2 da dd τ ρ ?

ρ;j (t ¯ )∗ d db

Fig. 68. The sum

b

d

c

λ,µ qλ,µ

X

=

? 6µ

P

a

ρ

c

ρ;j

t ¯ db

λ

d

c Fig. 67. The sum

X

ν

db dc w2

a

b

d P

ρ;j

t ¯ db

t ν;i ¯

b

ab

µ

a

ν

6 ρ ? 6 c b ρ;j (t ¯ )∗ d db

λ,µ qλ,µ

0 0 Proof. It follows from P Corollary 6.3 that qλ,µ ∗v qλ0 ,µ0 = 0 unless λ = λ and µ = µ . We now show that λ,µ qλ,µ = e0 . (We denote e0 ≡ eidM .) The sum is given graphically by the left-hand side in Fig. 67. A twofold application of Lemma 4.3 yields the righthand side in Fig. 67. Applying Lemma 4.3 twice again, we obtain the left-hand side of Fig. 68. We can now slide the upper trivalent vertex of the wire µ around to obtain the right-hand side of Fig. 68. Next we can use the trick of Fig. 40 to turn around the small arcs at the trivalent vertices of the wire µ, yielding a factor dµ /dc . This gives the right- and left-hand side of Fig. 68. Since we have a summation over c, we can again use Lemma 4.3, and this gives us the left-hand side of Fig. 69. As we have a prefactor

X a,b,d µ,ν,ρ,τ,i,j

p dµ dν dρ √ w2 da dd

ν;i ∗ a (ta b¯ ) b ν ?

t ν;i ¯ ab

ν

? 6

-

ρ d

τ

?

ρ;j t ¯ db

a

a

µ b

ρ

6

X

=

a,b,ν,i,j

1 wda

ρ;j (t ¯ )∗ d db

Fig. 69. The sum

t ν;i ¯

ab

ν

a P

∗ (t ν;i ¯ )

?b ν;j

t ¯ ab

ab

b

a

ν

6 ν;j

(t ¯ )∗ ab

a

λ,µ qλ,µ

dµ , the summation over µ provides a killing ring, and only τ = idN survives it: We obtain a factor wδτ,0 . Now our picture starts to collapse. The factor δτ,0 yields, with the normalization convention as in Fig. 39, a factor dν−1 δν,ρ . Since our picture is now disconnected into two parts which represent intertwiners in Hom(a, d), they are scalars and we obtain a factor δa,d . This gives us the right-hand side of Fig. 69. Therefore we

480

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

are now left with a sum over scalars times two straight vertical wires labelled by a, representing a scalar intertwiner in Hom(a a, ¯ a a). ¯ The scalar value of each connected √ part of the picture is δi,j dν db /da , therefore we can compute the prefactor as s

Nν

a b¯ 1 XX wda

b,ν i,j =1

dν db δi,j da

!2

1 X 1 X 2 1 ν d N d = db = . b ν ¯ a,b wda2 wda da

=

b,ν

b

Thus we are left with a sum over two vertical straight wires with label a and prefactor da−1 . This is e0 . λ,µ Next, we can expand each vector ξ ∈ Hλ,µ , in an orthonormal basis as dimHλ,µ

λ,µ ξ

=

X i=1

λ,µ

hEi

λ,µ

λ,µ

, ξ iEi

.

Inserting this in Eq. (31) yields qλ,µ Now using

P

p dλ dµ = w2 λ,µ qλ,µ

λ,µ

δi,j |Ei

dimHλ,µ

X X ξ

i,j

λ,µ

hEi

λ,µ

λ,µ

λ,µ

λ,µ

, ξ ihξ , Ej i |Ei

λ,µ

ihEj |.

= e0 and Corollary 6.3 we compute

λ,µ

ihEj | =

P

λ,µ

λ,µ

λ,µ

λ,µ

|Ei ihEi | ∗v qλ0 ,µ0 ∗v |Ej ihEj | p dλ dµ X λ,µ λ,µ λ,µ λ,µ λ,µ λ,µ hEi , ξ ihξ , Ej i |Ei ihEj |, = w2 λ0 ,µ0

ξ

hence dimHλ,µ

qλ,µ =

X

λ,µ

|Ei

i=1

Thus qλ,µ is a projection and we also have e0 = for any β ∈ M XM we find eβ = e0 ∗v eβ ∗v e0 =

Xλ,µ X dimH λ,µ i,j =1

λ,µ

hEi

λ,µ

ihEi

P

λ,µ

|.

PdimHλ,µ i=1

λ,µ

λ,µ

|Ei

λ,µ

, πλ,µ (eβ )Ej i |Ei

λ,µ

ihEi

|. Hence

λ,µ

ih Ej |

by Corollary 6.5. Thus each eβ can be expanded in our matrix units, and since Zh is λ,µ λ,µ spanned by the eβ ’s we conclude that {|Ei ih Ej |}λ,µ,i,j is a complete system of matrix units. It follows that the non-zero vertical projectors are minimal central projections in (Zh , ∗v ), and that the simple summand qλ,µ ∗v Zh is a full dimHλ,µ × dimHλ,µ matrix algebra. It remains to show dimHλ,µ = Zλ,µ . The dimension of Hλ,µ can be counted as dimHλ,µ

dimHλ,µ =

X i=1

λ,µ

hEi

λ,µ

, Ei

dimHλ,µ

i=

X i=1

1 1 λ,µ λ,µ τv (|Ei ihEi |) = τv (qλ,µ ). dλ dµ dλ dµ

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

481

c b X da db dc w2 dλ dµ

λ

a,b,c

?

6µ b

a

c −1 τ (q Fig. 70. The number dλ−1 dµ v λ,µ )

Now dλ−1 dµ−1 τv (qλ,µ ) is given graphically in Fig. 70. By the IBFE’s we can pull out the circle with label a which gives us another factor da . We can therefore proceed with the summation over a, and this yields a factor w, the global index, and then we are left exactly with the picture in Fig. 50. u t P Note that we learn from the proof that putting Tr v (z) = λ,µ dλ−1 dµ−1 τv (qλ,µ ∗v z) for z ∈ Zh gives a matrix trace Tr v on (Zh , ∗v ) which sends the minimal projections to one. Next we have learnt that for all λ, µ with Zλ,µ 6= 0, the πλ,µ ’s are the irreducible representations of (Zh , ∗v ) and hence the πλ,µ ◦ 8’s are the irreducible representations of the M-M fusion rule algebra. Corollary 6.9. Under Assumption 5.9, the M-M fusion rule algebra is commutative if and only if Zλ,µ ∈ {0, 1} for all λ, µ ∈ N XN . Corollary 6.10. 5.9, the total number of morphisms in M XM is equal P Under Assumption 2 . to tr(Z tZ) = λ,µ∈N XN Zλ,µ 6.2. The left action on M-N sectors. The decomposition of (Zh , ∗v ) into simple matrix algebras is equivalent to the irreducible decomposition of the “regular representation” (up to multiplicities given as the dimensions) of the M-M fusion rule algebra, i.e. the representation obtained by its action on itself as a vector space. There is another representation of the M-M fusion rule algebra, namely the one obtained by its (left) action on the M-N sectors. This is what we study Lin the following. ¯ Note that each block We define the vector space K by K = a∈N XM Hom(idN , a a). consists just of scalar multiples of the isometries r¯a but we need the explicit form of K. We −1/2 ¯ We define basis vectors va¯ ∈ K corresponding to da r¯a in each block Hom(idN , a a). can display each va¯ graphically by a thick wire “cap” with label a ∈ N XM together with a prefactor 1/da . We furnish K with a Hilbert space structure by putting hva¯ , vb¯ i = δa,b . For each a ∈ N XM we define a vector %(eβ )va¯ by putting X ¯ b Nβ, (32) %(eβ )va¯ = dβ a¯ vb¯ . b

We can display the right-hand side graphically as in Fig. 71. The left and right-hand side in Fig. 71 are the same because both sides are scalar multiples of the isometry r¯a in each block Hom(idN , a a). ¯ The map %(eβ ) : va¯ 7→ %(eβ )va¯ clearly defines a linear operator on K for each β ∈ M XM , and we can extend the map eβ 7→ %(eβ ) linearly to Zh . Graphically, this action of Zh is quite similar to the vertical product. (Note that

482

J. Böckenhauer, D. E. Evans, Y. Kawahigashi a

a

β

-

X

β

=

-

b

X 1 db

b

b

b

b

b

b

Fig. 71. The element %(eβ )va¯ ∈ K

there also appears a factor da cancelling the da−1 in the definition of va¯ when gluing the picture for va¯ on top of that for eβ .) We observe that the map % : eβ 7 → %(eβ ) extends linearly to a representation of (Zh , ∗v ) as we can compute for β, β 0 ∈ M XM as follows: P P ¯ c¯ N b¯ v %(eβ )(%(eβ 0 )va¯ ) = %(eβ ) dβ b Nβb0 ,a¯ vb¯ = dβ dβ 0 b,c Nβ, b¯ β 0 ,a¯ c¯ P P β 00 β 00 c ¯ = dβ dβ 0 β 00 ,c Nβ,β 0 Nβ 00 ,a¯ vc¯ = dβ dβ 0 β 00 ,c dβ−1 00 Nβ,β 0 %(eβ 00 )va¯ = %(eβ ∗v eβ 0 )va¯ , where we used associativity of the sector product in the third equality. Consequently, %(qλ,µ ) is a projection onto a subspace, and %|%(qλ,µ )K is a subrepresentation. L Lemma 6.11. We have K = λ∈N XN Kλ , where Kλ = %(qλ,λ )K. Proof. The vector %(qλ,µ )va¯ ∈ K is given graphically by the left-hand side of Fig. 72. ¯ Now note that the upper part of the diagram represents an intertwiner in Hom(idN , λµ). ¯

c

a

a

λ;i ∗ λ;i ∗ (tb, c¯ ) c (tc,b¯ )

b X db dc w2

λ

?

b

6µ

b,c,d

=

X b,c,d,i,j

δλ,µ

λ

w2

b d

c

λ

b d

λ;i tb, c¯

c

¯ c,b

t λ;i¯

Fig. 72. The vector %(qλ,µ )va¯ ∈ K

Therefore it vanishes unless λ = µ and then it must be a scalar multiple of r¯λ . Hence we can insert a term r¯λ r¯λ∗ which corresponds graphically to the disconnection of the wires as on the right-hand side in Fig. 72 and multiplication by dλ−1 . Then the factor db dc /dλ disappears because of the normalization convention for trivalent vertices with small arcs, and we are left exactly with the right-hand side of Fig. 72. It follows in particular that %(qλ,µ )K = 0 unless λ = µ. The claim follows now since the vertical projectors sum t up to e0 and %(e0 ) is the identity on K. u

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

483

We are now ready to prove the following Theorem 6.12. The representation % of (Zh , ∗v ) on K obtained by Eq. (32) is unitarily equivalent to the direct sum over the irreducible representations πλ,λ : M πλ,λ . (33) %' λ∈N XN

Consequently, the representation %◦8 of the M-M fusion rule algebra which is obtained by the action L on the M-N sectors arising from M XN decomposes into irreducibles as % ◦ 8 ' λ πλ,λ ◦ 8. ¯ we define a ¯ cb) ¯ and s ∈ Hom(λ, Proof. For b, c ∈ N XM and isometries t ∈ Hom(λ, bc) λ vector kb,c,t,s ∈ K by the diagram in Fig. 73. Using again intertwiner bases, we also put λ X a

b t

c

s

a

λ Fig. 73. The vector kb,c,t,s ∈K

kξλ = k

¯ λ;j c,b¯

λ;i b,c,tb, c¯ ,t

with some multi-index ξ = (b, c, i, j ). It follows from the right-hand

side in Fig. 72 that Kλ ⊂ span{kξλ | ξ = (b, c, i, j )}. Conversely, we obtain by Lemma 6.2 that %(qµ,µ )kξλ = 0 unless λ = µ, hence Kλ = span{kξλ | ξ = (b, c, i, j )}. With λ = µ, closing the wires on the bottom and on the top on both sides of Fig. 60 yields λ,λ hkξλ , kξλ0 i = dλ hλ,λ ξ , ξ 0 i. −1/2

kξλ defines a unitary operator Uλ : Hλ,λ → Kλ . Hence linear extension of λ,λ ξ 7 → dλ ¯ and Note that U means multiplication by r¯λ from the right in each block Hom(λλ¯ , a a) this corresponds graphically to closing the open ends of the wires λ in Fig. 58 and −1/2 multiplying by dλ . Therefore we find h i i h −1/2 λ,λ λ % (e )k = % (e )U = d , U πλ,λ (eβ )λ,λ λ β λ β ξ ξ ξ λ t where %λ = %|Kλ . Thus %λ ' πλ,λ . u Since the dimension of K is the cardinality of following

N XM

we immediately obtain the

Corollary 6.13. Under Assumption 5.9, the Ptotal number of morphisms in equivalently, in M XN ) is equal to tr(Z) = λ∈N XN Zλ,λ .

N XM

(or,

484

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

7. Conclusions and Outlook We have analyzed braided type III subfactors and shown that in the non-degenerate case the system of M-M system is entirely generated by α-induction, including in particular the subsectors of Longo’s canonical endomorphism γ . We established that in that case the essential structural information about the M-M fusion rules is encoded in the modular invariant mass matrix Z. Our setting applies in particular to SU(n) loop group subfactors π 0 (LI SU(n))00 ⊂ π 0 (LI G)00 of conformal inclusions SU(n)k ⊂ G1 and π0 (LI SU(n))00 ⊂ π0 (LI SU(n))00 oσ Zm which were analyzed by α-induction in [3,4]. Here π 0 denotes the level 1 vacuum representation of the loop group LG, π0 the level k representation of LSU(n), I ⊂ S 1 is an interval, and σ is a “simple current”. The braiding here arises from the localized transportable endomorphisms of the net of local algebras A(I ) = π0 (LI SU(n))00 . Since it follows from Wassermann’s work [45] that these endomorphisms obey the SU(n)k fusion rules and from the conformal spin-statistics theorem [18] that the statistics phases are given by ωλ = e2πihλ with hλ denoting the SU(n)k conformal dimensions, it follows that the S- and T-matrices from the braiding coincide with the well-known S- and T-matrices which transform the conformal characters. Therefore Theorem 5.10 shows in particular that Condition 4 in Proposition 5.1 in [4] holds in the setting of conformal inclusions, and in turn it proves Conjecture 7.1 in [4]. It also follows that in the setting of Proposition 5.1 in [4], the sum of eβ for “marked vertices” [β] (the M-M sectors arising from the positive energy representations of P the ambient theory) correspond to the projections appearing in the decomposition of λ,µ pλ+ ∗h pµ− , the “ambichiral projector” in Ocneanu’s language. Similarly, the results of this paper also prove Conjecture 7.2 in [4]. Theorem 5.10 shows in particular that there are no counter-examples for conformal inclusions where the M-M sectors arising from the conformal inclusion subfactor are not generated by the mixed α-induction (cf. [48]). Xu made some computation in [47] (see also [3]) to find an example with non-commutative fusion rules of (M-M) sectors generated by the image of only one “positive” induction for subfactors arising from conformal inclusions. By Corollary 6.9, it is at least very easy to find examples of a non-commutative entire M-M fusion rule algebra. The D4 case mentioned in [4, Subsect. 6.1] is one such example. In fact, the whole D2n series arising from simple current extension of SU (2)4n−4 also give examples of non-commutative M-M fusion rule algebras. Such non-commutativity for Deven has been also pointed out in the setting of [39] (though not in the context of conformal inclusions or simple current extensions). We will present the details and more analysis about SU(n)k loop group subfactors, including the treatment of all SU(2) modular invariants, in a forthcoming publication [5]. Our treatment can now also incorporate the type II invariants which were not considered in [3,4], because we dropped the chiral locality condition which automatically forces the mass matrix Z to be type I, i.e. block-diagonal. Let us remark that we could also have defined Zλ,µ with exchanged ±-signs in Def. 5.5, and this would correspond to replacing Z by the transposed mass matrix tZ. It is not hard to see that all our calculations go through with tZ as well. That means α-induction for a (non-degenerately) braided subfactor determines actually two modular invariant mass matrices Z and tZ, and it is not clear to us at present whether they can in fact be different in our general setting. (We have Z = tZ for all SU(2) and SU(3) modular invariants). A notion of subequivalent paragroups was introduced in [27]. Since N XN and M XM are equivalent systems of endomorphisms by definition, α-induction produces an example of a subequivalent paragroup. That is, for λ ∈ N XN , the subfactors αλ± (M) ⊂ M

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

485

are subequivalent to λ(N) ⊂ N . Various examples in [27] arise from this construction. Indeed, the most fundamental example in [27] comes from the Goodman–de la Harpe– √ Jones subfactor [17, Sect. 4.5] with index 3 + 3. In our current setting, this example comes from the conformal inclusion SU (2)10 ⊂ SO(5)1 and shows that the two paragroups with principal graph E6 are subequivalent to the paragroup with principal graph A11 . As a corollary of a rigidity theorem presented by Ocneanu in Madras in January 1997, there are only finitely many paragroups with global index below a given upper bound. This implies that for a given paragroup we have only finitely many subequivalent paragroups since their global indices are less than or equal to the global index of the given paragroup. In the context of modular invariants, a simple argument of Gannon P 2 , which in turn implies that there are only finitely many [16] shows λ,µ Zλ,µ ≤ 1/S0,0 modular invariant mass matrices Z for a given unitary representation of SL(2; Z), where the S-matrix satisfies the standard relations S0,λ ≥ S0,0 > 0. As for a non-degenerately 2 , braided system of morphisms this bound coincides with the global index, w = 1/S0,0 and in view of the relations between modular invariants and subfactors elaborated in this paper, it is natural to expect that these two finiteness arguments are not completely unrelated. We consider a good understanding of the connections between these two arguments to be highly desirable. Let us finally remark that in a recent paper of Rehren [42] the embedding of left and right chiral observables in a 2D conformal field theory are studied. Such embeddings give rise to subfactors and in turn to coupling matrices which are invariant mass matrices if the Fourier transform matrix of the chiral fusion rules is modular. As these subfactors are quite different from ours which appear in a framework considering chiral observables only, the relation between the two approaches also calls for a coherent understanding. Acknowledgement. Part of this work was done during visits of the third author to the University of Wales Swansea and the University of Wales Cardiff, a visit of the second author to the University of Tokyo, visits of all the three to Università di Roma “Tor Vergata” and visits of the first two authors to the Australian National University, Canberra. We thank R. Longo, L. Zsido, J. E. Roberts, D. W. Robinson and these institutions for their hospitality. We would like to thank S. Goto for showing us a preliminary manuscript of [39], M. Izumi for explaining [20], T. Kohno, H. Murakami, and T. Ohtsuki for helpful explanations on topological invariants, and J. E. Roberts for his comments.Y.K. thanks A. Ocneanu for various conversations on [39] at the Fields Institute in 1995. We acknowledge the financial support of the Australian National University, CNR (Italy), EPSRC (U.K.), the EU TMR Network in Non-Commutative Geometry, Grant-in-Aid for Scientific Research, Ministry of Education (Japan), the Kanagawa Academy of Science and Technology Research Grants, the Università di Roma “Tor Vergata”, and the University of Wales.

References 1. Bannai, E., Ito, T.: Algebraic combinatorics I: Association schemes. New York: Benjamin/Cummings, 1984 2. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. I. Commun. Math. Phys. 197, 361–386 (1998) 3. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. II. Commun. Math. Phys. 200, 57–103 (1999) 4. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. III. Commun. Math. Phys. 205, 183–228 (1999) 5. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Chiral structure of modular invariants for subfactors. Preprint math.OA/9907149, to appear in Commun. Math. Phys. (1) 6. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E classification of minimal and A1 conformal invariant theories. Commun. Math. Phys. 113, 1–26 (1987) 7. Di Francesco, P., Zuber, J.-B.: SU (N ) lattice integrable models associated with graphs. Nucl. Phys. B338, 602–646 (1990)

486

J. Böckenhauer, D. E. Evans, Y. Kawahigashi

8. Di Francesco, P., Zuber, J.-B.: SU (N ) lattice integrable models and modular invariance. In: Recent Developments in Conformal Field Theories, Trieste 1989, Singapore: World Scientific 1990, pp. 179–215 9. Doplicher, S., Roberts, J.E.: A new duality theory for compact groups. Invent. Math. 98, 157–218 (1989) 10. Evans, D.E., Kawahigashi, Y.: Orbifold subfactors from Hecke algebras. Commun. Math. Phys. 165, 445–484 (1994) 11. Evans, D.E., Kawahigashi, Y.: Quantum symmetries on operator algebras, Oxford: Oxford University Press, 1998 12. Evans, D.E., Kawahigashi, Y.: Orbifold subfactors from Hecke algebras II – Quantum doubles and braiding. Commun. Math. Phys. 196, 331–361 (1998) 13. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. II. Rev. Math. Phys. Special issue, 113–157 (1992) 14. Fröhlich, J., Gabbiani, F.: Braid statistics in local quantum theory. Rev. Math. Phys. 2, 251–353 (1990) 15. Fröhlich, J., King, C.: Two-dimensional conformal field theory and three-dimensional topology. Int. J. Mod. Phys A4, 5321–5399 (1989) 16. Gannon, T.: WZW commutants, lattices and level–one partition functions. Nucl. Phys. B396, 708–736 (1993) 17. Goodman, F., de la Harpe, P., Jones, V.F.R.: Coxeter graphs and towers of algebras. MSRI publications 14, Berlin: Springer 1989 18. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 19. Izumi, M.: Subalgebras of infinite C ∗ -algebras with finite Watatani indices II: Cuntz–Krieger algebras. Duke Math. J. 91, 409–461 (1998) 20. Izumi, M.: The structure of sectors associated with the Longo–Rehren inclusions. Kyoto Univ. Preprint No. 99-14 21. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 22. Jones, V.F.R.: Planar algebras. Preprint mth. QA/9909027 23. Kato, A.: Classification of modular invariant partition functions in two dimensions. Modern Phys. Lett A 2, 585–600 (1987) 24. Kauffman, L.: Knots and Physics. Singapore: World Scientific, 1991 25. Kauffman, L., Lins, S.L.: Temperley-Lieb recoupling theory and invariants of 3-manifolds. Princeton, NJ: Princeton University Press, 1994 26. Kawahigashi, Y.: On flatness of Ocneanu’s connections on the Dynkin diagrams and classification of subfactors. J. Funct. Anal. 127, 63–107 (1995) 27. Kawahigashi, Y.: Quantum Galois correspondence for subfactors. To appear in J. Funct. Anal. 28. Kirillov, A.N., Reshetikhin, N.Yu.: Representations of the algebra Uq (sl2 ), q-orthogonal polynomials and invariants for links. In: Kaˇc, V.G. (ed.): Infinite dimensional Lie algebras and groups. Advanced Series in Mathematical Physics, Vol. 7 1988, pp. 285–339, 29. Kosaki, H.: Extension of Jones theory on index to arbitrary factors. J. Funct. Anal. 66, 123–140 (1986) 30. Longo, R.: Index of subfactors and statistics of quantum fields II. Commun. Math. Phys. 130, 285–309 (1990) 31. Longo, R.: Minimal index of braided subfactors. J. Funct. Anal. 109, 98–112 (1991) 32. Longo, R.: A Duality for Hopf algebras and for subfactors I. Commun. Math. Phys. 159, 133–150 (1994) 33. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 34. Moore, G., Seiberg, N.: Polynomial equations for rational conformal field theories. Phys. Lett. B212, 451–460 (1988) 35. Moore, G., Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177– 254 (1989) 36. Murakami, J., Ohtsuki, T.: Topological quantum field theory for the universal quantum invariant. Commun. Math. Phys. 188, 501–520 (1997) 37. Ocneanu, A.: An invariant coupling between 3-manifolds and subfactors, with connections to topological and conformal quantum field theory. Preprint 1991 38. Ocneanu, A.: Chirality for operator algebras . (Notes recorded by Y. Kawahigashi) In: Subfactors (ed. H. Araki, et al.), Singapore: World Scientific, 1994, pp. 39–63 39. Ocneanu, A.: Paths on Coxeter diagrams: From Platonic solids and singularities to minimal models and subfactors. (Notes recorded by S. Goto), In preparation. 40. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The algebraic theory of superselection sectors, Palermo 1989, (ed. D. Kastler), Singapore: World Scientific, 1990, pp. 333–355 41. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132, 461–483 (1990) 42. Rehren, K.-H.: Chiral observables and modular invariants. Preprint hep-th/9903262 43. Roberts, J.E.: Local cohomology and superselection structure. Commun. Math. Phys. 51, 107–119 (1976) 44. Turaev, V.G., Wenzl, H.: Quantum invariants of 3-manifolds associated with classical simple Lie algebras. Internat. J. Math. 4, 323–358 (1993)

On α-Induction, Chiral Generators and Modular Invariants for Subfactors

487

45. Wassermann, A.: Operator algebras and conformal field theory III: Fusion of positive energy representations of SU (N) using bounded operators. Invent. Math. 133, 467–538 (1998) 46. Witten, E.: Gauge theories and integrable lattice models. Nucl. Phys. B322, 629–697 (1989) 47. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347–403 (1998) 48. Xu, F.: Applications of braided endomorphisms from conformal inclusions. Internat. Math. Research Notices, (1998) pp. 5–23, and the erratum to Theorem 3.4 (1) on p. 437 of the same volume Communicated by H. Araki

Commun. Math. Phys. 208, 489 – 506 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Hodge Integrals and Degenerate Contributions R. Pandharipande Department of Mathematics, California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected] Received: 6 April 1999 / Accepted: 14 July 1999

Abstract: Hodge integral techniques are used to compute the degree 1 degenerate contributions of curves of arbitrary genus in the Gromov–Witten theory of 3-folds. In the Calabi–Yau case, the contributions are compared to related M-theoretic calculations. In the Fano case, the contributions suggest new integrality conditions.

0. Introduction 0.1. Let X be a nonsingular, projective, 3 dimensional complex algebraic variety. Let M gD ,n (X, β) be the moduli space of stable maps from genus gD curves to X representing the homology class β ∈ H2 (X, Z). The Gromov–Witten invariants of X are defined via tautological integrals over these moduli spaces of maps (against their virtual fundamental classes): Z n Y g ev∗i (γi ), Nβ D (γ1 , . . . , γn ) = [M gD ,n (X,β)]vir i=1

where evi is the i th evaluation map and γi ∈ H ∗ (X, Z). As the moduli spaces are Deligne-Mumford stacks, the Gromov–Witten invariants take values in Q. Let TX and KX be the tangent bundle and the canonical class of X. For a 3-fold, the dimension formula shows the virtual dimensions do not depend upon the genus: dimvir (M gD (X, β)) = 3gD − 3 + χ(TX ) = −KX · β. If we restrict attention to a fixed curve class β ∈ H2 (X, Z), there are two basic possibilities: −KX · β = 0 or −KX · β > 0 (the negative case is of no interest here since then the Gromov–Witten invariants vanish). We will always take β 6 = 0.

490

R. Pandharipande

0.2. Case −KX · β = 0. If X is Calabi–Yau, this case holds for all classes β. Let d be a positive integer. Let C ⊂ X be a nonsingular genus g < gD curve of class β/d. The moduli space M gD (X, β) contains a substack of maps with genus gD domains which factor through a d-fold cover of C. Under suitable conditions, this substack of maps covering C is a connected component of M gD (X, β). In the latter case, the contribution of C to the genus gD , class β Gromov–Witten invariant of X is well-defined. It is these degenerate contributions that are studied here. Degenerate contributions play a central role in identifying the integer quantities in the Gromov–Witten theory of X. These integrality properties remain a very mysterious part of the subject. In algebraic geometry, degenerate contributions are related to Hodge integrals over the moduli space of curves M g,n [FP]. In string theory, recent progress in the study of these contributions has been made by a link to M-theory [GV1,GV2] (see also [MM]). While the mathematical results presented here overlap with the M- theoretic results of [GV2], the precise connection between the two approaches is still not completely understood. The differences are discussed below in Sect. 0.3. Let C ⊂ X be a nonsingular genus g curve representing the class [C] ∈ H2 (X, Z). For the degenerate analysis, we assume the normal bundle to C in X is general. Consider the moduli space of maps M g+h (X, d[C]). If g = 0 or 1, this moduli space will have a connected component equal to M g+h (C, d[C]). The contribution Cg (h, d) of C to the genus g + h Gromov–Witten is thus well-defined for g = 0, 1 and all values h ≥ 0, d > 0. The above component claim relies on rigidity arguments which possibly fail for multiple covers of genus g ≥ 2 curves. However, in the degree 1 case, M g+h (X, [C]) has a component equal to M g+h (C, [C]) for all g and h. Hence, Cg (h, 1) is always well-defined. At present, because of the possibility of deformations in X away from C, the correct definition of Cg (h, d) in general is not known to the author. The contributions in case g = 0 have recently been calculated in algebraic geometry [FP] and string theory [GV1, MM]: ∞ X h=0

C0 (h, 1)t 2h =

sin(t/2) −2 t/2

,

C0 (h, d) = d 2h−3 C0 (h, 1),

(1)

(2)

where C ⊂ X is a nonsingular, rigid rational curve. The contribution C0 (0, d) = 1/d 3 is the Aspinwall–Morrison formula which had been proven previously by several different methods [AM, M, V]. A nonsingular curve C ⊂ X is rigid if H 0 (C, N ) = 0, where N is the normal bundle of C in X. For rational C, rigidity is equivalent to the bundle splitting N = O(−1) ⊕ O(−1). Define C ⊂ X to be super-rigid if, for all non-constant maps of nonsingular curves µ : C 0 → C, H 0 (C 0 , µ∗ (N )) = 0. It is clear rigidity and super-rigidity are equivalent in the rational case, but differ for higher genus. Super-rigidity is a generic condition on the normal bundle for elliptic curves in X. Kley has informed the author his existence result for rigid elliptic curves on

Hodge Integrals and Degenerate Contributions

491

complete intersection Calabi–Yau 3-folds also shows the existence of super-rigid elliptic curves [K]. The contributions C1 (0, d) are easily computed for super-rigid elliptic curves C. The component of the moduli space M 1 (X, d[C]) corresponding to maps with image C is nonsingular of dimension 0 (and equal to M 1 (C, d[C]) ). The points of M 1 (C, d[C]) correspond naturally to the set of subgroups of Z⊕Z of index d. Hence, after accounting for automorphisms, σ (d) X 1 = C1 (0, d) = d i i|d

(see, for example, [S]). All other contributions of an elliptic curve C vanish by the following result. Theorem 1. Let C ⊂ X be a super-rigid elliptic curve. Then, C1 (h, d) = 0 for all h > 0, d > 0. This vanishing was conjectured by Gopakumar-Vafa in [GV1] and is derived in Mtheory in [GV2]. The proof given here uses basic constructions related to the virtual fundamental class. The degree 1 contributions Cg (h, 1) take a very simple form. Theorem 2. Let g ≥ 0. ∞ X

Cg (h, 1)t 2h =

sin(t/2) 2g−2

h=0

t/2

.

Theorem 2 is derived in Sect. 2 by expressing the contributions Cg (h, 1) as Hodge integrals over the moduli space of curves. The required integrals are then computed via geometric constructions, relations, and series manipulations. Theorem 2 is the main result of this paper. The right side of Theorem 2 was encountered before in the following related result of [FP]: 1+

h XX h≥1 i=0

t 2h k i

Z M h,1

ψ12h−2+i λh−i =

sin(t/2) −k−1 t/2

.

(3)

Theorem 2 gives an interpretation of these Hodge integrals in the Gromov–Witten theory of Calabi–Yau 3-folds. 0.3. M-theory predictions. The method of [GV1, GV2] is to consider limits of type IIA string theory which may be conjecturally analyzed in M-theory. A remarkable proposal is made in [GV2] for the form of the Gromov- Witten potential F˜ of a Calabi–Yau 3-fold X. Let X t 2g−2 F˜g (q), F˜ (t, q) = g≥0

F˜g (q) =

X

06 =β∈H2 (X,Z)

g

Nβ q β ,

492

R. Pandharipande g

where Nβ is the genus g Gromov–Witten invariant of X in curve class β. The potential F˜ differs from the full potential by the constant map (β = 0) contribution – the constant contributions have been calculated in [FP, GV1, MM]. For each curve class g β ∈ H2 (X, Z) and genus g, there is an integer nβ counting BPS states in the associated M-theory. The formula of [GV2] is: F˜ (t, q) =

X g,β

g

nβ t 2g−2

X 1 sin(dt/2) 2g−2 q dβ . d t/2

(4)

d>0

If CgM (h, d) denotes the contribution of a single BPS state in genus g and class β to the Gromov–Witten invariant in genus g + h and class dβ, then formula (4) is equivalent to the equations: ∞ sin(t/2) 2g−2 X CgM (h, 1)t 2h = , t/2 h=0

CgM (h, d) = d 2g+2h−3 CgM (h, 1). The first of these agrees with Theorem 2, so CgM (h, 1) = Cg (h, 1). The second is a generalization of (2) to g ≥ 0. It is therefore reasonable to interpret the states n0β as counting embedded (virtual) curves of genus 0 and degree β (even for the Calabi–Yau quintic these numbers n0β are at best virtual because of the existence of Vainsencher’s nodal rational curves). However, when specialized to genus 1, the second equation yields P C1M (0, d) = 1/d in contrast to C1 (0, d) = i|d 1/i. The (virtual) count of embedded genus 1 curves should be derived from F˜1 via the multiple cover corrections C0 (1, d) and C1 (0, d) (as previously pursued in [BCOV]). Gromov–Witten theory would predict the resulting number to be virtually enumerative, and thus integral (this heuristic argument for integrality is not a proof). Let L be an 1 be the virtual genus 1 number in curve class dL irreducible curve class in X. Let EdL obtained from the Gromov–Witten corrections. The M-theoretic perspective predicts a different correction of F˜1 to yield integers via formula (4). The relationship is n1dL =

X i|d

1 EiL .

While the numbers differ, the integrality predictions coincide. Klemm has checked the genus 1 integrality predictions hold in low degrees for several Calabi–Yau 3 folds [Kl]. No proofs of any of these integrality constraints are known to the author. To find higher genus evidence for formula (4), a direct computation of the potential F˜ in the local Calabi–Yau case (P2 with canonical bundle) for low genera and degrees has been pursued by Klemm and Zaslow [KlZ]. The Gromov–Witten invariants (in all genera) may be computed in this case by the virtual localization formula of [GP] and the holomorphic anomaly equation [BCOV]. The integrality predicted by (4) is a nontrivial constraint which is verified in all calculations. At this point, it is not clear how to define or compute the general contributions Cg (h, d). One may hope a complete understanding of Cg (h, d) will lead to an integrality property of the Gromov–Witten potential of X related to (4).

Hodge Integrals and Degenerate Contributions

493

0.4. Case −KX ·β > 0. In this case, the moduli spaces M gD (X, β) have positive virtual g dimensions. The Gromov–Witten invariants Nβ D (γ ) of X then depend upon a vector of cohomology classes γ = (γ1 , . . . , γk ), γi ∈ H ∗ (X, Z). Let Yi ⊂ X be general topological cycles dual to the classes γi . If we wish to identify integers in this Gromov–Witten theory, degenerate contributions again play a role. Let us assume we are in an ideal situation with respect to the moduli spaces of maps to X. Let MgBir (X, β) denote the moduli space of birational maps from smooth genus g domain curves. We assume first: (i)

The spaces MgBir (X, β) are generically reduced and of the expected dimension for all g ≤ gD . g

There is then an enumerative integer EβD (γ ) defined to equal the number of genus g gD maps of class β with smooth domains meeting all the cycles Yi . However, EβD (γ ) g will not equal Nβ D (γ ). The difference arises from the following observation. For each g g < gD , there are Eβ (γ ) maps with smooth genus g domains of class β satisfying g the required incidence conditions. The Gromov–Witten invariant Nβ D (γ ) receives a degenerate contribution from each of these lower genus solutions (via reducible genus gD maps factoring through covers of the lower genus curves). As the genus g solution represents the class β, the covers must be of degree 1. These degenerate contributions are therefore analogous to Cg (gD − g, 1). Dimension counts show maps multiple onto their image and maps with reducible g images are not expected to contribute to Nβ D (γ ). This is the second ideal assumption: (ii) Maps in M gD (X, β) multiple onto their image or with reducible image do not satisfy incidence conditions to the cycles Yi . Let C ⊂ X be a nonsingular, genus g curve of class β satisfying incidence conditions to the cycles Yi . Assume further C is infinitesimally rigid with respect to these incidence g+h conditions. The contribution Cg (h, X, β) of C to the Gromov–Witten invariant Nβ (γ ) is then well-defined: it is found in Sect. 3 to be an integral over the moduli space M g+h (C, [C]). This contribution is easily seen to be independent of γ . The final ideal assumption is: g

(iii) For all g < gD , the solution maps counted by Eβ (γ ) are nonsingular embeddings infinitesimally rigid with respect to the incidence conditions. The ideal relation between Gromov–Witten theory and the enumerative invariants is: g

Nβ D (γ ) =

gD X g=0 g

g

Cg (gD − g, X, β)Eβ (γ ).

(5)

The validity of the relation (5) for Nβ D (γ ) requires assumptions (i), (ii), and (iii) for gD , β, and γ . The easiest 3-fold to consider is X = P3 . As the divisor −KP3 is ample, −KP3 ·β > 0 for all nonzero curve classes. The moduli spaces of maps to P3 are easily seen to be ideal in the above sense for the genera gD = 0, 1, 2, all degrees d > 0, and all γ . The

494

R. Pandharipande

rigidity statements follow as usual from Bertini arguments (see [FuP]). Therefore, the ideal relation (5) holds in these genera. The equation Nd0 = Ed0

(6)

is well known for P3 (we drop γ in these equations). In joint work with Getzler and Graber, we had computed C0 (1, P3 , d) =

Nd1 =

1 − 2d , 12

1 − 2d 0 Ed + Ed1 . 12

(7)

Equation (7) was used in Getzler’s study [Ge] of the genus 1 enumerative geometry of P3 . Using Xiong’s calculations of low degree genus 2 Gromov–Witten invariants of P3 as data, Jinzenji and Xiong conjectured the contribution equation: Nd2 =

3 − 11d + 10d 2 0 4d 1 Ed − E + Ed2 . 720 24 d

(8)

These equations led Jinzenji and Xiong to recently conjecture a general formula [J] analogous to Theorem 2: ∞ X h=0

Cg (h, X, β)t 2h =

sin(t/2) 2g−2−KX ·β t/2

.

(9)

The contribution Cg (h, X, β) is calculated here by the method used in the proof of Theorem 2. Theorem 3. The degenerate contributions Cg (h, X, β) are determined by formula (9). Theorem 3 and relation (5) prove formulas (6), (7), (8) for g = 0, 1, 2 and all degrees d > 0 in P3 . For higher genera, it is known the space of curves in P3 may be of excess dimension. For example, the moduli space M 3 (P3 , 4) has a 17 dimensional component, but is expected to be 16 dimensional. The definition of enumerative invariants is therefore g not clear from a space curve point of view. However, the invariants Eβ (γ ) may still be defined by Theorem 3 from the Gromov–Witten invariants and Eq- (5). Perhaps an g integrality property holds for Eβ (γ ) in some general context. Algebraic 3-folds are special in Gromov–Witten theory since the (virtual) dimensions of the moduli spaces of stable maps do not depend upon the genus. A similar uniform treatment of degenerate contributions in higher dimensions will require new ideas. Graber has carried out a related degenerate analysis in the genus 0 Gromov– Witten theory of the Hilbert scheme of 2 points of P2 [Gr].

Hodge Integrals and Degenerate Contributions

495

0.5. Moduli of curves. The Hodge integral approach taken here has an application to the geometry of the moduli space of nonsingular curves Mg , (g ≥ 2). The tautological ring R∗ (Mg ) is the subring of A∗ (Mg ) generated by the κ classes (see [Mu]). The intersection calculus of R(Mg ) has a very rich structure. A detailed study by Faber of R(Mg ) for low genera has led to very precise conjectures of this ring structure [F1]. In particular, Faber has conjectured R∗ (Mg ) is a Gorenstein ring with socle in degree g − 2. In [GeP], the (conjectural) intersection pairing of R(Mg ) is directly linked to Gromov–Witten theory via (conjectural) Virasoro constraints on the descendent potential of P2 . The computation here of the degenerate contributions Cg (h, 1) leads to a formula in R∗ (Mg ) conjectured previously by Faber from evidence for g ≤ 15. Theorem 4. For g ≥ 2, the relation g−2 X 2g−1 (−1)i λi κg−2−i = κg−2 g! i=0

holds in R∗ (Mg ). 1. Theorem 1 1.1. Super-rigidity. Let C ⊂ X be a nonsingular elliptic curve in a Calabi–Yau 3- fold. The normal bundle N is of rank 2 with trivial determinant. If C is rigid, a straightforward argument shows N contains a non-trivial degree 0 line sub-bundle L: 0 → L → N → L−1 → 0. Conversely, such a filtration implies the rigidity of C. The curve C is super-rigid if and only if L is not a torsion element of the Picard group of C. While super-rigidity is a stronger condition on N than rigidity, it is an open condition. Super-rigidity is required for the equality of moduli spaces proven in Proposition 1. Note super-rigidity implies H 0 (C 0 , µ∗ (N)) = 0 for every non- constant stable map µ : C 0 → C. The moduli spaces M 1+h (C, d[C]) and M 1+h (X, d[C]) are Deligne-Mumford stacks with possibly nonreduced structures. Proposition 1. Let C ⊂ X be a nonsingular, super-rigid elliptic curve. The space of maps M 1+h (C, d[C]) is a union of connected components of M 1+h (X, d[C]) for all h ≥ 0, d > 0. Proof. There is a natural map: ι : M 1+h (C, d[C]) → M 1+h (X, d[C]). By the super-rigidity of C, the locus of M 1+h (X, d[C]) corresponding to maps with support in C is a union of connected components of M 1+h (X, d[C]). We will prove ι is an isomorphism onto these connected components. It suffices to prove a lifting property for families of stable maps over Artinian local rings A. Let ξ ∈ Spec(A) be the geometric point corresponding to the maximal ideal m ⊂ A. Let π : F → Spec(A), µ : F → X

496

R. Pandharipande

be a family of stable maps satisfying µξ : Fξ → C ⊂ X.

(10)

We will prove µ factors through C. This lifting implies the desired isomorphism property of ι. Let I be the ideal sheaf of C in X. We must prove the natural map φ : µ∗ (I) → OF is zero. Certainly φ has image in mOF by the assumption (10) on the geometric fiber ξ . Hence, φ induces a natural map on F: µ∗ (I/I 2 ) → mOF /m2 OF = (m/m2 ) ⊗C OFξ .

(11)

The restriction of µ∗ (I/I 2 ) to Fξ is simply µ∗ξ (N ∗ ). By the super-rigidity of C, the map (11) is zero. We conclude φ factors through m2 OF . The above argument may be used to prove the following implication: if φ factors through mk OF , then φ factors through mk+1 OF . Since A is Artinian, m is nilpotent. Hence, φ vanishes. u t There are two perfect obstruction theories on M 1+h (C, d[C]) obtained from considering the moduli problem of maps to C and X respectively (see [B, BF, LT]). Let π : F → M 1+h (C, d[C]), µ:F →C be the universal family and universal map respectively. By super-rigidity π∗ µ∗ (N ) = 0 and R 1 π∗ µ∗ (N) is a rank 2h bundle. The two obstruction theories differ exactly by the bundle R 1 π∗ µ∗ (N). From the definition of the virtual class, we conclude: Z c2h (R 1 π∗ µ∗ (N )). (12) C1 (h, d) = [M 1+h (C,d[C])]vir

1.2. Vanishing results. Let E be any bundle on C. Consider the complex Rπ∗ µ∗ (E) in the derived category of coherent sheaves on M 1+h (C, d[C]). Let L be a π-relatively ample polarization on F. We may find an exact sequence of bundles on F: 0 → K → ⊕L−k → µ∗ (E) → 0 for some positive integer k [H]. As both π∗ K and π∗ L−k vanish, we find a two term bundle resolution of Rπ∗ µ∗ (E): ∼

[R 1 π∗ K → R 1 π∗ ⊕ L−k ] = Rπ∗ µ∗ (E). The Chern classes of Rπ∗ µ∗ (E) are defined by c(R 1 π∗ K)/c(R 1 π∗ ⊕ L−k ). This definition is independent of two term resolutions in the derived category. As π∗ µ∗ (N) = 0 and R 1 π∗ µ∗ (N ) is a rank 2h bundle, we see (12) may now be rewritten as: Z [c−1 (Rπ∗ µ∗ (N ))]2h . C1 (h, d) = [M 1+h (C,d[C])]vir

Hodge Integrals and Degenerate Contributions

497

It is easy to find flat families of bundles on C connecting N and the trivial rank 2 bundle I = OC ⊕ OC . For example, if P is a sufficiently ample line bundle, both N ⊗ P and I ⊗ P will have nowhere vanishing sections: 0 → OC → N ⊗ P → P 2 → 0, 0 → OC → I ⊗ P → P 2 → 0. Hence N and I are connected in the family of extensions of P by P −1 . The integral Z [c−1 (Rπ∗ µ∗ (E))]2h [M 1+h (C,d[C])]vir

is clearly constant as E varies in this family (for example, the two term resolutions of Rπ∗ µ∗ (E) may be chosen compatibly over the family). We conclude Z [c−1 (Rπ∗ µ∗ (I ))]2h . C1 (h, d) = [M 1+h (C,d[C])]vir

Now assume h > 0. Let γ : M 1+h (C, d[C]) → M 1+h be the natural map to the moduli space of curves. Let E denote the Hodge bundle on M 1+h : the fiber of E over the moduli point [F ] ∈ M 1+h is H 0 (F, ωF ) (see [Mu]). Since π∗ µ∗ (I ) = OM ⊕ OM ,

R 1 π∗ µ∗ (I ) = γ ∗ (E∗ ⊕ E∗ ), we see [c−1 (Rπ∗ µ∗ (I ))]2h is a cohomology class pulled-back via γ from M 1+h . Hence, to complete the proof of Theorem 1, it suffices to show the following vanishing. Proposition 2. Let h > 0. Then, γ∗ ([M 1+h (C, d[C])]vir ) = 0. Proof. Fix a base point p ∈ C for the course of the proof. We will consider the moduli space of 1-pointed maps M 1+h,1 (C, d[C]). Let ev−1 1 (p) = M 1+h,p (C, d[C]) ⊂ M 1+h,1 (C, d[C]) denote the subspace of maps for which the marking has image p. There is a canonical isomorphism obtained by the group law on C: ∼

M 1+h,1 (C, d[C]) = C × M 1+h,p (C, d[C]). Let ρ : M 1+h,1 (C, d[C]) → M 1+h,p (C, d[C]) denote the canonical projection. The perfect obstruction theory on M 1+h,1 (C, d[C]) may be obtained from a canonical distinguished triangle involving the cotangent complex of the Artin stack of prestable curves and the perfect obstruction theory relative to this Artin stack (see [B, BF, GrP]). These objects are naturally equivariant with respect to the natural group law on C (see the constructions of [B, BF]). Hence, the virtual class of M 1+h,1 (C, d[C]) is a pull- back of an algebraic cycle class on M 1+h,p (C, d[C]). As the map γ1 : M 1+h,1 (C, d[C]) → M 1+h,1

498

R. Pandharipande

factors through M 1+h,p (C, d[C]), we obtain γ1∗ ([M 1+h,1 (C, d[C])]vir ) = 0.

(13)

Consider now the commutative diagram obtained from the 1-pointed moduli spaces: M 1+h,1 (C, d[C]) −−−−→ M 1+h,1 γ1     πy πy

(14)

M 1+h (C, d[C]) −−−−→ M 1+h . γ

While (14) is not a fiber square, it is easy to see the following equality holds: γ1∗ π ∗ = π ∗ γ∗ .

(15)

From the axiom of contracting a point [BM], we see π ∗ ([M 1+h (C, d[C])]vir ) = [M 1+h,1 (C, d[C])]vir . Then, Eqs. (13) and (15) imply: π ∗ γ∗ ([M 1+h (C, d[C])]vir ) = 0.

(16)

For any class ξ ∈ A∗ (M 1+h ), π∗ (ψ1 · π ∗ (ξ )) = 2h · ξ, where ψ1 is the Chern class of the cotangent line on M 1+h,1 . Hence, the pull-back π ∗ : A∗ (M 1+h ) → A∗ (M 1+h,1 ) is injective. The proposition now follows from (16). t u 2. Theorem 2 2.1. Rigidity. Let C ⊂ X be a rigid, nonsingular genus g curve with normal bundle N . The contribution Cg (0, 1) is certainly 1, so we may assume h is a positive integer. The proof of Proposition 1 also establishes the following facts. First, the moduli space M g+h (C, [C]) is a component (easily seen to be connected) of M g+h (X, [C]). Second, the contribution Cg (h, 1) is determined by: Z Cg (h, 1) =

[M g+h (C,[C])]vir

c2h (Rg,h ).

(17)

Here, Rg,h denotes the rank 2h bundle R 1 π∗ µ∗ (N ). Note the virtual dimension of M g+h (C, [C]) is also 2h. The arguments of Sect. 1 are valid because a rigid curve is super- rigid in degree 1.

Hodge Integrals and Degenerate Contributions

499

2.2. Irreducible components of M g+h (C, [C]). Let C be a nonsingular genus g curve. Let h be a positive integer. We first analyze the moduli space of degree 1 maps M g+h (C, [C]). Let P (h) denote the set of partitions h. There is a natural set-theoretic function: ν : M g+h (C, [C]) → P (h) defined by the following method. Let µ : F → C correspond to a point [µ] ∈ M g+h (C, [C]). The domain F must contain a unique irreducible component FC mapped isomorphically to C by µ. The arithmetic genera of the connected components {Fi } of F \ FC form a partition of h. Let ν([µ]) equal this partition. The irreducible components of M g+h (C, [C]) are in bijective correspondence with P (h) by the value of ν on a general element. Let τ = (h1 ≥ . . . ≥ hl ) be a partition of h of length l. Consider the FultonMacPherson configuration space C[l] of l marked points in C: C[l] is a natural compactification of the space of l distinct points of C [FuM]. If C has no automorphisms, C[l] is simply the fiber of M g,l → M g over the moduli point [C]. Define the nonsingular Deligne-Mumford stack Iτ by: Iτ = C[l] ×

l Y

M hi ,1 .

(18)

i=1

There is a natural family,

π : F → Iτ ,

of prestable curves over Iτ obtained by attaching a 1-pointed genus hi curve to the i th marking of the universal l-pointed curve over C[l]. Moreover, there is canonical projection µ : F → C. The induced morphism: γτ : Iτ → M τ ⊂ M g+h (C, [C]) is finite and surjective onto the irreducible component M τ corresponding to the partition τ. Let ∂M hi ,1 denote the boundary of the moduli space: the locus of curves with at least one node. Similarly, let ∂C[l] ⊂ C[l] denote the locus of nodal curves (∂C[l] may also be viewed as the locus lying over the diagonals of the product C l ). Let ∂Iτ denote the union of the pull-backs of the boundaries of the factors (18) via the l + 1 projections. Let ∂γτ : ∂Iτ → M g+h (C, [C]) denote the natural map. The main geometric result used in the proof of Theorem 2 is the following vanishing. Proposition 3. For all partitions τ of h, c2h (∂γτ∗ (Rg,h )) = 0. Proof. By definition, ∂Iτ is union of the pull-backs of the boundary divisors of the l + 1 product factors of (18). We show c2h (∂γτ∗ (Rg,h )) restricts to 0 on each of these pull-backs. Let propositionj denote the projection of Iτ onto the (j + 1)st factor of (18) for 0 ≤ j ≤ l. There are l natural evaluation maps evi : C[l] → C obtained from the l markings. Define evi : Iτ → C by the composition

500

R. Pandharipande pr0

evi

Iτ −→ C[l] −→ C for 1 ≤ i ≤ l. The bundle γτ∗ (Rg,h ) is easily analysed via the natural normalization sequence of the family F. We find a decomposition: γτ∗ (Rg,n ) =

l M i=1

E∗i ⊗ ev∗i (N ),

(19)

where Ei is the Hodge bundle on M hi ,1 . We denote the pull-back of these Hodge bundles to Iτ by the same symbols. An important relation among the Chern classes of the Hodge bundle has been established by Mumford in [Mu]. Mumford’s relation is: c(Ei )·c(E∗i ) = 1 in A∗ (M hi ,1 ). From (19), we deduce: c2h (γτ∗ (Rg,h )) =

l Y i=1

c2hi (E∗i ⊗ ev∗i (N )).

Algebra and Mumford’s relation then yield: c2h (γτ∗ (Rg,h )) =

l Y i=1

λhi λhi −1 c1 (ev∗i (N ∗ )).

(20)

Here, λk denotes the k th Chern class of the Hodge bundle. First, consider a boundary divisor 1 ⊂ M hj ,1 . The pull-back of 1 to Iτ is simply: pr−1 j (1) = C[l] × 1 ×

Y

M hi ,1 .

i6 =j

The restriction of the factor λhj λhj −1 of (20) to 1 has been proven by Faber to vanish [F1] (the reducible divisors of M h1 ,1 have non-trivial genus splittings). Hence, the restriction of c2h (γτ∗ (Rg,h )) to pr−1 j (1) vanishes. Second, consider a boundary divisor 1 of C[l]. The divisor 1 corresponds to a locus in which a subset S ⊂ [l] (of at least 2 elements) of the marked points coincide over C. The evaluation maps {evi }i∈S coincide when restricted to pr−1 0 (1). Therefore, since c1 (N ∗ )2 = 0, the restriction of c2h (γτ∗ (Rg,h )) to pr−1 (1) vanishes. t u j 2.3. Hodge integrals. Let ∂M τ = γτ (∂Iτ ), and let Mτ = M τ \ ∂M τ . Mτ is open in M g+h (C, [C]) and corresponds to the moduli space of degree 1 maps which consist of nonsingular curves of genus hi attached to distinct point of C. A deformation theory argument shows Mτ is a nonsingular moduli stack of dimension Pl i=1 (3hi − 1). More precisely, for [µ : F → C] ∈ Mτ , there is a canonical exact sequence: ι 0 → Aut[F ] → H 0 (F, µ∗ (TC )) → Def[µ] → Def[F ] → H 1 (F, µ∗ (TC )) → Obs[µ] → 0,

Hodge Integrals and Degenerate Contributions

501

where Aut[F ] is the infinitesimal automorphism space of F and Def[F ] , Def[µ] are the infinitesimal deformation spaces of F , µ respectively. It is easy to prove the cokernel of ι is equal to a vector space V with filtration 0 → Def[C] → V →

l M (Tpi ⊗ Tpi0 ) → 0. i=1

Here, the component Fi ⊂ F of genus hi is attached to C at the points pi ∈ Fi and pi0 ∈ C. The cokernel computation amounts to showing the map µ has no infinitesimal deformations which smooth any of the l nodes of F . We then see Def[µ] is of constant P dimension li=1 (3hi − 1). Moreover, the obstruction space is a bundle over Mτ with fiber 1 l H 1 (F, µ∗ (TC )) M H (Fi , OFi ⊗ Tpi0 ) = I m(V ) Tpi ⊗ Tpi0

(21)

i=1

over [µ]. The essential point here is the deformation theory of maps in Mτ is very simple. Let Autτ denote the stabilizer of the permutation Sl -action on the l-tuple τ . The map γτ : Iτ → M τ is Autτ -invariant. Moreover, the quotient map induces a proper, bijective morphism γ˜τ : Iτ /Autτ → M τ .

(22) ∼

Let Iτ0 = Iτ \ ∂Iτ . Certainly, γ˜τ induces an isomorphism Iτ0 /Autτ = Mτ . TheSrestriction of the virtual class ξ vir = [M g+h (C, [C])]vir to the disjoint open union τ ∈P (h) Mτ is: M ξτvir , τ ∈P (h)

where ξτvir ∈ A2h (Mτ ). The pull-back of ξτvir to Iτ0 is identified from the obstruction theory (21) to be: γτ∗ (ξτvir ) =

l Y i=1

chi −1

c(E∗i ⊗ ev∗i (TC )) . 1 − ψ1 + c1 (ev∗i (TC ))

(23)

Since Mτ is nonsingular, the restriction of the virtual class is the Euler class of the obstruction bundle. The virtual class ξ vir may be (non-canonically) expressed as a sum: M vir ξτ , τ ∈P (h)

vir

where ξ τ ∈ A2h (M τ ). Using the proper bijection (22), we see: X Z vir ξ τ ∩ c2h (γ˜τ∗ (Rg,h )). Cg (h, 1) = τ ∈P (h) Iτ /Autτ

(24)

502

R. Pandharipande vir

By the vanishing of Proposition 3, Eq. (24) remains valid if ξ τ is replaced with any cycle class which restricts to ξτvir on Mτ . This observation together with (23) yields the equality: X

Cg (h, 1) =

1 |Autτ |

τ ∈P (h) l Y

·

chi −1

i=1

Z Iτ

c2h (γτ∗ (Rg,h ))

c(E∗i ⊗ ev∗i (TC )) . 1 − ψ1 + c1 (ev∗i (TC ))

(25)

Equation (20) together with basic algebraic manipulations then prove the main integral formula: Cg (h, 1) =

hX l Z i −1 X (2 − 2g)l Y h −1−j λhi λhi −1 ( (−1)j λj ψ1 i ). |Autτ | M hi ,1 i=1

τ ∈P (h)

(26)

j =0

The only aspect of N which affects the integral Z Cg (h, 1) = is

R

C c1 (N

∗ ).

[M g+h (C,[C])]vir

c2h (Rg,h )

This Chern class enters enters (26) via Eq. (20) yielding the factor Z (

C

c1 (N ∗ ))l = (2 − 2g)l .

Theorems 2–4 will directly follow from formula (26). R Pq−1 q−1−j ). Define the generating For q ≥ 1, define αq = M q,1 λq λq−1 ( j =0 (−1)j λj ψ1 series: X Q(t) = αq t 2q . q≥1

An immediate consequence of formula (26) is the equation: X

Cg (h, 1)t 2h = exp((2 − 2g)Q(t))

h≥0

= exp(2Q(t))1−g X =( C0 (h, 1)t 2h )1−g h≥0

=

sin(t/2) 2g−2 t/2

.

The last equality follows from the previous computations of C0 (h, 1) in [FP]. The proof of Theorem 2 is complete.

Hodge Integrals and Degenerate Contributions

503

3. Theorem 3 We follow here the notation of Sect. 0.4 . Let C be a nonsingular genus g curve in a 3-fold X representing the homology class β. We now assume −KX · β > 0, so the moduli space M g (X, β) is of positive expected dimension. Let γ = (γ1 , . . . , γn ) be a g vector of cohomology classes defining a Gromov–Witten invariant Nβ (γ ). For each i, let Yi ⊂ X be a topological cycle dual to γi . Let pi ∈ C ∩ Yi . We let (C) denote the identity map π : C → C ⊂ X defining a point in the moduli space of stable maps. The g+h contribution of C to Nβ (γ ) via covers will require two general position hypotheses analogous to rigidity in the Calabi–Yau case: (i) (C, p1 , . . . , pn ) is a nonsingular point of M g,n (X, β) lying on a component of expected dimension −KX · β + n. (ii) The topological intersection of the cycles ev−1 i (Yi ) in M g,n (X, β) is transverse at (C, p1 , . . . pn ). Under these hypotheses, the degenerate contribution of C may be expressed directly as an integral over M g+h (C, [C]). Let W ⊂ M g (X, β) be the open, nonsingular, expected dimensional subset of the moduli space of maps. Let U ⊂ W be the open subset corresponding to embeddings of nonsingular genus g curves in X.As such embeddings have no nontrivial automorphisms, U is a nonsingular variety (not just a Deligne-Mumford stack). Moreover, by assumption (i), U is nonempty of dimension −KX · β and contains (C). After discarding a finite number of subvarieties of U , we may assume (C) is the only point of U meeting all the cycles Yi . Note the moduli space U is also naturally an open set of a component of the Hilbert scheme of curves in X. Let η:C→U denote the universal family of curves over U . Let M g+h (η, β) denote the η-relative moduli space of maps representing the fundamental class of the fibers of η. There is a natural morphism of Deligne–Mumford stacks: ι : M g+h (η, β) → M g+h (X, β)

(27)

obtained by composition. There are several tautological morphisms (over U ): π : F → M g+h (η, β), µ : F → C, τ : M g+h (η, β) → U. Let N denote the universal normal bundle N on C. N is the family of normal bundles of the fibers of η in X. As U is nonsingular of expected dimension, η∗ (N ) is isomorphic to the tangent bundle of U and R 1 η∗ (N ) = 0. A deformation theoretic check over Artinian rings shows ι is an open immersion. We see the stack M g+h (η, β) has two natural fundamental classes. The first is [M g+h (η, β)]vir obtained from the structure of a η-relative moduli space of maps. Second, the open inclusion ι endows M g+h (η, β) with the perfect obstruction theory on

504

R. Pandharipande

M g+h (X, β). A direct comparison of these two obstruction theories on M g+h (η, β) shows they differ exactly by the bundle Rg,h = R 1 π∗ µ∗ (N ): ι∗ ([M g+h (X, β)]vir ) = [M g+h (η, β)]vir ∩ c2h (Rg,h ).

(28)

Relations (27) and (28) are valid when considered in the context of n-pointed stable maps (this may be deduced from the above unpointed relations together with the natural properties of these virtual structures under the morphisms forgetting the markings [BM]). By relation (28) and the definition of the Gromov–Witten invariants, the contribution g+h of (C, p1 , . . . , pn ) to Nβ (γ ) is equal to the intersection product: [M g+h,n (η, β)]vir ∩ c2h (Rg,h ) ∩

n Y i=1

ev−1 i (Yi ),

(29)

with value in the zeroth homology of the compact space ∩ni=1 ev−1 i (Yi ). By assumption (ii) and the pull-back properties of the virtual class, intersection (29) is (numerically) equal to: [M g+h (η, β)]vir ∩ c2h (Rg,h ) ∩ τ −1 (C). The latter class (30) is an integral over the virtual class of the fiber τ −1 (C) = M g+h (C, [C]). We find: Z c2h (Rg,h ). Cg (h, X, β) = [M g+h (C,[C])]vir

(30)

(31)

This integral is identical to (17) except for the different normal bundles N occurring in the definition of Rg,h . The method in Sect. 2 to compute (17) also yields a computation of (31). As remarked R after Eq. (26), the bundle N affects the integral (31) through C c1 (N ∗ ): R X ( c1 (N ∗ ))l C Cg (h, X, β) = |Autτ | τ ∈P (h) (32) hX l Z i −1 Y hi −1−j j λhi λhi −1 ( (−1) λj ψ1 ). · i=1 M hi ,1

Since

R

C c1 (N

∗)

j =0

= 2 − 2g + KX · β, Theorem 3 follows via the series analysis of Sect. 2.

4. Theorem 4 Let π : M q,1 → M q be the universal curve (for q ≥ 2). The class ψ1 is the Chern class j +1 of the cotangent line bundle on M q,1 . The kappa classes are defined by κj = π∗ (ψ1 ). Define q−1 q−2 X X q−1−j )= (−1)j λj κq−2−j . βq−2 = π∗ ( (−1)j λj ψ1 j =0

j =0

Hodge Integrals and Degenerate Contributions

505

In the notation of Sect. 2.3, we see: Q(t) = t 2 /24 +

X

Z t 2q

q≥2

Mq

λq λq−1 ∩ βq−2 .

The results of Sect. 2.3 applied in case g = 0 prove: t/2 2 . exp(2Q(t)) = sin(t/2) After taking the logarithm, we find:

Q(t) = log

t/2 . sin(t/2)

(33)

The right series in (33) may be expanded as X |B2q | 2q t/2 = t log sin(t/2) (2q)(2q)! q≥1

by Lemma 3 of [FP]. Faber has computed Z λq λq−1 ∩ κq−2 = Mq

|B2q | 1 22q−1 (2q − 1)!! 2q

from Witten’s conjectures/Kontsevich’s theorem [F2]. It is known Rq−2 (Mq ) is exactly 1 dimensional ([F2, L]). Since λq λq−1 vanishes when restricted to ∂M q , we find R M q λq λq−1 ∩ βq−2 · κq−2 . βq−2 = R M q λq λq−1 ∩ κq−2 Theorem 4 now follows from the computation: R 2q−1 M λq λq−1 ∩ βq−2 R q = . q! M q λq λq−1 ∩ κq−2 Acknowledgement. The author thanks P. Belorousski, C. Faber, E. Getzler, T. Graber, M. Jinzenji, S. Katz, A. Klemm, H. Kley, C. Vafa, and E. Zaslow for comments and correspondence related to degenerate contributions. In particular, this paper was inspired by questions of C. Vafa. The author was partially supported by National Science Foundation grant DMS-9801574.

References [AM]

Aspinwall, P. and Morrison, D.: Topological field theory and rational curves. Comm. Math. Phys. 151, 245–262 (1993) [B] Behrend, K.: Gromov–Witten invariants in algebraic geometry. Invent. Math. 127, 601–617 (1997) [BF] Behrend, K. and Fantechi, B.: The intrinsic normal cone. Invent. Math. 128, 45–88 (1997) [BM] Behrend, K. and Manin, Yu.: Stacks of stable maps and Gromov–Witten invariants. Duke J. Math. 85, no. 1, 1–60 (1996) [BCOV] Bershadsky, M., Cecotti, S., Ooguri, H. and Vafa, C.: Holomorphic anomalies in topological field theories. (with an appendix by S. Katz), Nucl. Phys. B405, 279–304 (1993)

506

R. Pandharipande

[F1]

Faber, C.: A conjectural description of the tautological ring of the moduli space of curves. Preprint 1996, available from http://www.math.okstate.edu/preprint/ 1997.html Faber, C.: A non-vanishing result for the tautological ring of Mg . Preprint 1998 Faber, C. and Pandharipande, R.: Hodge integrals and Gromov–Witten theory. math.AG/9810173, to appear in Invent. Math. Fulton, W. and MacPherson, R.: A compactification of configuration spaces. Ann. of Math. (2) 139, no. 1, 183–225 (1994) Fulton, W. and Pandharipande, R.: Notes on stable maps and quantum cohomology. In: Proceedings of Symposia in Pure Mathematics: Algebraic Geometry Santa Cruz 1995, J. Kollár, R. Lazarsfeld, D. Morrison, eds., Volume 62, Part 2, pp. 45–96 Getzler, E.: Intersection theory on M 1,4 and elliptic Gromov–Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997) Getzler, E. and Pandharipande, R.: Virasoro constraints and the Chern classes of the Hodge bundle. Nucl. Phys. B530, 701–714 (1998) Graber, T.: Enumerative geometry of hyperelliptic plane curves. Preprint 1998, math.AG/9808084 Graber, T. and Pandharipande, R.: Localization of virtual classes. Invent. Math. 135, 487–518 (1999) Gopakumar, R. and Vafa, C.: M-theory and topological strings I. Preprint 1998, hep-th/9809187 Gopakumar, R. and Vafa, C.: M-theory and topological strings II. hep-th/9812127 Hartshorne, R.: Algebraic geometry. New York: Springer-Verlag, 1977 Jinzenji, M.: Private communication. March 1999 Klemm, A.: Private communication. February 1999 Klemm, A. and Zaslow, E.: Local mirror symmetry at higher genus. hep-th/9906046 Kley, H.: Rigid curves in complete intersection Calabi–Yau threefolds. Comp. Math. to appear Li, J. and Tian, G.: Virtual moduli cycles and Gromov–Witten invariants of algebraic varieties. J. AMS 11, no. 1, 119–174 (1998) Looijenga, E.: On the tautological ring of Mg . Invent. Math. 121, 411–419 (1995) Manin, Yu.: Generating functions in algebraic geometry and sums over trees. In: The moduli space of curves, R. Dijkgraaf, C. Faber, and G. van der Geer, eds., Basel–Boston: Birkhäuser, 1995, pp. 401–417 Mariño, M. and Moore, G.: Counting higher genus curves in a Calabi–Yau manifold. Preprint 1998, hep-th/9808131 Mumford, D.: Towards an enumerative geometry of the moduli space of curves. In: Arithmetic and Geometry M. Artin and J. Tate, eds., Part II, Basel–Boston: Birkhäuser, 1983, pp. 271–328 Serre, J.-P.: A course in arithmetic. New York: Springer-Verlag, 1973 Voisin, C.: A mathematical proof of a formula of Aspinwall and Morrison. Comp. Math. 104, no. 2, 135–151 (1996)

[F2] [FP] [FuM] [FuP] [Ge] [GeP] [Gr] [GrP] [GV1] [GV2] [H] [J] [Kl] [KlZ] [K] [LT] [L] [Ma] [MM] [Mu] [S] [V]

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 208, 507 – 520 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

The Wk,p -Continuity of the Schrödinger Wave Operators on the Line Ricardo Weder1,2,? 1 Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de

México, Apartado Postal 20-726, México D.F. 01000, E-mail: [email protected]

2 Instituto de Física Rosario, CONICET, Argentina

Received: 21 April 1999 / Accepted: 15 July 1999

Abstract: We prove that the wave operators for the Schrödinger equation on the line are continuous on the Sobolev spaces Wk,p , 1 < p < ∞. Moreover, if the potential is exceptional and a := limx→−∞ f1 (x, 0) = 1, where f1 (x, 0) is a Jost solution at zero energy, the wave operators are continuous on Wk,1 and on Wk,∞ . 1. Introduction In this paper we consider the one-dimensional Schrödinger equation i

∂2 ∂ u(t, x) = (− 2 + V (x))u(t, x), u(0, x) = φ(x), ∂t ∂x

(1.1)

with t, x ∈ R. Below we always assume that V ∈ L11 , where for any γ ∈ R we denote by L1γ the space of all complex-valued measurable functions, φ, defined on R such that, Z (1.2) kφkL1γ := |φ(x)|(1 + |x|)γ dx < ∞. L1γ is a Banach space with the norm (1.2). Under this condition (see [4] and [26]) the d differential expression τ := − dx 2 + V (x) is essentially self-adjoint on the domain 2

D(τ ) := {φ ∈ L2C : φ and

d φ are absolutely continuous and τ φ ∈ L2 }, dx

(1.3)

where we denote by L2C the set of all functions in L2 that have compact support. We designate by H the unique self-adjoint realization of τ . It is known (see [4] for example) ? Fellow Sistema Nacional de Investigadores.

508

R. Weder

that H has a finite number of negative eigenvalues, that it has no positive or zero eigenvalues, that it has no singular-continuous spectrum and that the absolutely-continuous d2 spectrum of H is [0, ∞). Let H0 denote the unique self-adjoint realization of − dx 2, with domain the Sobolev space W2,2 . See Sect. 2 for the definition of the spaces that we use. The wave operators are given by: W± := s − lim eitH e−itH0 . t→±∞

(1.4)

It is proven in [19] that the limits in (1.4) exit in the strong topology in L2 and that Range W± = Hc (H ), where Hc (H ) denotes the subspace of continuity of H . The adjoints of the W± are given by W±∗ = s − lim eitH0 e−itH Pc , t→±∞

(1.5)

where Pc denotes the orthogonal projector onto the subspace of continuity of H . The problem that we address on this paper is the continuity of the W± and of the W±∗ on the Sobolev spaces Wk,p . This question has been studied in a series of papers by Yajima [28–32] in the case of Schrödinger operators on Rn , n ≥ 3. He proved that the W± and the W±∗ have bounded extensions to operators on Wk,p , k = 0, 1, · · · , 1 ≤ p ≤ ∞, under appropriate conditions on the regularity and on the decay of V and assuming that zero is neither an eigenvalue nor a resonance (half-bound state) for H . We prove a related result in the case n = 1. As we show below, in the one-dimensional case W± and W±∗ extend to bounded operators on Wk,p , 1 < p < ∞, but in the general case they do not extend to bounded operators on Wk,1 and on Wk,∞ . The reason why we have a more singular behaviour in the one-dimensional case is that the low-energy parts of W± and of W±∗ contain terms with the Hilbert transform (see the end of the Introduction). If the potential is exceptional and a := limx→−∞ f1 (x, 0) = 1, the terms that contain the Hilbert transform are not present and W± , W±∗ extend to bounded operators on Wk,1 and on Wk,∞ . Before we state our main theorem we introduce some standard notation. For u, v any pair of solutions to the stationary Schrödinger equation −

d2 u + V u = k 2 u, k ∈ C, dx 2

(1.6)

let [u, v] denote the Wronskian of u and v : [u, v] :=

d d u v − u v. dx dx

(1.7)

Let fj (x, k), j = 1, 2, =k ≥ 0, be the Jost solutions to (1.6) (see [5,6,4,3], and Sect. 2 below). A potential V is said to be generic if [f1 (x, 0), f2 (x, 0)] 6 = 0 and V is said to be exceptional if [f1 (x, 0), f2 (x, 0)] = 0. If V is exceptional there is a bounded solution (a half-bound state, or a zero energy resonance) to (1.6) with k = 0. Note that the trivial potential, V = 0, is exceptional. For these definitions and related issues see [15]. For dl (0) = V . Our main result is the l = 0, 1, · · · , we denote V (l) := dx l V (x). Note that V following theorem.

Wk,p -Continuity

509

Theorem 1.1. Suppose that V ∈ L1γ , where in the generic case γ > 3/2 and in the exceptional case γ > 5/2, and that for some k = 1, 2, · · · , V (l) ∈ L1 , for l = 0, 1, 2, · · · , k − 1. Then W± and W±∗ originally defined on Wk,p ∩ L2 , 1 ≤ p ≤ ∞, have extensions to bounded operators on Wk,p , 1 < p < ∞. Moreover, there are constants Cp , 1 < p < ∞, such that:

kW± f kk,p ≤ Cp kf kk,p ; W±∗ f k,p ≤ Cp kf kk,p , f ∈ Wk,p ∩ L2 , 1 < p < ∞. (1.8) Furthermore, if V is exceptional and a := limx→−∞ f1 (x, 0) = 1, W± and W±∗ have extensions to bounded operators on Wk,1 and to bounded operators on Wk,∞ , and there are constants C1 and C∞ such that (1.8) holds for p = 1 and p = ∞. Our proof of Theorem 1.1 is quite different from the proof of Yajima [28–31]. The main difference between the one-dimensional case and n ≥ 3 is the low-energy behaviour. This can be seen by looking to the behaviour of√the resolvent (H0 − z)−1 . In the case n = 1 the resolvent has a singularity of the type 1/ z as z → 0, whereas for n ≥ 3 the resolvent is regular as z → 0. We base our analysis of the low-energy behaviour of the spectral measure of H , on the generalized Fourier maps that are constructed from the scattering solutions 9± (x, k). The crucial issue here is that for n = 1 the scattering solutions are given in terms of the Jost solutions as follows: ( 1 T (k)f1 (x, k), k ≥ 0, (1.9) 9+ (x, k) := √ 2π T (−k)f2 (x, −k), k < 0, and 9− (x, k) := 9+ (x, −k), where T (k) is the transmission coefficient. The fj are solutions to Volterra integral equations. They are obtained by iteration as uniformly convergent series (see [5,6,4,3] and Sect. 2). This fact and ordinary differential equation methods allows us (see [25]) to analyse the low-energy behaviour of the spectral measure of H in a detailed way. We prove Theorems 1.1 in Sect. 2 using these results. An important property of the wave operators is that they intertwine between the continuous part of H and H0 , i.e., H Pc = W± H0 W±∗ and H0 = W±∗ H Pc W± . It follows that for any Borel function f : f (H ) Pc = W± f (H0 ) W±∗ ; f (H0 ) = W±∗ f (H ) Pc W± ,

(1.10)

where f (H ) and f (H0 ) are defined by functional calculus. For X, Y , spaces as above, we denote by B (X, Y ) the space of all bounded operators from X into Y and by B (X) the space of all bounded operators on X. Theorem 1.1 and Eqs. (1.10) imply that f (H0 ) and f (H )Pc have equivalent operator norms from Wl,p into Wl,´ p´ , 0 ≤ l, l´ ≤ k, 1 < p, p´ < ∞. More precisely, Corollary 1.2. Suppose that the assumptions of Theorem 1.1 are satisfied. Then for any 0 ≤ l, l´ ≤ k, 1 < p, p´ < ∞, there is a constant C such that for all Borel functions f 1 ≤ kf (H )P k ≤ C kf (H )k . (1.11) kf (H0 )k c B W ,W 0 B W ,W B Wl,p ,Wl,´ p´ l,p l,p ´ p´ ´ p´ C l, l, Moreover, if V is exceptional and a = 1 we can take 1 ≤ p, p´ ≤ ∞.

510

R. Weder

For related results see [10] and [11]. The crucial point of Corollary 1.2 is that C is independent of f . This allows to obtain extensions to the case when V 6 = 0, of spacetime estimates that are known in the case V = 0 for the Schrödinger equation as well as for other evolution equations like the Klein–Gordon equation and the wave equation. Examples are Lp − Lp´ and Strichartz’s estimates and also Fourier multiplier theorems for the generalized Fourier transforms F± . For a discussion of these applications see [28–32]. For the use of these space-time estimates in nonlinear analysis see [22,14,12, 20,16,23,2,7,9,8,24], and [25]. In particular, Corollary 1.2 gives us a new proof, in the case 1 < p ≤ 2, of the Lp − Lp´ estimate for the Schrödinger equation (1.1) that we obtained in [25]. We prove in Proposition 2.5 in Sect. 2 that given χ ∈ C ∞ with √ χ(x) = 0, x ≤ 0, and χ (x) = 1, x ≥ 1, and given 9 ∈ C0∞ with 9(k) = 1, |k| ≤ k1 , for some |k1 | > 0, we can decompose W± as follows: HC3 [T (0) − 1 + R1 (0)P ] 2 (1.12) HC3 ± i(1 − χ(x))f2 (x, 0) [1 − T (0) − R2 (0)P ] , 2 ∗ are bounded operators on W where (the regular parts) W±,r , W±,r k,p , 1 ≤ p ≤ ∞, and −1 2 C3 := F 9(k )F , with F the Fourier transform. Moreover, Rj (k), j = 1, 2, are the reflection coefficients (see (2.7)), and (Pf )(x) := f (−x). For any f in the Schwartz space, S, we denote by Hf the Hilbert transform of f : Z f (x − y) 1 dy, (1.13) (Hf )(x) := P.V. π y W± = W±,r ± iχ(x)f1 (x, 0)

where P.V. denotes the principal value of the integral. As is well known (see [21,18]) the Hilbert transform extends to a bounded operator on B(Wk,p ), k = 0, 1, · · · , 1 < p < ∞, on B(Wk,1 , L1k,w ) and on B(Wk,∞ , BMOk ), k = 0, 1, · · · , where L1k,w denotes the space of functions that together with all its derivatives of order up to k are in L1 -weak, and BMOk designates the space of all functions that together with all its derivatives of order up to k are in the space, BMO, of functions of bounded mean oscillation. It follows from (1.12) that W± extends to a bounded operator from Wk,1 into L1k,w and that W±∗ extend to bounded operators on B(Wk,1 , L1k,w ) and on B(Wk,∞ , BMOk ). 2. The Proofs We first introduce some standard notation. By Wk,p , k = 0, 1, · · · , 1 ≤ p ≤ ∞, we denote the Sobolev space [1] of all functions on Lp such that all its derivatives of order up to k are functions in Lp . By k · kk,p we denote the norm in Wk,p . Clearly, W0,p = Lp . By L1loc we denote the space of all functions that are integrable over every compact set in R. By L2α , α ≥ 0, we denote the potential spaces [21]. By L1w we denote the weak L1 space [17] of all complex-valued, measurable functions on R such that m(x ∈ R : |f (x)| > t) ≤ C/t, t > 0, where m(·) denotes the Lebesgue measure. For f ∈ L1w let us denote kf k1,w := sup t m(x ∈ R : |f (x)| > t). t >0

(2.1)

Wk,p -Continuity

511

Note that k · k1,w is not a norm [17]. By L1k,w , k = 0, 1, · · · , we denote the space of all f ∈ L1w such that L1w , 0 ≤ l ≤ k. For f ∈ L1k,w we denote kf k1,k,w

k l X

d

:=

dx l f (x)

dl f (x) dx l

.

∈

(2.2)

1,w

l=0

The space of functions of bounded mean oscillation is designated by BMO [18]. By dl BMOk , k = 0, 1, · · · , we denote the space of all functions f ∈ BMO such that dx l f (x), 0 ≤ l, k ∈ BMO with norm kf kBMO,k

k l X

d

:=

dx l f (x)

.

(2.3)

BMO

l=0

For k any complex number we denote by
Z Dk (x) :=

x

( e

0

2iky

dy =

e2ikx − 1 , k 6 = 0, x, k = 0. 1 2ik

(2.5)

It is proven in [4] that for each fixed x ∈ R the mj (x, k) are analytic in k for =k > 0 and continuous for =k ≥ 0 and that |m1 (x, k) − 1| ≤ C

1 + max(−x, 0) 1 + max(x, 0) ; |m2 (x, k) − 1| ≤ C . 1 + |k| 1 + |k|

(2.6)

Moreover [4], the Jost solutions are independent solutions to (1.6) for k 6= 0 and there are unique functions T (k) and Rj (k), j = 1, 2, such that R1 (k) 1 f1 (x, k) + f1 (x, −k); T (k) T (k) R2 (k) 1 f2 (x, k) + f2 (x, −k), f1 (x, k) = T (k) T (k)

f2 (x, k) =

(2.7)

for k ∈ R \ 0. The function T (k)f1 (x, k) describes scattering from left to right of a plane wave eikx and T (k)f2 (x, k) describes scattering from right to left of a plane wave e−ikx . The function T (k) is the transmission coefficient, R2 (k) is the reflection coefficient from

512

R. Weder

left to right and R1 (k) is the reflection coefficient from right to left. The relations (2.7) are expressed in terms of the mj (x, k) as follows: T (k)m2 (x, k) = R1 (k)e2ikx m1 (x, k) + m1 (x, −k); T (k)m1 (x, k) = R2 (k)e−2ikx m2 (x, k) + m2 (x, −k).

(2.8)

Moreover, T (k) 6 = 0 for k 6 = 0, and T (k) is continuous for =k ≥ 0, k 6= iβl , βl > 0, l = 1, 2, · · · N . The numbers −βl2 , l = 1, 2, · · · N are the simple eigenvalues of H . The Rj (k) are continuous for k ∈ R. Moreover (see [4] or [25]), R˙ j (k) ≤ C ; T˙ (k) ≤ C , |k| |k| where we denote f˙(k) :=

(2.9)

d dk f (k).

T (k) = 1 + O

Furthermore, 1 1 ; Rj (k) = O , |k| → ∞. |k| |k|

(2.10)

The behaviour as k → 0 is as follows: (a) In the generic case T (k) = αk + o(k), α 6 = 0, k → 0, =k ≥ 0; R1 (0) = R2 (0) = −1.

(2.11)

(b) In the exceptional case 2a + o(1), k → 0, =k ≥ 0; 1 + a2 (2.12) 1 − a2 a2 − 1 + o(1); R2 (k) = + o(1), k → 0, k ∈ R, R1 (k) = 1 + a2 1 + a2 where a := limx→−∞ f1 (x, 0) 6 = 0. For the results above about T (k) and Rj (k) see [4,15] and [13]. In particular for the continuity of T (k) and of Rj (k) as k → 0 in the exceptional case for V ∈ L11 see [13]. In [25] we have proven the following theorem. T (k) =

Theorem 2.1. Assume that V ∈ L1γ . (a) If V is generic and 1 ≤ γ ≤ 2, then T˙ (k) ≤ C (1 + |k|)−1 , =k ≥ 0; ( o |k1 − k2 |γ −1 , 1 ≤ γ < 2, Rj (k1 ) − Rj (k2 ) = γ = 2, O (|k1 − k2 |) ,

(2.13)

as k1 − k2 → 0. (b) If V is exceptional and 2 ≤ γ ≤ 3, then T˙ (k) ≤ C

|k|γ −3 ; (1 + |k|)γ −2 T (k) − T (0) = O(|k|), Rj (k) − Rj (0) = O(|k|), k → 0, Rj (k1 ) − Rj (k2 ) = O |k1 − k2 |γ −2 , k1 − k2 → 0.

(2.14)

(2.15)

Wk,p -Continuity

513

The results on the spectral theorem for H that we state below follow from the Weyl– Kodaira–Titchmarsh theory. See [4] for example. For a version of the Weyl–Kodaira– Titchmarsh theory adapted to our situation see Appendix 1 of [27] and also the proof of Theorem 6.1 on p. 78 of [27]. For every φ ∈ Hc (H ) the following limits exist in the strong topology in L2 : Z

φˆ ± (k) := s − lim

N

9± (x, k)φ(x) dx.

N →∞ −N

(2.16)

Moreover, the operators: (F± φ) (k) := φˆ ± (k),

(2.17)

are unitary from Hc (H ) onto L2 . Furthermore, the adjoint operators are given by F±∗ φ (x) = s − lim

Z

N

N →∞ −N

9± (x, k)φ(k) dk.

(2.18)

The operators F±∗ F± are equal to the orthogonal projector onto Hc (H ), and H Pc = F±∗ k 2 F± .

(2.19)

Finally, by the stationary formulas for the wave operators (see Eq. 12.7.5 of [19]) W± = F±∗ F,

(2.20)

where F denotes the Fourier transform as a unitary operator in L2 : 1 (F φ)(k) := s − lim √ N →∞ 2π

Z

N

−N

e−ikx φ(x) dx,

(2.21)

ˆ := (F φ)(k). for every φ ∈ L2 . We use below the notation: φ(k) Formula (2.20) is the starting point of our proof of Theorem 1.1. We first prepare some results that we need. Since m1 (x, k) − 1 belongs to the Hardy class it can be written as follows (see [4]): Z ∞ B1 (x, y)e2iky dy, (2.22) m1 (x, k) = 1 + 0

where for each fixed x ∈ R, B1 (x, ·) ∈ L2 (0, ∞). Moreover (see Lemma 3 of [4]), |B1 (x, y)| ≤ eγ1 (x) η1 (x + y), x ∈ R, y > 0; Z ∞ Z ∞ (t − x)|V (t)| dt; η1 (x) := |V (t)| dt. γ1 (x) := x

(2.23)

x

Similarly, Z m2 (x, k) = 1 +

x

−∞

B2 (x, y)e−2iky dy,

(2.24)

514

R. Weder

where for each fixed x ∈ R, B2 (x, ·) ∈ L2 (−∞, 0) and, |B2 (x, y)| ≤ eγ2 (x) η2 (x + y), x ∈ R, y < 0; Z x Z x (x − t)|V (t)| dt; η2 (x) := |V (t)| dt. γ2 (x) := −∞

(2.25)

−∞

We denote: gk (x) :=

k−1 X (l) V (x) ; hj,k (x) := ηj (x) + gk (x), j = 1, 2,

(2.26)

l=0

Q1,k (x, y) := 2

k−1

Z

y

0

Q2,k (x, y) := 2

k−1

Z

0

y

gk (x + y − z) dz; (2.27) gk (x + y − z) dz.

Lemma 2.2. Suppose that V ∈ L11 and that for some k ≥ 1, gk ∈ L1loc . Then for l = 1, 2, · · · , k, l d ≤ hj,k (x + y) eQj,k (x,y) . B (x, y) (2.28) dx l j Proof. We give the proof for B1 (x, y). The case of B2 (x, y) follows similarly. It is proven in Lemma 3 of [4] that: B1 (x, y) =

∞ X

Kn (x, y),

(2.29)

n=0

where the series converges absolutely and Z ∞ Z y Z V (t) dt; Kn+1 := dz K0 (x, y) := x+y

∞

x+y−z

0

dt V (t) Kn (t, z) , n = 0, 1, · · · . (2.30)

We claim that for 1 ≤ l ≤ k: Z y n 2n(k−1) (l) h1,k (x + y) dz gk (x + y − z) . Kn (x, y) ≤ n! 0

(2.31)

It follows from (2.30) that this is true for n = 0. Assuming that (2.31) holds for n we prove it for n + 1, Z y (l) dz (V (x + y − z)Kn (x + y − z, z))(l−1) . (2.32) Kn+1 (x, y) = − 0

Then Z (l) Kn+1 ≤ 2k−1

0

y

dzgk (x + y − z)2n(k−1) h1,k (x + y)

1 = 2(n+1)(k−1) h1,k (x + y) (n + 1)!

Z 0

y

1 n!

Z 0

z

n

dvgk (x + y − v)

gk (x + y − z)dz

n+1

. (2.33)

This proves (2.31). Equation (2.28) follows from (2.29) and (2.31). u t

Wk,p -Continuity

515

We separate the proof of Theorem 1.1 into a high-energy estimate and a low-energy √ ∞ estimate. For this purpose √ let 8 be a function in C such that 8(k) = 0 for |k| ≤ k1 and 8(k) = 1 for |k| ≥ k2 , for some 0 < k1 < k2 . Lemma 2.3 (The High-Energy Estimate). Suppose that V ∈ L11 , and that for some k ≥ 1, V (l) ∈ L1 , 0 ≤ l ≤ k − 1. Then W± 8(H0 ) and 8(H0 )W±∗ originally defined on Wk,p ∩ L2 have extensions to bounded operators on Wk,p , 1 ≤ p ≤ ∞, and for some constants Cp : kW± 8(H0 )f kWk,p ≤ Cp kf kWk,p ;

≤ Cp kf kWk,p , f ∈ Wk,p ∩ L2 .

8(H0 )W±∗ f

(2.34)

Wk,p

Proof. We give the proof for W+ and for W+∗ . The cases of W− and of W−∗ follow similarly. Let χ ∈ C ∞ satisfy: χ = 0, for x ≤ 0 and χ = 1 for x ≥ 1. We denote ( 8(k 2 ), k ≥ 0, (2.35) 8+ (k 2 ) := 0, k < 0. It follows from (2.8) and (2.20) that for any f in the space of Schwartz, S, χ (x)W± 8(H0 )f =

5 X

Aj f,

(2.36)

j =1

where A1 := χ(x)C1 ; C1 := F −1 8+ (k 2 )(T (k) − 1)F ; A2 := C2 C1 ,

(2.37)

Z

eikx √ (m1 (x, k) − 1)(Ff )(k) dk; 2π A3 := C2 (I + C3 ); C3 := F −1 (8(k 2 ) − 1)F,

C2 f := χ (x)

A4 := χ (x) (I + C3 ); A5 := C2 C4 P ; C4 := F −1 8+ (k 2 )R1 (k)F ; (Pf )(x) := f (−x); A6 := χ(x) C4 P .

(2.38)

(2.39)

By (2.9) and (2.10) we have that 8+ (k 2 )(T (k) − 1) ∈ W1,2 . Hence, F −1 8+ (k 2 )(T (k) − 1) ∈ L1 , and it follows that C1 ∈ B Wl,p , l = 0, 1, · · · , 1 ≤ p ≤ ∞. We prove in a similar way that C3 , C4 ∈ B Wl,p , l = 0, 1, · · · , 1 ≤ p ≤ ∞. Equation (2.22) implies that C2 f =

χ(x) 2

Z

0

−∞

B1 (x, −y/2)f (x − y) dy.

(2.40)

516

R. Weder

Note that χ (l1 ) (x) 2

dl C f dx l 2

Z

0

−∞

is a finite sum of terms of the type

(l )

B1 2 (x, −y/2) f (l3 ) (x − y) dy =

Z x

∞

c(x, y) f (l3 ) (y) dy,

(2.41)

where l1 + l2 + l3 = l, and c(x, y) :=

χ (l1 ) (x) (l2 ) B (x, (y − x)/2). 2

(2.42)

It is a consequence of (2.28) that c satisfies the standard criterion for Lp boundedness: Z Z sup |c(x, y)| dx < ∞; sup |c(x, y)| dy < ∞. (2.43) y∈R

x∈R

Then, C2 is bounded on Wl,p for 0 ≤ l ≤ k, 1 ≤ p ≤ ∞. By (2.36) χ(x)W+ 8(H0 ) is bounded on Wl,p , 0 ≤ l ≤ k, 1 ≤ p ≤ ∞. We prove in a similar, using (2.8), way that (1 − χ(x))W+ 8(H0 ) is bounded on Wl,p , 0 ≤ l ≤ k, 1 ≤ p ≤ ∞. This proves the lemma for W+ 8(H0 ). We prove that 8(H0 )W+∗ χ(x) is bounded on Wl,p , 0 ≤ l ≤ k, 1 ≤ p ≤ ∞ observing that for f ∈ S (see (2.36)) 8(H0 )W+∗ χ(x)f

=

5 X j =1

A∗j f,

(2.44)

and arguing as above. We establish in an analogous way that 8(H0 )W+∗ (1 − χ(x)) is t bounded on Wl,p , 0 ≤ l ≤ k, 1 ≤ p ≤ ∞. u √ Let 9 be any function on C0∞ such that 9(k) = 1, |k| ≤ k1 , for some k1 > 0. Lemma 2.4 (Low-Energy Estimate). Suppose that V ∈ L1γ , where in the generic case γ > 3/2 and in the exceptional case γ > 5/2, and that for some k = 1, 2, · · · , V (l) ∈ L1 , l = 0, 1, · · · , k − 1. Then W± 9(H0 ) and 9(H0 ) W±∗ originally defined on Wk,p ∩ L2 , 1 ≤ p ≤ ∞, have extensions to bounded operators on Wk,p , 1 < p < ∞. Moreover, for some constants Cp , 1 < p < ∞,

kW± 9(H0 )f kk,p ≤ Cp kf kk,p ; 9(H0 ) W±∗ f (2.45) ≤ Cp kf kk,p , f ∈ Wk,p ∩ L2 , 1 < p < ∞. If furthermore, V is exceptional and a := limx→−∞ f1 (x, 0) = 1, then W± 9(H0 ) and 9(H0 ) W±∗ have extensions to bounded operators on Wk,1 and on Wk,∞ , and there are constants C1 , C∞ such that (2.45) holds for p = 1 and p = ∞. Proof. We give the proof in the case of W+ . The case of W− follows in a similar way. Let χ be as in the proof of Lemma 2.3. For f ∈ S we decompose χW+ 9(H0 ) as follows (see (2.8) and (2.20)): χ(x)W+ 9(H0 )f = W+,1 f +

iχ (x) H C3 [T (0) − 1 + R1 (0)P ] f, 2

(2.46)

Wk,p -Continuity

517

where, W+,1 f =

7 X

Aj f,

(2.47)

j =1

A1 := χ(x)C1 ; C1 := F −1 T+ (k)F ; ( 9(k 2 )(T (k) − T (0)), k ≥ 0, T+ (k) := 0, k < 0,

(2.48)

T (0) + 1 C3 ; C3 := F −1 9(k 2 )F ; 2 A3 := χ (x) C4 P ; C4 := F −1 R+ (k)F,

(2.49)

A2 := χ (x)

( R+ (k) :=

R1 (0) (R1 (k) − R1 (0))9(k 2 ), k ≥ 0, C3 P , ; A4 := χ(x) 0, k<0 2 Z

A5 f (x) := χ (x) Z A6 f := χ(x)

eikx (1 + sign(k)) dk √ T (k)9(k 2 )(m1 (x, k) − 1)fˆ(k), 2 2π

(2.50)

(2.51)

eikx (1 + sign(k)) R1 (k)9(k 2 )(m1 (x, k) − 1)(F (Pf ))(k), dk √ 2 2π (2.52) Z

A7 f (x) := χ (x)

eikx (1 − sign(k)) 9(k 2 )(m1 (x, k) − 1)fˆ(k), dk √ 2 2π

(2.53)

and where (Pf )(x) := f (−x). It follows from Theorem 2.1 and from Proposition 4 on p. 139 of [21] that T+ ∈ L2α , for some α > 1/2. Then, F −1 T+ ∈ L1 , and we have that C1 ∈ B(Wk,p ), k = 0, 1, · · · , 1 ≤ p ≤ ∞. We prove in the same way that C4 is bounded in Wk,p , k = 0, 1, · · · , 1 ≤ p ≤ ∞. Note that, A5 = C2

(1 + iH) C5 ; C5 := F −1 T (k)9(k 2 )F, 2

(2.54)

with C2 as in (2.40). It follows from Theorem 2.1, as above, that C5 ∈ B(Wk,p ).k = 0, 1, · · · , 1 ≤ p ≤ ∞. Then, A5 ∈ B(Wk,p ), 1 < p < ∞. We prove in a similar way that A6 , A7 ∈ B(Wk,p ), 1 < p < ∞. We now complete the proof of the lemma as in the proof of Lemma 2.3 and using the properties of the Hilbert transform quoted at the beginning of this section. Observe that in the exceptional case with a = 1, T (0) = 1, and Rj (0) = 0. Then, in this case there are no singular terms that contain the Hilbert transform in the second term in the right-hand side of (2.46). Moreover, in this case A5 + A7 = C2 C1 + C2 C3 ∈ B(Wk,p ); A6 = C2 C4 P ∈ B(Wk,p ), 1 ≤ p ≤ ∞. (2.55)

518

R. Weder

Proposition 2.5. Suppose that V ∈ L1γ , γ > 5/2, and that for some k = 0, 1, · · · , V (l) ∈ L11 , l = 0, 1, · · · , k − 1. Then, given χ ∈ C ∞ with χ(x) = 0, x ≤ 0, and √ χ (x) = 1, x ≥ 1, and given 9 ∈ C0∞ with 9(k) = 1, |k| ≤ k1 , for some |k1 | > 0, we can decompose W± as follows: HC3 [T (0) − 1 + R1 (0)P ] 2 HC3 ± i(1 − χ(x))f2 (x, 0) [1 − T (0) − R2 (0)P ] , 2

W± = W±,r ± iχ(x)f1 (x, 0)

(2.56)

∗ are bounded operators on W where W±,r , W±,r k,p , 1 ≤ p ≤ ∞.

Proof. We give the proof in the case of W+ . The case of W− follows in a similar way. For f ∈ S we decompose χ W+ 9(H0 ) as follows (see (2.8) and (2.20)): χ (x)W+ 9(H0 )f = W+,1 f +

iχ (x) m1 (x, 0) H C3 [T (0) − 1 + R1 (0)P ] f, (2.57) 2

where, W+,1 f =

7 X

Aj f,

(2.58)

j =1

T (0) + 1 C3 , 2

(2.59)

R1 (0) C3 P , 2

(2.60)

A1 := χ (x)m1 (x, 0)C1 ; A2 := χ(x)m1 (x, 0)

A3 := χ (x)m1 (x, 0) C4 P ; A4 := χ(x)m1 (x, 0)

where C1 , C3 , C4 , C5 , are as in Lemma 2.4, and Z eikx (1 + sign(k)) T (k)9(k 2 )(m1 (x, k) − m1 (x, 0))fˆ(k), A5 f (x) := χ(x) dk √ 2 2π (2.61) Z ikx (1 + sign(k)) e R1 (k)9(k 2 )(m1 (x, k) A6 f := χ(x) dk √ 2 2π (2.62) − m1 (x, 0))(F (Pf ))(k), Z A7 f (x) := χ(x)

eikx (1 − sign(k)) 9(k 2 )(m1 (x, k) − m1 (x, 0))fˆ(k). (2.63) dk √ 2 2π

We prove as in Lemma 2.4 that Aj ∈ B(Wk,p ), 1 ≤ p ≤ ∞, 1 ≤ j ≤ 4. For this (l) purpose note that by (2.22) and (2.28), m1 (x, 0) ∈ L∞ (R+ ), 0 ≤ l ≤ k. It follows from (2.22) that A5 f (x) = C6 C5 f (x); C5 := F −1 T (k) 4(k 2 ) F ; C6 f (x) :=

Z c(x, y)f (y) dy,

(2.64)

Wk,p -Continuity

with 4 ∈ C0∞ , 4(k 2 ) = 1, k ∈ support 9(k 2 ), and Z ∞ B1 (x, z)gz (y) dz; gz (y) c(x, y) := χ(x) Z 0 (1 + sign(k)) i2kz 1 − 1]e−iky . dk 9(k 2 ) [e := 2π 2 But,

2 (1 + sign(k)) 2ikz

≤ C|z|. kgz (·)kL1 ≤ C [e 9(k ) − 1]

2 1,2

519

(2.65)

(2.66)

Then, by (2.23) c(x, y) satisfies condition (2.43) and it follows that A5 ∈ B(Lp ), 1 ≤ p ≤ ∞. We prove in a similar way that A5 ∈ B(Wk,p ), 1 ≤ p ≤ ∞. By an analogous argument we establish that A6 , A7 ∈ B(Wk,p ), 1 ≤ p ≤ ∞. This proves that W+,1 ∈ B(Wk,p ), 1 ≤ p ≤ ∞. The proof of the proposition is now completed decomposing (1 − χ(x))W+ 9(H0 ) in a similar way and arguing as in the proof of Lemma 2.3. t Furthermore, by Lemma 2.3 (I − 9(H0 ))W± contributes to W±,r . u Proof of Theorem 1.1. The theorem follows from Lemmas 2.3 and 2.4. Added in proof. After this paper was completed I learned that a result in the continuity of the one-dimensional wave operators in Lp , 1 < p < ∞, was obtained in: Galtbayar, A., Yajima, K., Lp boundedness of wave operators for one-dimensional Schrödinger operators, preprint 1999. Galtbayar and Yajima proved their result under conditions on the potential that are more restrictive than ours. They require that V (1) ∈ L12 and that V ∈ L1γ , where in the generic case γ = 3 and in the exceptional case γ = 4. I thank K. Yajima for informing me about his result. References 1. Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 2. Beals M., Strauss W.: Lp estimates for the wave equation with a potential. Commun. Part. Diff. Equations 18, 1365–1397 (1993) 3. Chadam, K., Sabatier, P.C.: Inverse Problems in Quantum Scattering Theory. Second Edition. Berlin: Springer-Verlag, 1989 4. Deift, P., Trubowitz, E.: Inverse scattering on the line. Commun. Pure Appl. Math. XXXII, 121–251 (1979) 5. Faddeev, L.D.: Properties of the S matrix of the one-dimensional Schrödinger equation: Trudy Math. Inst. Steklov 73, 314–333 (1964) [English translation American Mathematical Society Translation Series 2 65, 139–166 (1964)] 6. Faddeev, L.D.: Inverse problems of quantum scattering theory, II: Itogi Nauki i Tekhniki Sovremennye Problemy Matematiki 3, 93–180 (1974) [English translation J. Soviet Math. 5, 334–396 (1976)] 7. Ginibre, J., Velo G.: Generalized Strichartz inequalities for the wave equation. J. Funct. Analysis 133, 50–68 (1995) 8. Ginibre, J.: Introduction aux Équations de Schrödinger non Linéaires”. Paris: Onze Édition, 1998 9. Hörmander, L.: Lectures on Nonlinear Hyperbolic Differential Equations. Mathématiques & Applications 26, Berlin: Springer–Verlag, 1997 10. Jensen, A., Nakamura, G.: Mapping properties of functions of Schrödinger operators between Lp -spaces and Besov spaces. In: Yajima, K. (ed.), Spectral and Scattering Theory and Applications, Tokyo: Adv. Stud. Pure Appl. Math. 23, Math. Soc. Japan, 1994 pp. 187–209 11. Jensen, A., Nakamura, G.: Lp -mapping properties of functions of Schrödinger operators and their applications to scattering theory. J. Math. Soc. Japan 47, 253–273 (1995) 12. Kato, T.: Nonlinear Schrödinger equations. In: Holden, H., Jensen, A. (eds.), Schrödinger Operators, Lecture Notes in Physics 345, Berlin: Springer–Verlag, 1989, pp. 218–263

520

R. Weder

13. Klaus, M.: Low-energy behaviour of the scattering matrix for the Schrödinger equation on the line. Inverse Problems 4, 505–512 (1988) 14. Marschall, B., Strauss, W., Wainger, S.: Lp − Lq estimates for the Klein–Gordon equation. J. Math. Pures et Appl. 59, 417–440 (1980) 15. Newton, R., G.: Low energy scattering for medium range potentials. J. Math. Phys. 27, 2720–2730 (1986) 16. Racke, R.: Lectures in Nonlinear Evolution Equations. Initial Value Problems. Aspects of Mathematics E 19, Braunschweig–Wiesbaden: F. Vieweg & Son, 1992 17. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II Fourier Analysis, Self–Adjointness. New York: Academic Press, 1975 18. Sadowsky, C.: Interpolation of Operators and Singular Integrals. New York: Marcel Dekker, 1979 19. Schechter, M.: Operator Methods in Quantum Mechanics. New York: North Holland, 1981 20. Strauss, W.A.: Nonlinear Wave Equations. CBMS–RCSM 73, Providence, RI: American Mathematical Society, 1989 21. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton Univ. Press, 1970 22. Strichartz, R.S.: Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations. Duke Math. J. 44, 705–714 (1977) 23. Ta–Tsien, Li, Yunmei, Chen: Global Solutions for Nonlinear Evolution Equations. Harlow: Longman Scientific & Technical, 1992 24. Weder, R.: Inverse scattering for the nonlinear Schrödinger equation. Commun. Part. Diff. Equations 22, 2089–2103 (1997) 25. Weder, R.: Lp − Lp´ estimates for the Schrödinger equation on the line and inverse scattering for the nonlinear Schrödinger equation with a potential. Preprint 1998. To appear in J. Funct. Analysis 26. Weidmann, J.: Spectral Theory of Ordinary Differential Operators. Lecture Notes in Mathematics 1258, Berlin: Springer-Verlag, 1987 27. Wilcox, C.H.: Sound Propagation in Stratified Fluids. Applied Mathematical Sciences 50, Berlin– Heidelberg–New York: Springer-Verlag, 1984 28. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators. Proc. Japan Acad. 69, Ser. A, 94–98 (1993) 29. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47, 551–581 (1995) 30. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators. II. Positive potentials in even dimensions m ≥ 4. In: Ikawa, M. (ed.), Spectral and Scattering Theory (Sanda 1992), Lecture Notes in Pure and Applied Mathematics 161, New York: Dekker 1994, pp. 287–300 31. Yajima, K.: The W k,p -continuity of wave operators for Schrödinger operators. III. Even–dimensional cases m ≥ 4. J. Math. Sci. Univ. Tokyo 2, 311–346 (1995) 32. Yajima, K.: Lp -continuity of wave operators for Schrödinger operators and its applications. In: Proceedings of the Korea–Japan Partial Differential Equations Conference, Lecture Notes Ser. 39, Seoul: Seoul Nat. Univ., 1997, 13 pp. Communicated by B. Simon

Commun. Math. Phys. 208, 521 – 540 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Differential Graded Cohomology and Lie Algebras of Holomorphic Vector Fields Friedrich Wagemann Institut Girard Desargues – UPRES-A 5028 du CNRS, Université Claude Bernard Lyon-I, 43, bd du 11. Novembre 1918, 69622 Villeurbanne Cedex, France. E-mail: [email protected] Received: 25 February 1999 / Accepted: 20 July 1999

Abstract: This article continues work of B. L. Feigin [5] and N. Kawazumi [15] on the Gelfand-Fuks cohomology of the Lie algebra of holomorphic vector fields on a complex manifold. As this is not always an interesting Lie algebra (for example, it is 0 for a compact Riemann surface of genus greater than 1), one looks for other objects having locally the same cohomology. The answer is a cosimplicial Lie algebra and a differential graded Lie algebra (well known in Kodaira–Spencer deformation theory). We calculate the corresponding cohomologies and the result is very similar to the result of A. Haefliger [12], R. Bott and G. Segal [2] in the case of C ∞ vector fields. Applications are in conformal field theory (for Riemann surfaces), deformation theory and foliation theory.

Introduction The continuous cohomology of Lie algebras of C ∞ -vector fields [2,7] has proven to be a subject of great geometrical interest: One of its most famous applications is the construction of the Virasoro algebra as the universal central extension of the Lie algebra of vector fields on the circle. So there is the natural problem of calculating the continuous cohomology of the Lie algebra of holomorphic vector fields on a complex manifold, see [5] and [15] for Riemann surfaces. This work is a solution of this problem for arbitrary complex manifolds up to the calculation of the cohomology of spaces of sections of complex bundles on the manifold – this is very close to the result for C ∞ -vector fields. We also show the relation between the cohomology of the holomorphic vector fields and the differential graded cohomology of some differential graded Lie algebra. The method is the one of R. Bott and G. Segal [2] – used also by N. Kawazumi [15], and for the relation with the differential graded cohomology, based on the article of B. L. Feigin [5].

522

F. Wagemann

One interest is in compact complex manifolds: Here, the Lie algebra of holomorphic vector fields seems to be too small to be interesting - for compact Riemann surfaces of genus g it is of dimension 3 for g = 0, 1 for g = 1 and 0 for g ≥ 2. However, treating the holomorphic vector fields as a sheaf rather than taking brutally global sections proves to reveal a richer cohomology theory, as first remarked by B. L. Feigin [5]. We study the relation of the sheaf Hol of Lie algebras of holomorphic vector fields to the sheaf g of vector valued differential forms of type (0, q), where the values are in the holomorphic vector fields. It is called the sheaf of Kodaira–Spencer algebras and it constitutes a sheaf of differential graded Lie algebras which is a fine sheaf resolution of Hol. We will calculate differential graded (co)-homology for the Kodaira–Spencer algebra (i.e. the space of global sections of g), also with coefficients. Another important idea of this article is the following: Let h be a sheaf of differential graded Lie algebras. There is a sheaf of differential graded coalgebras Cdg,∗ (h) with a corresponding sheaf of differential graded Lie algebra homology H∗,dg (h). This is the sheafified Quillen functor, see [18] and [14]. In the same way, there is a sheaf of ∗ (h) corresponding to the sheaf of differential graded differential graded algebras Cdg ∗ cohomology Hdg (h) of h. Now assume that h is not necessarily fine, but that there is a morphism φ to a fine sheaf g of differential graded Lie algebras which is a cohomology equivalence (i.e. ∗ (0(U, g)) = H ∗ (0(U, h))) on each contractible open set U . Hdg dg ∗ (h)) and cosimplicial In this case, hypercohomology (for the differential sheaf Cdg cohomology (i.e. the cohomology of the realization of the simplicial complex obtained ∗ to the Cech resolution of h) coïncide under suitable from applying the functor Cdg finiteness conditions for g and h. This is true because φ induces an isomorphism on the cohomology sheaves of the ∗ (g) and C ∗ (h), inducing an isomorphism in hypercohomology. As g is sheaves Cdg dg fine, hypercohomology is just the cohomology of the complex of global sections of ∗ (g). On the simplicial side, we have a morphism of simplicial cochain complexes Cdg induced by φ which is a cohomology equivalence on the realizations, see [2], Lemma 5.9. By a standard argument using partitions of unity for the fine sheaf g, see [2, §8], the realization of the simplicial cochain complex gives the cohomology of the complex of ∗ (g). global sections of Cdg We will apply this scheme of reasoning to the sheaf of holomorphic vector fields h = Hol on a complex manifold, and its fine resolution given by the sheaf g of d z¯ -forms with values in holomorphic vector fields, the sheaf of Kodaira–Spencer algebras. Applications of these calculations are in conformal field theory, cf. [5], in deformation theory, cf. [14] and in the theory of foliations. This work originated in the attempt to understand Feigin’s article [5], so the text is relying heavily on [5]. The content of the paper reads as follows: The first part is devoted to cohomology calculations: Sect. 1 is concerned with the definition of differential graded cohomology (also with coefficients), hypercohomology, the spectral sequences that go with them as tools for calculations, and the introduction of the sheaves Hol and g; Sect. 2 studies the cohomology of Hol(U ) on a Stein open set U linking it with the differential graded cohomology of 0(U, g); in the end of Sect. 2, we treat the cosimplicial version which gives an equivalent point of view according to the idea explained in the introduction; Sect. 3 gives the calculation of the cohomology in Sect. 2 in terms of the cohomology of some spaces of sections of some bundle on the manifold – the result is very close to Bott,

Cohomology of Lie Algebra of Holomorphic Vector Fields

523

Haefliger and Segal’s result [2,12]. The second part is concerned with the applications of these calculations: Sect. 4 just mentions the existing link to conformal field theory, see [5]; Sect. 5 treats the applications in deformation theory, following [14]; Sect. 6 shows a glimpse of possible applications in the theory of characteristic classes of foliations. Notations. As a general rule, g, h will denote Lie algebras and gothic letters g, h will denote sheaves of Lie algebras. For differential graded Lie algebras, the differential will be displayed in the notation: (g, d) is a differential graded Lie algebra and (g, d) is a sheaf of differential graded Lie algebras. After the preliminaries, the letter g will be reserved for the Kodaira–Spencer algebra, viewed as a sheaf of differential graded Lie algebras. 1. Preliminaries 1.1. Differential graded (co-)homology. 1.1.1. Let g an infinite dimensional topological Lie algebra. Its (co-)homology is calculated by associating to g a differential graded coalgebra C∗ (g) = (3∗ (g), d) and a differential graded algebra C ∗ (g) = (Hom(3∗ (g), C), d), the homological and the cohomological Chevalley–Eilenberg complex, and then taking their (co-)homology. In order to keep notations clear, we will suppress structures we don’t need in the notation, as for example the algebra and coalgebra structure and the grading here. As we deal with tensor products of infinite dimensional topological vector spaces, we will always take them to be completed. It is worth taking only the continuous duals instead of the algebraic duals ∗ (g), in order to Hom(3∗ (g), C) in the definition of cohomology, denoted then Ccont improve caculability and avoid pathologies. Ln i ¯ 1.1.2. Let (g = i=0 g , ∂) be a (cohomological) differential graded Lie algebra, ∗ , associating to dgla for short. There are as before two functors, noted here C∗,dg and Cdg ¯ a differential graded coalgebra C∗,dg (g) and a differential graded algebra C ∗ (g). (g, ∂) dg ∗ (g) without displaying it in the notation. We will assume continuous duals in Cdg ∗ (g) extend the functors in 1.1.1: for a trivial dgla (g, ∂) ¯ = (g, 0) C∗,dg (g) and Cdg ∗ (g) = th with only its 0 space in the grading non-zero, we have C∗,dg (g) = C∗ (g) and Cdg ∗ Ccont (g). C∗,dg is called the Quillen functor, see [18], and explicitly constructed in [14] §2.2. The cohomology version was, to the knowledge of the author, first used by Haefliger [12], see also [19] for useful remarks. 1.1.3. Explicitly Ck,dg (g) :=

M

p

Cdg (g)q :=

k=p+q

M

S −p (g[1])q ,

k=q+p

as graded vector spaces. “dg” stands for “differential graded”. Here S p (g[1])q is the graded symmetric algebra S ∗ on the shifted by 1 graded vector space g[1], i.e. g[1]q := g q+1 .

524

F. Wagemann

Note that for g 0 6 = 0, we have in g[1] a component of degree −1. S −p (g[1])q is bigraded by the tensor degree −p and the internal degree q which is induced by the grading of g[1]. The differential on C∗,dg (g) is the direct sum of the graded homological Chevalley–Eilenberg differential in the tensor direction (with degree reversed in order to ¯ still have a cohomological differential) and the differential induced on S ∗ (g[1])∗ by ∂, ¯ noted ∂. Note that the differential graded homology of g, denoted by C∗,dg (g), is calculated by a cohomological complex, but involving the homological Chevalley–Eilenberg differential. 1.1.4. C∗,dg (g) is the total direct sum complex associated to the bicomplex {S −p (g[1])q }p,q . So, there is a spectral sequence associated to the filtration by the ¯ columns, taking first cohomology in one column, i.e. cohomology with respect to ∂. ∗ Note that H∂¯ is a functor from dgla’s to graded Lie algebras. Let us identify the E2 term as well as where the sequence converges: ¯ is a topological complex of Fréchet nuclear Lemma 1. Suppose that the complex (g, ∂) spaces. There is a spectral sequence with p,q

E2

p

q

= Hgla (H∂¯ (g))

¯ Here, H ∗ converging to Hp+q,dg (g), i.e. the differential graded homology of (g, ∂). gla denotes the cohomology of graded Lie algebras. Remark. Other names for the morphisms involved in a topological cochain complex are strict morphisms or homomorphisms, see [3, Ch. III, §2, no. 8]. These are not necessarily “morphismes forts” or split in the sense of [11] or [21]. Proof. For E2 , the only thing which is not clear is C∗,dg (H∂¯∗ (g)) = H∂¯∗ (C∗,dg (g)). This follows directly from Prop. 2.1 in [18] in case we would not have been taking completed tensor products. This proposition holds also in the completed tensor product version, when the spaces involved are Fréchet nuclear spaces or its strong duals, and the complexes are topological: In this case, there is a Künneth formula, cf. [15, p. 673]. Then we can conclude as in [18], but we won’t have a topological isomorphism, which is irrelevant for us. The convergence is a more difficult problem because the shifting g[1] in C∗,dg (g) L ¯ and so the spectral sequence creates an internal degree -1 for a dgla (g = ni=0 g i , ∂) is not contained in the third quadrant. Actually, it is contained in the fourth quadrant. By the classical convergence theorem (cf. [24, p. 13]) the spectral sequence associated to the filtration by the columns converges to the total direct sum complex. This is by definition our differential graded homology. u t

Cohomology of Lie Algebra of Holomorphic Vector Fields

525

1.1.5. It is clear how to incorporate coefficients in a differential graded module (M = Lk i ˜ i i=0 M , ∂): such a module M is given as the direct sum of its components M and ˜ ¯ carries a differential ∂ and an action of a dgla (g, ∂) such that for x ∈ g and m ∈ M, we have ˜ ˜ ¯ ∂(x.m) = ∂(x).m + (−1)deg(x) x.∂(m). Now, take the graded tensor product C∗,dg (g) ⊗ M or the graded Hom-functor Hom(C∗,dg (g), M) with the action incorporated in the Chevalley–Eilenberg differential and the differential ∂˜ glued together with the differential ∂¯ on C∗,dg (g), i.e. in the homological case ˜ ⊗ m) + (∂¯ ⊗ 1)(x ⊗ m), ∂tot (x ⊗ m) = (−1)deg x (1 ⊗ ∂)(x and in the cohomological case ¯ ∂tot f = ∂˜ ◦ f + (−1)degf f ◦ ∂. We suppose further that M is a topological Fréchet nuclear module, and take completed tensor products. Note that, as before, the functor H∂∗tot transforms differential gradedbjects in graded objects. There is analoguously a spectral sequence and its corresponding lemma in this case: Lemma 2. There is a spectral sequence with p,q

E2

p

q

= Hgla (H∂tot (g ⊗ M))

¯ with coefconverging to Hp+q,dg (g, M), i.e. the differential graded homology of (g, ∂) ∗ ˜ ficients in the differential graded module (M, ∂). Here, Hgla denotes the cohomology of graded Lie algebras with coefficients. 1.1.6. Now we want to calculate differential graded cohomology instead of homology, so let me specify a setting where this is possible. ¯ a topological dgla such that g is a Fréchet nuclear space. This permits to Let (g, ∂) calculate cohomology by calculating homology on the continuous dual: Lemma 3. We have ∗ (g) ∼ Cdg = (C∗,dg (g ∗ )),

where we treat all objects as graded vector spaces and g ∗ is the continuous dual of g as a topological vector space. Proof. This follows directly from the following proposition, see for example [22, Prop. 50.7, p. 524]: Proposition 1. The continuous dual of a completed tensor product of two nuclear Fréchet spaces is the completed tensor product of the continuous duals of the two spaces. u t ¯ So, there is a spectral sequence for the differential graded cohomology in case (g, ∂) is also a topological complex, namely, the one from Lemma 1.

526

F. Wagemann

¯ is a resolution of a Lie algebra 1.1.7. In the same setting as in 1.1.6, suppose that (g, ∂) h which is a topological complex. As J.-P. Serre showed in [20], the ∂¯ resolutions on compact Kähler or Stein manifolds are always topological cochain complexes. For topological cochain complexes of Fréchet nuclear (or its dual) spaces, it is known that the strong dual complex is topological and has the dual cohomology spaces, cf. [15, p. 673]. This suite very well our approach of cohomology by the homology on the duals. ¯ of h induces an exact sequence for the strong In conclusion, the resolution (g, ∂) duals, and by the remark in 1.1.4, the spectral sequence in cohomology collapses at the second term. So, we have: ¯ a dgla as in 1.1.5 such that Lemma 4. Let (g, ∂) h for ∗ = 0 ∗ H∂¯ (g) = 0 for ∗ = 1, 2, . . . . Then ∗ ∗ (g) = Hcont (h). Hdg

Let me remark that the spectral sequence here is converging in the sense of complete convergence, cf. [24, p. 139], and to the total direct product complex. 1.1.8. All these notions extend to sheaves of Lie algebras and sheaves of dgla’s: Let X be a complex manifold of complex dimension n. Denote by OX the coherent sheaf of holomorphic functions on X and by EX the sheaf of C ∞ functions on X. Let g be a sheaf of OX -modules which are Lie algebras. Note that the bracket is not a morphism of OX -modules. In some contexts, the action of the elements of the Lie algebra on f ∈ OX should be specified: this leads to the concept of twisted Lie algebras. This is for example the case when considering tensor products over OX . In our context, everything is C¯ be a sheaf of dgla’s linear, so we need not specify this action. In the same way, let (g, ∂) which are EX -modules. We denote by 0(g), 0(X, g) or g(X) the dgla of global sections of the sheaf g. ¯ sheaves of differential By the previous sections, we can associate to g resp. to (g, ∂) graded coalgebras C∗ (g), C∗,dg (g), H∗ (g) and H∗,dg (g), where the last two carry the trivial differential. In the same way, we have sheaves of differential graded algebras ∗ (g), C ∗ (g), H ∗ (g) and H ∗ (g). Ccont cont dg dg Furthermore, we have differential graded coalgebras C∗ (0(g)), C∗,dg (0(g)), H∗ (0(g)) and H∗,dg (0(g)), and the corresponding algebras. 1.2. Examples. 1.2.1. The prescription U 7 → Hol(U ), where U is an open set of X and Hol(U ) is the Lie algebra of holomorphic vector fields on U is a sheaf of Lie algebras, denoted by Hol. It is a coherent sheaf. It is in some respect the opposite of a fine sheaf: its restriction maps are injective. ∗ (Hol) associated to Hol. To be We have a sheaf of differential graded algebras Ccont ∗ explicit, it is the sheaf Hom(3 (Hol), C) of morphisms of sheaves between 3∗ (Hol) and the constant sheaf C. Its underlying presheaf is U 7 → Homcont (3∗ (Hol)|U , C|U ).

Cohomology of Lie Algebra of Holomorphic Vector Fields

527

Here, Homcont (F, G) is the functor of continuous sheaf morphisms between two sheaves of topological spaces F and G , i.e. of morphisms of presheaves φU = {φV : F(V ) → G(V )}V ⊂U such that every φV is continuous. In particular, it is a differential sheaf, and one subject of this article will be to calculate its hypercohomology: ∗ (Hol), we get in a Taking a sheaf resolution of every graded component of Ccont ∗ standard way (cf. [9, 4.5, p. 176]) a resolution of Ccont (Hol). This gives a bicomplex; the cohomology of the total complex associated to it is by definition the hypercohomology ∗ (Hol), denoted by H(X, C ∗ (Hol)). of Ccont cont ∗ (Hol) is bounded below, we have two converging spectral sequences (assoAs Ccont ciated to the two canonical filtrations for the bicomplex) for hypercohomology. We need in 2.1.4 the first one, the one given by the filtration by the columns. Its E2 term is p,q

E2

q

= H p (X, Hcont (Hol)).

Here, H p (X, F) is the sheaf cohomology of the sheaf F. The second one is given by the filtration by the rows. Its E2 term is p,q

E2

q

∗ = Hd (H p (X, Ccont (Hol))).

∗ (Hol) induces in the resolutions of every compoHere, d is the differential which Ccont nent.

1.2.2. Let E be a holomorphic vector bundle over the complex manifold X. Denote by O(E) the sheaf of (germs of) holomorphic sections of E. Denote by k,l the sheaf of (germs of C ∞ sections of) differential forms of type (k, l) on X. The tensor product 0,∗ ⊗ O(E) is a sheaf on X, the sheaf of (germs of C ∞ sections of) differential forms with values in E (of type (0, ∗)). Let me denote by g this sheaf for E = T X, the complex tangent bundle of X. Note that O(T X) is simply Hol. g is a sheaf of dgla’s: It is a vector space, graded by the degree of the differential form. The bracket on every open set is the restriction to the (0, ∗)-type forms with values in T X of the Frölicher–Nijenhuis bracket on 0(∗,∗ ⊗ Vect), where V ect is the sheaf of all vector fields on the real manifold underlying X. This bracket is explained for example in [17], see also Sect. 2.2 of the present article. To give a short indication, it is the bracket of endomorphisms by viewing vector valued differential forms as derivations of the graded algebra of differential forms. The differential is just ¯ It is easy to see that ∂¯ acts as a graded derivation on the Frölicher–Nijenhuis bracket. ∂. ¯ this sheaf of dgla’s. We denote by (g, ∂) g is a fine sheaf because it is a sheaf of C ∞ sections. So its sheaf of dg algebras Homcont (C∗,dg (g), C) which is as before a sheaf of morphisms of sheaves, is in fact isomorphic to the sheaf of morphisms between the spaces of sections: given a morphism of sheaves φU : C∗,dg (g)|U → C|U . i.e. a compatible family φU = {φV : C∗,dg (g)(V ) → C(V )}V ⊂U , we can construct a morphism of the spaces of global sections on U by partitions of unity. It is well known that the hypercohomology of a fine differential sheaf is just the cohomology of its complex of global sections, see for example [9, Thm. 4.6.1 p. 178]. This implies ∗ ∗ (g)) = Hdg (0(g)). H(X, Cdg

Another goal of this article is to calculate the differential graded cohomology of the Kodaira–Spencer algebra 0(g).

528

F. Wagemann

∗ 1.2.3. There is one remark in order: Actually, we should indicate in the notation Hdg the way in which we proceeded to take total complexes associated to double complexes and cohomology with respect to differentials. The ambiguity involved stems from the fact that we are considering here hypercohomology of a bicomplex of sheaves, so the underlying homological problem is a TRI-complex. For example, it would not be the same to apply the global section functor term by term to the bicomplex (for example on a compact Riemann surface), and take its differential graded cohomology afterwards. ∗ the differential graded cohomology obtained by the Let us thus denote by 1 Hdg hypercohomology of the complex of sheaves given by the total complex of the bicomplex of sheaves. ∗ the differential graded cohomology obtained by the Let us also denote by 2 Hdg cohomology of the total complex of the bicomplex of global sections of the double complex of sheaves.

1.2.4. Let me remark that the sheaf Hol is a sheaf of topological Fréchet nuclear Lie algebras because of the canonical Fréchet topology on the space of sections on a coherent sheaf, see for example [10, Ch. V, §6]. In the same way, g is a sheaf of topological Fréchet nuclear dgla’s, see for example [13]. g as a space of C ∞ functions carries the C ∞ topology, and the canonical topology on a space of holomorphic functions is the same as the one induced from the C ∞ topology on the C ∞ functions. 2. The Cohomological Link Between Hol and g There is a strong relationship between Hol and g based on the fact that g is a fine sheaf resolution of Hol. We will first show this for trivial coefficients, and then construct the right category of modules such that the relationship still holds for cohomology with coefficients. 2.1. Trivial coefficients. 2.1.1. Recall some ∂¯ resolutions: Lemma 5. There is an exact sequence of sheaves ∂¯

0 → Hol → (0,0 ⊗ Hol) → (0,1 ⊗ Hol) → . . . → (0,n ⊗ Hol) → 0. Proof. Actually, we have an exact sequence of sheaves p

¯ ∂⊗1

¯ ∂⊗1

0 → hol ⊗O O(E) → p,0 ⊗O O(E) → p,1 ⊗O O(E) → . . . ¯ ∂⊗1

. . . → p,n ⊗O O(E) → 0 p

with the sheaf of holomorphic differential forms on X, hol , for every holomorphic fiber bundle E on X, see for example [25]. Taking p = 0 and E = T X, we have our sequence. u t ¯ is a resolution of Hol by fine sheaves. Corollary 1. The sheaf (g, ∂)

Cohomology of Lie Algebra of Holomorphic Vector Fields

529

2.1.2. Let us look for the open sets U where the corollary holds not only for the sheaves, but for the spaces of global sections on U . Definition 1. An open set U ⊂ X of a complex manifold X is called a Stein open set, if we have the following vanishing condition on coherent sheaf cohomology: H ∗ (U, F) = 0 ∀ ∗ = 1, 2, 3, . . . , and for all coherent sheaves F on X. ¯ is a resolution of Hol(U ). Lemma 6. For every Stein open set U ⊂ X, (0(U, g), ∂) ¯ is a fine sheaf Proof. This follows from standard sheaf cohomology theory: As (g, ∂) ¯ By definition of U , resolution of Hol, H ∗ (U, Hol) is the cohomology of (0(U, g), ∂). this cohomolgy is 0 except perhaps in degree 0. In degree 0, it is Hol(U ). u t Apply now Lemma 4 to get immediately. Corollary 2. For every Stein open set U , we have an isomorphism ∗ ∗ (Hol(U )) = 1Hdg (0(U, g)). Hcont

2.1.3. We can state this result in a completely formal setting: Recall that W1 is the Lie algebra of formal vector fields in 1 variable. Consider G := W1 [[¯z, t]] / (t 2 ). Then, G is a differential graded Lie algebra with the bracket: [X z¯ k t n , Y z¯ l t m ] = [X, Y ]¯zk+l t n+m , where X, Y ∈ W1 – it is the usual bracket on a tensor product of a Lie algebra with an associative algebra. The grading is given by the polynomial degree in t. The differential is just the operator ∂¯ defined by X

¯ ∂(

i

fi (z, z¯ )t i

X ∂ ∂ ∂ )= fi (z, z¯ )t i+1 . ∂z ∂ z¯ ∂z i

In particular, the elements of G without z¯ are the kernel of ∂¯ - these are the formal holomorphic vector fields. So the theorem can be stated in the 1 dimensional formal case as Theorem 1. ∗ ∗ ∗ Hdg (G) ∼ (W1 ) ∼ (S 3 ). = Hcont = Hsing

1

Of course, there exists also the n-dimensional version, but it is too cumbersome to write down.

530

F. Wagemann

2.1.4. We can pursue 2.1.2 a little bit further applying hypercohomology: Theorem 2. For every complex manifold X, there is an isomorphism ∗ ∗ (Hol)) = 1Hdg (0(X, g)). H∗ (X, Ccont

Proof. The preceding corollary gives the isomorphism on the filtrant family of Stein neighbourhoods of a point x ∈ X. Passing to the inductive limit, we get an isomorphism of the cohomology sheaves, ∗ ∗ (Hol) = 1Hdg (g). Hcont

Recall now the hypercohomolgy spectral sequence from 1.2.1. The inclusion sheaf morphism Hol → g gives a morphism of differential sheaves inducing a morphism of spectral sequences. This morphism is an isomorphism on the terms E2 , so by the standard comparison theorem for spectral sequences, we have an isomorphism of the limit terms. It remains to recall the result of 1.2.2 stating ∗ ∗ (g)) = 1H Hdg (0(X, g)). H∗ (X, Cdg

t u

2.1.5. Let us remark that there is an analogous situation for the Hochschild cohomology ¯ of the sheaf of the algebra of holomorphic functions OX (X): we have a fine ∂-resolution OX by the sheaves of differential forms of type (0, k), 0,k . On a Stein open set U , we have an isomorphism between the Hochschild cohomology of OX (U ) and the differential ¯ graded Hochschild cohomology of the differential graded algebra (⊕nk=0 0,k (U ), ∧, ∂). As before, we can pass to the cohomology sheaves and then to hypercohomology. We can even have the cosimplicial cohomology – see Sect. 2.3. ¯ is a 2.2. The coefficient case. Note that Hol(U ) is a Hol(U )-module and (0(U, g), ∂) differential graded 0(U, g)-module by the adjoint action for an open set U . In particular, ¯ as a differential graded module (M, ∂) ˜ verifies (0(U, g), ∂) ˜ ˜ m) = ∂(x). ¯ ∂(x. m + (−1)deg(x) x. ∂(m), just by the fact that ∂˜ = ∂¯ acts as a graded derivation on the Frölicher–Nijenhuis bracket. We can write the Frölicher–Nijenhuis bracket in our case locally as [φ ⊗ X, ψ ⊗ Y ] = φ ∧ ψ ⊗ [X, Y ] + (iY ∂φ ∧ ψ ⊗ X − (−1)kl iX ∂ψ ∧ φ ⊗ Y ) = φ ∧ ψ ⊗ [X, Y ] + φ ∧ LX ψ ⊗ Y − LY φ ∧ ψ ⊗ X for φ ∈ 0,k (U ), ψ ∈ 0,l (U ) and X, Y ∈ Hol(U ). We will look for differential graded 0(U, g)-modules giving a theorem as in 2.1 but with coefficients. 2.2.1. Let E be a holomorphic vector bundle on X. As before, let O(E) be the sheaf of holomorphic setions of E. It has a resolution by fine sheaves, given explicitly in the proof in Sect. 2.1.1. Denote by E(E)0,∗ the direct sum over all sheaves in this resolution.

Cohomology of Lie Algebra of Holomorphic Vector Fields

531

˜ which is 2.2.2. As before, 0(U, E(E)0,∗ ) is a differential graded vector space (M, ∂) resolution of the vector space 0(U, O(E)) for any Stein open set U . 2.2.3. Let us now suppose that the Lie algebra Hol(U ) acts on 0(U, O(E)) by differential operators – we can speak of a local action. We can define a differential graded action locally by setting (φ ⊗ X).(ψ ⊗ v) = φ ∧ ψ ⊗ X.v + φ ∧ LX ψ ⊗ v, where v ∈ O(E). Note that we dropped the term which is not realizable without an action of v on the forms and an inclusion of the holomorphic vector fields into O(E). It is obvious that it is in fact a global action. So we have constructed a differential graded 0(U, g)-module naturally induced by the action of Hol(U ) on 0(U, O(E)). It is easy to extend this correspondence to maps between modules, so we have constructed a category of differential graded modules corresponding to the category of local Hol(U )-modules. 2.2.4. We have a functor from local differential graded modules to the category of local ˜ So we get an equivHol(U )-modules simply by taking the cohomology with respect to ∂. alence of categories between the category of local Hol(U )-modules and a subcategory of the category of differential graded modules. 2.2.5. Call now the induced module of a local Hol(U )-module either the module constructed in 2.2.3 or – if the Hol(U )-module is Hol(U ) itself with the adjoint action – take 0(U, g) with its adjoint action. Unfortunately, we have to make this distinction because of the difference in the formulae for the action in 2.2.3 and the adjoint action. 2.2.6. We can now formulate the analogous theorem in the coefficient case: Theorem 3. On a Stein open set U , a local Hol(U )-module N (U ) induces a differential ˜ which is its resolution. So we have: graded module (M(U ), ∂) 1

∗ ∗ ˜ ∼ Hdg (g(U ), (M(U ), ∂)) (Hol(U ), N (U )). = Hcont

Proof. Following 2.2.3, the first statement is clear. By the spectral sequence calculating differential graded cohomology with coefficients, see the lemma in 1.1.5, the E2 term is the Lie algebra cohomology of Hol(U ) with coefficients N(U ) and the sequence collapses. u t 2.2.7. Note that Kawazumi calculated the cohomology of Hol(X) with coefficients in n-densities for an open Riemann surface X. Taking into account this result ([15, Eq. (9.7) p.701]), we have completely solved the problem of the differential graded cohomology of g(X) with coefficients in (differential graded) tensor densities for open Riemann surfaces.

532

F. Wagemann

2.3. The cosimplicial version. 2.3.1. Let us think of the tangent sheaf Hol as a sheaf of Lie algebras constituting an object in the derived category Db (X) of the category of bounded complexes of sheaves on the complex manifold X. The objects Hol and g are isomorphic in Db (X). The Lie algebra structure on Hol corresponds to the fact that there is a cohomological resolution which is a sheaf of differential graded Lie algebras. According to [14], for any sheaf of Lie algebras h there is another sheaf of differential graded Lie algebras constituting a resolution of h. It is the sheaf of cosimplicial Lie algebras given by taking h on the Cech complex associated to a covering U by Stein open sets, suitably normalised by the Thom-Sullivan functor, see [14]. 2.3.2. There is also a notion of cohomology for a cosimplicial Lie algebra: the coˇ homology of the cosimplicial Lie algebra C(U, Hol) for some covering by Stein open sets U is the cohomology of the realization of the simplicial cochain complex obtained ∗ to the from applying the continuous Chevalley–Eilenberg complex as a functor Ccont ∗ cosimplicial Lie algebra. We denote cosimplicial cohomology by Hcos . 2.3.3. As explained in the introduction, the general idea is that this cannot give anything new. To show this, one constructs a morphism of simplicial cochain complexes ∗ ∗ (g(N∗ )) → Ccont (Hol(N∗ )) f˜ : Cdg

induced by the inclusion f : Hol(NM,q ) ,→ g(NM,q ) simply by applying the functor ∗ to the inclusion. N denotes the thickened nerve of the covering U, i.e. the simplicial Cdg ∗ complex manifold associated to the covering U. By Lemma 5.9 in [2], the morphism f˜ induces a cohomology equivalence between the realizations of the two simplicial cochain complexes (the conditions of the lemma are fullfilled because of the isomorphism of the cohomologies on a Stein open set of the covering and the Künneth theorem). As in Prop. 6.2 in [2] using partitions of unity, one shows that the cohomology of the realization of the simplicial cochain complex on the left hand side gives the differential graded cohomology of 0(X, g). 2.3.4. This gives the following Theorem 4. On a complex manifold X of dimension n, we have 1

∗ ∗ ˇ Hdg (0(X, g)) ∼ (C(U, Hol)) = Hcos

for any covering of M by Stein open sets U. 2.3.5. Observe that we proceeded in the same order taking cohomology with respect to differentials in the spirit of Remark 1.2.3.

Cohomology of Lie Algebra of Holomorphic Vector Fields

533

3. Calculating the Cohomology 3.1.1. I. M. Gelfand and D. B. Fuks calculated the cohomology of the Lie algebra of formal vector fields in n variables Wn (in our setting always with complex coefficients). They showed an isomorphism of the Hochschild–Serre spectral sequence for the subalgebra gl(n) with the Leray spectral of the restriction to the 2n skeleton of the universal U (n) principal bundle. Let us note π : V (∞, n) → G(∞, n) the universal principal U (n)-bundle and X(n) an open neighbourhood (because the inverse image of the union of the cells is not a manifold) of the inverse image under π of the 2n-skeleton of the Grassmannian G(∞, n). Their theorem reads Theorem 5 (Gelfand–Fuks, cf. [7]). There is a manifold X(n) such that ∗ ∗ (Wn ) ∼ (X(n)). Hcont = Hsing

R. Bott and G. Segal showed that for R n , or more generally a starshaped open set U of an n-dimensional manifold M, the Lie algebra of C ∞ -vector fields Vect(U ) has the same cohomology as Wn . The same is true for the Lie algebra of holomorphic vector fields on a disk of radius R in Cn : The map sending a holomorphic field to its Taylor series is continuous (E. Borel’s lemma, see [22, p. 190]), open (trivial!), injective (trivial!) and of dense image (the series of convergence radius R are dense in the formal series). So they have the same continuous cohomology, cf. [23]. 3.1.2. N. Kawazumi calculated what seemed to be the only interesting Gelfand–Fuks cohomology related to Lie algebras of holomorphic vector fields on Riemann surfaces, i.e. the cohomology on open Riemann surfaces: Theorem 6 (Kawazumi, [15]). Let X an open Riemann surface. Then ∗ (Hol(X)) = H ∗ (Map(X, S 3 )). Hcont

He used the method of Bott–Segal [2] to prove this result, i.e. he constructed a global fundamental map from the cochain complex of the Lie algebra to the complex of ∗ (Hol(C)). This map, denoted by fˆ , is constructed differential forms with values in Ccont σ with the help of a global non-vanishing vector field ∂ existing on open Riemann surfaces: ∗ ∗ (Hol(Uσ )) → ∗ (U σ ; Ccont (Hol(C))) fˆσ : Ccont

c 7 → (∂ −1 ) ⊗ (fσ,p )∗ i∂ c + (fσ,p )∗ (c)

S σ Here, for Ta subset σ = {α0 , . . . , αq } of the index set of a covering, U = i Uαi and Uσ = i Uαi . (fσ,p )∗ is the map induced from a complex immersion of the open set into C and i is the insertion operator. It is rather straightforward to generalize this map to the n-dimensional case: fˆσ relies on a vector valued differential form ω which is complicated in the case of Bott and Segal, but here it is just ω = ∂ −1 ⊗ ∂, the identity on Hol(X). In the n-dimensional case, we P take ω = ni=1 ∂i−1 ⊗ ∂i . These ∂i – trivializing the tangent bundle – can be chosen such that they are the images of ∂z∂ i for a specially chosen parametrization sending a contractible open set into C, cf. Lemma 6.4 of [15].

534

F. Wagemann

3.1.3. In general, there is no such vector field ∂, so there one should adapt the fundamental map of Bott–Segal to this holomorphic setting. For this, it is enough to notice that X(n) is homotopically equivalent to a complex manifold carrying a Gl(n, C)-action. For example, X(1) is S 3 which is homotopically equivalent to C 2 \ {0}. So, replacing from the real case the principal U (n)-bundle (associated to the tangent bundle) by the principal Gl(n, C)-bundle (associated to the complex tangent bundle), one has a family of immersions P , cf. [2, §4 and p. 295], which is parametrized by a complex manifold (Gl(n, C)) and consists of complex immersions. This implies that the fundamental map, constructed from this family as in [2, §4], goes from (cochains on) holomorphic fields to (holomorphic differential forms with values in cochains on) holomorphic fields. 3.1.4. Secondly, Kawazumi uses the fact that the open Riemann surface is a Stein manifold to pass from the cosimplicial cohomology to the cohomology of the Lie algebra of global holomorphic fields. His method works perfectly for n-dimensional Stein manifolds. So there are two immediate corollaries: Corollary 3. Let X be an n dimensional complex Stein manifold with trivial tangent bundle. Then we have H ∗ (Hol(X)) ∼ = H ∗ (Map(X, X(n))). cont

sing

If one drops the “Stein” hypothesis, it is perhaps not possible to globalize the result, but one can stay with the cosimplicial cohomology: Corollary 4. Let X be an n dimensional complex manifold with trivial tangent bundle and U a covering of X by Stein open sets. Then we have ∗ ∗ ˇ (C(U, Hol)) ∼ (Map(X, X(n))). Hcos = Hsing

From 3.1.3 it follows on the other hand: Theorem 7. Let X be an n-dimensional complex manifold. Then we have: ∗ ∗ ˇ (C(U, Hol)) ∼ (0(En )). Hcos = Hsing

Here, En is the bundle with typical fiber homotopically equivalent to X(n) associated to the principal Gl(N, C)-bundle on X (gotten from the complex tangent bundle of X). 3.1.5. For 0(6, g) in the case of a compact Riemann surface 6, we have Feigin’s theorem (note that many theorems in this article could be named “Feigin’s theorem”): Theorem 8. 1

∗ Hdg (0(6, g)) ∼ = H ∗ (Map(6, S 3 )).

Proof. In our setting, this theorem follows from the above considerations because the (C 2 \ {0})-bundle (or the S 3 -bundle) is trivial: The given S 1 -representation in SO(4) may be lifted to Spin(4) and this representation is used to view the bundle as associated to a principal Spin(4)-bundle which is trivial because of the existence of a section by obstruction theory combined with dimension arguments. u t Let us remark that one can calculate H ∗ (Map(6, S 3 )) by standard methods, and the result is given in Feigin’s article. In particular, H 1 (Map(6, S 3 )) is 1-dimensional, and fixing a generator means fixing the central charge c of a Virasoro type cocycle, cf. [5].

Cohomology of Lie Algebra of Holomorphic Vector Fields

535

4. Applications in Conformal Field Theory 4.1.1. Feigin’s article [5] treats the applications in conformal field theory. We will summarize them briefly, see [5] and [1] for more information. As complex manifolds X, we take here compact Riemann surfaces 6 of genus g ≥ 2. As we deal with homology in this section, we replace the sheaf of holomorphic vector fields Hol by the sheaf of algebraic vector fields Lie. In view of the stated difficulties in globalizing these vector fields, we take the cosimplicial version, cf. §2.3. 4.1.2. Let p ∈ 6 be a point. Following Feigin, let us choose the covering of 6 by a formal disk U2 around p (in order to be able to take algebraic fields on it) and the Zariski open set U1 = 6 \ {p}. This means that Lie(U2 ) is the Lie algebra of formal jets of vector fields at p, completed by the ideal defined by p. A similar remark applies to Lie(U1 ∩ U2 ). So, Lie(U1 ), Lie(U2 ) and Lie(U1 ∩ U2 ) form a cosimplicial Lie algebra. 4.1.3. As the choice of a generator for 1 H 1 (KS(6)) fixes the central charge c of a Virasoro type cocycle, cf. 3.1.5, it fixes a cosimplicial Lie algebra associated to Lie(U1 ), Lie(U2 ) ⊕ cC and Vir(U1 ∩ U2 ) in the same way as before. We still have inclusions of Lie(U1 ) and Lie(U2 )⊕cC into Vir(U1 ∩U2 ), because the cocycle is 0 on these subspaces by the residue theorem. The cosimplicial Lie algebra is denoted by Lie0 (6). It has a representation (in the sense of representation of a diagram, cf. [8]) noted c , where we associate to Lie(U2 ) ⊕ cC, Vir(U1 ∩ U2 ) and Lie(U1 ) respectively 1c (a 1-dimensional space, Lie(U2 ) acting trivially, cC acting by multiplication by c), its induced module (a Verma module noted Mc (p)) and its restriction to Lie(U1 ). 4.1.4. There is a similar cosimplicial Lie algebra Lie4 (6) associated to the covering by all Zariski open sets of 6. Such a set is given by a finite number of points {p1 , . . . , pn }. Lie4 (6) has a similar representation: doing the above construction yields a representation space for every 6 \ {p1 }. For Lie algebras associated to sets with more than 1 point, we take the tensor product representation of the Verma modules. Actually, all these modules are linked by induction arrows. This gives a representation of Lie4 (6) still noted. One should view Lie0 (6) and its representation as a simple model for Lie4 (6) and the above representation. 4.1.5. Feigin calculates the (cosimplicial) homology of Lie0 (6) and Lie4 (6) with values in the above representations. The result is (for simplicity only for Lie0 (6)) Theorem 9.

Hi (Lie0 (6), c ) =

Mc (p) / Lie(U1 )Mc (p) if i = 0 . 0 otherwise

The point is that the space of coinvariants on the right-hand side which defines the so-called modular functor is usually associated to locally defined objects, as for example the local Virasoro algebra Vir(U1 ∩ U2 ). Feigin obtains here a homological description in terms of globally defined objects. A second point is that the space of coinvariants is in fact the continuous dual of the completion of the local ring of the moduli space (of compact Riemann surfaces of genus g ≥ 2) at the point 6, provided that 6 is a smooth point. This gives an important link between Lie algebra homology and the geometry of the moduli space, cf. §5.

536

F. Wagemann

4.1.6. The modular functor for what is called a minimal field theory relies on a special choice of the central charge c, dictated from Virasoro representation theory, see for example [4]. Furthermore, instead of Verma modules one deals with their irreducible quotients. Feigin shows that the above setting can be adapted to this situation. The modular functor associates to 6 a finite dimensional vector space; this fact relies in our context on the theorem, cf. [6, Lemma 4.1.1 p. 16], stating that coinvariants in a representation with 0 singular support are finite dimensional.

5. Applications in Deformation Theory 5.1. Deformations of complex manifolds. In this section, we give links from the cohomology calculations in the first part to the deformation theory of complex manifolds, still relying strongly on the ideas of [5] and here also [14]. It will concern particularly the differential graded homology of 0(X, g) for a complex manifold X. Most of this section is more generally true for smooth proper schemes, see [14]. 5.1.1. The most basic idea in this context is the following, taken from [14]: “The completion of a local ring of a moduli space at a given point X is isomorphic to the dual of the 0th homology group of the Lie algebra of infinitesimal automorphisms of X.” Let me underline once more that this links Lie algebra homology and the geometry of the moduli space in a formal neighbourhood of a point. 5.1.2. As Feigin remarked, we have for Riemann surfaces an incarnation of this principle: Theorem 10. Let 6 be a compact Riemann surface of genus g ≥ 2. Then 2

H0,dg (0(6, g)) = S ∗ (T6 M(g, 0)),

and the other homology spaces are 0. Remark. Note that we have here 2 H0,dg ; this reminds one of the way we defined the differential graded homology of a sheaf of differential graded Lie algebras, see 1.2.3. Proof. It is the result from the Kodaira–Spencer deformation theory for Riemann surfaces 6 that we have H 1 (6, Hol) = T6 M(g, 0). Also, H 0 (6, Hol) = 0. So the theorem follows directly from the lemma in 1.1.6, because the graded Lie algebra homology of an abelian Lie algebra in degree 1 is just the symmetric algebra on it. u t Taking continuous duals in the theorem, we get the principle stated in 6.1.1 viewing S ∗ (T6 M(g, 0))∗ as the completion of the local ring which is possible if the point 6 is smooth in M(g, 0).

Cohomology of Lie Algebra of Holomorphic Vector Fields

537

5.1.3. The theorem of 6.1.2 is still true for higher dimensional complex manifolds X as long as H 1 (X, Hol) = T6 M(g, 0),

(1)

and zero otherwise. So there are two problems, well known in deformation theory following Kodaira and Spencer: the problem whether the number of moduli is well-defined and the problem if Eq. 1 holds. For compact complex manifolds M this is answered by a theorem of Kodaira, see [16, p. 306, Thm. 6.4]: a sufficient condition for the affirmative answer to the two questions is that H 0 (M, Hol) = H 2 (M, Hol) = 0. So in the case of compact complex surfaces, we can conclude right away that the theorem in 6.1.2 is still true. See [16] for examples of such complex surfaces.

5.2. Deformations of Lie algebras. 5.2.1. It is well known that the Lie algebra cohomology with values in the adjoint representation H ∗ (L, L) of a Lie algebra L answers questions about the deformations of L as an algebraic object. For example, H 2 (L, L) can be interpreted as the space of equivalence classes of infinitesimal deformations of L, see [7, p. 35]. So there arise natural questions of this type for the Lie algebra of holomorphic vector fields Hol(U ) on a Stein manifold U and in the differential graded setting for the differential graded Lie algebra 0(U, g). 5.2.2. The formal case is well known: Theorem 11. ∗ (Wn , Wn ) = 0. Hcont

This gives right away (as before by considering Hol(D) for a disk D ⊂ Cn as a dense subalgebra of Wn and by the principle that a dense subalgebra has the same continuous cohomology) Corollary 5. ∗ (Hol(D), Hol(D)) = 0. Hcont

So this implies the rigidity of the Lie algebra of holomorphic vector fields for disks. Observe that these disks are also rigid as manifolds, i.e. H 1 (D, Hol) = 0. 5.2.3. Now by the theorem in 2.2.5, we also have differential graded rigidity of 0(D, g): Corollary 6. 1

∗ Hdg (0(D, g), 0(D, g)) = 0.

538

F. Wagemann

5.2.4. On the other hand, for a compact Riemann surface 6 of genus g ≥ 2, we have by the lemma in 1.1.5 and by the exact sequence which is implicit in the proof of the theorem in 5.1.2 (here, we have the dg-cohomology procedure as in 5.1.2!) Theorem 12. 2

∗ Hdg (0(6, g), 0(6, g)) = S ∗ (T6 M(g, 0))∗ ⊗ T6 M(g, 0).

Here, S ∗ (T6 M(g, 0))∗ is the continuous dual of the nuclear Fréchet space given by the polynomials on T6 M(g, 0). So, it’s the space of formal power series on T6 M(g, 0)∗ . 5.2.5. Note that the space on the right hand side can be given a bracket such that it is isomorphic to the Lie algebra of formal vector fields on T6 M(0, g). This could be interpreted as the relation between cohomology with adjoint coefficients of g, i.e. differential graded deformations of global sections of g, and deformations of the underlying manifold. It fits into Feigin’s philosophy that the choice of the coefficients in the Lie algebra cohomology determines the geometric object on the moduli space in a formal neighbourhood of a point: trivial coefficients correspond to the structure sheaf, adjoint coefficients correspond to vector fields, adjoint coefficients in the universal envelo0ping algebra correspond to differential operators.

6. Applications in Foliation Theory This section is inspired by the famous link between the cohomology of Lie algebras and characteristic classes of foliations, see for example [7] for an introduction. We won’t go into all details and we won’t try to develop this theory in all its strength in our case, alas, we will only consider the easiest case, i.e. the case of characteristic classes of g-structures. In fact, we will define a class of “g”-structures such that the cohomology calculations from the first part yield characteristic classes for these structures. We won’t pretend that this construction gives rise to interesting new characteristic classes; in fact, in the absence of an explicit description of the cohomology classes, we have no explicit description of the characteristic classes. 6.1.1. A g-structure on a manifold X is a g-valued C ∞ -differential 1-form ω satisfying the Maurer–Cartan equation: −[ω(ξ1 ), ω(ξ2 )] = dω(ξ1 , ξ2 ). q

For a continuous cochain c ∈ Ccont (g), there is a characteristic class of the g-structure defined by ω simply given by the differential form c(ω, . . . , ω). | {z } q−times

Cohomology of Lie Algebra of Holomorphic Vector Fields

539

6.1.2. Define for a covering U by open sets a “Hol-U-structure” or short Hol-structure as follows: Let X be a complex manifold and U = {Ui }i∈I a covering of X by open sets such that I is a countable directed index set. Consider the sheaves Hol and Vect of holomorphic resp. C ∞ vector fields on X. For an inclusion of open sets U ⊂ V , we have restriction maps φV U : Hol(V ) → Hol(U ) and ψV U : Vect(V ) → Vect(U ). A Hol-structure is now a Hol(Ui )-valued differential 1-form ωUi for every open set Ui of U such that it verifies the Maurer–Cartan equation and furthermore for an inclusion U ⊂ V we have φV U (ωV (ξ )) = ωU (ψV U (ξ )) for all ξ ∈ Vect(V ). If X is part of the covering and Hol(X) = 0, then the Hol-structure is 0, so let us restrict it to coverings not including X. 6.1.3. To have a link with better known structures in foliation theory, let us restrict ourselves to coverings by contractible open sets (such that intersections are contractible). Let X be of complex dimension n. By the obvious base change, we can think of W2n as being generated by ∂z∂ i and ∂∂z¯ i , i = 1, . . . , n. Denote by W2n |hol the Lie subalgebra of W2n generated by the ∂z∂ i for i = 1, . . . , n. Given a Hol-structure associated such a covering, denoted by U, we have Lemma 7. The data {ωU }U ∈U is equivalent to a W2n |hol -valued differential form ω. So, for these coverings, Hol-structures are special cases of Wn -structures, and their importance is clear, see for example [7, Ch. 3.1.3, B 3◦ , p. 231]. 6.1.4. To such a structure (for which obviously only the transverse structure of the folia∗ (Hol(X)) tion is relevant), we assign now characteristic classes by considering not Hcont ∗ ∗ ∗ ∗ ˇ which could be too small, but H (X, Ccont (Hol)), or better H (|Ccont (C(U, Hol))|) which coïncide with Sect. 2. The Hol-structure is defined such that by inserting p-times ωUi0 ∩...∩Uiq into each Q p c ∈ Ccont ( i0 <...
540

F. Wagemann

References 1. Beilinson, A., Feigin, B.L., Mazur, B.: Introduction to algebraic field theory on curves. Preprint 2. Bott, R., Segal, G.: The cohomology of the vector fields on a manifold. Topology 16, 285–298 (1977) 3. Bourbaki, N.: Topologie générale, Eléments de Mathématique, Première Partie, Livre III. Paris: Herman, 1960 4. Di Francesco, P., Mathieu, P., Sénéchal, D.: Conformal Field Theory. Graduate Texts in Contemporary Physics. Berlin–Heidelberg–New York: Springer, 1996 5. Feigin, B.L.: Conformal field theory and Cohomologies of the Lie algebra of holomorphic vector fields on a complex curve. Proc. ICM, Kyoto, Japan, 71–85 (1990) ˆ 2 at a Rational Level. Cont. 6. Feigin, B.L., Malikov, F.: Modular Functor and Representation Theory of sl Math. 202, 357–406 (1997) 7. Fuks, D.B.: Cohomology of Infinite Dimensional Lie algebras. NewYork and London: Consultant Bureau 1986 8. Gerstenhaber, M., Schack, S.D.:Algebraic Cohomology and Deformation Theory. In: Deformation Theory of Algebras and Structures and Applications, NATO Adv. Sci. Inst. Ser. C 247, Dodrecht: Kluwer, 1988, pp. 11–264 9. Godement, R.: Théorie des faisceaux. Paris: Herman, 1964 10. Grauert, H., Remmert, R.: Theory of Stein Spaces. Springer Grundlehren 236, Berlin–Heidelberg–New York: Springer, 1979 11. Guichardet, A.: Cohomologie des groupes topologiques et des algèbres de Lie. Paris: Ed. Cedic, 1980 ` 12. Haefliger, A.: Sur la cohomologie de l’algèbre de Lie des champs de vecteurs. Ann. Sci. ENS, 4eme série, t. 9, 503–532 (1976) 13. Hamilton, R.S.: The Nash–Moser Inverse Function Theorem. Bull. AMS 7, 65–222 (1982) 14. Hinich, V., Schechtman, V.: Deformation Theory and Lie algebra Homology. alg-geom/9405013 v2 to appear in Alg. Coll. 15. Kawazumi, N.: On the complex analytic Gel’fand-Fuks cohomology of open Riemann surfaces. Ann. Inst. Fourier, Grenoble 43, 3, 655–712 (1993) 16. Kodaira, K.: Complex Manifolds and Deformation of Complex Structures. Springer Grundlehren 283 Berlin–Heidelberg–New York: Springer, 1986 17. Kolá˘r, I., Michor, P., Slovák, J.: Natural Operations in Differential Geometry. Berlin–Heidelberg–New York: Springer, 1993 18. Quillen, D.: Rational Homotopy Theory. App. B, Ann. Math. (2) 90, 205–295 (1969) 19. Schlessinger, M., Stasheff, J.D.: Deformation Theory and Rational Homotopy Type. Preprint 20. Serre, J.P.: Un théorème de dualité. Comment. Math. Helv. 29, 9–26 (1955) 21. Taylor, J.L.: Homology and Cohomology for Topological Algebras. Adv. Math. 9, 137–182 (1972) 22. Treves, F.: Topological Vector Spaces; Distributions and Kernels. New York–London: Academic Press, 1967 23. Wagemann, F.: Some remarks on the cohomology of Krichever–Novikov algebras. Lett. Math. Phys. 47, No. 2, 173–177 (1999) 24. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge Studies in Advanced Mathematics 38, Cambridge: Cambridge University Press, 1994 25. Wells, R.O. (Jr.): Differential Geometry on Complex Manifolds. New York: Prentice-Hall, 1973 Communicated by H. Araki

Commun. Math. Phys. 208, 541 – 574 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Inner Estimate of Singularities to Solutions for Elastic Wave Propagation Problems in Stratified Media Senjo Shimizu? Faculty of Engineering, Shizuoka University, Hamamatsu 432-8561, Japan. E-mail: [email protected] Received: 24 August 1998/ Accepted: 30 March 1999

Abstract: We consider elastic mixed or initial-interface value problems in a stratified media. We give an inner estimate of the location of singularities of the reflected and refracted Riemann functions by making use of a localization method. For an incident wave, lateral waves as well as reflected and refracted waves appear. 1. Introduction We consider elastic wave propagation problems in plane-stratified media R3 with the planar interface x3 = 0. This problem is formulated as an elastic mixed or initial-interface problem in a stratified media. An elastic equation has two speeds. Pressure or Primary wave (for simplicity called P wave) and Share or Secondary wave (S wave). P wave is a longitudinal wave and S wave is a transversal wave. In general the speed of P wave is greater than that of S 3 called Medium I has wave. In the plane-stratified media problem, a lower half-space R− 3 P1 and S1 waves and an upper half-space R+ called Medium II has P2 and S2 waves. The speed of the P1 (resp. P2 ) wave is greater than that of the S1 (resp. S2 ) wave. So the order relation of the speeds of P1 , P2 , S1 , and S2 waves are six cases. Here we assume P2 , S2 , P1 , S1 waves in order of speed since it is the most complex case. We put an unit impulse Dirac’s delta in the lower half-space Medium I. Then the P1 incident wave which has speed faster than the S1 incident wave bumps against the interface and causes P1 and S1 reflected waves in Medium I and P2 and S2 refracted waves in Medium II as in Fig. 1. Moreover after a certain period of time, lateral waves, or glancing waves or total reflected (or refracted) waves, arise. In Fig. 2, dotted arrows show P2 -P1 and P2 -S1 lateral waves in Medium I, and a P2 -S2 lateral wave in Medium II for the P1 incident ? This work was supported in part by Grants-in-Aid for Encouragement of Young Scientists (grant A09740098) from the Ministry of Education, Science and Culture of Japan.

542

S. Shimizu x3 S2

P2

x 000 = (x1 , x2 )

P1

P1 S1

Fig. 1. Reflected waves and refracted waves

x3

S2 P1 P1

S1

P2 − S2 x000 = (x1 , x2 ) P2 − P1 P2 − S1

Fig. 2. Lateral waves

wave. P2 -P1 lateral wave means that the wave originally should have been a P2 refracted wave but it tends toward total reflection, then becomes a source and causes a P1 reflected wave. We have eleven kinds of lateral waves in all. It is a characteristic of our elastic wave propagation problems in stratified media. For the half-space problem which has two speeds, there exists only one kind of lateral wave. For a plane-stratified media problem where each medium has one speed, there exists only one kind of lateral wave. Thus this elastic wave propagation problem in plane-stratified media has many lateral waves. In this paper we prove the above physical situation mathematically by using an expression of an inner estimate of singularities. The main technical tool of our analysis is a localization method. M. F. Atiyah, R. Bott, L. Gårding [A-B-G] and L. Hörmander [H1] developed a localization method to get an inner estimate of the location of singularities of initial value problems for hyperbolic equations with constant coefficients. M. Matsumura [M2] gave an inner estimate of the location of singularities of half-space mixed problems for hyperbolic equations with constant coefficients by using the localization method for hyperbolic polynomials. S. Wakabayashi [W1,W2] and M. Tsuji [T] studied half-space mixed

Inner Estimate of Singularities to Elastic Wave

543

problems by localizing the Lopatinski determinant and reflection coefficients as well as hyperbolic polynomials. Then M. Matsumura [M3,M4] studied the plane-stratified media problem for which each medium has one speed by localizing the Lopatinski determinant and reflection and refraction coefficients as well as a hyperbolic polynomial with constant coefficients. In this paper we study the plane-stratified media problem for which each medium has two speeds by localizing the Lopatinski determinant and reflection and refraction coefficients as well as a hyperbolic polynomial with constant coefficients. The rest of this paper is organized as follows. Elastic wave propagation problems in a plane-stratified media is formulated in Sect. 2. In Sect. 3, we solve the mixed problem and show the explicit expressions of the reflected and refracted Riemann functions. The Main Theorem is stated in Sect. 4. Here we interpret the Main Theorem as a physical situation. Section 5 is devoted to the proof of the Main Theorem. We consider an outer estimate of location of singularities. The inner estimate and an outer estimate give an exact estimate of location of singularities. Moreover if we put an unit impulse Dirac’s delta on the interface x3 = 0, then the Stoneley wave that is a kind of surface wave appears. These results will be given in forthcoming papers. 2. Formulation of Problems We consider elastic wave propagation problems in the following plane-stratified media R3 with the planar interface x3 = 0: ( (λ1 , µ1 , ρ1 ) for x3 < 0, (λ(x3 ), µ(x3 ), ρ(x3 )) = (λ2 , µ2 , ρ2 ) for x3 > 0. Here the constants λ1 , λ2 , µ1 , µ2 are called the Lamé constants and the constants ρ1 , 3 by Medium I and the upper ρ2 are densities. We shall denote the lower half-space R− 3 half-space R+ by Medium II, respectively, as in Fig. 3. We assume that λi + µi > 0, µi > 0, ρi > 0, i = 1, 2.

(2.1)

Equation (2.1) is the natural assumption in a practical situation. From the roots of the characteristic equations of P I (D) and P I I (D) which are defined below as 3 × 3 matrix valued hyperbolic partial differential operators in Medium I and Medium II, respectively, we obtain two speeds correspond to P wave and S wave on each medium. cp1 denotes the speed of the P wave in Medium I and cs1 denotes the speed of the S wave in Medium I. x3 Medium II

λ2

µ2

p2 x 000 = (x1 , x2 )

0

Medium I

λ1

µ1

p1

Fig. 3. Plane-stratified media

544

S. Shimizu

cp2 and cs2 denote the speed of the P and S wave in Medium II, respectively. They are given by λi + 2µi µi , cs2i = , i = 1, 2. (2.2) cp2 i = ρi ρi By assumption (2.1), the speed of the P wave is greater than that of the S wave in each medium. On account of this, these are six cases of the order relation of the speeds of {cp1 , cs1 , cp2 , cs2 } (cf. [S, Sect. 3]). Here we assume that cs1 < cp1 ≤ cs2 < cp2 .

(2.3)

Since if we put an unit impulse Dirac’s delta in Medium I, it is the case that the most number of lateral waves appear. The other cases can be treated in a similar manner. Let x = (x0 , x1 , x2 , x3 ) = (x 0 , x3 ) = (x0 , x 00 ) = (x0 , x 000 , x3 ) in R4 . The variable x0 will play a role of time, and x 00 = (x1 , x2 , x3 ) will play that of space. ξ is a real dual variable of x and is equal to (ξ0 , ξ1 , ξ2 , ξ3 ) = (ξ 0 , ξ3 ) = (ξ0 , ξ 00 ) = (ξ0 , ξ 000 , ξ3 ) in Rξ4 . √ We use the differential symbol Dj = i −1 ∂/∂xj (j = 0, 1, 2, 3), where i = −1. We n the half-space {x = (x , · · · , x ) ∈ R n | x < 0} and by Rn the halfshall denote by R− 1 n n q + n space {x = (x1 , · · · , xn ) ∈ R | xn > 0}, and also use the notation |x| = x12 + · · · xn2 . Let u(x) = t (u1 (x), u2 (x), u3 (x)) ∈ R3 be the displacement vector at time x0 and position x 00 . The propagation problems of elastic waves in the stratified media is formulated as a mixed or initial-interface value problem:  I 3 , P (D)u(x) = f (x), x0 > 0, x 00 = (x1 , x2 , x3 ) ∈ R−      II 00 3    P (D)u(x) = f (x), x0 > 0, x = (x1 , x2 , x3 ) ∈ R+ , (2.4) u(x)|x3 =−0 = u(x)|x3 =+0 , x0 > 0, x 000 ∈ R2 ,    I I I 000 2  B (D)u(x)|x3 =−0 = B (D)u(x)|x3 =+0 , x0 > 0, x ∈ R ,     k D0 u(x)|x0 =0 = gk (x 00 ), k = 0, 1, x 00 ∈ R3 .

Here P I (D)u = D02 Eu −

λ1 + µ1 µ1 ∇x 00 (∇x 00 · u) − 1x 00 u, ρ1 ρ1

(2.5)

is a 3 × 3 matrix valued second order hyperbolic differential operator with constant coefficients where E is a 3 × 3 identity matrix; (B I (D)u)k = iλ1 (∇x 00 · u)δk3 + 2µ1 εk3 (u), k = 1, 2, 3,

(2.6)

are the k th component of symmetric stress tensors B I (D)u where εk3 (u) = i/2 (D3 uk + Dk u3 ) , k = 1, 2, 3, are strain tensors. The P I I (D)u and B I I (D)u are defined by replacing λ1 , µ1 , ρ1 by λ2 , µ2 , ρ2 , respectively. If we put an unit impulse Dirac’s delta δ(x − y) with y3 < 0, that is, put it in Medium I, then the Riemann function of this elastic mixed problem is given by the following: ( for x3 < 0, E I (x − y) − F I (x, y) (2.7) G(x, y) = for x3 > 0, F I I (x, y)

Inner Estimate of Singularities to Elastic Wave

545

where E I (x) is the fundamental solution in Medium I describing an incident wave and is defined by Z eix·(ξ +iη) P I (ξ + iη)−1 dξ, η ∈ −γ0 ϑ − 0(detP I , ϑ), (2.8) E I (x) = (2π)−4 Rξ4

where γ0 is a positive real number, ϑ and 0(detP I , ϑ) are defined in Definition 2.3 below, and P I (ξ + iη)−1 is the 3 × 3 inverse matrix of P I (ξ + iη). F I (x, y) and F I I (x, y) describe reflected and refracted waves, and are called the reflected and refracted Riemann functions, respectively. In this paper we give an inner estimate of the location of singularities of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y) by making use of a localization method. The inner estimate of the location of singularities of the fundamental solution E I (x−y) is also given more easily. Here singularities are expressed by singular supports. Definition 21 (cf. [H2]). If X ⊂ Rn and u ∈ D0 (X), then the singular support of u, denoted sing supp u, is the set of points in X having no open neighborhood to which the restriction of u is a C ∞ function. We define a localization of polynomials according toAtiyah-Bott- Gårding (cf. [A-B-G]): Definition 22. Let P (ξ ) be a polynomial of degree m ≥ 0 and develop ν m P (ν −1 ξ + η) in increasing power of ν: ν m P ν −1 ξ + η = ν p Pξ (η) + O(ν p+1 ) as ν → 0,

(2.9)

where Pξ (η) is the first coefficient that does not vanish identically in η. The polynomial Pξ (η) is the localization of P at ξ , the number p is the multiplicity of ξ relative to P . Moreover we introduce the following: Definition 23. 0 = 0(P , ϑ) is the component of Rηn \ {η ∈ Rηn , P (η) = 0} which contains ϑ = (1, 0, · · · , 0) ∈ Rn . Moreover 0 0 = 0 0 (P , ϑ) = {x ∈ Rn | x · η ≥ 0 for any η ∈ 0} is the dual cone of 0 and is called the propagation cone. 3. The Reflected and Refracted Riemann Functions In this section, we solve the mixed problem (2.4) and obtain the explicit expressions of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y), respectively. We use a modified version of the standard method for the half-space problem (e.g. [M1, M2]) which uses the compensating kernel. Taking the partial Fourier–Laplace transform with respect to x 0 for the mixed problem, we obtain an interface value problem for the ordinary differential equation with parameters. Then taking the partial inverse Fourier– Laplace transform for the solution, we obtain explicit expressions of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y). Note that if we put ξ + iη = ζ , then ! I (ζ 0 , D ) P 0 3 2×1 (U(ζ 000 )C)−1 , P I (ζ 0 , D3 ) = U(ζ 000 )C 1 01×2 P2I (ζ 0 , D3 )

546

S. Shimizu

where P1I (ζ 0 , D3 ) and P2I (ζ 0 , D3 ) are 2 × 2 and 1 × 1 ordinary differential operators with parameters, respectively, defined by  q 2 − {c2 D 2 + c2 (ζ 2 + ζ 2 )} −(c2 − c2 ) ζ 2 + ζ 2 D ζ 3 s p p s 0 2 1 2 1 3 1 1 , q 1 1 P1I (ζ 0 , D3 ) =  −(cp2 1 − cs21 ) ζ12 + ζ22 D3 ζ02 − {cp2 1 D32 + cs21 (ζ12 + ζ22 )} 

P2I (ζ 0 , D3 ) = ζ02 − cs21 {D32 + (ζ12 + ζ22 )}, and



   0 ζ1 −ζ2 100   1  ζ2 ζ1 , C =  0 U(ζ 000 ) = q 0 0 1 . q   2 2 ζ1 + ζ 2 0 0 010 ζ12 + ζ22

Moreover ! I (ζ 0 , D ) B 0 3 2×1 1 (U(ζ 000 )C)−1 , B I (ζ 0 , D3 ) = U(ζ 000 )C 01×2 B2I (ζ 0 , D3 )  B1I (ζ 0 , D3 ) = iρ1 

(cp2 1

q  cs21 ζ12 + ζ22 cs21 D3  , B2I (ζ 0 , D3 ) = iρ1 cs2 D3 . q 1 2 2 2 2 − 2cs1 ) ζ1 + ζ2 cp1 D3

The P I I (ζ 0 , D3 ) and B I I (ζ 0 , D3 ) are decomposed into P1I I (ζ 0 , D3 ) and P2I I (ζ 0 , D3 ), and B1I I (ζ 0 , D3 ) and B2I I (ζ 0 , D3 ), respectively, defined by replacing cp1 , cs1 , ρ1 by cp2 , cs2 , ρ2 , respectively. As shown in Sect. 2, if we put the unit impulse of the Dirac delta δ(x −y) with y3 < 0, that is, put in Medium I, then the Riemann function G(x, y) of this elastic mixed problem is given by (2.7). The reflected and refracted Riemann functions F I (x, y) and F I I (x, y) are the solutions of the following interface value problem:  4 4  supp F I (x, y) ⊂ {R− × R− | x0 ≥ y0 }, P I (Dx )F I (x, y) = 0,     I I I I I I 4 4  P (Dx )F (x, y) = 0, supp F (x, y) ⊂ {R+ × R− | x0 ≥ y0 },     I II  E I (x − y)| x3 =−0 − F (x, y)|x3 =−0 = F (x, y)|x3 =+0 , 4  , x0 ≥ y0 , x 0 ∈ R3 , y ∈ R−     I I   B (Dx )E (x − y)|x3 =−0 − B I (Dx )F I (x, y)|x3 =−0 = B I I (Dx )F I I (x, y)|x3 =+0 ,     4 , x0 ≥ y0 . x 0 ∈ R3 , y ∈ R− (3.1) Taking the partial Fourier–Laplace transform with respect to x 0 for the interface value problem (3.1), we obtain the following interface value problem for the ordinary differ-

Inner Estimate of Singularities to Elastic Wave

547

ential equations with parameters:

                            

U(ζ 000 )C(P1I (ξ 0 +iη0 , D3 ) ⊕ P2I (ξ 0 +iη0 , D3 ))(U(ζ 000 )C)−1 Fˆ I (ξ 0 + iη0 , x3 , y) = 0, U(ζ 000 )C(P1I I (ξ 0 + iη0 , D3 ) ⊕ P2I I (ξ 0 + iη0 , D3 )) × (U(ζ 000 )C)−1 Fˆ I I (ξ 0 + iη0 , x3 , y) = 0, Fˆ I (ξ 0 + iη0 , 0, y) + Fˆ I I (ξ 0 + iη0 , 0, y) Z ∞ −1 e−iy(ξ +iη) U(ζ 000 )C(P1I (ξ + iη) ⊕ P2I (ξ + iη))−1 (U(ζ 000 )C)−1 dξ3 , = (2π) −∞

    B I (ξ 0 + iη0 , D3 )Fˆ I (ξ 0 + iη0 , x3 , y)|x3 =0      + B I I (ξ 0 + iη0 , D3 )Fˆ I I (ξ 0 + iη0 , x3 , y)|x3 =0    Z ∞    −1   = (2π) e−iy(ξ +iη) U(ζ 000 )C(B1I (ξ + iη) ⊕ B2I I (ξ + iη))    −∞     × (P1I (ξ + iη) ⊕ P2I (ξ + iη))−1 (U(ζ 000 )C)−1 dξ3     ξ ∈ Rξ4 , η ∈ −γ0 ϑ − 0(detP I , ϑ).

Then taking the partial inverse Fourier–Laplace transform for the solution, we obtain the following expressions of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y):

I

−4

F (x, y) = (2π) 

Z R3

e

i(x 0 −y 0 )(ξ 0 +iη0 )

{P1I (ξ + iη)−1 }1 I {B (ξ + iη)P I (ξ + iη)−1 }1 1 1

Z

e−iy3 (ξ3 +iη3 ) U(ξ 000 + iη000 )C ×

R

· · · ·

· · · ·

· · · ·

     !  000 + iη000 |  + 0 0 |ξ  e−iτp1 (ξ +iη )x3 R1 (ξ 0 +iη0 )  + 0 0 −τp1 (ξ + iη )      · {P1I (ξ + iη)−1 }1 · ·    · · ·   I (ξ + iη)P I (ξ + iη)−1 } · · · {B 1  1 1 !  ·  · · τs+1 (ξ 0 + iη0 ) −iτs+ (ξ 0 +iη0 )x3  + e 1 R1 (ξ 0 +iη0 )  |ξ 000 + iη000 |  0

548

S. Shimizu {P1I (ξ + iη)−1 }2 I {B (ξ + iη)P I (ξ + iη)−1 }2 1 1 R1 (ξ 0 +iη0 )

+

· · · ·

I B (ξ 2

· · · ·

· · · ·

· · · ·

! + 0 0 |ξ 000 + iη000 | e−iτp1 (ξ +iη )x3 + 0 0 −τp1 (ξ + iη )

· · · · {B1I (ξ + iη)P1I (ξ + iη)−1 }2 · · ! · · τs+1 (ξ 0 + iη0 ) −iτs+ (ξ 0 +iη0 )x3 e 1 R1 (ξ 0 +iη0 ) |ξ 000 + iη000 | 0 {P1I (ξ + iη)−1 }2



0 0

P2I (ξ + iη)−1 · + iη)P2I (ξ + iη)−1 · −iτs+ (ξ 0 +iη0 )x3 e 1 R2 (ξ 0 +iη0 )

    × (U(ξ 000 + iη000 )C)−1 dξ3 dξ 0  

for x3 < 0.

F

II



−4

(x, y) = (2π) · · · ·

Z R3

e

i(x 0 −y 0 )(ξ 0 +iη0 )

· {P1I (ξ + iη)−1 }1 · · {B1I (ξ + iη)P1I (ξ + iη)−1 }1 ·

           ·    ·   ·     +·  

R1 (ξ 0 +iη0 )

· · · ·

Z

e−iy3 (ξ3 +iη3 ) U(ξ 000 + iη000 )C × R

· · · ·

! + 0 0 |ξ 000 + iη000 | eiτp2 (ξ +iη )x3 + 0 0 τp2 (ξ + iη )

· {P1I (ξ + iη)−1 }1 · · {B1I (ξ + iη)P1I (ξ + iη)−1 }1 ! −τ + (ξ 0 + iη0 ) + 0 0 · s2 eiτs2 (ξ +iη )x3 R1 (ξ 0 +iη0 ) 000 000 |ξ + iη | 0

(3.2)

Inner Estimate of Singularities to Elastic Wave · · · ·

+

· ·

549

· {P1I (ξ + iη)−1 }2 · · {B1I (ξ + iη)P1I (ξ + iη)−1 }2 · R1 (ξ 0 +iη0 )

· · · ·

· · · ·

· · · ·

! + 0 0 |ξ 000 + iη000 | eiτp2 (ξ +iη )x3 + 0 0 τp2 (ξ + iη )

· {P1I (ξ + iη)−1 }2 · I I −1 · {B1 (ξ + iη)P1 (ξ + iη) }2 ! −τ + (ξ 0 + iη0 ) + 0 0 · s2 eiτs2 (ξ +iη )x3 R1 (ξ 0 +iη0 ) 000 000 |ξ + iη | 0



0 0

B2I (ξ

 

  × (U(ξ 000 P2I (ξ + iη)−1   + iη)P2I (ξ + iη)−1 iτs+ (ξ 0 +iη0 )x3 2 e 0 0 R2 (ξ +iη )

+ iη000 )C)−1 dξ3 dξ 0

for x3 < 0.

(3.3)

Here · means the same component of the Lopatinski matrices R1 (ζ 0 ) (ζ 0 = ξ 0 + and R2 (ζ 0 ) given below, {P1I (ζ )−1 }1 and {P1I (ζ )−1 }2 are the 1 and 2 columns, respectively, of the inverse matrix of P1I (ζ ) given by

iη0 )

P1I (ζ )−1 = {P1I (ζ )−1 }1 , {P1I (ζ )−1 }2 =

cofP1I (ζ ) detP1I (ζ )

=

1

− cs21 |ζ 00 |2 ) (3.4)  q  2 2 2 2 2 2 2 2 2 2 ζ0 − {cp1 ζ3 + cs1 (ζ1 + ζ2 )} (cp1 − cs1 ) ζ1 + ζ2 ζ3 , q × (cp2 1 − cs21 ) ζ12 + ζ22 ζ3 ζ02 − {cs21 ζ32 + cp2 1 (ζ12 + ζ22 )} (ζ02

− cp2 1 |ζ 00 |2 )(ζ02

{B1I (ζ )P1I (ζ )−1 }1 and {B1I (ζ )P1I (ζ )−1 }2 are the 1 and 2 columns of B1I (ζ )P1I (ζ )−1 , respectively. R1 (ζ 0 ) and R2 (ζ 0 ) are the Lopatinski determinants of the systems {P1I (ζ 0 , D3 ), P1I I (ζ 0 , D3 ), B1I (ζ 0 , D3 ), B1I I (ζ 0 , D3 )} and {P2I (ζ 0 , D3 ), P2I I (ζ 0 , D3 ), B2I (ζ 0 , D3 ), B2I I (ζ 0 , D3 )}, respectively, given by R1 (ζ 0 ) = det R1 (ζ 0 ),

(3.5)

550

S. Shimizu



|ζ 000 | τs+1 (ζ 0 )  |ζ 000 | −τp+1 (ζ 0 )  R1 (ζ 0 ) =   −2ρ1 cs21 τp+1 (ζ 0 )|ζ 000 | −ρ1 cs21 (τs+1 (ζ 0 )2 − |ζ 000 |2 ) 2 ρ1 cs21 (τs+1 (ζ 0 ) − |ζ 000 |2 ) −2ρ1 cs21 τs+1 (ζ 0 )|ζ 000 |

 −τs+2 (ζ 0 ) |ζ 000 |  |ζ 000 | τp+2 (ζ 0 )  , 2 2ρ2 cs22 τp+2 (ζ 0 )|ζ 000 | −ρ2 cs22 (τs+2 (ζ 0 ) − |ζ 000 |2 ) 2 2ρ2 cs22 τs+2 (ζ 0 )|ζ 000 | ρ2 cs22 (τs+2 (ζ 0 ) − |ζ 000 |2 ) R2 (ζ 0 ) = det R2 (ζ 0 ), R2 (ζ 0 ) =

(3.7)

1

1 ρ2 cs22 τs+2 (ζ 0 )

!

−ρ1 cs21 τs+1 (ζ 0 )

(3.6)

.

(3.8)

4. Results 4.1. Main Theorem. First we mention the results of the fundamental solution E I (x). This proposition is a version of the theorem proved by Atiyah–Bott–Gårding [A-B-G, Theorem 4.10] adopted in the present context. Proposition 41. For ξ 0 ∈ Rξ4 \ {0} satisfying (det PjI )(ξ 0 ) = 0 (j ∈ {p1 , s1 }), that is, 2

00

2

00

(det PpI1 )(ξ 0 ) = ξ00 − cp2 1 |ξ 0 |2 = 0, or

(det PsI1 )(ξ 0 ) = ξ00 − cs21 |ξ 0 |2 = 0,

then we have lim νe−iν(x−y)·ξ E I (x − y) = EjI ξ 0 (x − y), j ∈ {p1 , s1 }, 0

ν→∞

4 , where in the distribution sense with respect to (x, y) ∈ R4 × R− Z (cofP I )ξ 0 (ξ + iη) I −4 ei(x−y)·(ξ +iη) dξ, Ej ξ 0 (x − y) = (2π) (det P I )ξ 0 (ξ + iη) Rξ4

η ∈ −γ0 ϑ − 0(detPjI , ϑ), j ∈ {p1 , s1 } with a positive real number γ0 . Moreover we have [ supp EpI ξ 0 (x − y) ∪ supp EsI ξ 0 (x − y) ⊂ sing supp E I (x − y), ξ 0 6=0

1

1

and supp EjI ξ 0 (x − y) = 0j ξ 0 for any η ∈ 0j ξ 0

0

=

4 : (x − y) · η ≥ 0 (x, y) ∈ R4 × R− = 0 (det PjI )ξ 0 (η), ϑ , ϑ = (1, 0, 0, 0), j ∈ {p1 , s1 }. def

Inner Estimate of Singularities to Elastic Wave

551

0 In general, supp EjI ξ 0 (x − y) ⊂ 0j ξ 0 (j ∈ {p1 , s1 }), more precisely, 0 ch[supp EjI ξ 0 (x − y)] = 0j ξ 0 , where ch denotes convex hull. Since (det PjI )ξ 0 (η) is 0 homogeneous of degree 2, we obtain supp EjI ξ 0 (x − y) = 0j ξ 0 . The precise proof is as same as the proof of (4.5) in Sect. 5 below. Secondly we mention about the following Main Theorem. This theorem means singular supports of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y) are estimated innerly by supports of localization FξI0 (x, y) and FξI0I (x, y) of F I (x, y) and F I I (x, y) at ξ 0 , respectively.

Main Theorem. For ξ 0 ∈ Rξ4 \ {0} satisfying (det PjI )(ξ 0 ) = 0 (j ∈ {p1 , s1 }), that is, 2

00

2

00

(det PpI1 )(ξ 0 ) = ξ00 − cp2 1 |ξ 0 |2 = 0, or

(det PsI1 )(ξ 0 ) = ξ00 − cs21 |ξ 0 |2 = 0,

we have the following: (1) For the reflected Riemann function F I (x, y), we have 0

0

lim νe−iν{(x −y )·ξ

ν→∞

0 0 +x τ − (ξ 0 0 )−y ξ 0 } 3 k 3 3

F I (x, y) = FjIξ 0 k (x, y),

j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, 0

0

000

(4.1)

ξ0

and if ξ 0 are zeros of τm+ (ζ 0 ), that is, ξ 0 satisfy |ξ 0 | = cm0 (m ∈ {p1 , p2 , s2 }), then we have o n 3 − 00 1 0 0 0 00 lim ν 2 e−iν{(x −y )·ξ +x3 τk (ξ )−y3 ξ3 } F I (x, y) − ν 2 FjIξ 0 k (x, y) ν→∞ (4.2) = FjIξ 0 km (x, y), j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 } 4 × R 4 . Moreover we have in the distribution sense with respect to (x, y) ∈ R− − [ supp FjIξ 0 k (x, y) ∪ supp FjIξ 0 km (x, y) ⊂ sing supp F I (x, y),

(4.3)

ξ 0 6=0

and supp FjIξ 0 k (x, y) = 0j ξ 0

I

=

k def

4 4 × R− : (x, y) ∈ R−

0 (x 0 − y 0 ) + x3 gradξ τk− (ξ 0 ) · η0 − y3 η3 ≥ 0 for any η ∈ 0j ξ 0 ,

(4.4)

j ∈ {p1 , s1 }, k ∈ {p1 , s1 } for ξ 0 satisfying FjIξ 0 k (x, y) 6 = 0 (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }), I I 4 4 × R− : supp Fj ξ 0 km (x, y) = 0j ξ 0 m k = (x, y) ∈ R− def

0

(x − y

0

0 ) + x3 gradξ τk− (ξ 0 ) · η0

− y3 η3 ≥ 0 for any η ∈ 0j ξ 0 m ,

j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }

(4.5)

552

S. Shimizu

for ξ 0 satisfying FjIξ 0 km (x, y) 6 = 0 (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }). (2) For the refracted Riemann function F I I (x, y), we have 0

0

lim νe−iν{(x −y )·ξ

ν→∞

0 0 +x τ + (ξ 0 0 )−y ξ 0 } 3 k 3 3

F I I (x, y) = FjIξI0 k (x, y),

j ∈ {p1 , s1 }, k ∈ {p2 , s2 },

(4.6)

0

and if ξ 0 are zeros of τm+ (ζ 0 ) (m ∈ {p2 }), then we have o n 3 + 00 1 0 0 0 00 lim ν 2 e−iν{(x −y )·ξ +x3 τk (ξ )−y3 ξ3 } F I I (x, y) − ν 2 FjIξI0 k (x, y) ν→∞

= FjIξI0 km (x, y), j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p2 }, 4 × R 4 . Moreover we have in the distribution sense with respect to (x, y) ∈ R+ − [ II II supp Fj ξ 0 k (x, y) ∪ supp Fj ξ 0 km (x, y) ⊂ sing supp F I I (x, y),

(4.7)

(4.8)

ξ 0 6=0

and supp FjIξI0 k (x, y) = 0j ξ 0

I I k

=

4 4 × R− : (x, y) ∈ R+

def

0 (x 0 − y 0 ) + x3 gradξ τk+ (ξ 0 ) · η0 − y3 η3 ≥ 0 for any η ∈ 0j ξ 0 ,

(4.9)

j ∈ {p1 , s1 }, k ∈ {p2 , s2 } for ξ 0 satisfying FjIξI0 k (x, y) 6 = 0 (j ∈ {p1 , s1 } k ∈ {p2 , s2 }), and I I 4 4 × R− : supp FjIξI0 km (x, y) = 0j ξ 0 m k = (x, y) ∈ R+ def

0 (x 0 − y 0 ) + x3 gradξ τk+ (ξ 0 ) · η0 − y3 η3 ≥ 0 for any η ∈ 0j ξ 0 m ,

(4.10)

j ∈ {p1 , s1 }, k ∈ {p2 , s2 }, m ∈ {p2 } for ξ 0 satisfying FjIξI0 km (x, y) 6 = 0 (j ∈ {p1 , s1 }, k ∈ {p2 , s2 }, m ∈ {p2 }). Here 0j ξ 0 = 0 (det PjI )ξ 0 (η), ϑ , ϑ = (1, 0, 0, 0), j ∈ {p1 , s1 }, ! ) ( ξ00 I 0 0 0 0j ξ 0 m = 0 (det Pj )ξ 0 (η), ϑ ∩ 0 2 η0 − ξ1 η1 − ξ2 η2 , ϑ × Rη , cm ϑ 0 = (1, 0, 0), j ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }, s ξ2 ξ2 000 ± 0 τp1 (ξ ) = sgn(∓ξ0 ) 20 − |ξ 0 000 |2 , if 20 − |ξ 0 |2 ≥ 0, cp1 cp1 r ξ2 ± 0 and τp1 (ξ ) is taken as a branch of c20 − |ξ 0 000 |2 such that ±Imτp±1 (ξ 0 ) > 0 if 000 |ξ 0 |2

p1

(4.11) (4.12)

ξ02 cp2

1

−

< 0. τs±1 (ξ 0 ),τp±2 (ξ 0 ), and τs±2 (ξ 0 ) are defined as the same as τp±1 (ξ 0 ) substituting cp1 for cs1 , cp2 , and cs2 , respectively.

Inner Estimate of Singularities to Elastic Wave

553

Remark 1. The (0j ξ 0 )Ik (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }) represent k reflected wave for j incident wave. The (0j ξ 0 m )Ik (j ∈ {p1 , s1 }, k ∈ {p1 , s1 } m ∈ {p1 , p2 , s2 }) represent m lateral wave of k reflected wave for j incident wave. The (0j ξ 0 )Ik I (j ∈ {p1 , s1 }, k ∈ {p2 , s2 }) represent k refracted wave for j incident wave. The (0j ξ 0 m )Ik I (j ∈ {p1 , s1 }, k ∈ {p2 , s2 } m ∈ {p2 }) represent m lateral wave of k refracted wave for j incident wave. Remark 2. The τp±1 (ξ 0 ), τs±1 (ξ 0 ), τp±2 (ξ 0 ), and τs±2 (ξ 0 ) arise from det P I (ξ ) = det P1I (ξ ) × det P2I (ξ ) = {(ξ02 − cp2 1 |ξ 00 |2 )(ξ02 − cs21 |ξ 00 |2 )} × (ξ02 − cs21 |ξ 00 |2 ) = {det PpI1 (ξ ) × det PsI1 (ξ )} × det PsI1 (ξ ) = {cp2 1 cs21 (ξ3 − τp+1 (ξ 0 ))(ξ3 − τp−1 (ξ 0 ))(ξ3 − τs+1 (ξ 0 ))(ξ3 − τs−1 (ξ 0 ))} × {cs21 (ξ3 − τs+1 (ξ 0 ))(ξ3 − τs−1 (ξ 0 ))}, and the factor of det P I I (ξ ) given with replaced p1 , s1 by p2 , s2 , respectively.

(4.13)

Remark 3. If (det PjI )(ξ 0 ) 6 = 0 (j ∈ {p1 , s1 }), then (det PjI )ξ 0 (η) = (det P I )j (ξ 0 ) and 4 × R4 is constant. So 0j ξ 0 = 0j ξ 0 m = R4 and thus (0j ξ 0 )Ik = (0j ξ 0 m )Ik = {0} ⊂ R− − (j ∈ {p1 , s1 }, k ∈ {p1 , s1 } m ∈ {p1 , p2 , s2 }) and (0j ξ 0 )Ik I = (0j ξ 0 m )Ik I = {0} ⊂ 4 × R 4 (j ∈ {p , s }, k ∈ {p , s } m ∈ {p }). R+ 1 1 1 1 2 − Remark 4. By the assumption (2.3), for example there are not any real ξ that are roots of ζ02 − cp2 1 |ζ 00 |2 = 0 and zeros of τs+1 (ξ 0 ). The sets of ξ 0 that cause singularities are given in (4.14)–(4.34) below. Remark 5. In (4.4), ξ 0 satisfying FjIξ 0 k (x, y) 6 = 0 is equivalent to (Q1 (ξ 0 ), Q2 (ξ 0 ))

6 = 0 in (5.13) below, or is equivalent to Q1 (ξ 0 ) 6= 0 in (5.15) below. In (4.5), ξ 0 0 0 satisfying FjIξ 0 km (x, y) 6 = 0 is equivalent to T1 (ξ 0 )R1 (ξ 0 ) − Q1 (ξ 0 )S(ξ 0 ) 6 = 0 or 0

0

T2 (ξ 0 )R1 (ξ 0 ) − Q2 (ξ 0 )S(ξ 0 ) 6 = 0 in (5.16) below. 4.2. Interpretation of Main Theorem. By using the Main Theorem and Proposition 4.1, we find an inner estimate of the location of singularities of the reflected and refracted Riemann functions F I (x, y) and F I I (x, y), and the fundamental solution E I (x − y). It gives a interpretation of the Main Theorem and Proposition 4.1 as a physical situation in Sect. 1. In the expressions (3.2) and (3.3), the parts put between U (ξ 000 +iη000 )C and (U (ξ 000 + 000 iη )C)−1 are decomposed into 2 × 2 and 1 × 1 matrices valued Riemann functions I (x, y) and F I (x, y) for F I (x, y), F I I (x, y) and F I I (x, y) for F I I (x, y), and F2×2 1×1 2×2 1×1 I I (x − y) for E I (x − y). The displacement vector of F ι (x, y) E2×2 (x − y) and E1×1 2×2 ι (x, y) (ι = {I, I I }) lies in x (ι = {I, I I }) lies in the x1 x3 -plane and that of F1×1 2 ι ι (x, y) axis, where we regard y as a parameter. Thus we can treat F2×2 (x, y) and F1×1 (ι = {I, I I }) independently. I (x − y). By Proposition 4.1, we have the following 2 sets of First we consider E2×2 ξ 0 that cause singularities and that are roots of detP1I (ξ 0 ) = 0; 0

000

ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) with |ξ 0 | <

1 , cs1

(4.14)

554

S. Shimizu 0

000

ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) with |ξ 0 | <

1 . cp1

(4.15)

We obtain [ [ 0 I 4 supp Ej ξ 0 (x − y) = : 0j ξ 0 = (x, y) ∈ R4 × R− ξ 0 6=0

ξ 0 6 =0

(x0 − y0 )2 =

1 2 2 2 {(x − y ) + (x − y ) + (x − y ) } , j = {p1 , s1 }. 1 1 2 2 3 3 cj2

Equations (4.14) and (4.15) correspond to S1 and P1 incident waves, respectively. I (x, y) and F I I (x, y). For F I (x, y), we have the folSecondly we consider F2×2 2×2 2×2 lowing 4 sets of ξ 0 that cause singularities corresponding to supp FjIξ 0 k (x, y) (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }) in (4.1), (4.3), and (4.4), that is, ξ 0 are roots of detP1I (ξ 0 ) = 0 and are not zeros of τm+ (ξ 0 ) (m ∈ {p1 , p2 , s2 }); 1 , cs1 1 0 0 000 , ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | < cp1 1 0 0 000 , ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | < cp1 1 0 0 000 . ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | < cp1 0

0

000

ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | <

(4.16) (4.17) (4.18) (4.19)

Equations (4.16) and (4.17) (resp. (4.18) and (4.19)) correspond to the S1 and P1 reflected waves for the S1 (resp. P1 ) incident wave, respectively. We have the following 9 sets of ξ 0 that cause singularities corresponding to supp I Fj ξ 0 km (x, y) (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }) in (4.2), (4.3), and (4.5) that is, ξ 0 are roots of detP1I (ξ 0 ) = 0 and are zeros of τm+ (ξ 0 ) (m ∈ {p1 , s2 , p2 }); 1 , cp1 1 0 0 000 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | = , cs2 1 0 0 000 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | = , cs2 1 0 0 000 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | = , cp2 1 0 0 000 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | = , cp2 1 0 0 000 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | = , cs2 1 0 0 000 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | = , cs2 0

0

000

ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | =

(4.20)

ξ0

(4.21)

ξ0 ξ0 ξ0 ξ0 ξ0

(4.22) (4.23) (4.24) (4.25) (4.26)

Inner Estimate of Singularities to Elastic Wave

555

1 , cp2 1 0 0 000 . ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τp−1 (ξ 0 ) with |ξ 0 | = cp2 0

0

000

ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τs−1 (ξ 0 ) with |ξ 0 | =

(4.27) (4.28)

Equation (4.20) corresponds to the P1 − S1 lateral or glancing wave for the S1 incident wave. The P1 − S1 lateral wave means the wave that is originally the P1 reflected wave tends to total reflection, and it becomes a source and causes S1 reflected wave (see Fig. 2. Figure 2 shows P2 −S1 , P2 −P1 , and P2 −S2 lateral waves). Equations (4.21) and (4.22) (resp. (4.23) and (4.24)) correspond to S2 − S1 and S2 − P1 (resp. P2 − S1 and P2 − P1 ) lateral waves for the S1 incident wave, respectively. Equations (4.25) and (4.26) (resp. (4.27) and (4.28)) correspond to S2 − S1 and S2 − P1 (resp. P2 − S1 and P2 − P1 ) lateral waves for the P1 incident wave, respectively. I I (x, y), we have the following 4 sets of ξ 0 that cause singularities correFor F2×2 sponding to supp FjIξI0 k (x, y) (j ∈ {p1 , s1 }, k ∈ {p2 , s2 }) in (4.6), (4.8), and (4.9), that is ξ 0 are roots of detP1I (ξ 0 ) = 0 and are not zeros of τp+2 (ξ 0 );

1 , cs2 1 0 0 000 , ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τp+2 (ξ 0 ) with |ξ 0 | < cp2 1 0 0 000 , ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τs+2 (ξ 0 ) with |ξ 0 | < cs2 1 0 0 000 . ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τp+2 (ξ 0 ) with |ξ 0 | < cp2 0

0

000

ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs+2 (ξ 0 ) with |ξ 0 | <

(4.29) (4.30) (4.31) (4.32)

Equations (4.29) and (4.30) (resp. (4.31) and (4.32)) correspond to S2 and P2 refracted waves for the S1 (resp. P1 ) incident wave, respectively. We have the following 2 sets of ξ 0 that cause singularities corresponding to supp FjIξI0 km (x, y) (j ∈ {p1 , s1 }, k ∈ {p2 , s2 }, m ∈ {p2 }) in (4.7), (4.8), and (4.10), that is, ξ 0 are roots of det P1I (ξ 0 ) = 0 and are zeros of τp+2 (ξ 0 );

1 , cp2 1 0 0 000 . ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) for τs+2 (ξ 0 ) with |ξ 0 | = cp2 0

0

000

ξ 0 = (1, ξ10 , ξ20 , τs+1 (ξ 0 )) for τs+2 (ξ 0 ) with |ξ 0 | =

(4.33) (4.34)

Equations (4.33) (resp. (4.34)) corresponds to the P2 − S2 lateral wave for the S1 (resp. P1 ) incident wave. Remark. It is sufficient to consider only the case ξ00 = 1 since (0j ξ 0 )Ik = (0j (tξ 0 ) )Ik , (0j ξ 0 m)Ik = (0j (tξ 0 )m )Ik (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }), and (0j ξ 0 )Ik I = (0j (tξ 0 ) )Ik I , (0j ξ 0 m)Ik I = (0j (tξ 0 )m )Ik I (j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p2 }) for t ∈ R \ {0}. We calculate the singularity caused by the point (4.19) as a reflected wave. Localiza0 tion of detPpI1 (η) at the point ξ 0 = (1, ξ10 , ξ20 , τp+1 (ξ 0 )) is given by (det PpI1 )ξ 0 (η) = η0 − cp2 1 (ξ10 η1 + ξ20 η2 + τp+1 (ξ 0 )η3 ),

556

S. Shimizu

so 0p1 ξ 0 is given by n o 0 0p1 ξ 0 = η ∈ R4 : η0 − cp2 1 ξ10 η1 + ξ20 η2 + τp+1 (ξ 0 )η3 > 0 . The (0p1 ξ 0 )Ip1 is calculated as follows: 0p1 ξ 0

I p1

0

4 4 = (x, y) ∈ R− × R− : 0

(x −y )+

1

τp−1 (ξ 0 0 ) 4 4 × R− : = (x, y) ∈ R−

! ! 1 0 0 0 x · η ,−ξ ,−ξ −y η ≥ 0 for any η ∈ 0 0 3 3 3 p1 ξ 1 2 cp2 1

ξ 0 x3 ξ 0 x3 x3 (x0 − y0 ) + − 0 2 , (x1 − y1 ) − −1 0 , (x2 − y2 ) − −2 0 , −y3 τp1 (ξ 0 ) cp1 τp1 (ξ 0 ) τp1 (ξ 0 ) 0 = u 1, −cp2 1 ξ10 , −cp2 1 ξ20 , −cp2 1 τp+1 (ξ 0 ) , u ≥ 0 x3 1 4 4 × R− : (x0 − y0 ) + − 0 2 ≥ 0, = (x, y) ∈ R− τp1 (ξ 0 ) cp1 1

x1 − y1 = −cp2 1 ξ10 (x0 − y0 ), x2 − y2 = −cp2 1 ξ20 (x0 − y0 ), 0 x3 + y3 = cp2 1 τp+1 (ξ 0 )(x0 − y0 ) . 0

2

2

From (4.35) and τp+1 (ξ 0 )2 + ξ10 + ξ20 = [ 000 |ξ 0 |< cp1 1

0p1 ξ 0

I p1

1 cp2

(4.35)

, we have

1

4 4 = (x, y) ∈ R− × R− : (x0 − y0 )2 =

!

x0 > 0,

1 2 2 2 {(x − y ) + (x − y ) + (x + y ) } . 1 1 2 2 3 3 cp2 1

Moreover the singularity caused by the point (4.22) as a lateral wave is calculated as 4 4 (0s1 ξ 0 s2 )Ip1 = (x, y) ∈ R− × R− :

[ 000

|ξ 0 |= cs1

2

cs cs q 2 x3 + q 2 y3 ≥ 0, 2 2 cp1 cs2 − cp1 cs1 cs22 − cs21 q q  2 cs22 − cp2 1 cs22 − cs21 2 2   x3 + y3 . (x1 − y1 ) + (x2 − y2 ) = cs2 (x0 − y0 ) + cp1 cs1 (x0 − y0 ) +

Inner Estimate of Singularities to Elastic Wave

557

If we cut at x2 = 0 and y 0 = 0, then the sectional face is given by [ I 0 3 (0s1 ξ 0 s2 )p1 ∩ {x2 = 0} ∩ {y = 0} = (x0 , x1 , x3 , y3 ) ∈ R− × R− : 000

|ξ 0 |= cs1

2

cs cs q 2 x3 + q 2 y3 ≥ 0, cp1 cs22 − cp2 1 cs1 cs22 − cs21 q q   cs22 − cp2 1 cs22 − cs21   x3 + y3 . x1 = ± cs2 x0 + cp1 cs1

x0 +

The singularities caused by other points are calculated similarly. Thus we obtain the following illustration of the inner estimate of the location of I (x, y) and F I I (x, y) singularities of the reflected and refracted Riemann functions F2×2 2×2 with the passage of time as in Fig. 4. Figure 4 illustrates the sectional face with fixed x0 and intersection of x2 = 0 of the figure of time and space when y 0 = 0 and y3 is fixed. I (x − y), real curved lines in the lower Dotted lines show incident waves caused by E2×2 I (x, y) (resp. upper) half-space show reflected (resp. refracted) waves caused by F2×2 I I (resp. F2×2 (x, y)), and real straight lines in the lower (resp. upper) half-space show I (x, y) (resp. F I I (x, y)). lateral waves caused by F2×2 2×2 Remark 1. Under the assumption (2.3), we obtain the following two order relations: cp cp cs 1 q 2 q 2 q 1 < < < , cs1 cs1 cp2 2 − cs21 cs1 cp2 1 − cs21 cs1 cs22 − cs21 cp cs 1 q 2 q 2 < < . cp1 cp1 cp2 2 − cp2 1 cp1 cs22 − cp2 1 In order to visualize, we put cs1 = 1, cp1 = 2, cs2 = 3, cp2 = 4, and y3 = −1. Under this condition, we obtain the order relation cp cp cs q 2 q 2 q 2 < < 2 2 2 2 cp1 cp2 − cp1 cp1 cs2 − cp1 cs1 cp2 2 − cs21 <

cp cs q 2 q 1 < . cs1 cp2 1 − cs21 cs1 cs22 − cs21

Remark 2. If we consider the half-space problem, then the singularities corresponding to reflected waves caused by the points (4.16)–(4.19) and only the singularity corresponding to a lateral wave caused by the point (4.20) appear. Other singularities do not appear. Remark 3. If we consider the stratified media problem of the usual wave operator, then the singularity corresponding to a reflected (resp. refracted) wave caused by the point (4.16) (resp. (4.29)) and only the singularity corresponding to a lateral wave caused by the point (4.21) appear. Other singularities do not appear. Remark 4. If cp1 = cs2 in assumption (2.3), then the singularities corresponding to lateral waves caused by the points (4.25) and (4.26) do not appear.

558

S. Shimizu

Remark 5. Reflected or refracted waves come in contact with lateral waves since discriminants of simultaneous equations of quadratic curves and straight lines are equal to 0. x3 x3

P2

S2

0 x1 S1 incident

P1 incident

x1

P1 S1

y1

y

y

0 < x0 ≤ − cp3

− cp3 < x0 ≤

1

1

x3

cp2 (−y3 ) q 2 −c2 cp1 cp 2 p1

x3 S2

P2

P2

S2 0

0

x1

S1

x1

P1 S1

P1

cs2 (−y3 ) cp2 (−y3 ) q q < x0 ≤ 2 2 −c2 cp1 cs2 −cp cs1 cp 2 1 2 s1

cs2 (−y3 ) cp2 (−y3 ) q q < x0 ≤ 2 −c2 2 cp1 cp cp1 cs2 −cp p 2 1 2 1

x3

x3 P1

S2 S1

P2

S2

S1

P2

S2

S1 P1

cs2 (−y3 ) cp2 (−y3 ) q q < x0 ≤ 2 −c2 cs1 cp cs1 cs2 −cs2 2 s1 2 1

x1

S1

P2 P1

cs2 (−y3 ) cp1 (−y3 ) q q < x0 ≤ 2 −c2 cs1 cs2 −cs2 cs1 cp 2 1 1 s1

x1

Inner Estimate of Singularities to Elastic Wave

559

x3

P2

P2

S2

S2

8

S1

x1 S1

P1 P1

cp1 (−y3 ) q < x0 2 − c2 cs1 cp s1 1 I (x, y), F I I (x, y), and E I (x − y) Fig. 4. Inner estimate of the location of singularities of F2×2 2×2 2×2

I (x, y), F I I (x, y), and E I (x − y). These are treated Secondly we consider F1×1 1×1 1×1 I (x, y) and F I I (x, y). Thus we obtain the the same as p1 = p2 = 0 in the case of F2×2 2×2 following illustration of the inner estimate of the location of singularities of the reflected I (x, y) and F I I (x, y) with passage of time as in and refracted Riemann functions F1×1 1×1 I (x − y), real curved lines Fig. 5. The dotted line shows incident wave caused by E1×1 in the lower (resp. upper) half-space show reflected (resp. refracted) waves caused by I (x, y) (resp. F I I (x, y)), and real straight lines in the lower half-space show lateral F1×1 1×1 I (x, y). waves caused by F1×1

5. Proof of Main Theorem In this section, we give a proof of the Main Theorem. We prove it for the reflected Riemann function F I (x, y). A similar proof is given for the refracted Riemann function F I I (x, y). The first part of the theorem is derived by the localization method. First we prove Eq. (4.1). We consider the case when j = s1 , k = p1 , that is, consider the P1 reflected wave for the S1 incident wave, and consider the point ξ 0 satisfying (4.17). We calculate

560

S. Shimizu x3

x3

S2

0 S1

x1 y3

x1

S1 incident

y

y

0 < x0 ≤ − cs3

− cs3 < x0 ≤

1

1

cs2 (−y3 ) q cs1 cs22 −cs21

x3 S1

x1 S1

cs2 (−y3 ) q < x0 cs1 cs22 −cs21 I (x, y), F I I (x, y), and E I (x − y) Fig. 5. Inner estimate of the location of singularities of F1×1 1×1 1×1

0

0

00

−

00

νe−iν{(x −y )·ξ +x3 τp1 (ξ )−y3 ξ3 } F I (x, y) Z Z 0 0 0 00 0 0 ei(x −y )·(−νξ +ξ +iη ) e−iy3 (−νξ3 +ξ3 +iη3 ) U(ξ 000 + iη000 )C × = (2π)−4 R3

0

R

Inner Estimate of Singularities to Elastic Wave



{P1I (ζ )−1 }1    I  {B (ζ )P I (ζ )−1 }1 1  1  

· · · ·

· · · ·

 ν −1 R1 (ξ 0 +iη0 )       · {P1I (ζ )−1 }1    ·   · {B I (ζ )P I (ζ )−1 }1 1 1   ·  + ν −1 R1 (ξ 0 +iη0 )  

561

· · · ·

· · · ·

! + 0 0 − 00 |ξ 000 + iη000 | e−i{τp1 (ξ +iη )+ντp1 (ξ )}x3 + 0 0 −τp1 (ξ + iη )

· · · ! · τs+1 (ξ 0 + iη0 ) −i{τs+ (ξ 0 +iη0 )+ντp− (ξ 00 )}x3 1 1 e |ξ 000 + iη000 | 0

{P1I (ζ )−1 }2 I {B (ζ )P I (ζ )−1 }2 1 1

· · · ·

· · · ·

· · · ·

! 0 |ξ 000 + iη000 | −i{τp+ (ξ 0 +iη0 )+ντp− (ξ 0 )}x3 1 1 e −τp+1 (ξ 0 + iη0 )

ν −1 R1 (ξ 0 +iη0 )

+

· · · ·

· · · · {B1I (ζ )P1I (ζ )−1 }2 · · ! · · τs+1 (ξ 0 + iη0 ) −i{τs+ (ξ 0 +iη0 )+ντp− (ξ 00 )}x3 1 1 e ν −1 R1 (ξ 0 +iη0 ) |ξ 000 + iη000 | {P1I (ζ )−1 }2

0

P2I (ζ )−1 I B (ζ )P I (ζ )−1 2 2 ν −1 R2 (ξ 0 +iη0 )



0 0

 

  ·  · −i{τs+ (ξ 0 +iη0 )+ντp− (ξ 00 )}x3 

e

1

−1 dξ3 dξ 0 , U(ξ 000 + iη000 )C

1

where ζ = ξ + iη. Making the change of variable −νξ 0 + ξ = κ, then we have = (2π)−4

Z R3

0

0

0

0

ei(x −y )·(κ +iη )

Z R

000

e−iy3 (κ3 +iη3 ) U(νξ 0 + κ 000 + iη000 )C ×

(5.1)

562

S. Shimizu



I {B (νξ 0 1

                         +     

{P1I (νξ 0 + κ + iη)−1 }1

· · I 0 −1 + κ + iη)P1 (νξ + κ + iη) }1 · · 0

· · · ·

· · · ·

! 000 |νξ 0 + κ 000 + iη000 | 0 −τp+1 (νξ 0 + κ 0 + iη0 )

ν −1 R1 (νξ 0 +κ 0 +iη0 ) +

×e−i{τp1 (νξ · · · ·

0 0 +κ 0 +iη0 )−ντ + (ξ 0 0 )}x 3 p1

· · · · {B1I (νξ 0 + κ + iη)P1I (νξ 0 + κ + iη)−1 }1 · · · · {P1I (νξ 0 + κ + iη)−1 }1

0

ν −1 R1 (νξ 0 +κ 0 +iη0 ) +

×e−i{τs1 (νξ 0

and so on

! 000 |νξ 0 + κ 000 + iη000 | 0 −τp+1 (νξ 0 + κ 0 + iη0 )

0 0 +κ 0 +iη0 )−ντ + (ξ 0 0 )}x 3 p1

−1 000 dκ3 dκ 0 . U(νξ 0 + κ 000 + iη000 )C

(5.2)

Note that 0

0

0

τk+ (νξ 0 + κ 0 + iη0 ) = ντk+ (ξ 0 ) + gradτk+ (ξ 0 ) · (κ 0 + iη0 ) + O(ν −1 ), k = {p1 , s1 }, (5.3) and 1 000 U(νξ 0 + κ 000 + iη000 ) = q 0 2 (νξ1 + κ1 + iη1 ) + (νξ20 + κ2 + iη2 )2 

νξ10 + κ1 + iη1 −νξ20 − κ2 − iη2

  0 × νξ2 + κ2 + iη2  0 

νξ10 + κ+ iη1

ξ10 −ξ20

1

→q 2 2 ξ10 + ξ20 000

0 0



0 q

0 (νξ10 + κ1 + iη1 )2 + (νξ20 + κ2 + iη2 )2

    



   0 0  ξ2 ξ1  = U (ξ 0 000 ) as ν → ∞ for ξ 0 000 6= 0, (5.4) 0   q   2 2 ξ10 + ξ20 0 0 000

U(νξ 0 + κ 000 + iη000 ) → U(κ 000 + iη000 ) as ν → ∞ for ξ 0 = 0,

(5.5)

Inner Estimate of Singularities to Elastic Wave

563

q 000 ν −1 |νξ 0 + κ 000 + iη000 | = ν −1 (νξ10 + κ1 + iη1 )2 + (νξ20 + κ2 + iη2 )2 q p 2 2 ξ10 + ξ20 1 + O(ν −1 ) q 000 000 2 2 → ξ10 + ξ20 = |ξ 0 | as ν → ∞ for ξ 0 6 = 0, =

000

(5.6)

000

|νξ 0 + κ 000 + iη000 | → |κ 000 + iη000 | as ν → ∞ for ξ 0 = 0.

(5.7)

4 × R 4 ), we have For a 3 × 3 matrix valued function φ(x, y) ∈ C0∞ (R− +

0 0 0 00 − 00 νe−iν{(x −y )·ξ +x3 τp1 (ξ )−y3 ξ3 } F I (x, y), φ(x, y)

x,y

= (2π)−2 U(ξ 000 + iη000 )C × 

{P1I (ζ )−1 }1     {B I (ζ )P I (ζ )−1 } 1  1 1  

· · ·

· · · · · ·

· · ·   −1 R (ξ 0 +iη0 ) ν 1       · {P1I (ζ )−1 }1 ·    · ·    · {B1I (ζ )P1I (ζ )−1 }1 ·   · ·  + ν −1 R1 (ξ 0 +iη0 )     and so on e

0

iνx3 τp+ (ξ 0 ) 1

|ξ 000 + iη000 |

! +

0

0

+

0

0

e−iτp1 (ξ +iη )x3

−τp+1 (ξ 0 + iη0 )

·

·

·

· τs+ (ξ 0 + iη0 ) 1 |ξ 000

+ iη000 |

! e−iτs1 (ξ +iη )x3

0 −1 , U(ξ 000 + iη000 )C

0 00 ˜ − ξ 0 − iη0 , x3 , −νξ 0 + ξ 0 + iη0 , −νξ30 + ξ3 + iη3 ) φ(νξ

making the change of variable −νξ 0 + ξ = κ, then we have

ξ x3

,

564

S. Shimizu

000 = (2π)−2 U(νξ 0 + κ 000 + iη000 )C × 

I {B (νξ 0 1

{P1I (νξ 0 + κ + iη)−1 }1

· · I 0 −1 + κ + iη)P1 (νξ + κ + iη) }1 · ·

· · · ·

· · · ·

     ! 000  |νξ 0 + κ 000 + iη000 |   0 0 ν −1 R1 (νξ 0 +κ 0 +iη0 )  −τp+1 (νξ 0 + κ 0 + iη0 )  0 + 0 0 0   ×e−iτp1 (νξ +κ +iη )x3     {P1I (νξ 0 + κ + iη)−1 }1 · ·  ·   · · ·    · {B1I (νξ 0 + κ + iη)P1I (νξ 0 + κ + iη)−1 }1 · · !  0000 + κ 000 + iη000 |  · · · |νξ + 0 0  ν −1 R1 (νξ 0 +κ 0 +iη0 ) −τp+1 (νξ 0 + κ 0 + iη0 )   0 + 0 0 0  ×e−iτs1 (νξ +κ +iη )x3    0

−1 000 , U(νξ 0 + κ 000 + iη000 )C + 00 0 ˜ − iη0 , x3 , κ 0 + iη0 , κ3 + iκ3 ) . eiνx3 τp1 (ξ ) φ(−κ and so on

(5.8)

κ x3

Here < > denotes a sum of each component, and ˜ 0 , x3 , z) = FL(x 0 ,y) [φ(x, y)](ζ 0 , z), φ(ζ where FL denotes the Fourier–Laplace transformation. If ξ 0 including e

00

−iτp+ (νξ +κ 0 +iη0 )x3 1

6= 0, then for the term

, we have by using (5.3),

**

−→(2π)−2

000

1

000

U(ξ 0 )C

I ) (κ (det P1s 0 1 ξ

!

+ iη)

! 0 |ξ | igradτp− (ξ 0 )x3 ·(κ 0 +iη0 ) 1 e and so on 0 R1 (ξ 0 0 ) −τp+1 (ξ 0 ) −1 000 0 ˜ , φ(−κ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) , U(ξ 0 )C Q1 (ξ 0 )

0 000

(5.9)

κ x3

where

2 00 I ) 0 (κ + iη) = 2 ξ00 − cp2 1 |ξ 0 |2 (det P1s 1 ξ o n × ξ00 (κ0 + iη0 ) − cs21 ξ10 (κ1 + iη1 ) + ξ20 (κ2 + iη2 ) + ξ30 (κ3 + iη3 ) ,

Inner Estimate of Singularities to Elastic Wave

565 0

0

and Q1 (ξ 0 ) is defined by (5.14) below. Here we note that τp+1 (ξ 0 ) = −τp−1 (ξ 0 ) since +

0

τp+1 (ξ 0 ) is real for ξ 0 satisfying (4.17). For the term including e−iτs1 (νξ right-hand side of (5.8) could be equal to

−2

00

+κ 0 +iη0 )x3

0 0 −iν τs+ (ξ 0 )−τp+ (ξ 0 ) x3

f (κ + iη, ν), 0 0 0 0 ˜ , φ(−κ − iη , x3 , κ + iη , κ3 + iκ3 )

(2π)

e

, the

1

1

(5.10)

κ x3

by using 0

0

0

0

τs+1 (νξ 0 + κ 0 + iη0 ) − ντp+1 (ξ 0 ) = ν{τs+1 (ξ 0 ) − τp+1 (ξ 0 )} 0

+ gradτs+1 (ξ 0 ) · (κ 0 + iη0 ) + O(ν −1 ). We put +

< >κ = e−iν(τs1 (ξ

0 0 )−τ + (ξ 0 0 ))x 3 p1

g(x3 , ν).

The g(x3 , ν) belongs to C0∞ with respect to x3 , and suppx3 g(x3 , ν) is included at a compact set independent of ν. Moreover ∂ ∂x g(x3 , ν) ≤ ∃C indep of ν. 3 Thus < >x3 in (5.10) is Z + 00 + 00 < >x3 = e−iν(τs1 (ξ )−τp1 (ξ ))x3 g(x3 , ν)dx3 → 0 as ν → ∞

(5.11)

because of integration by parts or the Riemann–Lebesgue theorem. By (5.9) and (5.11), we obtain 0

0

lim < νe−iν{(x −y )·ξ

ν→∞

0 0 +x τ − (ξ 0 0 )−y ξ 0 } 3 p1 3 3

=< FsI ξ 0 p (x, y), φ(x, y) >x,y 1

1

F I (x, y), φ(x, y) >x,y

4 4 for φ(x, y) ∈ C0∞ (R− × R− ).

(5.12)

Here FsI ξ 0 p (x, y) 1 1 Z Z 0 0 − 00 0 0 000 ei(x −y +gradτp1 (ξ )x3 )·(κ +iη ) e−iy3 (κ3 +iη3 ) U(ξ 0 )C = (2π)−4 R3 R ! !   0 000 | 0 000 | 0 0 0 |ξ |ξ Q1 (ξ ) Q2 (ξ ) 1  R1 (ξ 0 0 )  0 0 00) + 0 + 0 R (ξ 1 × −τp1 (ξ ) −τp1 (ξ ) 0  (det PsI1 )ξ 0 (κ + iη) 0 0 0 −1 000 dκ3 dκ 0 , (5.13) U(ξ 0 )C

566

S. Shimizu

where Q1 (ξ 0 ) and Q2 (ξ 0 ) are given by {P I (ξ 0 )−1 }1 · · · 1 · · · Q1 (ξ 0 ) = det P1I (ξ 0 ) × I 0 I 0 −1 {B1 (ξ )P1 (ξ ) }1 · · · · · · 2 2 000 · · · ξ00 − (cs21 |ξ 0 |2 + cp2 1 ξ30 ) 000 · · · (cp2 1 − cs21 )|ξ 0 |ξ30 , = 2 2 000 · · · iρ1 cs21 ξ30 {ξ00 − (2cs21 − cp2 1 )|ξ 0 |2 − cp2 1 ξ30 } iρ1 |ξ 0 000 |{(cp2 − 2cs2 )ξ00 2 − cs2 (cp2 − 2cs2 )|ξ 0 000 |2 + cp2 cs2 ξ30 2 } · · · 1 1 1 1 1 1 1 · {P1I (ξ 0 )−1 }1 · · · · · (5.14) Q2 (ξ 0 ) = det P1I (ξ 0 ) × . · {B1I (ξ 0 )P1I (ξ 0 )−1 }1 · · · · · 000

Here · means the same component of the Lopatinski matrix (3.6). If ξ 0 = 0, then using (5.5) and (5.7), the right-hand side of (5.8) is equal to ** 1 −2 U(κ 000 + iη000 )C (2π) I (det P1s1 )ξ 0 (κ + iη) ! ! Q1 (ξ 0 ) ν1 |κ 000 + iη000 | i gradτp− (ξ 0 0 )x3 ·(κ 0 +iη0 ) 1 e and so on 0 −τp+1 (ξ 0 ) R1 (ξ 0 0 ) E E −1 0 ˜ , φ(−κ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) . U(κ 000 + iη000 )C κ x3

Since we could put ξ 0 satisfying (4.17) to (1, 0, 0, − c1s ), we obtain (5.12) with FsI ξ 0 p (x, y) 1 1

−4

= (2π)

1

Z R3

e

i(x 0 −y 0 +( cp1 1

,0,0)x3 )·(κ 0 +iη0 )

Z

e−iy3 (κ3 +iη3 )

R

 0 00 1 Q1  0 0 0 0 ×  dκ3 dκ .  I ) (κ + iη) 00) (det P1s R (ξ 0 1 1 1 ξ cp 0 0 

(ξ 0 )

(5.15)

1

Thus we prove Eq. (4.1). Secondly we prove Eq. (4.2). We consider the case when j = s1 , k = p1 and m = s2 , that is, consider the S2 lateral wave of the P1 reflected wave for the S1 incident wave, and consider the point ξ 0 satisfying (4.22). We calculate 0

0

ν 2 e−iν{(x −y )·ξ Z 1 = (2π)−4 ν 2 3

0 0 +x τ − (ξ 0 0 )+y ξ 0 } 3 p1 3 3

R3

F I (x, y) − ν 2 FsI ξ 0 p (x, y) 1 1 Z 000 i(x 0 −y 0 )·(κ 0 +iη0 ) −iy3 (κ3 +iη3 ) e e U(νξ 0 + κ 000 + iη000 )C × 1

R

Inner Estimate of Singularities to Elastic Wave



I {B (νξ 0 1

567

{P1I (νξ 0 + κ + iη)−1 }1

· · I 0 −1 + κ + iη)P1 (νξ + κ + iη) }1 · ·

· · · ·

· · · ·

     ! 000  |νξ 0 + κ 000 + iη000 |   0 0 ν −1 R1 (νξ 0 +κ 0 +iη0 )  −τp+1 (νξ 0 + κ 0 + iη0 )  0 0 + 0 0 0 + 0   ×e−i{τp1 (νξ +κ +iη )−ντp1 (ξ )}x3     {P1I (νξ 0 + κ + iη)−1 }1 · ·  ·   · · ·    · {B1I (νξ 0 + κ + iη)P1I (νξ 0 + κ + iη)−1 }1 · · !  0000 + κ 000 + iη000 |  · · · |νξ + 0 0  ν −1 R1 (νξ 0 +κ 0 +iη0 ) −τp+1 (νξ 0 + κ 0 + iη0 )   + 00 0 0 + 00  ×e−i{τs1 (νξ +κ +iη )−ντp1 (ξ )}x3    0

−1 000 dκ3 dκ 0 and so on U(νξ 0 + κ 000 + iη000 )C Z Z 0 0 − 00 0 0 1 000 − (2π)−4 ν 2 ei(x −y +gradτp1 (ξ )x3 )·(κ +iη ) e−iy3 (κ3 +iη3 ) U(ξ 0 )C R3 R ! !   000 000 0 |ξ 0 | |ξ 0 | Q1 (ξ 0 ) Q2 (ξ 0 ) 1  R1 (ξ 0 0 )  0 0 00) + 0 + 0 R (ξ 1 × −τp1 (ξ ) −τp1 (ξ ) 0 I ) (κ + iη)  (det P1s 0 1 ξ 0 0 0 −1 000 dκ3 dκ 0 . × U(ξ 0 )C We have for τs1 , τp1 , τp2 , 0

0

ν −1 τl+ (νξ 0 + κ 0 + iη0 ) −→ τl+ (ξ 0 ) as ν → ∞ for l ∈ {s1 , p1 , p2 }, and for τs2 0

ν − 2 τs+2 (νξ 0 + κ 0 + iη0 ) v ( ) u u ξ00 000 t 0 000 000 −→ 2 2 (κ0 + iη0 ) − ξ · (κ + iη ) as ν → ∞, cs2 1

where

√ 0

√ · satisfies Im · > 0. We have

R1 (νξ 0 + κ 0 + iη0 ) v (  ) u  ξ00 1u 0 000 6 0 −2 t 0 000 000 2 2 (κ0 + iη0 ) − ξ · (κ + iη ) × = ν R1 (ξ ) + ν  cs2

568

S. Shimizu

 0 000 0 |ξ 0 | τp+2 (ξ 0 ) −τp+1 (ξ 0 )   −2ρ1 c2 τ + (ξ 0 0 )|ξ 0 000 | −ρ1 c2 (τ + (ξ 0 0 )2 − |ξ 0 000 |2 ) 2ρ2 c2 τ + (ξ 0 0 )|ξ 0 000 | s1 p1 s1 s1 s2 p2  2 + 00 2 000 2 0 ρ1 cs (τs (ξ ) − |ξ 0 000 |2 ) −2ρ1 cs2 τs+ (ξ 0 0 )|ξ 0 000 | −ρ2 cs2 |ξ | 1 1 1 1 000 0 τs+1 (ξ 0 ) |ξ 0 | 0 000 |ξ 0 | −τp+1 (ξ 0 ) + ρ2 cs22 −2ρ1 c2 τ + (ξ 0 0 )|ξ 0 000 | −ρ1 c2 (τ + (ξ 0 0 )2 − |ξ 0 000 |2 ) s1 p1 s1 s1  000 0 |ξ |  0 τp+2 (ξ 0 )  + O(ν −1 ) 0 000 2ρ2 cs22 τp+2 (ξ 0 )|ξ 0 | v (   ) u   0 u ξ 1 0 0 = ν 6 R1 (ξ 0 ) + ν − 2 t2 20 (κ0 + iη0 ) − ξ 0 000 · (κ 000 + iη000 ) S(ξ 0 ) + O(ν −1 ) .   cs2 Similarly · · · {P1I (νξ 0 + κ + iη)−1 }1 ν4 · · · = I I ) (κ + iη) {B1 (νξ 0 + κ + iη)P1I (νξ 0 + κ + iη)−1 }1 · · · (det P1s 0 1 ξ ··· v (   ) u   0 ξ 1u × Q1 (ξ 0 ) + ν − 2 t2 20 (κ0 + iη0 ) − ξ 0 000 · (κ 000 + iη000 ) T1 (ξ 0 ) + O(ν −1 ) ,   cs2 · · · {P1I (νξ 0 + κ + iη)−1 }1 · ν4 · · = I ) (κ + iη) · {B1I (νξ 0 + κ + iη)P1I (νξ 0 + κ + iη)−1 }1 · · (det P1s 0 1 ξ · ·· v (   ) u   0 ξ 1u × Q2 (ξ 0 ) + ν − 2 t2 20 (κ0 + iη0 ) − ξ 0 000 · (κ 000 + iη000 ) T2 (ξ 0 ) + O(ν −1 ) .   cs2 In a similar manner as the proof of Eq. (4.1), we obtain 0 0 0 00 − 00 3 1 ν 2 e−iν{(x −y )·ξ +x3 τp1 (ξ )+y3 ξ3 } F I (x, y) − ν 2 FsI ξ 0 p (x, y), φ(x, y) 1

=< Fs1 ξ 0 p1 s2 (x, y), φ(x, y) >x,y

1

x,y

4 4 for φ(x, y) ∈ C0∞ (R− × R− ),

where FsI ξ 0 p 1

1 s2

(x, y) Z

= (2π)−4

R3

0

0

−

ei(x −y +gradτp1 (ξ

0 0 )x )·(κ 0 +iη0 ) 3

Z R

000

e−iy3 (κ3 +iη3 ) U(ξ 0 )C

Inner Estimate of Singularities to Elastic Wave

569

v ( ) u u ξ00 1 000 t × 2 2 (κ0 + iη0 ) − ξ 0 · (κ 000 + iη000 ) I ) (κ + iη) cs2 (det P1s 0 1 ξ !  000 0 0 |ξ 0 | T1 (ξ 0 )R1 (ξ 0 )−Q1 (ξ 0 )S(ξ 0 ) 0  0 R1 (ξ 0 )2 × −τp+1 (ξ 0 ) 0 !  000 0 0 0 |ξ 0 | T2 (ξ 0 )R1 (ξ 0 )−Q2 (ξ 0 )S(ξ 0 )  0 0 −1 R1 (ξ 0 )2 −τp+1 (ξ 0 ) 0  U(ξ 0 000 )C dκ3 dκ 0 .   0 0

(5.16)

Here we remark that 0 0 0 0 T1 (ξ 0 )R1 (ξ 0 ) − Q1 (ξ 0 )S(ξ 0 ), T2 (ξ 0 )R1 (ξ 0 ) − Q2 (ξ 0 )S(ξ 0 ) 6 ≡ (0, 0) since there is at least one no zero point. If we localize at the point ξ 0 satisfying (4.20), that is, j = s1 , k = s1 , m = p1 , then 4 × R 4 ), we could for the proof of (4.2) we remark the following. For φ(x, y) ∈ C0∞ (R− − put 3 −iν{(x 0 −y 0 )·ξ 0 0 +x τ − (ξ 0 0 )−y ξ 0 } 1 I I 3 s1 3 3 2 2 F (x, y) − ν Fs ξ 0 s (x, y), φ(x, y) ν e 1

1 2

0 0 −i{τp+ (νξ 0 +κ 0 +iη0 )−ντs+ (ξ 0 )}x3 1 1

1 2

0 0 −i{τs+ (νξ 0 +κ 0 +iη0 )−ντs+ (ξ 0 )}x3 1 1

= ν e

1

x,y

f1 (κ + iη, ν), E 0 ˜ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) φ(−κ κ,x3

+ ν e

f2 (κ + iη, ν), E 0 ˜ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) . φ(−κ κ,x3

The first term of the right-hand side is equal to (

Z

0

1

ν2e

1

i ντs+ (ξ 0 )−ν 2 1

s 2

ξ00 2 cp 1

)

000

(κ0 +iη0 )−ξ 0 ·(κ 000 +iη000 )

x3

0 ˜ × f¯1 (κ + iη, x3 , ν)φ(−κ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) dκdx3 ≡ I (ν),

where ˜ 0 , x3 , z) = FL(x 0 ,y) [φ(x, y)](ζ 0 , z), φ(ζ 0 f¯1 (κ + iη, x3 , ν) = eiO(ν )x3 f1 (κ + iη, ν).

If we put t

L=−

s ξ00 000 0 000 000 2 c2 (κ0 + iη0 ) − ξ · (κ + iη ) p1

ν

1 2

ξ0 x3 c20 p

1

1 ∂ , i ∂κ0

570

S. Shimizu

then we obtain (

Z

0

1

i ντs+ (ξ 0 )−ν 2

I (ν) =

1

e h

s 2

ξ00 2 cp 1

)

000

(κ0 +iη0 )−ξ 0 ·(κ 000 +iη000 )

x3

i 1 0 ˜ × ν 2 L2 f¯1 (κ + iη, x3 , ν)φ(−κ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) dκdx3 −→ 0 as ν → ∞, since

∂ j h i 0 ˜ − iη0 , x3 , κ 0 + iη0 , κ3 + iη3 ) f¯1 (κ + iη, x3 , ν)φ(−κ ∂κ0 ≤ C(η)(|κ| + |η|)−6 if ν ≥ 1, j ≤ 2,

and

˜ −∃C1 ≤ x3 ≤ −∃C2 < 0 on supp φ. 000

The case ξ 0 = 0 is calculated similarly. Thus we prove Eq. (4.2). 4 × R 4 ) be the complement Next we prove the inclusion relation (4.3). Let V (⊂ R− − 0 of sing suppF I (x, y). For the points ξ 0 that are not zeros of inner radical sign of τm+ (ζ 0 ) (m ∈ {p1 , p2 , s2 }), by the Riemann–Lebesgue theorem Z − 00 0 0 0 00 νe−iν{(x −y )·ξ +x3 τk (ξ )+y3 ξ3 } F I (x, y)g(x, y)dxdy −→ 0 as ν → ∞ for g(x, y) ∈ C0∞ (V ). On the other hand, by the localization method 0

0

νe−iν{(x −y )·ξ

0 0 +x τ − (ξ 0 0 )+y ξ 0 } 3 k 3 3

F I (x, y) −→FjIξ 0 k (x, y) as ν → ∞, j ∈ {p1 , s1 }, k ∈ {p1 , s1 },

so we have

(5.17)

V ∩ supp FjIξ 0 k (x, y) = ∅, j ∈ {p1 , s1 }, k ∈ {p1 , s1 }. 0

For the points ξ 0 that are zeros of inner radical sign of τm+ (ζ 0 ) (m ∈ {p1 , p2 , s2 }), we have Z n o − 00 3 1 0 0 0 00 ν 2 e−iν{(x −y )·ξ +x3 τk (ξ )+y3 ξ3 } F I (x, y) − ν 2 FjIξ 0 k (x, y) g(x, y)dxdy −→ 0 as ν → ∞, j ∈ {p1 , s1 }, k ∈ {p1 , s1 } for g(x, y) ∈ C0∞ (V ) by (5.17). On the other hand, by the localization method 0

0

ν 2 e−iν{(x −y )·ξ 3

0 0 +x τ − (ξ 0 0 )+y ξ 0 } 3 k 3 3

F I (x, y) − ν 2 FjIξ 0 k (x, y) −→ FjIξ 0 km (x, y) 1

as ν → ∞, j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }, so we have V ∩ supp FjIξ 0 km (x, y) = ∅, j ∈ {p1 , s1 }, k ∈ {p1 , s1 }, m ∈ {p1 , p2 , s2 }. Thus we obtain the inclusion relation (4.3).

Inner Estimate of Singularities to Elastic Wave

571

Finally we prove the formulas (4.4) and (4.5). If (Q1 (ξ 0 ), Q2 (ξ 0 ))6 = (0, 0) in (5.13) or if Q1 (ξ 0 ) 6 = 0 in (5.15), then we could put FsI ξ 0 p (x, y) 1 1

Z

−4

= Const.(2π )

0

0

+

ei{(x −y −gradτp1 (ξ

R4

0 0 )x )·(κ 0 +iη0 )−y (κ +iη )} 3 3 3 3

(κ0 + iη0 ) −

cs2

ξ 0 00 · (κ 00 + iη00 )

1 ξ00

dκ,

and would like to obtain suppFsI ξ 0 p (x, y). If we put 1

G1 (x) = (2π)−4

then

1

Z

eix·(κ+iη) R4

(κ0 + iη0 ) −

cs2

1 ξ00

ξ 0 00 · (κ 00 + iη00 )

dκ,

0

FsI ξ 0 p (x, y) = G1 (x 0 − y 0 − gradτp+1 (ξ 0 )x3 , −y3 ). 1

1

So it is sufficient that we consider suppG1 . From the Paley-Wiener -Schwartz theorem, we are led to ch[suppG1 ] = {x ∈ R4 | x · η ≥ 0 for ∀η ∈ 0s1 ξ 0 }, where ch denotes a convex hull, and ( ) cs21 0 00 00 4 0s1 ξ 0 = η ∈ Rη η0 − 0 ξ · η > 0 . ξ0

(5.18)

By (5.18), we have ! ) cs21 0 00 , λ≥0 ch[suppG1 ] = x ∈ R x = λ 1, − 0 ξ ξ0 (

and it is half-line. So we obtain

4

! ) cs21 0 00 , λ≥0 , suppG1 = ch[suppG1 ] = x ∈ R x = λ 1, − 0 ξ ξ0 (

4

since G1 is a homogeneous distribution. Thus we prove the formula (4.4). Next we prove the formula (4.5). If 0 0 0 0 T1 (ξ 0 )R1 (ξ 0 ) − Q1 (ξ 0 )S(ξ 0 ), T2 (ξ 0 )R1 (ξ 0 ) − Q2 (ξ 0 )S(ξ 0 ) 6 = (0, 0) in (5.16), then we could put FsI ξ 0 p s (x, y) 1 1 2

−4

= Const.(2π ) s ×

Z R4

0

0

+

ei{(x −y −gradτp1 (ξ (κ0 + iη0 ) −

0 0 )x )·(κ 0 +iη0 )−y (κ +iη )} 3 3 3 3

cs2

1

ξ00

ξ 0 00 · (κ 00 + iη00 )

ξ00 (κ0 + iη0 ) − ξ 0 000 · (κ 000 + iη000 ) dκ, cs22

572

S. Shimizu

and would like to obtain suppFsI ξ 0 p

1 s2

1

−4

Z

G2 (x) =(2π)

eix·(κ+iη)

R4

s × then

FsI ξ 0 p 1

1 s2

(x, y). If we put

(κ0 + iη0 ) −

cs2

1

ξ00

ξ 0 00 · (κ 00 + iη00 )

ξ00 (κ0 + iη0 ) − ξ 0 000 · (κ 000 + iη000 ) dκ, cs22 0

(x, y) = G2 (x 0 − y 0 − gradτp+1 (ξ 0 )x3 , −y3 ).

So it is sufficient that we consider suppG2 . From the Paley-Wiener -Schwartz theorem, we are led to ch[suppG2 ] = {x ∈ R4 | x · η ≥ 0 for ∀η ∈ 0s1 ξ 0 s2 }, where ch denotes a convex hull, and ( ) 2 2 c c 00 000 s s 0s1 ξ 0 s2 = η ∈ Rη4 η0 − 01 ξ 0 · η00 > 0, η0 − 02 ξ 0 · η000 > 0 . ξ0 ξ0

(5.19)

By (5.19), we have ! ! cs22 0 000 cs21 0 00 , ch[suppG2 ] = x ∈ R x = k1 1, − 0 ξ , 0 + k2 1, − 0 ξ ξ0 ξ0 k1 , k2 ≥ 0 .

4

We would like to verify ch[suppG2 ]=suppG2 . We take the change of coordinates such as   c2

c2

s2 0 s2 0 0  1 − ξ00 ξ1 − ξ00 ξ2   2 2 2 c c c  s1 0 s1 0 s1 0  ξ − ξ − ξ 1 −   p = Aκ, A =  ξ00 1 ξ00 2 ξ00 3  , 0 0 1 0    0 0 0 1

where we note that A is a holomorphic matrix by assumption (2.3). Then G2 (x) is given by q Z t A−1 x·(p−iAϑ) i ξ00 (p0 − i) 1 e −4 × dp, G2 (x) = (2π) | det A| R4 p1 − i cs2 ξ00 q where ϑ = t (1, 0, 0, 0) and ξ00 (p0 − i) is taken to be the branch such that q Im ξ00 (p0 − i) > 0.

Inner Estimate of Singularities to Elastic Wave

573

By Aϑ=t (1, 1, 0, 0) and the Cauchy integral theorem, we obtain q Z ξ00 (p0 − i) 1 −4 i t A−1 x·(p−i t (1,1,0,0)) 1 e dp G2 (x) = (2π) | det A| R4 p1 − i cs2 ξ00 q Z ξ00 (p0 − i0) 1 1 t −1 ei A x·p dp. (5.20) = (2π)−4 | det A| R4 p1 − i0 cs2 ξ00 By the Fourier transform formula (cf. [H2, Example 7.1.17]), we deduce 1 −1 (x) = iH (x), F ξ − i0 i h i 1 e− 4 π − 3 F −1 (ξ − i0) 2 (x) = − √ x+ 2 , 2 π where H (x) denotes the Heaviside function and ( x a for x > 0, a = x+ 0 for x ≤ 0, for a ∈ C. By (5.21) and (5.22), the right-hand side of (5.20) is equal to  i − 23 −1 q e−√4 π 0 −4 ×  (z ) (2π)  0 + ⊗ iH (z1 ) ⊗ δ(z2 , z3 ) for ξ0 > 0,  | det A|c ξ0 2 π   (2π)−4 ×

s2

0

| det A|cs2

i

− 23 4π e√ (z ) 0 + 2 π |ξ00 |

−1 q

⊗ iH (z1 ) ⊗ δ(z2 , z3 ) for ξ00 < 0

0π

−ie−i sgnξ0 4 − 23 q (z ) = 0 + ⊗ H (z1 ) ⊗ δ(z2 , z3 ), √ (2π)4 × 2 π| det A|cs2 |ξ00 | where z = t A−1 x. Thus we get n o supp G2 = x ∈ R4 | z0 ≥ 0, z1 ≥ 0, z2 = z3 = 0     k1       k    4 t  2 = x ∈ R x = A   , k1 , k2 ≥ 0   0       0 ! ! cs22 0 000 cs21 0 00 4 , = x ∈ R x = k1 1, − 0 ξ , 0 + k2 1, − 0 ξ ξ0 ξ0 k1 , k2 ≥ 0 = ch[supp G2 ], thereby we prove formula (4.5). This completes the proof of the Main Theorem.

(5.21) (5.22)

574

S. Shimizu

Acknowledgement. The author would like to express her gratitude to Professor Seiichiro Wakabayashi for his thoughtful comments on an earlier manuscript and many invaluable suggestions, especially on formulas (4.4), (4.5), (4.9), and (4.10) in the Main Theorem.

References [A-B-G] Atiyah, M.F., Bott, R., Gårding, L.: Lacunas for hyperbolic differential operators with constant coefficients I. Acta Math. 124, 109–189 (1970) [H1] Hörmander, L.: On the singularities of solutions of partial differential equations. Commun. Pure Appl. Math. 23, 329–358 (1970) [H2] Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Berlin–Heidelberg–New York–Tokyo: Springer-Verlag, 1983 [M1] Matsumura, M.: Comportement asymptotique de solutions de certains problèmes mixtes pour des systèmes hyperboliques symétriques à coefficients constants. Publ. RIMS Kyoto Univ. 5, 301–360 (1970) [M2] Matsumura, M.: Localization theorem in hyperbolic mixed problems. Proc. Japan. Acad. 47, 115– 119 (1971) [M3] Matsumura, M.: On the singularities of the Riemann functions of mixed problems for the wave equation in plane-stratified media I. Proc. Japan. Acad. 52, 289–292 (1976) [M4] Matsumura, M.: On the singularities of the Riemann functions of mixed problems for the wave equation in plane-stratified media II. Proc. Japan. Acad. 52, 293–295 (1976) [S] Shimizu, S.: Eigenfunction expansions for elastic wave propagation problems in stratified media R 3 . Tsukuba J. Math. 18, 283–350 (1994) [T] Tsuji, M.: Propagation of the singularities for hyperbolic equations with constant coefficients. Japan J. Math. 2, 361–410 (1976) [W1] Wakabayashi, S.: Singularities of the Riemann functions of hyperbolic mixed problems in a quarterspace. Proc. Japan. Acad. 50, 821–825 (1974) [W2] Wakabayashi, S.: Singularities of the Riemann functions of hyperbolic mixed problems in a quarterspace. Publ. RIMS Kyoto Univ. 11 417–440 (1976) Communicated by H. Araki

Commun. Math. Phys. 208, 575 – 604 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Low Temperature Phase Diagrams of Fermionic Lattice Systems C. Borgs1,?,?? , R. Kotecký2,??? 1 Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11, 04109 Leipzig, Germany.

E-mail: [email protected]

2 Center for Theoretical Study, Charles University, Prague, Jilská 1, 110 00 Praha 1, Czech Republic

and Theoretical Physics, Charles University, V Holešoviˇckách 2, 180 00 Praha 8, Czech Republic. E-mail: [email protected] Received: 23 December 1996/ Accepted: 7 April 1999

Abstract: We consider fermionic lattice systems with Hamiltonian H = H (0) + λHQ , where H (0) is diagonal in the occupation number basis, while HQ is a suitable “quantum perturbation”. We assume that H (0) is a finite range Hamiltonian with finitely many ground states and a suitable Peierls condition for excitations, while HQ is a finite range or exponentially decaying Hamiltonian that can be written as a sum of even monomials in the fermionic creation and annihilation operators. Mapping the d dimensional quantum system onto a classical contour system on a d + 1 dimensional lattice, we use standard Pirogov–Sinai theory to show that the low temperature phase diagram of the quantum system is a small perturbation of the zero temperature phase diagram of the classical system, provided λ is sufficiently small. Particular attention is paid to the sign problems arising from the fermionic nature of the quantum particles. As a simple application of our methods, we consider the Hubbard model with an additional nearest neighbor repulsion. For this model, we rigorously establish the existence of a paramagnetic phase with commensurate staggered charge order for the narrow band case at sufficiently low temperatures.

1. Introduction In recent years, the Hubbard model has become one of the most important models in the theory of strongly correlated electron systems. Since its invention by Hubbard and others [1–3], it has been used to describe, among others, antiferromagnetism [4], ferromagnetism [5], paramagnetism [6], the metal-insulator transition [7–9], and, more recently, high-Tc superconductivity [10,11]. ? Partly supported by the Commission of the European Union under contract CHRX-CT93-0411.

?? Present address: Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA ??? Partly supported by the grants GACR ˇ 202/96/0731 and GAUK 96/272.

576

C. Borgs, R.Kotecký

As already pointed out by Hubbard in his original paper [1], the standard Hubbard model is a very crude approximation to the actual behaviour of electrons in these systems. Many terms, some of which may drastically change the phase diagram, have been neglected. The largest and most important of these terms is the nearest neighbor Coulomb repulsion. The modification of the Hubbard model which contains this term is usually referred to as the extended Hubbard model. Most relevant for physical applications is the so-called narrow band case of this model, characterized by a hopping constant t that is small with respect to the Coulomb interaction. In this paper we rigorously establish the existence of a low temperature phase with staggered charge order in the narrow band extended Hubbard model in d ≥ 2 dimensions. This phase is characterized by an electron density which, rather than being constant, varies from one sublattice to the next of a bipartite lattice 3. While the existence of such a phase has been predicted by many authors (see e.g. [11–15]), the only previous rigorous results consider the atomic limit t = 0 [16,17]. In order to obtain our results for the narrow band extended Hubbard model, we combine the methods of reference [17] with our recent extension of Pirogov–Sinai theory to quantum spin systems [18] to obtain a convergent expansion about the atomic limit. Actually, this expansion will be derived for a more general class of strongly interacting fermionic lattice system, see Sect. 3 below. The extended Hubbard model is defined by the Hamiltonian X † † † † cy,↓ + cy,↑ cx,↑ + cy,↓ cx,↓ cx,↑ cy,↑ + cx,↓ H3 = −t hx,yi

+U

X x

nˆ x,↑ nˆ x,↓ + W

X hx,yi

nˆ x nˆ y − µ + zW +

U X nˆ x , 2 x

(1.1)

where the second and fourth sum run over the points x of a bipartite lattice 3 with constant coordination number z, while the first and third sum run over the set B(3) of † and cx,σ , denote the creation all nearest neighbor pairs hx, yi in 3. The symbols cx,σ and annihilation operators of the electron with up and down spin, σ = ↑, ↓, while † cx,σ and nˆ x : = nˆ x,↑ + nˆ x,↓ are the corresponding number operators. As nˆ x,σ : = cx,σ usual, the electron creation and annihilation operators satisfy canonical anticommutation relations. The first term of the Hamiltonian (1.1) stands for the isotropic nearest neighbour hopping of electrons, the second one is the familiar on-site Hubbard interaction, the third term represents the isotropic nearest neighbour interaction, and the last one the contribution of the particle reservoir characterized by the chemical potential µ. We have introduced the shift zW + U2 in order to move the hole-particle symmetry point (the half-filled band) to the value1 µ = 0. Originally, the second and the third terms were supposed to simulate the effect of the Coulomb repulsion between the electrons, hence only positive U and W were considered. Later on, in various applications of the model, the parameters t, U and W represented the effective interaction constants that take into account also other interactions (for instance with phonons). Therefore U and W could take negative values as well. In this paper U will be allowed to change its sign while W always stays positive. 1 For more general biparite lattices where the coordination number z varies from sublattice to sublattice, one would need a shift which is different for the two sublattices. Even though our methods do not require any symmetry between the two sublattices and would allow us to analyse this asymmetric model as well, we don’t consider it here in order to simplify our notation.

Phase Diagrams of Fermionic Lattice Systems

577

=2d H2

Sf1 2g ;

W

2

W=

H1 2

U=4d

W=

Sf0 2g ;

,

2

W=

,

W

Sf0 1g ;

H0

Fig. 1. Ground state phase diagram of the t = 0 model

Before stating our main result for the narrow band model at low temperatures, we recall the ground state diagram of the atomic limit model (t = 0). In order to simplify the notation, we restrict ourselves to the simple hypercubic lattice Zd , although our results should hold for other bipartite lattices as well. Observing that the potential term in (1.1) can be written as a sum over pair potentials, H3 = −t

X hx,yi∈B(3)

+

X

† † † † cy,↑ + cx,↓ cy,↓ + cy,↑ cx,↑ + cy,↓ cx,↓ cx,↑

(1.2)

v(nˆ x , nˆ y ),

hx,yi∈B(3)

with

U (nˆ x − 1)2 + (nˆ y − 1)2 v(nˆ x , nˆ y ) = W nˆ x − 1 nˆ y − 1 + 4d µ nˆ x + nˆ y − 2 , − 2d

(1.3)

the ground states of the t = 0 model are easily determined. The corresponding ground state diagram is shown in Fig. 1. One finds three regions Ha , a = 0, 1, 2 with homogeneous particle density hnˆ x i = a, and three regions S{a,b} , {a, b} = {0, 1}, {0, 2}, {1, 2} with a commensurate charge density wave: hnˆ x i = ρ + (−1)x 1, where ρ = a+b 2 and b−a 1=± 2 . In this paper we will prove that, for all > 0, and for all (U, µ) in the subregion () S{0,2} = (U, µ) ∈ R2 U < 2d(W − ),

|µ| < 2d min{W − , W − − U/4d}

(1.4)

578

C. Borgs, R.Kotecký (0)

of the region S{0,2} = S{0,2} , the staggered charge order persists for sufficiently low temperatures and sufficiently small t. We also establish that the corresponding phase is paramagnetic,2 see Sect. 2 for the precise statements of our results. The more general class of fermionic systems we consider is described by a Hamiltonian H = H (0) + λHQ , where H (0) is diagonal in the occupation number basis, while HQ is a suitable “quantum perturbation”. We assume that H (0) is a finite range Hamiltonian with finitely many ground states and a suitable Peierls condition for excitations, while HQ is a finite range or exponentially decaying Hamiltonian that can be written as a sum of even monomials in the fermionic creation and annihilation operators. For these models, we derive a convergent cluster expansion about the “classical theory” with λ = 0, following closely the methods used in [18]: In a first step, we use the DuhamelPhillips (or Schwinger-Dyson) expansion to derive a path integral representation of the model. In the next step, we block the configurations contributing to the path integral onto lattice configurations on a suitable space-time lattice 3 × {1, 2, . . . , M}. Applying Pirogov–Sinai theory [20,21] in the form developed in [22] to the resulting classical contour system, we obtain our main results. Namely, we determine the stable phases in dependence on the external parameters (construction of the phase diagram), and show that the corresponding infinite volume states are periodic, pure states with exponential clustering for truncated expectation values. We also control the thermodynamic limit for periodic boundary conditions and prove that it is a convex combination of the stable states with equal weight for each of them. Finally, we discuss conservation laws for the quantum system in a general setup. Under the condition that the full Hamiltonian commutes with an operator Q3 , we 1 hQ3 i in the ground state h · i of the quantum show that the density ρQ = lim3→Zd |3| system exactly coincides with the density of the corresponding classical ground state, see Sect. 3.6 for the precise statement. Within their parallel approach to quantum Pirogov–Sinai theory, Datta, Fröhlich and Fernández have obtained similar results in [19]. In contrast to our methods that are based on renormalization group ideas and the reduction to a contour model on a spacetime lattice with a subsequent application of standard Pirogov–Sinai theory, they study directly the contour model emerging from the functional integral, extending Pirogov– Sinai theory to contour models with continuous time. After the original submission of the present paper, several other articles on fermionic quantum systems have appeared, in particular [25–28]. The organization of this paper is as follows. In the next section, we state our main result concerning the extended Hubbard model, Theorem 2.1. In Sect. 3, we define the general model and state our results in this case. Sect. 4 is devoted to the derivation of the contour representation of the model, paying particular attention to the factorization properties of the signs coming from the permutation of fermions. In Sect. 5 we prove exponential decay of contour activities and use these bounds, together with standard cluster expansion methods, to prove the results of Sect. 3. Theorem 2.1 is proved in Sect. 6.

2 It would be very interesting to establish the existence of a phase with ferro- or antiferromagnetic order in this model. Unfortunately, this is a very difficult task, due to the Goldstone boson which is expected as a consequence of spontaneous symmetry breaking of the corresponding continuous symmetry. Note that this problem does not arise in the asymmetric t − J model studied in [19], where the symmetry to be broken is a discrete symmetry.

Phase Diagrams of Fermionic Lattice Systems

579

2. Statement of Results for the Extended Hubbard Model For any even L we consider a finite box 3 = 3(L) = {−L/2, . . . , L/2 − 1}d with † and cx,σ (x ∈ 3, Ld points, the fermionic creation and annihilation operators cx,σ σ = ↑, ↓), the corresponding Fock-space H3 , the algebra A3 that is generated by even monomials in the creation and annihilation operators and the algebra of local observables, A = ∪A3 , where the union runs over all finite sets 3 ⊂ Zd . Choosing periodic boundary conditions, we define the partition function at the inverse temperature β = 1/kT as β

Zper,3 = Tr H3 e−βH3 ,

(2.1)

and the expectation value of an observable 9 ∈ A3 , β

h9iper,3 =

1 β Zper,3

Tr H3 9 e−βH3 .

(2.2)

Assuming for a moment that the corresponding limit exists for all local observables 9 ∈ A, we define the infinite volume Gibbs state β

h9iβper = lim h9iper,3 , 3→Zd

(2.3)

where the limit is taken along cubic boxes 3(L) of even side length L.3 Next, we define, for an arbitrary periodic state h·i on A, the density, ρ = lim |3|−1 3→∞

X

hnx i,

(2.4)

x∈3

and the staggered density, 1 = lim |3|−1 3%∞

X

(−1)x hnx i.

(2.5)

x∈3

Introducing, in addition to the number operators † cx,σ , nˆ x = nˆ x,↑ + nˆ x,↓ , nˆ x,σ = cx,σ

(2.6)

also the spin operators Sx3 =

1 2

† † cx,↓ and Sx− = cx,↓ cx,↑ , nˆ x,↑ − nˆ x,↓ , Sx+ = cx,↑

(2.7)

our main theorem is: Theorem 2.1. For d ≥ 2 there are constants C1 = C1 (d) < ∞ and C2 = C2 (d) > 0 () such that, for 0 < < W , β > C1 , |t| < C2 and all (U, µ) ∈ S{0,2} : 3 The existence of the limit (2.3) in the relevant region (1.4) is part of our results.

580

C. Borgs, R.Kotecký

i) The thermodynamic limit (2.3) exists for all local observables 9 ∈ A. It is a convex combination, β (2.8) h·iβper = 21 h·iβeven + 21 h·iodd , β

β

of two pure states h·ieven and h·iodd with charge density waves hnˆ x iβeven = ρ + (−1)x 1, β

hnˆ x iodd = ρ − (−1)x 1,

x ∈ Zd , x ∈ Zd ,

(2.9)

where 1 > 0. Here 1 = 1even = −1odd and ρ = ρeven = ρodd are given by (2.4) and (2.5). β ii) For all x ∈ Zd , and m = even or odd, hS x im = 0. d iii) Let tx (·) be the translation by x ∈ Z , and let 9, 8 ∈ A be arbitrary local observables. Then, for m = even or odd, and all x ∈ Zd , h9 tx (8)iβ − h9iβ htx (8)iβ ≤ C(9, 8)e−|x|/ξ . (2.10) m m m Here C(9, 8) < ∞ and ξ < ∞ are constants. () iv) At zero temperature, the compressibility ∂ρ/∂µ vanishes for all (U, µ) ∈ S{0,2} . β

β

Remarks. i) By Statement iii), h·ieven and h·iodd are pure phases. β β ii) Statement ii) implies the absence of magnetic ordering in the phases h·ieven and h·iodd . Our methods can actually be extended to include non-zero magnetic fields, giving paramagnetism in the usual sense. 3. General Setting and Results In this section, we state our results for a general class of fermionic models on Zd . We consider a finite index set 6 = {1, 2, . . . , |6|} labelling internal degrees of freedom, † and cx,σ lafinite subsets 3 ⊂ Zd , fermionic creation and annihilation operators cx,σ belled by indices x = (x, σ ) ∈ 3 = 3 × 6, the corresponding Fock-space H3 , the algebra A3 that is generated by even monomials in the creation and annihilation operators, and the algebra of local observables, A = ∪A3 , where the union runs over all finite sets 3 ⊂ Zd . In order to define an occupation number basis in H3 , we introduce an arbitrary total order on Zd × 6. We then define, for a classical configuration n : Zd × 6 → {0, 1}: (x, σ ) 7 → nx,σ , the vector |ni3 as Y (cx† )nx |0i3 , (3.1) |ni3 = P x∈3

where |0i3 is the Fock vacuum in H3 , and P denotes ordering with respect to the order on Zd × 6. Finally, we define the projection operator onto the classical state n in a finite set U ⊂ Zd as Y Px (n), (3.2a) PU (n) = x∈U ×6

where Px (n) = nx (cx† cx ) + (1 − nx )(1 − cx† cx )

(3.2b)

Phase Diagrams of Fermionic Lattice Systems

581

and U is a finite subset of Zd . Note that PU (n) is a local observable in A3 provided U ⊂ 3. We assume that the Hamiltonian H of the model is a sum of two terms, H = H (0) + λHQ ,

(3.3)

where the “classical part” H (0) is diagonal in the occupation number basis and the “quantum part” HQ is a sum of even monomials in the creation and annihilation operators. In order to prove the results of this paper, we will need several additional assumptions on the classical and the quantum part of the Hamiltonian. We start with the assumptions on the classical part. 3.1. Assumptions on the classical model. Since H (0) is diagonal in the occupation number basis, it defines a classical lattice gas with |6| different species, occupation numbers nx,σ in {0, 1}, configurations n : Zd × 6 → {0, 1}, (x, σ ) 7 → nx,σ , and a suitable Hamilton function H (0) (n). We assume that this Hamilton function is given in terms of finite range, translation invariant interactions, depending on a vector parameter µ ∈ U, where U is an open subset of Rν . Due to these assumptions, H (0) (n) can be written in the form X 8x (n), (3.4) H (0) (n) = x

where 8x (n) ∈ R depends on n only via the occupation numbers ny,σ for which dist(x, y) ≤ R0 , where R0 is a finite number. In our notation we suppress the dependence of H (0) and 8x on µ. As usual, a configuration g which minimizes the Hamiltonian (3.4) is called a ground state configuration. For the purpose of this paper, we will assume that the number of periodic ground states of the Hamiltonian (3.4) is finite. More precisely, we will assume that there is a finite number of periodic configurations g (1) , . . . , g (r) , with (specific) energies 1 X 8x (g (m) ), (3.5) em = em (µ) = lim 3→Zd |3| x∈3

such that for each µ ∈ U, the set of periodic ground states G(µ) is a subset of {g (1) , . . . , g (r) }. Obviously, G(µ) is given by those configurations g (m) for which em (µ) is equal to the “ground state energy” e0 = e0 (µ) = min em (µ). m

(3.6)

Note that we may assume, without loss of generality, that 8x (g (m) ) is independent of the point x for all ground state configurations g (m) , because this condition can always be achieved by averaging 8x (n) in (3.4) over the minimal common period L0 of g (1) , . . . , g (r) . Our goal will be to prove that the low temperature phase diagram of the quantum model is a small perturbation of the classical ground state diagram provided that the quantum perturbation is sufficiently small and the classical part of the Hamiltonian satisfies the standard Pirogov–Sinai theory assumptions (in particular, finite degeneracy of ground states). Note that this excludes the regions S{1,2} , H1 , and S{0,1} in Fig. 1.

582

C. Borgs, R.Kotecký

In order to formulate and prove the above statement, we need some assumptions on the structure of the ground state diagram. Here we assume that for some value of µ0 ∈ U all states in {g (1) , . . . , g (r) } are ground states, em (µ0 ) = e0 (µ0 )

for all

m = 1, . . . , r,

(3.7)

that em (µ) are C 1 functions in U, and that the matrix of derivatives E=

∂e (µ) m ∂µi

(3.8)

has rank r − 1 for all µ ∈ U, with uniform bounds on the inverse of the corresponding submatrices. We remark that this condition implies that the zero temperature phase diagram has the usual structure of a ν − (r − 1) dimensional coexistence surface S0 where all states g (m) are ground states, r different ν − (r − 1) − 1 dimensional surfaces Sn ending in S0 where all states but the state g (m) are ground states, . . . Next, we formulate a suitable Peierls condition. Recalling that 8x (n) does not depend on ny,σ if dist(x, y) > Ro , we define U (x) as the minimal set of points y such that 8x (n) depends on ny,σ .4 We then introduce, for a given configuration n, the notion of excited sites x ∈ Zd . We say that a site x is in the state g (m) if the configuration n coincides with the configuration g (m) on U (x); a site is excited, if it is not in any of the states g (1) , . . . , g (r) . Given this notation, the Peierls assumption used in this paper is that there exists a constant γcl > 0, independent of µ, such that 8x (n) ≥ e0 (µ) + γcl

for all excited sites x of all configurations n.

(3.9)

Finally, we assume that the derivatives of 8x are uniformly bounded in U. More explicitly, we assume that there is a constant C0 < ∞, such that ∂ 8x (n) ≤ C0 ∂µi

(3.10)

for all i = 1, . . . , ν, µ ∈ U, x ∈ Zd , and all configurations n. Remarks. i) Given the assumptions stated in this subsection, standard Pirogov–Sinai theory implies that the low temperature phase diagram of the classical model has the same topological structure as the corresponding zero temperature phase diagram (see above). ii) Let nˆ x be the number operator cx† cx . Recalling that all these operators commute with each other, we define ˆ (3.11) Hx(0) = 8x (n). With this definition, H (0) is the formal sum H (0) =

X x

4 If H is given as a sum of the form the union over all M such that x ∈ M.

P

Hx(0) .

(3.12)

M φM , where φM depends only on ny,σ with y ∈ M, then U (x) is

Phase Diagrams of Fermionic Lattice Systems

583

3.2. Assumptions on the quantum perturbation. We assume that HQ is given in the form X tA h A , (3.13) HQ = A

where the sum runs over sequences A = (a˜ 1 , . . . , a˜ 2k ) of labels a˜ i = (ai , αi , i ) ∈ Zd × 6 × {−1, 1}, tA ∈ C is a suitable hopping parameter, and hA = c(a˜ 2k )c(a˜ 2k−1 ) . . . c(a˜ 1 ),

(3.14a)

(

with c((a, α, )) =

† if = +1 ca,α ca,α if = −1.

(3.14b)

It will be convenient to assume that the creation and annihilation operators in HQ have been ordered in such a way that for each sequence A = (a˜ 1 , . . . , a˜ 2k ) contributing to (3.13) there exists ` ∈ {0, 1, . . . , 2k} such that i) i = −1 for 1 ≤ i ≤ ` and i = +1 for i > `, and ii) with respect to the given order on Zd × 6 one has (a1 , α1 ) < (a2 , α2 ) < · · · < (a` , α` ), and (a`+1 , α`+1 ) > (a`+2 , α`+2 ) > · · · > (a2k , α2k ). In the sequel, we call such a sequence a standard sequence and write Ao for the set of all standard sequences. Given the above representation of HQ (we sometimes call it the standard form for HQ ), our assumptions on HQ are now formulated in terms of the coefficients tA . First, in order to assure that the quantum perturbation is selfadjoint, we assume that t¯A = tA? , where the bar denotes complex conjugation, and ?

A =

(3.15) A?

is the sequence

? (a˜ 2k , . . . , a˜ 1? ),

(3.16a)

with

(3.16b) (a, α, )? = (a, α, −). Note that A? is a standard sequence if and only if A is a standard sequence. Next, we assume that the hopping parameters tA are translation invariant, and that tA and its derivatives decay sufficiently fast in the support of the sequence A, defined as the minimal connected set containing A. To state this more precisely, for A = (a˜ 1 , . . . , a˜ 2k ), we consider connected sets of bonds B that connect all points in {a1 , · · · , a2k }. Restricting ourselves to those of minimal size, we define B0 as the first in some arbitrary (but fixed) lexiographic order, and define the support, supp A, of A as the union of all points which are connected by this minimal set Bo . Note that by definition supp A depends only on the set {a1 , · · · , a2k }. As a consequence supp A = supp A? . Introducing, for each γ ≥ 0, the Sobolev norm ν X X ∂ tA eγ | supp A| , (3.17) |tA | + ||t||γ = ∂µi A:x∈supp A

i=1

our decay assumption for the quantum perturbation is the assumption that ||t||γQ < ∞ for a sufficiently large constant γQ .

(3.18)

584

C. Borgs, R.Kotecký

Remarks. i) For a finite range quantum perturbation, this assumption is obviously fulfilled for any γQ < ∞. ii) If the quantum perturbation is of infinite range, we need that |tA | and |∂tA /∂µi | decay exponentially fast in the size of the support of A. Assuming exponential decay with a sufficiently large decay constant γ , and observing that the number of connected sets B of size s that contain a given point x ∈ Zd is bounded by (2d)2s , while the number of standard sequences A with supp A = B is bounded by 22|B||6| , the condition (3.18) can be satisfied provided γ > γQ + 2 log(2d) + 2|6| log 2. 3.3. Finite volume states for the quantum system. In order to discuss the phase diagram β of the quantum lattice system, we will consider suitable finite volume states h·iq,3 which are analogues of the classical states with boundary condition q, with q = 1, . . . , r. Given (q,3) (q) as the number gx if x ∈ 3c = Zd \ 3, and as the a finite set 3 ⊂ Zd , we define nˆ x operator cx† cx if x ∈ 3. With this definition, the operators (0) = 8x (nˆ (q,3) ), Hq,x X (0) (0) Hq,x Hq,3 =

(3.19) (3.20)

x∈3

and

X

(0)

Hq,3 = Hq,3 + λ

tA hA

(3.21)

A:supp A⊂3

are selfadjoint operators in H3 provided λ ∈ R (recall that the sets supp A have been chosen in such a way that supp A = supp A? ). Given the Hamiltonian with boundary conditions q, we introduce the quantum state β h·iq,3 by 1 β (3.22) h9iq,3 = β Tr H3 (9 e−βHq,3 ), Zq,3 where

β

Zq,3 = Tr H3 e−βHq,3 .

(3.23)

We close this section with the definition of the support and norm of a local observable 9 . Recalling that, by definition, any local observable 9 is a finite sum of the form X λ9 (3.24) 9= A hA , A

where the hA are even monomials in creation and annihilation operators (cf. (3.13)), we say that 9 is given in its standard form, if all sequences contributing to (3.24) are standard sequences. Let now 9 be a local observable, and let (3.24) be its standard form. Then the support of 9 is defined as [ supp A, (3.25) supp 9 = A:λ9 A 6 =0

and its norm as

||9|| =

X λ9 . A∈A0

A

(3.26)

Phase Diagrams of Fermionic Lattice Systems

585

3.4. Statement of results for non-zero temperatures. In order to state our results in the form of a theorem we introduce, for each x in Zd and any local observable 9 ∈ A, the corresponding translate tx (9). Defining finally 3(L) as the box o n L L for all i = 1, . . . , d , 3(L) = x ∈ Zd − ≤ xi < 2 2

(3.27)

our main results are stated in the following two theorems. Theorem 3.1. Let d ≥ 2 and let H = H (0) + λHQ be a Hamiltonian satisfying the assumptions of Sects. 3.1 and 3.2. Then there are constants 0 < γ0 = γ0 (d, |6|) < ∞ and α = α(d, |6|) > 0 such that for all γ ≥ γ0 , all finite β ≥ β0 = γ /γcl and all λ ∈ C with 1 , (3.28) |λ| ≤ λ0 := 2eβ0 (γ )||t||γ there are functions fq (µ, β), q = 1, . . . , r, continuously differentiable in µ, such that the following statements hold true whenever aq (µ, β, λ) := Re fq (µ, β) − min Re fm (µ, β) = 0. m

(3.29)

β

i) The infinite volume free energy corresponding to Zq,3(L) exists and is equal to fq : fq = −

1 1 β lim log Zq,3(L) . β L→∞ |3(L)|

(3.30)

ii) The infinite volume limit β

h9iβq = lim h9iq,3(L) L→∞

(3.31)

exists for all local observables 9 and has the same period as the corresponding classical ground state g (q) . iii) For all local observables 9 and 8, there exists a constant C9,8 < ∞, such that h9tx (8)iβ − h9iβ htx (8)iβ ≤ C9,8 e−αγ |x| . (3.32) q q q (m)

(m)

iv) The projection operators Px = PU (x) (g (m) ) onto the “classical states” gU (x) obey the bounds (m) β hP i − δm,q < Ce−γ , (3.33) x

q

where C < ∞ is a constant that depends only on d and |6|. v) With C < ∞ as above, and C0 as in (3.10), one has

and

|fq (µ, β) − eq (µ)| ≤ Ce−γ

(3.34a)

d fq (µ, β) − eq (µ ≤ CC0 e−γ . dµi

(3.34b)

586

C. Borgs, R.Kotecký

Remarks. i) Following the usual terminology of Pirogov–Sinai theory, we call a phase with aq = 0 stable. By (3.34b) and our assumptions on the derivative matrix (3.8), the matrix ∂Re f (µ, β) m (3.34) F = ∂µi has rank r − 1, and the inverse of the corresponding submatrix is uniformly bounded in U, provided γ is sufficiently large. By the inverse function theorem, statement v) of the theorem therefore implies that the phase diagram of the quantum system has the same structure as the zero temperature phase diagram of the classical system, with a ν − (r − 1) dimensional coexistence surface S˜0 , where all states are stable, r different ν − (r − 1) − 1 dimensional surfaces S˜n ending in S˜0 , where all states but the state m are stable, . . . . ii) Choosing β sufficiently large and λ sufficiently small, the bounds (3.33) can be made arbitrarily sharp. In this sense, the quantum states h·iq are small perturbations of the corresponding classical state whenever q is stable. iii) While Theorem 3.1 is stated (and proven) for general complex λ, the physical situation corresponds, of course, to real values of λ, as required by the self-adjointness of the Hamiltonian H . The “meta-stable free energies” fq are real in this case5 , making the real part in (3.29) and (3.34) superfluous. iv) As stated, Theorem 3.1 is only valid for β < ∞. Some care is needed when stating the corresponding results for zero temperature, since the thermodynamic limit and the limit of zero temperature, in general, do not commute. Theorem 3.1 does hold for zero temperature, if fq (µ, β) is replaced by fq (µ) = lim fq (µ, β) β→∞

(3.35)

and the equalities (3.30) and (3.31) are replaced by 1 β log Zq,3(L) L→∞ β→∞ β|3(L)|

fq (µ) = − lim lim and

β

h9iq = lim lim h9iq,3(L) . L→∞ β→∞

(3.300 ) (3.310 )

For a statement concerning the possibility to interchange the order of limits see Sect. 3.5 below. In order to state the next theorem, we define states with periodic boundary conditions d on 3(L). To this end, we consider the torus 3per (L) = Z/LZ and the corresponding Hamiltonian X X Hx(0) + λ tA h A , (3.36) Hper,3(L) = x∈3per (L)

A:supp A⊂3per (L)

where the second sum goes over sequences A whose support supp A does not wind around the torus 3per (L). With these definitions, we then introduce the quantum state with periodic boundary conditions as β

h·iper,3(L) =

1 β Zper,3(L)

Tr H3 (L) (· e−βHper,3(L) ),

(3.37)

5 Given our constructions in Sect. 4 and 5, the proof of this fact is identical to the corresponding proof in [18].

Phase Diagrams of Fermionic Lattice Systems

where

587

β

Zper,3(L) = Tr H3 (L) e−βHper,3(L) .

(3.38)

Theorem 3.2. Let H (0) , HQ , β and λ be as in Theorem 3.1, and let L0 be the smallest common period of the ground states g (1) , . . . , g (r) . Assume in addition that λ is real. Then the infinite volume state with periodic boundary conditions, β

h9iβper = lim h9iper,3(nL0 ) . n→∞

(3.39)

exists for all local observables 9 and is a convex combination (with equal weights) of the stable states, X 1 (3.40) h9iβq , h9iβper = |Q| q∈Q

with

Q = Q(µ, β, λ) = {q ∈ {1, . . . , r} | aq (µ, β, λ) = 0}.

(3.41)

Remark. The statement of the theorem remains true if the sequence of volumes in (3.39) goes over volumes 3(L) with L = nL0 (Q), where L0 (Q) is the smallest common period of all stable ground states g (q) , q ∈ Q = Q(µ, β, λ). 3.5. Quantum states at zero temperature. As discussed in Remark iv) above, some care is needed when considering zero temperature states since the zero temperature limit β → ∞ and the thermodynamic limit 3 → Zd , in general, do not commute. In order to discuss this further, let us consider the modified partition function β,np

(q)

(q)

Zq,3 = hg 3 | e−βHq,3 |g 3 i,

(3.42)

where np indicates non-periodic boundary conditions. Namely, represented as a contour β,np partition function on a suitable space-time lattice, see Sect. 4, the partition function Zq,3 (q)

is characterized by the boundary conditions g 3 at times 0 and β, instead of the periodic β β b.c. in time corresponding to Zq,3 . As a consequence, Zq,3 might contain contours β,np

wrapped around the lattice in time direction, while Zq,3 does not. Since these contours may force a state that is stable at zero temperature to be unstable at finite β, the cluster β expansion for log Zq,3 might be divergent for arbitrary large β, even though the phase q becomes stable as β → ∞. β,np This phenomenon does not occur for Zq,3 for which there are no dangerous contours β,np

wrapped around the lattice in time direction. Therefore, the partition function Zq,3 can be analysed by the convergent expansion provided β ≥ β0 and q is stable for β = ∞. The same will be true for the modified expectation values β,np

β

β

h9iq,3 = hq,3 | 9 |q,3 i, where

β 1 (q) β e− 2 Hq,3 |g 3 i. |q,3 i = q β,np Zq,3

As a consequence, we obtain the following lemma.

(3.43) (3.44)

588

C. Borgs, R.Kotecký

Lemma 3.3. Let d ≥ 2 and let H (0) , HQ , α, γ , β0 , β and λ be as in Theorem 3.1. Let q be a phase with (3.45) lim aq (µ, β, λ) = 0, β→∞

and let h·iq and fq (µ) be as defined in (3.310 ) and (3.300 ). Then 1 1 1 |∂3| β,np log Zq,3 + fq (µ) ≤ O + , β|3| β β0 |3| and

h9iβ,np − h9iq ≤ C9 e−αγ min{β/β0 ,dist(supp 9,∂3)} , q,3

(3.46) (3.47)

where C9 < ∞ depends on d, |6|, the norm ||9|| of 9, and the size | supp 9| of the support of 9. Remarks. i) Lemma 3.3 implies, in particular, that the limits β → ∞ and 3 → Zd commute for the modified partition function and expectation values (3.42) and (3.43). ii) The statement (and the above consequence) of Lemma 3.3 remains true for the β β unmodified partition function and expectation values, Zq,3 and h9iq,3 , if the phase ˜ λ) = 0 for all β˜ ∈ [β, ∞]. In fact, the q is stable for all β˜ in [β, ∞], i.e. if aq (µ, β, 1 error term O( β ) in (3.46) gets replaced by an error term O(e−β/β0 ) in this case. 3.6. Low temperature states and global symmetries. In this section we consider the case in which the Hamiltonian Hq,3 commutes with some operator Q3 , which is extensive in the sense that X Qx,3 , (3.48) Q3 = x∈3

where Qx,3 are local observables in A3 for which | supp Qx,3 | and kQx,3 k is uniformly bounded in both x and 3. A typical example would be the operator of total particle number X nx , (3.49) N3 = x∈3

or the operator of the total number of particles of a given spin σ , X nx,σ . N3,σ =

(3.50)

x∈3

In addition to the assumption that Q3 is a symmetry of the quantum system, [Hq,3 , Q3 ] = 0,

(3.51)

(q)

we will assume that |g 3 i is an eigenstate of Q3 , (q)

(q)

(q)

Q3 |g 3 i = ρ3 |3| |g 3 i,

(3.52)

(q)

and that the classical density ρ3 has a limit as 3 → ∞, (q)

(q)

ρclass = lim ρ3 . 3→Zν

(3.53)

Phase Diagrams of Fermionic Lattice Systems

589

(q)

In the above examples, ρclass is the average density or the average density of particles (q) with spin σ , respectively, in the classical state |g 3 i. The following theorem states that the “quantum density” 1 (q) hQ3 iβq (3.54) ρquant (β) = lim 3→Zd |3| (q)

approaches the classical density ρclass as β → ∞. Theorem 3.4. Let d ≥ 2, let H (0) , HQ , β0 , and λ0 be as in Theorem 3.1, and let |λ| ≤ λ0 . Assume that Q3 is an operator that is extensive in the sense described above, and that satisfies (3.51) through (3.53) for some q. Then there exist constants C = C < ∞ and c > 0 such that i) If q is stable at β = ∞, i.e. if limβ→∞ aq (µ, β, λ) = 0, then (q)

ρquant ≡ lim

3→Zd

1 (q) hQ3 iq = ρclass , |3|

where h·iq is the zero temperature state defined in (3.310 ). ii) If β˜0 ≥ β0 and if q is stable for all β ≥ β˜0 , then (q) (q) ρquant (β) − ρclass ≤ Ce−βc ,

(3.55)

(3.56)

provided β ≥ β˜0 . (q)

Remark. For many models, the classical density ρclass is constant in some range of parameters µ. For these models, Theorem 3.4 implies that the compressibilities χ (i) =

∂ (q) ρ (β) ∂µi quant

(3.57)

vanish at zero temperature. An example of such a model is the extended Hubbard model in the staggered phase considered in Sect. 2. 4. Contour Representation We consider a fixed finite volume 3 = 3(L) = {x ∈ Zd | |xi | ≤ L for all i = 1, . . . , d}, and a fixed value q ∈ {1, . . . , r} for the boundary condition; further, we are not explicitly specifying this in our notation. Fixing an integer M to be determined later, and setting β˜ = β/M, we introduce the transfer matrices ˜ (0) (4.1) T (0) = e−βHq,3 and

˜

T = e−βHq,3 ,

(4.2)

and rewrite the partition function Zq,3 as Zq,3 = Tr H3 T M .

(4.3)

590

C. Borgs, R.Kotecký

4.1. Duhamel series and path integral representation. In a first step, we expand the transfer matrix T around the matrix T (0) using the Duhamel (or Dyson) series for the operator T (for a reference on the Duhamel series, see e.g. [24]). Introducing the family A0 of all sequences A contributing to (3.21), and, for each multiindex m : A0 → {0, 1, . . . , }, the notation X mA , |m| = A∈A0

Y

(−λt)m =

(−λtA )mA ,

A∈A0

m! =

Y

mA !

A∈A0

and

Z dτ

m

Y

=

Z

A∈A0 :mA 6 =0 0

β˜

Z dτA1 · · ·

β˜

0

dτAmA ,

the Duhamel series for the operator T can be written in the form T =

X (−λt)m Z m

m!

dτ m T (τ , m).

(4.4)

Here the sum goes over multiindices m : A0 → {0, 1, . . . , }, τ = {τA1 , . . . , τAmA , A ∈ A0 }, and the operator T (τ , m) is obtained from T (0) by “inserting” the operator hA at the times τA1 , . . . , τAmA , A ∈ A0 . Formally, it can be defined as follows. For a given m and τ , let A = {A1 , . . . , Ak } be the set of all A ∈ A0 with mA 6 = 0, mi = mAi , and hi = hAi . Let (s1 , . . . , s|m| ) = π(τA1 1 , . . . , τAm11 , . . . , τA1 k , . . . , τAmkk )

(4.5)

be a permutation of the times τ such that s1 ≤ s2 ≤ · · · ≤ s|m| , and set (h˜ 1 , . . . , h˜ |m| ) = π(h1 , . . . , h1 , . . . , hk , . . . , hk ),

(4.6)

where on the right-hand side each hi appears exactly mi times. Then T (τ , m) is defined by ˜

(0)

(0)

T (τ , m) = e−(β−s|m| )Hq,3 h˜ |m| e−(s|m| −s|m|−1 )Hq,3 h˜ |m|−1 . . . (0)

(0)

. . . e−(s2 −s1 )Hq,3 h˜ 1 e−s1 Hq,3 .

(4.7)

For later reference, we also define the time ordered monomials R(τ , m) = h˜ |m| h˜ |m|−1 . . . h˜ 1 . (Notice that, formally, R(τ , m) ≡ TH (0) ≡0 (τ , m).)

(4.8)

Phase Diagrams of Fermionic Lattice Systems

591

Inserting the expansion (4.4) into (4.3), and using the occupation number basis (3.1) to express the trace as a sum of expectation values, we get Zq,3 =

XX n

m1

···

Z M XY (−λt)mk k dτ m k mk ! m M

k=1

(4.9)

hn| T (τ M , mM ) · · · T (τ 1 , m1 ) |ni, where mk , k = 1, . . . , M are multiindices mk : A0 → {0, 1, . . . , } : A 7 → mk,A and 1 , . . . , τ mk,A , A ∈ A } are the corresponding integration variables. τ k = {τk,A 0 k,A Each term on the right-hand side of (4.9) can be interpreted, in a standard manner, in terms of a classical path n(·) : [0, β] → {0, 1}3 determined uniquely by the vector |ni and sequences (τ 1 , m1 ), . . . , (τ M , mM ). To get the assignment τ 7 → n(τ ) we start with the observation that an operator hA applied to a vector of the form (3.1) yields either (0) zero or again a vector of the form (3.1). Combined with the fact that Hq,3 is diagonal in the basis (3.1), X (0) 8x (n) |ni, (4.10) Hq,3 |ni = x∈3

we infer that T (τ M , mM ) · · · T (τ 1 , m1 ) |ni and R(τ M , mM ) · · · R(τ 1 , m1 ) |ni are parallel vectors of H3 and that hn| T (τ M , mM ) · · · T (τ 1 , m1 ) |ni is non zero if and only if hn| R(τ M , mM ) · · · R(τ 1 , m1 ) |ni does not vanish. The classical path n(·) is now obtained in the standard way. Starting from n(0) = n, n(τ ) is piecewise constant, with a jump whenever i (4.11) τ = (k − 1)β˜ + τk,A for some k ∈ {1, . . . , M}, A ∈ A0 , and i ∈ {1, . . . , mk,A }. At these times, n(·) jumps from n(τ ) to n(τ + 0) defined by |n(τ + 0)i := hA(τ ) |n(τ )i,

(4.12)

with A(τ ) implicitly defined by (4.11). Note that n(τ + 0) is not defined if the righthand side of (4.12) is zero. It is easy to see, however, that the corresponding terms do not contribute to the right-hand side of (4.9), since the matrix elements hn| R(τ M , mM ) · · · R(τ 1 , m1 ) |ni and hn| T (τ M , mM ) · · · T (τ 1 , m1 ) |ni vanish in this case. In a similar way, paths with n(β) 6 = n(0) ≡ n do not contribute to (4.9). Note also that there may be several values for A, k and i which fulfill (4.11). Since such “events” have measure zero in the integration on the right-hand side of (4.9), we may assume, without loss of generality, that this does not happen. Given the above construction and the definition (4.8) of the matrix T (τ , m), one immediately gets the following explicit formula for the vector T (τ M , mM ) · · · T (τ 1 , m1 ) |ni in terms of R(τ M , mM ) · · · R(τ 1 , m1 ) |ni. Namely, T (τ M , mM ) · · · T (τ 1 , m1 ) |ni = o n XZ β 8x (n(τ ))dτ R(τ M , mM ) · · · R(τ 1 , m1 ) |ni. = exp −

(4.13)

x∈3 0

Inserting the equality (4.13) into (4.9), and introducing the symbol S(n, {τ k , mk }) for the “sign” (4.14) S(n, {τ k , mk }) = hn|R(τ M , mM ) · · · R(τ 1 , m1 ) |ni,

592

C. Borgs, R.Kotecký

we obtain the representation Zq,3

Z M o n XZ β X X Y (−λt)mk mk dτ k exp − = 8x (n(τ ))dτ × mk ! 0 n {mk } k=1

x∈3

× S(n, {τ k , mk }),

(4.15)

where the second sum stands for the M sums over m1 , . . . , mM . Remarks. i) Note that for x near to the boundary, the value of 8x (n(τ )) depends on the configuration outside 3, which we assumed to be the ground state configuration g (q) by assuming boundary conditions q. We suppress this dependence in our notation. ii) As discussed above, configurations {n, {τ k , mk }} only contribute to the partition function Zq,3 if they correspond to a classical configuration n(·) with n(0) = n(β). To make this condition more explicit, it is convenient to consider time ordered monomials Mx (τ , n) which are obtained from R(τ , n) by leaving out all creation and annihilation operators cy† and cy with y 6 = x. A configuration {n, {τ k , mk }} then contributes to the partition function Zq,3 if and only if, for each x, the monomials Mx (τ M , mM ) · · · Mx (τ 1 , m1 ) are of the form cx† cx cx† · · · cx if nx = 1, and of the form cx cx† cx · · · cx† if nx = 0. 4.2. Ground state cells, excited cells and contours. In order to define contours, we introduce a suitable space time lattice, the notion of an elementary cell, and the definition of ground state cells and excited cells. We define the lattices

and

˜ . . . , M}per L = Zd × β{0,

(4.16a)

˜ . . . , M}per , L3 = 3 × β{0,

(4.16b)

˜ = (x, β), and where the index “per” stands for the identification of (x, 0) and (x, M β) the continuum tori (4.17a) T = Rd × [0, β]per and

T3 = {y ∈ Rd | dist(y, 3) ≤ 21 } × [0, β]per ,

(4.17b)

again with periodic boundary conditions in the “time direction”. An elementary cell C(x, k), labeled by an index (x, k) ∈ Zd × {1, . . . , M} (we identify 0 and M), is now defined as the set ˜ − 1, k]. C(x, k) = {y ∈ Rd | dist(y, x) ≤ 21 } × β[k

(4.18)

Given a “configuration” ω = {n, {τ k , mk }} contributing to the right-hand side of (4.15), we distinguish between elementary cells C(x, k) with constant occupation numbers nx,σ (τ ), and those which are “visited” by a hopping term hA . We define an elementary cell C(x, k) ⊂ T3 to be a quantum cell, if x ∈ supp mk , where supp mk := S / supp mk . Note that with this A:mk,A 6=0 supp A, and to be a classical cell if x ∈ definition, the occupation number nx,σ (τ ) is constant inside classical cells, so that ˜ =: nσ (C(x, k)) if C(x, k) is a classical cell and (k−1)β˜ ≤ τ ≤ k β. ˜ nx,σ (τ ) = nx,σ (k β)

Phase Diagrams of Fermionic Lattice Systems

593

We say that a cell C(x, k) is in the ground state m, if all cells C(y, k) with y ∈ U (x) are (m) classical cells, and nσ (C(y, k)) = gy,σ . A cell which is not in a ground state is called an excited cell, and the set of excited cells corresponding to the configuration ω is denoted by D = D(ω). At this point, the definition of contours is standard. One defines a (labeled) contour Y as a pair (supp Y, α), where supp Y ⊂ T is a finite, connected union of elementary cells, while α is an assignment of labels α(F ) to faces of ∂ supp Y which is constant on the boundary of all connected components of T \ supp Y . The contours Y1 , . . . , Yn corresponding to a configuration ω = {n, {τ k , mk }} are then defined by taking the connected components of the set D of excited cells in T3 for their supports supp Y1 , . . . , supp Yn and by taking the labels m of the ground states for the elementary cells C that touch the face F , see above, for the corresponding labels αi (F ). The ground state regions Vm , m = 1, . . . , r, corresponding to ω, on the other hand, are defined as the union of all elementary cells that are in the ground state m. Note that for each configuration ω = {n, {τ k , mk }} contributing to (4.15), the set of contours corresponding to ω is a set of mutually compatible contours with matching labels and external boundary condition q. Here, as usually, two contours Y and Y 0 are called compatible whenever supp Y ∩ supp Y 0 = ∅, a set {Y1 , . . . , Yn } of pairwise compatible contours is called a set with matching labels, if the labels α(F ) of the contours Y1 , . . . , Yn are constant on the boundary of each component of T \ (supp Y1 ∩ · · · ∩ supp Yn ), and a set of mutually compatible contours with matching labels is said to have external boundary condition q if these labels take the value q on the boundary of the infinite component of T \ (supp Y1 ∩ · · · ∩ supp Yn ). Note also that, by our definition of ground state cells, the function 8x (n(τ )) in the exponent in (4.15) is constant and equal to em for all (x, τ ) in the ground state region Vm . As a consequence, the contribution of the ground state region Vm to the exponent in ˜ m |em , where |Vm | is the number of elementary cells in Vm . (4.15) is −β|V In a final step, we now sum (and integrate) over all configurations leading to the same P ˜ m |Vm | − βe m for the classical set of contours {Y1 , . . . , Yn }. Extracting further the factor e energy of the ground state regions, ∪m Vm = T3 \(supp Y1 ∩· · ·∩supp Yn ), and denoting the numerical value of the sum over the remaining factors by ρ(Y1 , . . . , Yn ), we obtain the contour representation P X ˜ e− m β|Vm |em ρ(Y1 , . . . , Yn ), (4.19) Zq,3 = {Y1 ,...,Yn }

where the sum goes over all sets of mutually compatible contours with matching labels, external boundary condition q, and support supp Yi ⊂ T3 . Note that the external boundary condition q refers to the set {Y1 , . . . , Yn }, not to the individual contours Y ∈ {Y1 , . . . , Yn }. Our goal, now, is to show that it is possible to define contour activities ρ(Y ) so that ρ(Y1 , . . . , Yn ) =

n Y

ρ(Yi ),

(4.20)

i=1

and hence Zq,3 =

X {Y1 ,...,Yn }

e−

P

˜

m β|Vm |em

n Y i=1

ρ(Yi ).

(4.21)

594

C. Borgs, R.Kotecký

Given this representation, the partition function can then be analysed using a slightly modified version [18] of standard Pirogov–Sinai theory, provided ρ(Y ) is decaying sufficiently fast in the size of Y (which will be easy to show, see Sect. 5).

4.3. Factorization of the contour activities. In this subsection we prove the factorization (4.20). Let us first introduce the notation ω(V ) for a configuration living on a set V ⊂ T3 ; namely, such a configuration is given by ω(V ) = {n(V ), {τ k (V ), mk (V )}} with n(V ) = {nx , C(x, 1) ⊂ V }, mk (V ) = {mk,A ; ∪x∈supp A C(x, k) ⊂ V }, τ k (V ) = {τ ik,A ; ∪x∈supp A C(x, k) ⊂ V }. Inspecting now the mapping ω 7 → {Y1 , . . . , Yn } assigning a set of mutually compatible contours to a configuration ω = {n, {τ k , mk }} contributing to Zq,3 (see Remark ii) after (4.15) for an explicit condition), we define the indicator function χY1 ,...,Yn (ω) to be 1 if Y1 , . . . , Yn are the contours corresponding to ω and to be 0 otherwise. Note that this definition implicitly gives χY1 ,...,Yn (ω) = 0 if ω does not contribute to Zq,3 , since such a configuration does not correspond to a classcial path n(τ ) and hence not to any assignment of contours. The indicator function χY1 ,...,Yn (ω) can now be decomposed into a product Y

χY1 ,...,Yn (ω) =

χm (ω(Vm ))

m

n Y

χYi (ω(supp Yi )).

(4.22)

i=1

Here ω(Vm ) and ω(supp Yi ) are the corresponding restrictions of the configuration ω. (m) The function χm (ω(Vm )) indicates that mk (Vm ) = 0 for all k and nx = gx for all x such that C(x, 1) ⊂ Vm . Given a contour Y and extending the configuration ω(supp Y ) by (m) putting mk (T3 \ supp Y ) = 0 and fixing nx = gx for every cell C(x, 1) ∩ supp Y = ∅ contained in the component of T3 \ supp Y whose boundary is labeled by α = m, the function χY (ω(supp Y )) indicates that Y is the only contour of this extension of ω(supp Y ). Note that the conditions according to Remark ii) after (4.15) are fullfilled for ω if and only if they are fullfilled for the extension of ω(supp Y ), for all contours Y corresponding to ω, a condition that is, in turn, again implicit in χY (ω(supp Y )). ˜ Next, we introduce the classical energy βE(ω(supp Y )) of a contour Y : ˜ βE(ω(supp Y )) =

M X

X

k=1 x:C(x,k)⊂supp Y

Z

k β˜ (k−1)β˜

8x (nY (τ )) dτ,

(4.23)

where nY (·) is the classical path obtained from the above extension of ω(supp Y ) to T3 . With these notations,

ρ(Y1 , . . . , Yn ) =

Z M X X Y (−λt)mk k dτ m χY1 ,...,Yn (ω) × k mk ! n {mk } k=1

× S(ω)

n Y

˜

e−βE(ω(supp Yi )) ,

i=1

where ω = {n, {τk , mk }}, and S(ω) = S(n, {τ k , mk }) is given by (4.14).

(4.24)

Phase Diagrams of Fermionic Lattice Systems

595

Thus to prove the factorization (4.20), it remains to show the factorization for the sign S(ω). Our task is to introduce signs S(ω(supp Y )) ∈ {−1, 1} so that, for a configuration ω with contours {Y1 , . . . , Yn }, one has S(ω) =

n Y

S(ω(supp Yi )).

(4.25)

i=1

We need some notation. As ususal, the interior Int Y of a contour Y = (supp Y, α) is defined as the union of all finite6 components C of T \ supp Y , while the exterior Ext Y is defined as the infinite component of T \ supp Y . One says that Y is a contour with external boundary condition q, or shorter: a q-contour, if α(F ) = q for all faces F in the boundary of Ext Y , and one defines Int m Y as the union of all components C of Int Y such that α ∂C = m. Finally, V (Y ) is defined as supp Y ∪ Int Y . We now proceed by determining the signs of contours one by one, starting from the most inner ones, “erasing” them simultaneously from the configuration ω. Let thus Yi be a contour with external boundary condition qi , such that there is no contour Yj , j 6 = i, with supp Yj ⊂ Int Yi . Consider the configuration ω˜ obtained by extending the (q ) ˜ k (V (Yi )) = 0 and n˜ x = gx i for all x such configuration ω(T3 \ V (Yi )) by taking m that C(x, 1) ⊂ V (Yi ). We will now introduce the sign S(ω(supp Yi )) (independently of the configuration ω(T3 \ V (Yi ))) in such a way that ˜ S(ω) = S(ω(supp Yi ))S(ω)

(4.26)

with S(ω) ˜ defined from the configuration ω˜ by (4.14). Iterating the erasure procedure and formula (4.26), we get a final configuration with no contours and sign +1, establishing thus the equality (4.25). To determine the sign S(ω(supp Yi )), we begin by considering for each x ∈ 3 the intersection I (x) of the line {x}×[0, β]per with V (Yi ), I (x) = ({x}×[0, β]per )∩V (Yi ). If nonempty, the set I (x) is either a union of disjoint intervals I (x) = ∪l [kl− , kl+ ] or I (x) = [0, β]per . In the former case (I (x) 6 = [0, β]per ), we use the fact that all boundary cells of V (Yi ) are classic cells with the same ground state g (m) and thus the path n(τ ) (corresponding to ω for which χY (ω(supp Yi )) 6 = 0 ) necessarily attains the values (m) nx (τ = kl− ) = nx (τ = kl+ ) = gx . Assuming for a moment that the interval (kl− , kl+ ) does not contain the time τ = 0, let us consider the product (4.27) h˜ a · · · h˜ b Q i R(τ k , mk ) for which the times τk,A consisting of those terms hA in the product − + fall into the interval (kl , kl ). If the corresponding term is to be nonvanishing (i. e. if χY (ω(supp Yi )) 6 = 0 ), there must be in (4.27) the same number of creation and annihilation operators cx+ and cx . Commuting them through all remaining terms until they mutually annihilate, we produce a sign sx,l (ω(supp Yi )). Notice that this sign does not depend on the configuration ω(T3 \ supp Yi ), since if (4.27) contains a term hA corresponding to any other contour, then necessarily x 6∈ supp A and, since A is a product of an even number of creation and annihilation operators, the operator cx+ (resp. cx ) commutes with such hA producing no additional sign. If the interval (kl− , kl+ ) contains the time τ = 0, we consider separately the product of the form (4.27) for the interval 6 In the sense that C is a finite union of unit cells.

596

C. Borgs, R.Kotecký

(kl− , 0), and that for the interval (0, kl+ ). We then commute all creation and annihilation operators cx+ and cx that correspond to times in (kl− , 0) with the remaining operators in the product (4.27) until they hit time zero, and similarly for those in (0, kl+ ). After annihilating all pairs, we will be left with monomials R+ and R− in the operators cx+ and cx , such that (4.28) R+ |nihn| R− = |n˜ (x) ihn˜ (x) | , (m)

= gx for nx . Combining the steps where n˜ (x) is obtained from n by substituting n˜ x Q described so far, we get a sign sx (ω(supp Yi )) = l sx,l (ω(supp Yi )) and the new state (m) ˜ n˜ (x) at τ = 0, with n˜ x = gx , as required by our definition of ω. If I (x) = [0, β]per , then the values nx (τ = 0) = nx (τ = β) = nx and we can reason in a similar fashion as in the first case above. Then all operators cx+ and cx are annihilated after the commutations are performed yielding the sign sx (ω(supp Yi )), without any change in the state n at time τ = 0. Since all operators cx+ and cx corresponding to the concerned x have been cancelled, the value of S(ω) ˜ does not depend on the state nx and (m) we may replace it, without any additional change in sign, by n˜ x = gx . Iterating the above procedure for all x (chosen in a fixed (say, lexicographic) order) such that I (x) Qis nonempty, we pass to the configuration ω˜ and produce the sign S(ω(supp Yi )) = x sx (ω(supp Yi )).

5. Exponential Decay of Contour Activities, Proof of Theorems 3.1–3.4 5.1. Bound on the contour activities ρ(Y ). Given the contour representation (4.21), the proof of Theorems 3.1 and 3.2 is an easy exercise in Pirogov–Sinai theory, once a suitable bound on the weights ρ(Y ) is established. This is done in this subsection. Proposition 5.1. Let λ ∈ R, β˜ > 0, and γQ ≥ 1 be such that ˜ (e − 1)β|λ|ktk γQ ≤ 1.

(5.1)

|ρ(Y )| ≤ e−(βe0 +γ˜ )| supp Y | ,

˜

(5.2)

˜ cl , γQ − 1} − (1 + |6|) log 2. γ˜ = min{βγ

(5.3)

Then where

Proof. Since |S(ω)| = 1, we get from (4.24) (for n = 1) the bound ˜

|ρ(Y )| ≤ e−βe0 | supp Y | 2|6|| supp Y | × ×

X

X

X⊂supp Y

{mk } ∪k supp mk =X

M Y (|λ||t|)mk |mk | −βγ ˜ e cl | supp Y \X| . β˜ mk !

(5.4)

k=1

The second sum is over all unions X of unit cells in supp Y (corresponding to the quantum cells on the right-hand side of (4.24)) . The factor 2|6|| supp Y | comes from the sum over occupation numbers n, observing that, for a q-contour Y , the occupation numbers are (q) fixed, nx,σ = gx,σ , whenever C(x, 1) ∩ supp Y 6 = ∅, and the last factor in (5.4) comes

Phase Diagrams of Fermionic Lattice Systems

597

from the fact that all cells in supp Y that are not quantum cells must be classically excited. In a similar manner as in [18], we use the bound ∞ X ˜ A |)mk,A (|λ|β|t ˜ A |, ≤ (e − 1)|λ|β|t mk,A !

(5.5)

mk,A =1

˜ A | ≤ 1, to get valid whenever |λ|β|t ˜

|ρ(Y )| ≤ e−βe0 | supp Y | 2|6|| supp Y | × Y X Y X ˜ ˜ A| . e−βγcl | supp Y \X| (e − 1)|λ|β|t × X⊂supp Y

(5.6)

Bk A∈Bk

k

P The Bk is over all finite collections Bk ⊂ A0 such that ∪A∈Bk A = Xk , where, for a fixed k ∈ {1, . . . , M}, the set Xk is the union of all unit cells C(x, k) contained in X. Using now (5.1) we get the bound X

Y

Bk ={A1 ,...,A` } ∪Ai =Xk

Ai ∈Bk

˜ Ai | (e − 1)|λ|β|t

≤ e−γ˜Q |Xk |

∞ ` X 1 Y X ˜ Ai |eγQ | supp Ai | (e − 1)|λ|β|t `! A ∈A `=1

≤ e−γQ |Xk |

i=1

0 i Ai ∩Xk 6 =∅

` ∞ X 1 Y X X ˜ Ai |eγQ | supp Ai | (e − 1)|λ|β|t `! A ∈A i=1 x∈Xk

`=1

0 i Ai 3x

≤ e−γQ |Xk |

∞ X 1 |Xk |` ≤ e−(γQ −1)|Xk | . `!

(5.7)

`=1

Since X

˜

˜

e−(γQ −1)|X| e−βγcl | supp Y \X| ≤ e− min(βγcl ,γQ −1)| supp Y | 2| supp Y | ,

(5.8)

X⊂supp Y

we finally get (5.2) with γ˜ given by (5.3). u t 5.2. Bound on the derivatives ∂ρ(Y ) ∂µi . Proposition 5.2. Let λ ∈ R, β˜ > 0, and γQ ≥ 1 be such that (5.1) is satisfied. Then ∂ e ˜ ˜ 0+ | supp Y |e−(βe0 +γ˜ )|Y | . ρ(Y ) ≤ βC ∂µi e−1 Here C0 is the constant from (3.10) and γ˜ is the constant defined in (5.3).

(5.9)

598

C. Borgs, R.Kotecký

Proof. We start again from the expression (4.24) for n = 1 and bound ∂ ∂ ˜ ˜ e−βE(ω(supp Y )) ≤ β˜ E(ω(supp Y )) e−βE(ω(supp Y )) ∂µi ∂µi ˜ Y )) ˜ 0 | supp Y |e−βE(ω(supp ≤ βC

(5.10)

with the help of (4.23) and (3.10), as well as M M ∂ ∂ Y Y (λt)mk (|λ||t|)mk X mk, log tA¯ . ≤ ¯ A¯ ∂µi mk ! mk ! ∂µi k=1

(5.11)

¯ A¯ k,

k=1

Using then (5.5) and ∞ X mk, ¯ A¯ =1

mk, ¯ A¯

˜ A¯ |)mk,¯ A¯ ∂tA¯ 1 (|λ|β|t mk, ∂µi |tA¯ | ¯ A¯ ! ∞ ∂t ∂t X ˜ A¯ |)mk,¯ A¯ (|λ|β|t ¯ ˜ A¯ , ≤ |λ|βe = |λ|β˜ A ∂µi mk, ∂µi ¯ A¯ !

(5.12)

mk, ¯ A¯ =0

we get ∂ X ˜ ˜ ρ(Y ) ≤ e−βe0 | supp Y | 2|6|| supp Y | e−βγcl | supp Y \X| ∂µi X⊂supp Y n YX Y ˜ 0 | supp Y | ˜ A| × βC (e − 1)|λ|β|t +

X ¯ A¯ k, ¯ A∩X 6 =∅ k¯

k

Bk A∈Bk

∂t X Y o Y X Y ˜ A¯ ˜ A¯ | ˜ A| |λ|βe (e − 1)|λ|β|t (e − 1)|λ|β|t ∂µi B k¯ A¯ ∈B / ¯ k

A∈Bk¯

k6 =k¯

Bk A∈Bk

(5.13) with the sum on the last line running through all A¯ and Bk¯ such that the union of A¯ with all A in Bk¯ is Xk¯ . Hence ∂ ˜ ρ(Y ) ≤ e−βe0 | supp Y | 2|6|| supp Y | ∂µi n X e o −(γQ −1)|X| ˜ ˜ 0 | supp Y | + |X| e × e−βγcl | supp Y \X| βC . e−1

(5.14)

X⊂supp Y

The rest of the proof then follows the same argument as above in the proof of Proposition 5.1. u t

Phase Diagrams of Fermionic Lattice Systems

599

5.3. Proof of Theorem 3.1 and 3.2. Given the representation (4.21) and the bounds of Propositions 5.1 and 5.2, the proof of Theorem 3.1 i) and v) is essentially identical to the proof of Theorem 2.1 i) and v) in [18]. Actually, it is an almost standard application of Pirogov–Sinai theory, with two modifications: the fact that the contour weights ρ(Y ) are in general not positive, and the fact that (4.21) describes a contour model in a finite slab, see [18] for the details. The constants can be chosen as follows. Taking any sufficiently large γ , we put β0 = γ /γcl and assume that λ fulfills (3.28). Taking now γQ = γ and β˜ ∈ [β0 , 2β0 ), the condition (5.1) is satisfied and we can infer that the bounds (5.2) and (5.9) are fulfilled with γ˜ = γ − 1 − (1 + |6|) log 2. Finally, whenever β ≥ β0 , we choose M ∈ N so that β˜ = β/M ∈ [β0 , 2β0 ). In order to prove the remaining parts of Theorem 3.1, we need a representation of the form (4.21) for expectation values of local observables. By linearity and the fact that a local observable is a finite sum of even monomials in the creation annihilation operators, we may restrict ourselves to local observables that are of the form 9 = hA(9) , 8 = hA(8) .

(5.15)

Rewriting the expectation value of a local observable 9 as h9iq,3 =

9 Zq,3 Tr H3 (9e−βHq,3 ) Tr H3 (9T M ) , = = Tr H3 (T M ) Zq,3 Tr H3 (e−βHq,3 )

(5.16)

9 . Rewe now derive a contour representation for the modified partition function Zq,3 tracing the steps leading to representation (4.15), we get the expression

9 = Zq,3

Z M o n XZ β X X Y (−λt)mk k 8 (n(τ ))dτ exp − dτ m x k mk ! 0 n x∈3

{mk } k=1

× S(n, {τ k , mk }; 9), where

S(n, {τ k , mk }; 9) = hn|R(τ M , mM ) · · · R(τ 1 , m1 )9 |ni.

(5.17) (5.18)

In order to define the contours corresponding to a configuration ω = {n, {τ k , mk }} we then introduce, in addition to the set of excited cells D(ω), the d +1 dimensional support of 9 as [ C(x, 1), (5.19) D(9) := x∈supp 9

where we localized the observable 9, by definition, in the first time slice. Considering all cells in D(ω) ∪ D(9) as excited, we then define the set supp Y9 as the union of all connected components of D(ω) that are connected to D(9), and the set supp Y9 as supp Y9 ∪ D(9). The contours corresponding to the configuration ω are defined by taking the set supp Y9 , and the remaining components of D(ω), denoted by supp Y1 , . . . , supp Yn as their support. Since the cell in D(9) has to be considered as excited as well, a slight variance will appear in the definition of the ground state regions Vm , which now, by definition, does not contain the cells in D(9). With these definitions, we get the contour representation 9 = Zq,3

∞ XX

X

Y9 n=0 {Y1 ,...,Yn }

e−

P

˜

m β|Vm |em

ρ(Y9 , Y1 , . . . , Yn ),

(5.20)

600

C. Borgs, R.Kotecký

with ρ(Y9 , Y1 , . . . , Yn ) =

Z M X X Y (−λt)mk k dτ m χY9 ,Y1 ,...,Yn (ω) k mk ! n {mk } k=1

˜

× S(ω; 9)e−βE(ω(supp Y9 ))

n Y

˜

e−βE(ω(supp Yi )) ,

(5.21)

i=1

where ω = {n, {τk , mk }}, and S(ω; 9) = S(n, {τ k , mk }; 9) is given by (5.18). Since the observable 9 is of the form (5.15), the factorization proof now goes through without modifications, leading to the representation 9 = Zq,3

∞ XX

X

e−

P

˜

m β|Vm |em

ρ(Y9 )

Y9 n=0 {Y1 ,...,Yn }

n Y

ρ(Yi ),

(5.22)

i=1

with ρ(Yi ) defined as before, and ρ(Y9 ) defined by Z M X X Y (−λt)mk ˜ k dτ m χY9 (ω) S(ω; 9)e−βE(ω(supp Y9 )) . ρ(Y9 ) = k mk ! n

(5.23)

{mk } k=1

Given the contour representation (5.22), we can now proceed as in [18] to complete the proof of Theorem 3.1. In the same way, Theorem 3.2 follows from the corresponding 9 = h9iper,3 Zper,3 . representation for the modified partition function Zper,3 5.4. Proof of Lemma 3.3. Given the results of Sects. 4 and 5, the proof of Lemma 3.3 is almost a textbook exercise. We therefore only indicate the main steps, and leave the details to the reader. β,np Starting with the partition function Zq,3 , we note that it has a representation of the form (4.15), with the only difference that the sum over n is replaced by the single term β,np (q) n = g 3 . Represented as the partition function of a contour model, Zq,3 is then given as a sum over sets of contours in a volume V (3) ⊂ Rd+1 , V (3) = {y ∈ Rd | dist(y, 3) ≤ 21 } × [0, β],

(5.24)

with boundary condition q on ∂V (3). β,np The partition function Zq,3 can be analysed by standard Pirogov–Sinai theory as developed in [20–23]. We follow [22,23], with a slight variant in the definition of truncated contour models. Namely, for a contour Y with support supp Y ⊂ Rd+1 , we define δ(Y ) as the diameter of the projection of supp Y on Rd , and then proceed by induction on δ(Y ), see [18], Eqs. (5.8)–(5.10). Denoting the corresponding truncated partition np functions by Z¯ q (V (3)), we define the truncated free energies f¯q (µ) = −

log Z¯ q (V ) , |V | np

lim

V →Rd+1

(5.25)

where V denotes the euclidean volume of V (note that |V | is nothing but the number of ˜ elementary cells C(x, k) in V multiplied by β).

Phase Diagrams of Fermionic Lattice Systems

601 β,np

As usual, the untruncated partition functions Zq,3 and the corresponding truncated np partition functions Z¯ q (V (3)) are identical whenever aq (µ) = 0, where aq (µ) = 0 is defined as (5.26) aq (µ) = f¯q (µ) − min f¯m (µ). m

β,np Zq,3

can therefore be analyzed by a convergent For aq (µ) = 0, the partition function cluster expansion, giving a representation for β,np

β,np

log Zq,3 + f¯q (µ) |V (3)| = log Zq,3 + f¯q (µ) β|3| in terms of clusters connected to the boundary ∂V (3). Defining ∂i V (3) as the union over all faces in ∂V (3) that are orthogonal to the direction i, and recalling that an elementary cell C(x, k) has extension β˜ in the “time direction”, we therefore get the bound d 1X β,np |∂i V (3)| log Zq,3 = −f¯q (µ)β|3| + O |∂0 V (3)| + β˜ i=1 β = −f¯q (µ)β|3| + O |3| + |∂3| . β˜

(5.27)

In order to complete the proof of (3.46), we need a relation between the truncated free energies f¯q (µ) introduced above and the truncated free energies fq (µ, β) of the model on the torus T. To this end, we note that the truncated activity of a contour Y with supp Y ⊂ T is the same for both truncated models, as long as the support of Y ˜ q (µ, β) does not wind around the torus T. The cluster expansions for β˜ f¯q (µ) and βf therefore only differ by terms involving clusters winding around T in the time direction. As a consequence, ˜ ˜ q (µ, β) = β˜ f¯q (µ) + O(e−αγ (β/β) ), (5.28) βf where γ and α > 0 are the constants from Theorem 3.1. From (5.28) we get fq (µ) ≡ limβ→∞ fq (µ, β) = f¯q (µ) and as a consequence aq (µ) = lim aq (µ, β, λ).

(5.29)

β→∞

Observing finally that β˜ ∈ [β0 , 2β0 ), see the proof of Theorem 3.1 above, the bound (3.46) follows from (5.27). As for the proof of (3.47), we note that the above methods also give a convergent β,np cluster expansion for h9iq,3 if aq (µ) = 0. Comparing this cluster expansion to the corresponding cluster expansion in the thermodynamic limit V (3) → Rd+1 , we get β,np

β,np

˜

h9iq,3 = lim h9iq,3 + O(e−αγ min{dist(supp 9,∂3),β/2β} ), 3→Zd β→∞

(5.30)

provided aq (µ) = 0. In order to complete the proof, we need to control the limit in (3.310 ), showing that it is identical to the limit in the right-hand side of (5.30). To this end, we note that the condition aq (µ) = 0 implies that ˜ ˜ q (µ, β, λ) ≤ O(e−αγ (β/β) ). βa

(5.31)

602

C. Borgs, R.Kotecký

Standard Pirogov–Sinai theory, here in the form derived in [18], on the other hand, gives β that Zq,3 and h9iq,3 can be analysed by a convergent cluster expansion if ˜ q (µ, β, λ) diam(3) ≤ O(1). βa

(5.32)

The limits in (3.300 ) and (3.310 ) can therefore be analysed by a convergent expansion. β,np Comparing the resulting expansion for h9iq to that for h9iq,3 , we obtain the desired bound (3.47). 5.5. Proof of Theorem 3.4. Let β ≥ β0 , where β0 is the constant from Theorem 3.1. Using (3.51), (3.52), Lemma 3.3, and the fact that the norm and support of Qx,3 are uniformly bounded in 3, we get 1 1 1 X β,np β,np (q) hQx,3 iq −hQx,3 iq,3 hQ3 iq −ρ3 = hQ3 iq −hQ3 iq,3 = |3| |3| |3| x∈3 C X exp{−αγ min{dist(supp Qx,3 , ∂3), β/β0 }}. (5.33) ≤ |3| x∈3

Taking the limit β → ∞, this gives 1 |∂3| C X (q) hQ3 iq − ρ3 ≤ (5.34) exp{−αγ dist(supp Qx,3 , ∂3)} ≤ O |3| |3| |3| x∈3

which in turn implies the bound (3.55). β In order to prove (3.56), we have to bound the difference of hQx,3 iq and hQx,3 iq . β Since q is stable for all β ≥ β˜0 , both hQx,3 iq and hQx,3 iq can be analysed by a convergent cluster expansion. Comparing these expansions, one obtains a representation β for hQx,3 iq − hQx,3 iq that only involves clusters which either wind around the torus T in the time direction or are contained in the infinite volume Rd+1 and “do not fit” into the torus T. In either case, one gets only contribution of the order O e−(β/β0 )αγ yielding the bound (5.35) hQx,3 iq − hQx,3 iβq ≤ Ce−(β/β0 )αγ , which in turn implies the bound (3.56). 6. Application to the Extended Hubbard Model: Proof of Theorem 2.1 The claims i) and iii) are a straightforward corollary of Theorem 3.1. For the choice of () constants we notice that γcl ≥ c(d) everywhere in S{0,2} , where c(d) is a strictly positive constant. As a consequence, β0 ∼ 1 and λ0 ∼ . The bound |t| < C2 corresponds to (3.28) (with t replacing λ and 2de8γ replacing ||t||γ ). The long range order expressed in (2.9) follows from the bound (3.33) and the staggered order of the ground states of H (0) (see [17] for a detailed discussion of the classical states of H (0) ). Using Theorem 3.4 for the (quantum) density ρ(β) defined in (2.4) and noticing that the density of the classical ground state ρclass is actually constant throughout the region (0) S{0,2} , ρclass = 1, we get the claim iv).

Phase Diagrams of Fermionic Lattice Systems

603 β

To prove ii), we first show that hSx3 im = 0. Taking into account Theorem 3.1 ii), it is enough to show that for every 3(L) with even L, one has Tr H3 (L) (Sx3 e−βHm,3(L) ) = 0,

(6.1)

where Hm,3 is defined as in (3.21) with g (m) being the corresponding {0, 2} staggered ground state configuration. Using (2.7) and expressing the trace in terms of the base |ni of occupation numbers n : 3 × 6 → {0, 1}, we will show that X hn| (nˆ x,↑ − nˆ x,↓ ) e−βHm,3(L) |ni = 0. (6.2) n

Indeed, taking into account that X X hn| (nˆ x,↑ − nˆ x,↓ ) e−βHm,3(L) |ni = hn| e−βHm,3(L) |ni(nx,↑ − nx,↓ ) n

(6.3)

n

and that the matrix element hn| e−βHm,3(L) |ni is symmetric under the overall spin flip ˜ n˜ x,↑ = nx,↓ and n˜ x,↓ = nx,↑ , we get (6.1). n → n, The Hamiltonian above is invariant under rotations and has actually an identical expression in terms of the creation and annihilation operators of the electron with up β β and down spin with respect to, say, the 1-axis. To get hSx1 im = 0 and hSx2 im = 0, it is therefore enough to repeat the above argument in the corresponding occupation number bases. References 1. Hubbard, J.: Electron Correlations in Narrow Energy Bands. Proc. Roy. Soc. London A 276, 238–257 (1963) 2. Gutzwiller, M.C.: The Effect of Correlation on the Ferromagnetism of Transition metals. Phys. Rev. Lett. 10, 159–162 (1963) 3. Kanamori, J.: Electron Correlation and Ferromagnetism in Transition Metals. Prog. Theor. Phys. 30, 275–289 (1963) 4. Nagaoka, Y.: Ferromagnetism in a Narrow, Almost Half-Filled s Band. Phys. Rev. 147, 392–405 (1960) 5. Anderson, P.W.: Theory of Magnetic Exchange Interactions: Exchange in Insulators and Semiconductors. Solid State Phys. 14, 99–214 (1966) 6. Cyrot, M.: The Hubbard Hamiltonian. Physica (Amsterdam) 91B, 141–150 (1977) 7. Hubbard, J.: Electron correlations in narrow energy bands III. An improved solution. Proc. Roy. Soc. London A 281 401–419 (1964) 8. Mott, N.F.: Metal-Insulator Transition. Rev. Mod. Phys. 40, 677–683 (1968) 9. Brinkmann, W.F., Rice, T.M.: Application of Gutzwiller’s Variational Method to Metal-Insulator Transition. Phys. Rev. B 2, 4302–4304 (1970) 10. Anderson, P.W.: The Resonating Valence Bond State in La2 CuO4 and Superconductivity. Science 235, 1196–1198 (1987) 11. Ruckenstein, A.E., Hirschfeld, P.J. and Appel, J.: Mean-field theory of high-Tc superconductivity: The supercharge mechanism. Phys. Rev. B 36, 857–860 (1987) 12. Bari, R.A.: Effects of Short-Range Interactions on Electron-Charge Ordering and Lattice Distortions in the Localized State. Phys. Rev. B 3, 2662–2670 (1971) 13. Wolff, U.: Saddle point mean field calculation in the Hubbard model. Nucl. Phys. B225 [FS9], 391–408 (1983) 14. Micnas, R., Robaszkiewicz, S. and Chiao, K.A.: Multicritical Behaviour of the Extended Hubbard Model in the Zero-Bandwidth Limit. Phys. Rev. B 29, 2784–2789 (1984) 15. van Dongen, P.G. Thermodynamics of the Extended Hubbard Model in High Dimensions. Phys. Rev. Lett. 67, 757–760 (1991) 16. J¸edrzejewski, J.: Phase Diagrams of Extended Hubbard Models in the Atomic Limit. Physica A 205, 702–717 (1994)

604

C. Borgs, R.Kotecký

17. Borgs, C., J¸edrzejewski, J. and Kotecký, R.: The Staggered Charge-Order Phase of the Low-Temperature Extended Hubbard Model in the Atomic Limit. J. Phys. A 29, 733–747 (1996) 18. Borgs, C., Kotecký, R. and Ueltschi, D.: Low Temperature Phase Diagrams for Quantum Perturbations of Classical Spin Systems. Commun. Math. Phys. 181, 409–446 (1996) 19. Datta, N., Fernández, R. and Fröhlich, J.: Low-Temperature Phase Diagrams of Quantum Lattice Systems. I. Stability for Quantum Perturbations of Classical Systems with Finitely-Many Ground States. J. Stat. Phys. 84, 455–534 (1996) 20. Pirogov, S. and Sinai, Ya.G.: Phase diagrams of classical lattice systems. Theor. Math. Phys. 25, 1185– 1192 (1975); 26, 39–49 (1976) 21. Zahradník, M.: An Alternate Version of Pirogov–Sinai Theory. Commun. Math. Phys. 93, 559–581 (1984) 22. Borgs, C. and Imbrie, J.:A UnifiedApproach to Phase Diagrams in Field Theory and Statistical Mechanics. Commun. Math. Phys. 123, 305–328 (1989) 23. Borgs, C. and Kotecký, R.: A Rigorous Theory of Finite-Size Scaling at First-Order Phase Transitions. J. Stat. Phys. 61, 79–119 (1990) 24. Seiler, E. and Simon, B.: Nelson’s symmetry and all that in Yukawa and (φ 4 )3 theories. Ann. Phys. 97, 470–518 (1990) 25. Datta, N., Fernández, R., Fröhlich, J. and Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. II. Convergent perturbation expansions and stability in systems with infinite degeneracy. Helv. Phys. Acta 69, 752–820 (1996) 26. Fröhlich, J. and Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. III. Examples. Helv. Phys. Acta 69, 821–849 (1996) 27. Kotecký, R. and Ueltschi, D.: Effective interactions due to quantum fluctuations. Commun. Math. Phys. 206, 289–335 (1999) 28. Datta, N., Messager, A. and Nachtergaele, B.: Rigidity of interfaces in the Falicov–Kimball model. Preprint, mp-arc 98-267 Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 605 – 622 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Absolutely Continuous Invariant Measures for Piecewise Real-Analytic Expanding Maps on the Plane Masato Tsujii Department of Mathematics, Hokkaido University, Sapporo 060-0810, Japan. E-mail: [email protected] Received: 5 June 1998 / Accepted: 11 May 1999

Abstract: We prove the existence of absolutely continuous invariant measures for piecewise real-analytic expanding maps on bounded regions in the plane. 1. Introduction Expanding properties of dynamical systems give rise to chaotic behavior of the orbits. On the other hand, they often lead to good ergodic properties such as the existence of absolutely continuous invariant measures. One typical example is the fork-lore theorem [12] that shows the existence of a smooth ergodic measure for every expanding C 2 self-map on a closed manifold. Hence, one interest in the study of chaotic dynamical systems is the relations between expanding properties and the ergodic properties they produce. Lasota and Yorke showed, in their famous work [11], the existence of absolutely continuous invariant measures for piecewise C 2 expanding maps on intervals. They made use of the Perron–Frobenius operator and functions of bounded variation, and their idea has been used extensively in the study of one dimensional dynamical systems. This paper concerns the generalization of their result towards higher dimension. Though it is natural to expect similar results, it has turned out that things are not simple in higher dimension. In fact, at present, we do not know whether piecewise C 2 expanding maps on bounded regions in higher dimensional Euclidean space always have absolutely continuous invariant measures. The main difficulty in higher dimension is the fact that the partition of the domain into the regions where an iteration of the map is smooth can be very complicated. Gerhard Keller treated piecewise C 2 expanding maps on bounded regions in the plane in his thesis [7,8] and gave some criterion for the existence of absolutely continuous invariant measure. The most effective result we have so far is that of Góra and Boyarski [4], which gives a lower bound for the minimum expansion rate that assures the existence of absolutely continuous invariant measures. Their result is valid for arbitrary dimension.

606

M. Tsujii

But their lower bound depends on the minimal angle on the boundaries of the regions in the partition associated to the map. See [1] for a modification of their result. In this article, we consider the problem for piecewise real- analytic maps on bounded regions in the plane. (We will give the definition of piecewise real-analytic maps in the next section.) The real-analytic property somewhat relaxes the difficulty we mentioned above. In fact, we can prove the following theorem as the main result of this paper. Theorem 1. An absolutely continuous invariant finite measure exists for every piecewise real-analytic expanding map on a bounded region in the plane. This result improves a theorem of Keller in his thesis [7,8], which gives the same conclusion under one additional assumption that the map is piecewise conformal. Actually we will prove the so-called Lasota–Yorke type inequality for some iterations of piecewise real-analytic expanding maps. It is known that we can derive many other properties of the maps from that kind of inequality. We will mention some of them in the appendix. The author learned from Gerhard Keller that Jérôme Buzzi (C.N.R.S., Institut de Mathématiques de Luminy) obtained a similar result [2] when he was preparing the manuscript of this paper. 2. Piecewise Real-Analytic Map We call a map c : [a, b] → R2 a real-analytic curve if it is a restriction of a real-analytic map defined on a neighborhood of [a, b] and satisfies c0 (t) 6 = 0 for t ∈ [a, b]. In what follows, we will assume kc0 (t)k ≡ 1

for t ∈ [a, b]

(1)

for real-analytic curves, by real-analytic change of the variable t. Also we will denote the image of a real-analytic curve c : [a, b] → R2 by the same symbol c, as an abuse of the symbol. A continuous map c : [a, b] → R2 is called a piecewise real-analytic curve if there is a sequence a = ξ0 < ξ1 < ξ2 < · · · < ξn = b such that the restrictions c|[ξi ,ξi+1 ] , 0 ≤ i < n, are real-analytic curves. Let D be a region on the plane R2 whose boundary consists of finite simple closed piecewise real-analytic curves. We consider a finite (quasi)-partition ξ = {Di }ki=1 of the domain D such that • Di ⊂ D is a region whose boundary is a finite union of simple closed piecewise real-analytic curves, • Di ∩ Dj = ∅ if i 6 = j , and • ∪ki=1 D i = D, where D and D i denote the closures of D and Di respectively. We call such a partition a real-analytic partition of D. We denote E = ∪ki=1 ∂Di = D − ∪ki=1 Di . For a real-analytic partition ξ = {Di }ki=1 , we can choose a finite set of real-analytic curves {γi : [ai , bi ] → E ⊂ R2 }m i=1 in E with the following properties • each γi is a simple curve, that is, has no multiple point, • the boundary of each region Dj , 1 ≤ j ≤ k, is a union of the image of some γi ’s, and

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

607

• the images of curves γi , 1 ≤ j ≤ k, except their end points are mutually disjoint, that is, γi (t) 6 = γj (s) if i 6= j , t ∈ (ai , bi ) and s ∈ (aj , bj ). We call these curves the dividing curves of the partition ξ . Remark that there are only finitely many points that belong to more than two dividing curves. A map f : D → D is called a piecewise real-analytic map on D if there is a realanalytic partition ξ = {Di }ki=1 of D as above such that each restriction f |Di of f to Di , 1 ≤ i ≤ k, can be extended to a neighborhood of D i as a real-analytic map. We will denote, by fDi , the real- analytic extension of f |Di to a neighborhood of D i . For a tangent vector v at x ∈ D − E, we define its expansion rate ρ(v, f ) by ρ(v, f ) =

kDf (v)k . kvk

The minimum expansion rate ρ(f ) of the map f is the infimum of the expansion rate over all non-zero vectors at all points in D − E. If ρ(f ) > 1 for a piecewise real-analytic map f , the map f is called a piecewise real-analytic expanding map. We will fix a piecewise real-analytic expanding map f , the partitions ξ = {Di }ki=1 and the dividing curves {γi : [ai , bi ] → R2 }m i=1 throughout this paper. An important consequence from the above definitions is the fact that the iterations f n of a piecewise real-analytic map f are also piecewise real-analytic maps. We will use this fact repeatedly in the proof of Theorem 1. Remark 2. Iterations of piecewise C r maps with r ≤ ∞ are not necessarily piecewise C r maps since the partition associated to them may have infinitely many connected components. We refer to [14] for examples of piecewise C r expanding maps with r < ∞ that have singular ergodic properties. 3. Germs of Real-Analytic Curves Let p be a point on the plane R2 , and let ci : [0, i ] → R2 , i = 1, 2, be two real-analytic curves satisfying ci (0) = p and (1). We say that these two curves give the same germ at p if c1 (t) = c2 (t) for 0 ≤ t < min{1 , 2 }. This is an equivalence relation between real-analytic curves c satisfying c(0) = p and (1). The equivalence classes are called germs of real-analytic curves at p. We say that two open subsets U1 and U2 on the plane R2 give the same germ at p if there exists δ > 0 such that U1 ∩ B(p, δ) = U2 ∩ B(p, δ), where B(p, δ) = {x ∈ R2 ; kx − pk < δ}. This is also an equivalence relation. We call the equivalence classes germs of open subsets at p. Let β1 and β2 be distinct germs of real-analytic curves at p, and let bi : [0, i ] → R2 , i = 1, 2 be simple real-analytic curves that represent βi respectively. If δ > 0 is sufficiently small, the open set B(p, δ)\(b1 ∪b2 ) consists of two connected components. The germs of an open subset represented by the connected component of B(p, δ)\(b1 ∪ b2 ) that is located in the counterclockwise direction of the curve b1 is called the region between β1 and β2 . From this definition, the region U between β1 and β2 and that between β2 and β1 are complementary. The germs of real-analytic curves β1 and β2 are called the boundary curves of the region U . Let Angle1 (β1 , β2 ) ∈ [0, 2π ] be the angle that is formed by the region between β1 and β2 at p. So we have Angle1 (β1 , β2 ) = 2π − Angle1 (β2 , β1 ). If Angle1 (β1 , β2 ) 6 = 0

608

M. Tsujii

we define Ord(β1 , β2 ) = 1. On the other hand, if Angle1 (β1 , β2 ) = 0, we define Ord(β1 , β2 ) as the contact order of the two germs of real-analytic curves β1 and β2 at p, that is, log min{kb1 (t) − b2 (s)k | s ∈ [0, 2 )} log t log min{kb2 (t) − b1 (s)k | s ∈ [0, 1 )} . = lim t→+0 log t

Ord(β1 , β2 ) = lim

t→+0

When Ord(β1 , β2 ) = d > 1, we define Angled (β1 , β2 ) by min{kb1 (t) − b2 (s)k | s ∈ [0, 2 )} td min{kb2 (t) − b1 (s)k | s ∈ [0, 1 )} = lim . t→+0 td

Angled (β1 , β2 ) = lim

t→+0

For the region U between β1 and β2 , we define Ord(U ) = Ord(β1 , β2 ) and Angled (U ) = Angled (β1 , β2 ). We will need the following elementary lemmas. Lemma 3. Let H : W → R2 be a real-analytic map defined on a neighborhood W of a point p ∈ R2 . Assume that kDH (p)wk/kwk ≥ 1 for all tangent vectors w 6= 0 at p. Let β1 and β2 be germs of real-analytic curves at p, and let vi , i = 1, 2, be the unit tangent vectors of them at the point p respectively. Let U be the region between β1 and β2 . (a) If 0 < Angle1 (β1 , β2 ) < 2π, we have sin(Angle1 (H (U ))) · ρ(v1 , H )ρ(v2 , H ). | det DH (p)| = sin(Angle1 (U ))

(2)

(b) If Ord(U ) = d > 1, we have | det DH (p)| =

Angled (H (U )) · ρ(v1 , H )d+1 . Angled (U )

(3)

Remark 4. In the claim (b), v1 = v2 and ρ(v1 , H ) = ρ(v2 , H ). Proof. In the case Ord(U ) = 1, the formula (2) says nothing but the fact that the absolute value of the determinant of H at p is the ratio between the area of the infinitesimal parallelogram at p spanned by the vectors v1 and v2 and that of its image under DH (p). Let us consider the case Ord(U ) = d > 1. Since two curves bi , i = 1, 2, representing the germs βi are almost parallel in small neighborhoods of the point p, the minimum min{kb1 (t) − b2 (s)k|s ∈ [0, 2 )} is attained when b1 (t) − b2 (s) is almost orthogonal to the vector v1 = v2 when t is small. Hence we can see, by elementary geometric argument, lim

t→+0

min{kH ◦ b1 (t) − H ◦ b2 (s)k | s ∈ [0, 2 )} = hDH (v1⊥ ), DH (v1 )⊥ i min{kb1 (t) − b2 (s)k | s ∈ [0, 2 )} | det DH (p)| , = ρ(v1 , H )

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

609

where v1⊥ and DH (v1 )⊥ are the unit vectors that are orthogonal to the vectors v1 and DH (v1 ) respectively and h·, ·i is the inner product in the ordinary sense. Taking into account the change of the variable t so that the map H ◦ b1 (t) satisfies the condition (1), we obtain the claim (b). u t Under the assumption of Lemma 3, we define ρ(U, H ) as the maximum of the expansion rate ρ(v, H ) = kDH (v)k/kvk over all tangent vectors v at p that is in between v1 and v2 or, in other words, that is contained in the closure of the angle formed by U at p. Lemma 5. Under the same assumption as in the last lemma, we have AngleOrd(U ) (H (U )) AngleOrd(U ) (U )

≤ 2π ·

| det DH (p)| , ρ(U, H )ρ0

(4)

where ρ0 ≥ 1 is the minimum of the expansion rate ρ(w, H ) over all tangent vectors w 6 = 0 at p. Proof. If Ord(U ) > 1, (4) is obvious from the last lemma. Let us consider the case Ord(U ) = 1. We first prove, for K = 2π , | det DH (p)| Angle1 (H (U )) . ≤K· Angle1 (U ) ρ(v1 , H )ρ0

(5)

If Angle1 (U ) > π/2, we get (5) for K = 4 because the left-hand side is smaller than 2π/(π/2) = 4, while the right-hand side except K is not smaller than 1. If Angle1 (U ) ≤ π/2 and Angle1 (H (U )) ≤ π/2, we get (5) for K = π/2 from the last proposition because 2/π ≤ sin(x)/x ≤ 1 for 0 < x ≤ π/2. Finally, we consider the case when Angle1 (U ) ≤ π/2 and Angle1 (H (U )) > π/2. In this case, we can take a unit tangent vector v between v1 and v2 such that DH (p)v is orthogonal to DH (p)v1 . Let γ be a germ of real-analytic curve passing through U that is tangent to the vector v at p, and let V be the region between β1 and γ . Then we can apply the above argument to V and get (5) with U replaced by V for K = π/2. Since 2π Angle1 (H (V )) Angle1 (H (U )) ≤ =4 , Angle1 (U ) Angle1 (V ) Angle1 (V ) we get (5) for K = 2π. Therefore we have (5) for K = 2π in any case. Remark that we can replace v1 by v2 in (5) by symmetry. Now let us take a vector v at p that is in between v1 and v2 and satisfy ρ(v, H ) = ρ(U, f ). If v = v1 or v = v2 , the conclusion of the lemma is nothing but (5) or that with v1 replaced by v2 . Otherwise, we consider a germ γ of a real-analytic curve that is tangent to v at p. The germ of curve γ divide U into two regions U1 and U2 . Applying (5) to these regions, we get Angle1 (H (U1 )) Angle1 (H (U2 )) Angle1 (H (U )) ≤ max , Angle1 (U ) Angle1 (U1 ) Angle1 (U2 ) | det DH (p)| . ≤ 2π · ρ(v, H )ρ0 The lemma is proved. u t

610

M. Tsujii

4. Weighted Multiplicity In this section we introduce what we call weighted multiplicity that count the multiplicity of the intersection of dividing curves {γi }m i=1 with appropriate weight. Let p be a point in E = ∪ki=1 ∂Di . Let γi : [ai , bi ] → R2 be a dividing curve. If the curve γi passes through the point p, it gives germs of real-analytic curves at p in the following manner: If γi (t) = p for t ∈ (ai , bi ), the curve γi gives two germs of real-analytic curve at p represented by the curves s 7 → γi (t + s) and s 7→ γi (t − s). If γi (ai ) = p (resp. γi (bi ) = p), the curve γi gives one germ represented by a curve s 7 → γi (ai + s) (resp. s 7 → γi (bi − s)). m(p) Let B(p) = {βi (p)}i=1 be the collection of the distinct germs of real-analytic curves given in such a way by all dividing curves. Remark that m(p) = 2 for all points p ∈ E except for finite points. These germs of real-analytic curves are called the germs of curves at p given by the dividing curves. We always assume that the germs of curves βi (p), i = 1, 2, · · · , m(p), are arranged in counterclockwise order around the point p. Let Ui , 1 ≤ i < m(p), be the region between βi (p) and βi+1 (p), and let Um(p) be that m(p) between βm(p) (p) and β1 (p). We denote the set of these regions by U(p) = {Ui (p)}i=1 . For U ∈ U(p), let fU be the germ of a real-analytic map at p that is obtained as the real-analytic extension of the restriction of f to a representative of U . We define the weight W (Ui (p)) of the region Ui (p) ∈ U(p) by W (Ui (p)) =

kDfUi (p) (p)v1 k/kv1 k + kDfUi (p) (p)v2 k/kv2 k | det DfUi (p) (p)|

for 1 ≤ i ≤ m(p), where v1 and v2 are the tangent vectors of the boundary curves of Ui (p) at p. The weighted multiplicity M(p, f ) at a point p ∈ E is defined by M(p, f ) =

m(p) X

W (Ui (p)).

i=1

The weighted multiplicity M(f ) of a piecewise real-analytic expanding map f is the supremum of M(f, p) over all points p ∈ E. Remark again that M(f, p) ≤ 4ρ(f )−1 for all p ∈ E except for finite points. Weighted multiplicity M(f ) is the quantity that we are most concerned with in the argument below. 5. Functions of Bounded Variation We use the theory of bounded variation functions in higher dimensional space, which is developed in the book [3]. We recall some definitions and properties of functions of bounded variation from [3]. Let U be an open subset of the plane R2 . Let C r (U, R2 ) be a set of bounded vectorvalued C r functions g = (g1 , g2 ) : U → R2 and let C0r (U, R2 ) be the subset of C r (U, R2 ) that consists of functions with compact support. Similarly, let r (U ) be the set of 1-forms 9 = 91 dx + 92 dy of class C r on U and let r0 be the subset of r (U ) that consists of 1-forms with compact support. We denote the d-dimensional Hausdorff measure by µd . We define the variation Var(ϕ, U ) of the function ϕ ∈ L1 (U ) as the supremum of Z ϕ(z)Divg(z)dµ2 (z) (6) U

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

611

over all g = (g1 , g2 ) ∈ C01 (U, R2 ) satisfying kg(z)k ≤ 1 for z ∈ U , where ∂ ∂ g1 (x, y) + g2 (x, y). ∂x ∂y

Divg(x, y) =

A function ϕ ∈ L1 (U ) is said to be of bounded variation if Var(f, R2 ) < ∞. We denote, by BV(U ), the set of functions ϕ ∈ L1 (U ) of bounded variation. Sometimes it is convenient to write the variation Var(ϕ, U ) as Z ϕd9 ; 9 ∈ 10 (U ) and k9(z)k ≤ 1 for z ∈ U . , (7) Var(ϕ, U ) = sup U

where k9(z)k =

q

912 (z) + 922 (z) for 9 = 91 dx + 92 dy. We can obtain this formula

from the correspondence between g = (g1 , g2 ) ∈ C01 (U, R2 ) and 9 = −g2 dx +g1 dy ∈ 10 (U ). Remark that, for ϕ ∈ BV(U ), the functional Z ϕDivgdµ2 ∈ R 8ϕ : g ∈ C01 (U, R2 ) 7→ U

satisfies

|8(g1 ) − 8(g2 )| ≤ sup kg1 (z) − g2 (z)k · Var(ϕ, U ). z∈U

Hence 8 can be extended uniquely as a continuous linear functional on C00 (U, R2 ). We can consider this extension as a vector-valued Radon measure on U with total variation Var(ϕ, U ). R(See [10, Ch.6] for example.) We denote this vector-valued Radon measure by Dϕ. Let U hg, Dϕi be the integration of a vector-valued function g ∈ C 0 (U, R2 ) with respect to the vector-valued measure Dϕ.RLet |Dϕ| be the R measure that is obtained as the total variation of Dϕ. Obviously we have U hg, Dϕi ≤ kgk·|Dϕ| for g ∈ C 0 (U, R2 ), where kgk denotes the function kgk(x) = kg(x)k on U . The bounded variation norm of the function g ∈ BV(U ) is defined as Z |ϕ|dµ2 . kϕkBV (U ) = Var(ϕ, U ) + U

This norm makes BV(U ) a Banach space. See [3, Remark 1.12]. We make use of the following fact when we prove the existence of an absolutely continuous invariant measure in Sect. 6. Proposition 6. Let U ⊂ R2 be a bounded open set with C 1 boundary. Then sets of functions in BV(U ) that are uniformly bounded in the bounded variation norm k·kBV (U ) are relatively compact in L1 (U ). Another important property of functions of bounded variation is that they give traces on the boundary. Let U ⊂ R2 be a bounded region whose boundary is a finite union of real-analytic simple closed curves. We denote by L1 (∂U ) the set of functions that is integrable with respect to the one dimensional Hausdorff measure µ1 . We put B(x, r) = {y ∈ R2 | kx − yk < r}. Then we have

612

M. Tsujii

Proposition 7. For ϕ ∈ BV(U ), there is a unique function ϕ − ∈ L1 (∂U ) such that Z |ϕ(z) − ϕ − (x)|dµ2 (z) = 0 lim µ2 (B(x, r))−1 r→0

B(x,r)∩U

for µ1 -almost all x ∈ ∂U . Moreover, (a) for ζ ∈ C01 (R2 , R2 ), we have Z Z Z ϕ − (z)hζ (z), ν(z)idµ1 (z) = ϕ(z)Divζ (z)dµ2 (z) + hζ, Dϕi, ∂U

U

U

where ν(z) is the unit outer normal vector for the boundary ∂U at z, (b) if we define ϕ(z) = 0 for z ∈ / U , we have Z |ϕ − (x)|dµ1 (x). Var(ϕ, R2 ) = Var(ϕ, U ) + ∂U

The function ϕ − ∈ L1 (∂U ) in the above proposition is called the trace of ϕ on the boundary ∂U . We refer to Theorem 1.19 of [3] for Proposition 6, and Theorem 2.10 and Remark 2.14 of [3] for Proposition 7. Remark 8. In Theorem 2.10 and Remark 2.14 of [3], the boundary of the region U is assumed to be Lipschitz. Hence Proposition 7 is not a direct consequence of that theorem when there are cusps on the boundary of U . But, with slight modification in the proof, we can derive Proposition 7, because the proposition is essentially a local one. 6. An Existence Theorem for Absolutely Continuous Invariant Measures Let f : D → D be a piecewise real-analytic expanding map and let ξ = {Di }ki=1 be the partition of the domain D associated to it as in Sect. 2. In this section we prove Theorem 9. If f satisfies (a) M(f ) + ρ(f )−1 < 1 and, (b) the continuous extension of each restriction f |Di , 1 ≤ i ≤ k, to the closure D i is injective, then there exists an absolutely continuous invariant finite measure for f . Theorem 9 above is a modification of the result of Góra and Boyarski in [4], and the essential part of the proof below is a repetition of the argument in [4]. We define the Perron–Frobenius operator Pf : L1 (D) → L1 (D) by Pf (ϕ)(x) =

X f (y)=x

ϕ(y) , | det Df (y)|

where the sum is taken over all y ∈ ∪ki=1 Di such that f (y) = x. Remark that if there exists a non-negative valued function h 6 = 0 in L1 (D) such that Pf (ϕ) = ϕ, the measure h · µ2 is an absolutely continuous invariant finite measure for f . From the definition of Perron–Frobenius operator, we have Z Z Pf gdµ2 = gdµ2 (8) D

D

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

613

for a non-negative valued function g ∈ L1 (D). In what follows, we consider each / D. The element ϕ ∈ L1 (D) as an element of L1 (R2 ) by defining ϕ(x) = 0 for x ∈ following is the key in the proof of Theorem 9. Proposition 10. For any > 0, there exists a constant C > 0 such that Var(Pf ϕ, R2 ) ≤

k X

Var(Pf (ϕ · χDj ), R2 )

j =1

≤ (M(f ) + ρ −1 (f ) + )Var(ϕ, R2 ) + CkϕkL1

(9)

for ϕ ∈ BV(D). This kind of inequality appeared in the original work of Lasota and Yorke and can be seen in the papers of Keller [7,8] and Góra&Boyarski [4]. First, we prove that Proposition 10 implies Theorem 9. Take a small number > 0 such that M(f ) + ρ(f )−1 + < 1. Then take the constant C inR Proposition 10 for that . Let ϕ ∈ BV(D) be a non-negative valued function such that D ϕdµ2 = 1. From (9) and (8), we obtain Var(Pfn ϕ, R2 ) ≤ (1 − M(f ) − ρ(f )−1 − )−1 C + Var(ϕ, R2 ) P i ∞ by induction. Hence the set of functions {ζn = n−1 n−1 i=0 Pf ϕ}n=1 are contained in BV(D) and uniformly bounded in the bounded variation norm k · kBV (R2 ) on R2 . Applying Theorem 6 to a bounded open subset U ⊃ D with C 1 boundary, we can find a ϕ∞ ∈ L1 (D) in the L1 norm. Obviously subsequence ζnk that converges to a function R ϕ∞ is non-negative valued. We have D ϕ∞ dµ2 = 1 from (8). Moreover, ϕ∞ is a fixed point of the Perron–Frobenius operator Pf because kPf ϕ∞ − ϕ∞ kL1

n −1 nk −1 k

1 X 1 X

i+1 i = lim Pf ϕ − Pf ϕ

k→∞ nk nk i=0

i=0

L1

1 2 ≤ lim kPfnk ϕ − ϕkL1 ≤ lim = 0. k→∞ nk k→∞ nk Therefore ϕ∞ · µ2 is an absolutely continuous invariant finite measure for f . Theorem 9 is proved. Now let us go into the proof of Proposition 10. We first study the situation that the image of a dividing curve γi : [ai , bi ] → R2 is contained in the boundary of some region Dj ∈ ξ . From Proposition 7, a function ϕ ∈ BV(D) viewed as a function on Dj gives the trace ϕj− on the curve γi . We consider one side of the tubular neighborhood of the curve γi , 0ij : [ai , bi ] × [0, δ] → R2 , (t, s) 7→ γi (t) + s · ν(t), where ν(t) is the unit inner normal vector for the boundary ∂Dj at γi (t) and δ > 0 is a small constant that we will specify in the argument below. We first take δ > 0 so small that 0ij is a diffeomorphism. We will denote the image of 0ij by the same symbol 0ij .

614

M. Tsujii

Let Vi (x) be the unit tangent vector of the curve γi at x ∈ γi . We define a real-analytic function hij : γi → R by ρ(Vi (x), fDj ) . hij (x) = | det DfDj (x)| Let π : [ai , bi ] × [0, δ] → [ai , bi ] be the projection. We define a function h˜ ij = (0, hij ◦ 0ij ◦ π ) : [ai , bi ] × [0, δ] → R2 . Put kh˜ ij k(x) = kh˜ ij (x)k. Then we have Lemma 11. If ϕ ∈ BV(D), the composition ϕ ◦ 0ij viewed as a function on the open rectangle (ai , bi ) × (0, δ) is of bounded variation. We have Z Z 1 − hij · ϕj dµ1 ≤ kh˜ ij k · |ϕ ◦ 0ij |dµ2 δ [ai ,bi ]×[0,δ] γi Z + kh˜ ij k · |D(ϕ ◦ 0ij )|. (ai ,bi )×(0,δ)

Proof. We can get the first claim easily from formula (7). For y ∈ (0, δ), let us consider the rectangle Ry = [ai , bi ] × [0, y]. The function ϕ ◦ 0ij , viewed as a function on the interior of Ry , gives the trace G− y on the boundary ∂Ry of the rectangle. Remark that the restriction of G− on the edge [ai , bi ] × {0} does not depend on y and equals the y − function ϕi ◦ 0ij on [ai , bi ] × {0} from the definition of the trace. Obviously, we have Z [ai ,bi ]×{0}

G− y · hij ◦ 0i,j ◦ π dµ1 =

Z γi

hij · ϕj− dµ1 .

Let us define a function B : (ai , bi ) × (0, δ) → R by B(x, y) = G− y (x, y). Then, from Lebesgue’s theorem [15, Theorem 1.3.8], B(x, y) = ϕ ◦ 0ij (x, y) for almost every (x, y) ∈ (ai , bi ) × (0, δ). Applying Proposition 7 to ϕ ◦ 0ij (x, y) and h˜ ij on Ry , we obtain Z Z − B · h ◦ 0 ◦ π dµ − h · ϕ dµ ij ij 1 ij 1 j [ai ,bi ]×{y} γi Z (10) kh˜ ij k · |D(ϕ ◦ 0ij )| ≤ intRy

because Div h˜ ij ≡ 0. Since kh˜ ij k = hij ◦ 0ij ◦ π , we get Z δ Z ≤

γi

hij · ϕj− dµ1 −

[ai ,bi ]×[0,δ]

Z (ai ,bi )×(0,δ)

kh˜ ij k · |D(ϕ ◦ 0ij )|

kh˜ ij k · |ϕ ◦ 0ij |dµ2

from Fubini’s theorem. This implies the lemma. u t

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

615

Let us take a small number η > 0 such that (1 + η)2 M(f ) < M(f ) + . Remark that, if δ > 0 is sufficiently small, the map 0ij is almost an isometry on [ai , bi ] × [0, δ]. Hence we can take δ > 0 so small that (1 + η)−1 kvk < kD0ij (v)k ≤ (1 + η)kvk

(11)

for all tangent vectors v at all points in [ai , bi ] × [0, δ]. Let us define a function hˆ ij : R2 → R by −1 for x ∈ 0ij ; ˆhij (z) = hij ◦ 0ij ◦ π ◦ 0ij 0 for x ∈ / 0ij . Since hˆ ij ◦ 0ij = kh˜ ij k for x ∈ 0ij , we obtain, from (11), Z Z kh˜ ij k · |D(ϕ ◦ 0ij )| ≤ (1 + η) (ai ,bi )×(0,δ)

int0ij

hˆ ij |Dϕ|.

Therefore we get, from the last lemma, Proposition 12. We have Z hij · ϕj− dµ1 γi

≤ (1 + η)

2

1 δ

Z 0ij

hˆ ij |ϕ|dµ2 +

Z R2

! hˆ ij |Dϕ| .

Next we prove the following proposition. Proposition 13. Let U be one of the regions in the partition ξ = {Dj }kj =1 . Let V (x) be the unit tangent vector for the boundary ∂U at x ∈ ∂U . Let ϕ ∈ BV(U ) be non-negative valued. Let ϕ − be the trace of ϕ on the boundary ∂U . Then we have Var(Pf ϕ, R2 ) ≤ ρ(f )−1 Var(ϕ, U ) + C(f, U )kϕkL1 Z ρ(V (x), fU ) − + ϕ (x)dµ1 (x), ∂U | det DfU (x)| where C(f, U ) is a constant depending only on the restriction f |U of f (defined in (12) below). Proof. We have Z Z ϕ Pf ϕd9 = df ∗ 9 R2 R2 | det Df | Z Z f ∗9 1 ϕd ϕd − ∧ f ∗9 = | det Df | | det Df | R2 R2 for 9 ∈ 10 (f (U )). Hence we get, from formula (7), Var(Pf ϕ, f (U )) ≤ ρ(f )−1 Var(ϕ, U ) + C(f, U )kϕkL1 if we put σ (f, U ) = sup{kDf (v)k/kvk | 0 6 = v ∈ Tx R2 , x ∈ U } and C(f, U ) = σ (f, U ) sup{kD((det Df )−1 )(x)k | x ∈ U }.

(12)

616

M. Tsujii

On the other hand, we have Z Z (Pf ϕ)− dµ1 = ∂f (U )

∂U

ρ(V (x), fU ) − ϕ (x)dµ1 (x). | det DfU (x)|

From these and Proposition 7(b), we obtain the conclusion. u t Now we complete the proof of Proposition 10. From Proposition 13, k X

Var(Pf ϕ, R2 ) ≤

Var(Pf (ϕ · χDj ), R2 )

j =1  k X XZ Var(ϕ, Dj )  + C(f, Dj )kϕkL1 + hij ϕj− dµ1 , ≤ ρ(f ) γi

where

P

j =1

i∼j

i∼j

is the sum over i satisfying γi ⊂ ∂Dj . We have k X

Var(ϕ, Dj ) ≤ Var(ϕ, R2 ).

j =1

Hence, in order to prove Proposition 10, we show k XZ X j =1 i∼j

γi

hij ϕj− dµ1 ≤ (M(f ) + )Var(ϕ, R2 ) + KkϕkL1

for some constant K > 0. But, from Proposition 12, it is sufficient to show (1 + η)

2

k XZ X j =1 i∼j

R2

hˆ ij |Dϕ| ≤ (M(f ) + )Var(ϕ, R2 )

or, more simply, (1 + η)

2

k X X

hˆ ij (x) ≤ M(f ) + for x ∈ D.

(13)

j =1 i∼j

Notice that hˆ ij (x) = hij (x) on the dividing curves γi . From the definition of weighted multiplicity M(p, f ), we have k X X

hˆ ij (x) ≤ M(x, f )

j =1 i∼j

for x ∈ E, if δ is sufficiently small. Let F be the set of points that is contained in more than two dividing curves. From the choice of η, we can take a small open neighborhood W of the finite set F in D such that the left-hand side of (13) is not larger than (M(f )+) on W . If δ is sufficiently small, the intersections of two distinct subsets in {0ij | γi ⊂ Dj } are contained in W . By continuity, we easily see that the left-hand side of (13) is smaller than M(f ) + for x ∈ D − W if δ is sufficiently small. Therefore (13) holds for sufficiently small δ. Proposition 10 is proved. u t

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

617

7. Estimates of the Weighted Multiplicity for the Iterations In this last section, we complete the proof of Theorem 1 by considering the iterations of a piecewise real-analytic expanding map f . First, remark that a map has an absolutely continuous finite invariant measure if and only if an iteration of it does. Since f n is a piecewise real-analytic map with minimum expansion rate ρ(f n ) ≥ ρ(f )n , we can assume ρ(f ) > 64π without loss of generality. We prove the following theorem. Theorem 14. M(f n ) → 0. Subdividing the partition ξ by real-analytic curves artificially, we can assume that f satisfies condition (b) in Theorem 9. It means that all iterations of f also satisfy that condition. Thus, from Theorem 14 above, iterations f n of f satisfy assumptions (a) and (b) of Theorem 9 if n is sufficiently large. Therefore we can get Theorem 1 from Theorem 14. (n) k(n) Let us prepare some notations in order to prove Theorem 14. Let ξn = {Di }i=1 be n the real-analytic partition associated to the piecewise real-analytic map f . Let E(n) = m(p,n) k(n) (n) k(n) (n) D − ∪i=1 Di = ∪i=1 ∂Di . For p ∈ E(n), let {βi (p, n)}i=1 be the germs of realanalytic curves at p ∈ E(n) given by the dividing curves of the partition ξn . We assume that these germs of curves are arranged in counterclockwise order around p as before. Let Ui (p, n) be the region between βi (p, n) and βi+1 (p, n) for 1 ≤ i ≤ m(p, n). Let m(p,n) us denote U(p, n) = {Ui (p, n)}i=1 . For U ∈ U(p, n), let fUn be the germ of the real-analytic map at p obtained as a real-analytic continuation of the restriction of f n to a representative of U . Let 1 be the maximum of Ord(Ui (p, 1)) over all 1 ≤ i ≤ m(p, 1) and all p ∈ E(1). Let θ and 2 be the minimum and maximum of AngleOrd(Ui (p,1)) (f (Ui (p, 1))) over all 1 ≤ i ≤ m(p, 1) and all p ∈ E(1) respectively. Let µ be the maximum of m(p, 1) over all p ∈ E(1). Thus 1, θ , 2 and µ depend only on the single map f . Let us consider a point p ∈ E(n). Each V ∈ U(p, n + 1) is contained in some U ∈ U(p, n) as a germ. Remark that, if V ⊂ U and V 6 = U , the image fUn (p) is contained in E(1) and a dividing curve of the partition ξ = ξ1 passing through fUn (p) divides f n (U ) into more than two regions. We say that V ∈ U(p, n + 1) is a kid of U ∈ U(p, n) if V ⊂ U . If V is a kid of U and if V and U have at least one germ of real-analytic curve as a boundary curve in common additionally, we say V is a daughter of U . Especially, if V = U , V is a daughter of U . Obviously, each U ∈ U(p, n) has at most two daughters. If Ord(V ) > Ord(U ), we say that V is a small kid of U . The reason why we distinguish daughters is the following. Let V ∈ U(n + 1, p) be a kid of U ∈ U(n, p) and assume that U 6= V . If V is not a daughter, f n (V ) should coincide with an element of U(fUn (p), 1). So AngleOrd(V ) (f n+1 (V )) ≤ 2. On the other hand, if V is a daughter and it is small, we can not expect such anestimate on AngleOrd(V ) (f n+1 (V )). For the same reason, we put the following definition.An element U of U(p, n) is called special if Ord(U ) > 1 or if there is a chain Ui ∈ U(p, n − ` + i), i = 0, 1, 2, · · · , `, of regions with length ` + 1 ≥ 2 such that • U` = U ,

618

M. Tsujii

• U1 is a small kid and daughter of U0 , and • Ui+1 is a daughter of Ui for 1 ≤ i < `. In order to estimate M(f n ), we introduce what we call the modified weight W(U ) of U ∈ U(p, n) in the following manner. We fix a small number 0 < η < 1 that will be specified later in the condition (16). We define the level `(U ) of U ∈ U(p, n), p ∈ E(n), by 2 min{Ord(U ), 1 + 1} if U is not special; `(U ) = 2 min{Ord(U ), 1 + 1} − 1 if U is special. Remark that we always have `(V ) ≥ `(U ) if V is a kid of U . If U ∈ U(p, n) is special, we define η`(U ) ρ(U, fUn ) . W(U ) = | det DfUn (p)| (For the definition of ρ(U, fUn ), see Sect. 3.) On the other hand, if U is not special, we define η`(U ) ρ(U, fUn ) AngleOrd(U ) (f n (U )) + 1 , W(U ) = | det DfUn (p)| θ where [·] is Gauss’ symbol. We put X W(U, f n ) and M(f n ) = sup M(p, f n ). M(p, f n ) = U ∈U (p,n)

p∈E(n)

Clearly, we have M(f n ) ≤ 2η−21−2 M(f n ). In order to prove Theorem 14, it is enough to show the following proposition. Proposition 15. If we take η > 0 sufficiently small, we have X W(V ) ≤ W(U )/2 V :a kid of U

(14)

for all U ∈ U(p, n), p ∈ E(n), n ≥ 1. In fact, if this is true, we have M(p, f n+1 ) ≤ (1/2)M(p, f n ) ≤ (1/2)M(f n ) for p ∈ E(n). On the other hand, we have M(p, f n+1 ) ≤ ρ(f )−n M(f n (p), f ) ≤ (1/2)n M(f ) for p ∈ E(n + 1) − E(n). These show M(f n ) ≤ (1/2)n M(f ) inductively. Therefore M(f n ) ≤ (1/2)n−1 η−21−2 M(f ) → 0 as n → ∞. Proof. We prove Proposition 15. Let us consider a region U ∈ U(p, n) and its kids. We assume that OrdU ≤ 1 until the end of this proof where we treat the case OrdU > 1. We classify the kids V of U into the following four classes: 1. 2. 3. 4.

V V V V

is a daughter of U , and V is a small kid of U , is not a daughter of U , and V is a small kid of U , is a daughter of U , and V is not a small kid of U , and is not a daughter of U , and V is not a small kid of U .

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

619

We estimate the sums of modified weights over the kids V in each class. First we consider the kids in class 1. Kids V in this class is special. Hence we have W(V ) ≤

ρ(f n (V ), ff n (U ) ) W(U ) ≤ ρ(f )−1 W(U ). | det Dff n (U ) (fUn (p))|

Since the number of kids in this class is at most 2, the sum of W(V ) over the kids V in this class is not larger than 2ρ(f )−1 W(U ) < (1/8)W(U ). We consider class 2. Note that `(V ) ≥ `(U ) + 1 in this case. The number of the kids in this class is not larger than µ. For each kid V in this class, we have AngleOrd(V ) (f n+1 (V )) ≤ 2. Thus it holds ρ(f n (V ), ff n (V ) ) W(U ) | det Dff n (V ) (fVn (p))| ≤ η(2/θ + 1)W(U ).

W(V ) ≤ η(2/θ + 1)

Hence the sum of W(V ) over the kids V in this class is not larger than µη(2/θ + 1)W(U ). We consider case 3. Remark that U is special if and only if so is V . In the case U is special, we easily see that W(V ) ≤

ρ(f n (V ), f ) W(U ) ≤ ρ(f )−1 W(U ). | det DfU (p)|

And the sum is not larger than 2ρ(f )−1 W(U ) < (1/8)W(U ). In case U is not special, we have [AngleOrd(V ) (f n+1 (V ))/θ] + 1

ρ(f n (V ), ff n (V ) ) W(U ) [AngleOrd(U ) (f n (U ))/θ] + 1 | det Dff n (V ) (fVn (p))| ) ( AngleOrd(V ) (f n+1 (V )) ρ(f n (V ), ff n (V ) ) W(U ). ≤ 1+ n AngleOrd(V ) (f (V )) | det Dff n (V ) (fVn (p))|

W(V ) ≤

Here we used the fact AngleOrd(U ) (f n (U )) ≥ AngleOrd(V ) (f n (V )) and an inequality y+1 y [y + 1] ≤ ≤ +1 [x + 1] max{x, 1} x Using Lemma 5 in case

Angle(f n+1 (V )) Angle(f n (V ))

for x > 0 and y > 0.

> 1, we get

W(V ) ≤ 4πρ(f )−1 W(U ). Since the number of kids in this class is at most 2, the sum is not larger than 8πρ(f )−1 W(U ) < (1/8)W(U ). We see class 4. In this case, let Vi , i = 1, 2, · · · , `, be the kids of this class and let d = Ord(U ). Remark that Vi , i = 1, 2, · · · , `, are not special. In case U is special, we P have `(Vi ) ≥ `(U )+1 for all i. Hence, in this case, we can see that the sum `i=1 W(Vi )

620

M. Tsujii

is not larger than µη(2/θ + 1)W(U ) by just the same argument as above for class 2. Let us consider the case that U is not special. Obviously we have ` X i=1

Angled (f n (Vi )) ≤ Angled (f n (U )).

(15)

Since Angled (f n+1 (Vi )) ≥ θ for i = 1, 2, · · · , `, we have [Angled (f n+1 (Vi ))/θ + 1] ≤ 2Angled (f n+1 (Vi ))/θ. By using Lemma 5, we obtain [Angled (f n+1 (Vi ))/θ + 1] ρ(f n (Vi ), f ) W(Vi ) ≤ · n W(U ) [Angled (f (U )/θ + 1] | det Dff n (Vi ) (fVni (p))|

2πAngled (f n (Vi )) [Angled (f n+1 (Vi ))/θ + 1] · [Angled (f n (U ))/θ + 1] ρ(f )Angled (f n+1 (Vi )) n 4πAngled (f (Vi )) . ≤ ρ(f )Angled (f n (U )) ≤

Thus we have, from (15), ` X

W(Vi ) ≤ 4πρ(f )−1 W(U )

i=1

in this case. Summing up all the above arguments for the four classes, we obtain (14) for the case Ord(U ) ≤ 1, if we take η > 0 so small that µη(2/θ + 1) < 1/8

(16)

because the sums of modified weights over each of four classes are smaller than (1/8)W(U ). Finally, let us consider the case Ord(U ) > 1. In this case every kid of U should be a daughter. So the number of the kids is at most 2. Since U and its kid V are special, we have ρ(f n (V ), f ) W(U ) ≤ ρ(f )−1 W(U ). W(V ) ≤ | det Dff n (V ) (fVn (p))|

Therefore the left-hand side of (14) is smaller than 2ρ(f )−1 W(U ) < (1/2)W(U ). We completed the proof of Proposition 15. u t Appendix: Other Ergodic Properties of Piecewise Real-Analytic Expanding Maps

Proposition 10 and Theorem 9 imply that, if n is sufficiently large, the iteration f n satisfies the inequality (9) with f replaced by f n and the coefficient (M(f n )+ρ −1 (f n )+ ) is smaller than 1. As is pointed out in the papers of Keller [7] and Góra & Boyarski [4], once we get such an inequality, we can derive many properties of Perron–Frobenius operator Pf and those of the absolutely continuous invariant measures for f . In fact, we can apply an ergodic theorem [6] and show that, √ of Ionescu-Tulcea and Marinescu√ if we put BV(U, C) = {ξ + −1η | ξ, η ∈ BV(U )} and kξ + −1ηkBV (U,C) = kξ kBV (U ) + kηkBV (U ) ,

ACIM for Piecewise Real-Analytic Expanding Maps on the Plane

621

• the operator Pf : BV(U ) → BV(U ) has finitely many eigenvalues λ1 , λ2 , · · · , λr of modulus 1, and the corresponding eigenspaces Ei = {ϕ ∈ BV(U, C) | Pf ϕ = λi ϕ} are of finite dimension, • the natural extension of Pf to BV(U, C) is written as Pf =

r X

λi 9i + Q,

i=1

to Ei and where 9i are projections p – lim supn→∞ n kQn kBV (U,C) < 1, – 9i ◦ 9j = 0 for i 6 = j and 9i ◦ Q = Q ◦ 9i = 0 for 1 ≤ i ≤ r. Also one of the eigenvalues λi must be 1. See [7] for details. Let us assume λ1 = 1. Since Pf does not increase L1 -norm of functions and since BV(U ) is dense in L1 (U ), we can see, by approximation argument, P k 1 • For any ϕ ∈ L1 (U ), the sequence 1/n n−1 k=0 Pf (ϕ) converges in L to an element in the eigenspace E1 for λ1 = 1. Especially the density function of each absolutely continuous invariant measure is contained in the finite dimensional space E1 . So we have • there exists only finitely many absolutely continuous ergodic measures µk , 1 ≤ k ≤ q, and all other absolutely continuous invariant measures are convex combinations of them. Furthermore, using the argument in [5,13], we can derive the following ergodic properties of the measure µi ’s. • For each µk , there exist a positive integer p and Borel measurable mutually disjoint subsets Ci , 1 ≤ i ≤ p, such that µk (Ci ) = 1/p, f (Ci ) ⊂ Ci+1 for 1 ≤ i ≤ p − 1, f (Cp ) ⊂ C1 and f p |Ci are exact. We refer to [4, Sect. 3], [9] and the references given there for further results. References 1. Adl-Zarabi, K.: Absolutely continuous invariant measure for piecewise expanding C 2 transformations in Rn on domains with cusps on the boundaries. Ergod. Th.& Dynam. Sys. 16, 1–18 (1996) 2. Buzzi, J.: A.C.I.M.’S for arbitrary expanding piecewise R-analytic mappings of the plane. Ergod. Th.& Dynam. Sys. (to appear) 3. Giusti, E.: Minimal Surfaces and Functions of Bounded Variation. Monographs in Mathematics, Vol. 80, Boston: Birkhauser, 1984 4. Góra, P., Boyarski, A.: Absolutely continuous invariant measures for piecewise expanding transformations in RN . Israel J. Math. 67, 272–276 (1989) 5. Hofbauer, F. and Keller, G.: Ergodic properties of invariant measures for piecewise monotonic transformations. Math. Z. 180, 119–140 (1982) 6. Ionescu-Tulcea, C., and Marinescu, G.: Theorie ergodique pour des classes d’operations non completement continues. Ann. Math.(2) 52, 140–147 (1950) 7. Keller, G.: Propriété ergodique des endomorphismes dilatants, C 2 par morceaux, des r´gions bornées du plan. Thesis, Universite de Rennes, 1979 8. Keller, G.: Ergodicité et mesures invariantes pour les transformations dilatantes par morceaux d’une région bornée du plan. C.R.Acad. Sci. Paris 289 Serie A, 625–627 (1979) 9. Keller, G.: Generalized bounded variation and applications to piecewise monotonic transformations. Z. Wahr. verw. Geb. 69, 461–478 (1985)

622

M. Tsujii

10. Lang, S.: Real and Functional Analysis. Graduate Text in Math. 142, Berlin–Heidelberg–New York: Springer, 1993 11. Lasota, A., Yorke, J.: On the existence of invariant measure for piecewise monotonic transformations. Trans. A.M.S. 186, 481–488 (1973) 12. Renyi,A.: Representation of real numbers and their ergodic properties. Acta. Math. Akad. Sc. Hungar. 8, 477–493 (1957) 13. Rychlik, M.: Bounded variation and invariant measures. Studia math. 76, 69–80 (1983) 14. Tsujii, M.: Piecewise expanding maps on the plane with singular ergodic properties. Ergod. Th.& Dynam. Sys. (to appear) 15. Ziemer, W.: Weakly Differentiable Functions. Graduate Text in Math. 120, Berlin–Heidelberg–New York: Springer, 1989 Communicated by Ya. G. Sinai

Commun. Math. Phys. 208, 623 – 661 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Microlocal Analysis and Interacting Quantum Field Theories: Renormalization on Physical Backgrounds Romeo Brunetti, Klaus Fredenhagen Institut für Theoretische Physik, Universität Hamburg,149 Luruper Chaussee, 22761 Hamburg, Germany. E-mail: [email protected]; [email protected] Received: 31 March 1999 / Accepted: 10 June 1999

Dedicated to the memory of Professor Roberto Stroffolini Abstract: We present a perturbative construction of interacting quantum field theories on smooth globally hyperbolic (curved) space-times. We develop a purely local version of the Stückelberg–Bogoliubov–Epstein–Glaser method of renormalization by using techniques from microlocal analysis. Relying on recent results of Radzikowski, Köhler and the authors about a formulation of a local spectrum condition in terms of wave front sets of correlation functions of quantum fields on curved space-times, we construct time-ordered operator-valued products of Wick polynomials of free fields. They serve as building blocks for a local (perturbative) definition of interacting fields. Renormalization in this framework amounts to extensions of expectation values of time-ordered products to all points of space-time. The extensions are classified according to a microlocal generalization of Steinmann scaling degree corresponding to the degree of divergence in other renormalization schemes. As a result, we prove that the usual perturbative classification of interacting quantum field theories holds also on curved space-times. Finite renormalizations are deferred to a subsequent paper. As byproducts, we describe a perturbative construction of local algebras of observables, present a new definition of Wick polynomials as operator-valued distributions on a natural domain, and we find a general method for the extension of distributions which were defined on the complement of some surface. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2. General Theory of Quantized Fields and Microlocal Analysis 2.1 Wave front sets and Hadamard states for free fields . . . 2.2 A new construction of Wick polynomials . . . . . . . . . 3. On a Local Formulation of Perturbation Theory . . . . . . . . 3.1 Formulation of the local S-matrix . . . . . . . . . . . . . 3.2 Defining properties . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

624 628 629 632 637 638 639

624

R. Brunetti, K. Fredenhagen

4. Inductive Construction up to the Small Diagonal . . . . . . . . . . 5. Steinmann Scaling Degree and the Extension of Distributions . . . 5.1 The scaling degree . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Extensions of distributions to a point . . . . . . . . . . . . . . 6. Surfaces of Uniform Singularity and the Microlocal Scaling Degree 6.1 Scaling degrees at submanifolds . . . . . . . . . . . . . . . . 6.2 Invariance and properties for the scaling degrees . . . . . . . . 6.3 Transversal scaling degree . . . . . . . . . . . . . . . . . . . 6.4 Extension of distributions to surfaces . . . . . . . . . . . . . . 7. Extension to the Diagonal and Renormalization . . . . . . . . . . . 8. On the Definition of the Net of Local Algebras of Observables . . . 9. Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

641 643 644 645 649 649 651 653 654 656 657 658

1. Introduction The quest of the existence of a non-trivial quantum field theory in four space-time dimensions is still without any conclusive result. Nonetheless, physicists are working daily, with success, on concrete models which describe very efficiently physics at wide energy scales. This description is based on expansions of physical quantities like amplitudes of a scattering process in terms of power series of “physical” parameters, as coupling constants, masses, charges. The higher order terms of these power series are usually ill defined, in a naive approach, but physicists have soon learned how to make sense out of them through the procedure now known as renormalization [51]. The rigorous extension of this procedure to curved Lorentzian space-times will be the main topic of this paper. The question whether the power series approximate the corresponding quantities in a full quantum field theory goes beyond the scope of this paper and will not be touched. The renormalization procedure on Minkowski space-time led to impressive results in the case of quantum electrodynamics [38,62], where observable quantities were calculated and agree with high precision with the experimental values [42]. Based on this example, a general method of renormalization of interacting fields was found and successfully applied to the standard models of elementary particles. There is another approach to quantum field theory (“axiomatic quantum field theory”) which assumes the existence of a class of models satisfying certain first principles. Under this assumption several structural properties could be derived which therefore hold for every model in this class. To name some, CPT and Spin-Statistics Theorems are among the main successes of this line of thought, and nowadays the application of these methods to specific kind of theories, like conformally covariant theories on low dimensional spaces, is expected to give new insights. Nevertheless, all these schemes (see for instance [12] for a recent survey), either analytic [55] or algebraic [29], by now seem to have missed the challenge for the concreteness needed by, say, particle physicists. A notable exception is the rigorous formulation of perturbation theory [34,20,53, 64] which may be considered as interpolating between the world of phenomenological physics and the mathematical schemes mentioned above. This point of view has been pioneered, in particular, by K. Hepp [34], who gave solid foundations to perturbation theory for quantum field theories on Minkowski space-time. This philosophy proved to be correct for instance in constructive quantum field theory [26] where rigorous renormalization ideas were used as fundamental inputs.

Microlocal Analysis and Renormalization

625

One of the aims of the authors is to put forward a formulation of perturbation theory which satisfies the needs of axiomatic field theory, much in the sense of [53], and is at the same time applicable to phenomenology. In distinction to earlier approaches we give a purely local formulation which is meaningful also on curved space-times. Our principal result is: Main Theorem. All polynomially interacting (scalar) quantum field theories on smooth curved globally hyperbolic space-times of dimension four follow the same perturbative classification as on Minkowski space-time. Before starting the description of our claim we continue the description of the interplay between perturbation theory and rigorous methods. One of the most puzzling things in physics is that all the attempts to include gravity in the renormalization program failed: More recent proposals look for theories of a different kind like string theory [28] or its generalizations which are hoped to describe all known forces, or Ashtekar program [1]. Because of the large difference between the Planck scale (≈ 10−33 cm), where “quantum gravity” effects are expected to become important, and the scale relevant for the Standard Model [31] (≈ 10−17 cm), a reasonable approximation should be to consider gravity as a classical background field and therefore investigate quantum field theory in curved space-time. This ansatz already led to interesting results, the most famous being the Hawking radiation of black-holes [32]. A look through the literature (see, e.g., [60, 3,25]), however, shows that predominantly free field theories were treated on curved physical backgrounds. In fact, most of the papers on interacting quantum field theories on curved space-times deal with the (locally) Euclidean case and discuss renormalization only for particular Feynman diagrams. We are aware of only one attempt at a complete proof of renormalizability, that given by Bunch [13] for the case of the λϕ 4 -model. His attempt was however confined to the rather special case of real analytic space-times which allow an analytic continuation to the (locally) Euclidean situation. It is interesting to note that the main technical tool of his paper is a kind of local Fourier transformation and that some of his mathematical claims can be justified in the framework of Hadamard “parametrices”, both of which belong to the powerful techniques of microlocal analysis that we use in this paper. The situation is then uncertain for general smooth space-times with the Lorentzian signature for the metric. Here, to our knowledge, more or less nothing has been done. Why is the problem of renormalization so difficult on curved space-times? Again the precise perspective gained from the rigorous approach is helpful. The main problem is the absence of translation invariance which in the rigorous schemes plays a decisive rôle. In general, no global (time-like) Killing vector field exists (no energy-momentum operator), so there is no canonical notion of a vacuum state, which is a central object in most formulations of quantum field theory; the spectrum condition (positivity of the energy-momentum operator) can not be formulated. There is no general connection between quantum field theories on Riemannian and Lorentzian space-times corresponding to the Osterwalder-Schrader Theorem [47], and the meaning of path integrals for quantum field theories on curved space-times is unclear. As a result, the rigorous frameworks described before cannot simply be generalized, and the more formal description based on Euclidean ideas and path integrals does not help much. On the other hand, physically motivated by the Einstein Equivalence Principle [59], a quick look at the possible ultraviolet (short distance) divergences indicates that they are of the same nature as on Minkowski space, so no real obstruction for renormalization on

626

R. Brunetti, K. Fredenhagen

curved space-times is visible. Despite the interest in its own right, renormalization on curved space-time might also trigger a conceptual revisitation of renormalization theory on Minkowski space in the light of the principle of locality [18]. To develop perturbation theory in a form which is suitable to extensions to curved Lorentzian space-times, we mainly rely on a construction given by Epstein and Glaser [20] at the beginning of the seventies and on some improvements suggested later by Stora [54]. This construction makes precise older ideas of Stückelberg [56] and Bogoliubov [5]. In spite of its elegance it was widely ignored (compare, e.g., its neglect in several books on quantum field theory, with the exception of [38]). Recently, it was further developed and applied to gauge field theories by Scharf and his collaborators [50] 1 (after earlier work by [4]). We offer some intuitive explanations of the ideas behind the approach of Stückelberg– Bogoliubov–Epstein–Glaser (we refrain from using an acronym for this and simply call it the Epstein and Glaser approach). For simplicity we discuss the case of flat Minkowskian space-time. The basic idea is that, in the asymptotic past and future, the interacting quantum fields approach, in a sense to be specified, free fields, i.e. fields satisfying linear hyperbolic equations of motion. For free fields there exists a precise construction which can be used for a perturbative description of interacting fields. Now, in a translationally invariant theory, the interacting fields approach the asymptotic free fields only in a rather weak sense (LSZ-asymptotic condition [34]). Moreover, the Haag Theorem [29] forbids the construction of interacting fields in the vacuum Hilbert space of the time-0 free fields. In the Epstein and Glaser scheme, these problems are, in a first step, circumvented by choosing interactions which take place only in a bounded region of space-time. Then the scattering operator can be defined in the interaction picture as the time evolution operator from the past, before the interaction was switched on, to the future, after the interaction was switched off. A localized interaction here is thought to be a smooth function of time, t, with compact support with values in the local operators associated with the free field. In the simplest case it is Hint (t) = ϕ(ft ), where ϕ is a free field, i.e. an operator-valued distribution on a Hilbert space, and where ft (x) = δ(x 0 − t)f (x) for some test function f . The S-matrix is then an operator-valued functional S(f ) on the test function space. The functional equation for the evolution operator implies a factorization property for the S-matrix if the support of the interaction (as a function of time) consists of disjoint intervals. In the case above with the interaction being a linear function of the free field we even find the factorization, S(f1 + f2 ) = S(f1 )S(f2 ),

(1)

whenever there exists some t ∈ R such that supp(f1 ) ⊂ {x|x 0 ≥ t} and supp(f2 ) ⊂ {x|x 0 ≤ t}, and where f1 and f2 need not be smooth at the hypersurface x 0 = t. This stronger factorization property is not expected to hold for more singular interactions. Instead we require the following consequence of (1), S(f1 + f2 + f3 ) = S(f1 + f2 )S(f2 )−1 S(f2 + f3 ),

(2)

to hold for test functions f1 , f2 , f3 , whenever the supports of f1 and f3 can be separated by a Cauchy surface such that supp(f1 ) lies in the future and supp(f3 ) in the past of 1 Note that the term “finite” in Scharf’s book refers to the fact that in the Epstein and Glaser approach (as in the similar BPHZ method) no regularization is necessary. It does not mean that the indeterminacy connected with the divergence of naive perturbation theory disappears.

Microlocal Analysis and Renormalization

627

this surface. Together with the normalization condition S(0) = 1 (identity operator on Hilbert space) it implies the first mentioned weaker factorization condition in the case f2 = 0. The functional equation (2) has an interesting property; if S is a functional solving it we get other solutions Sf by defining Sf (g) = S(f )−1 S(f +g) (the relative S-matrices) where f is an arbitrary test function. In particular we get commutativity in case supp(g1 ) and supp(g2 ) are space-like separated, Sf (g1 + g2 ) = Sf (g2 )Sf (g1 ) = Sf (g1 )Sf (g2 ).

(3)

Thus the relative S-matrices satisfy the locality condition required for local observables. They serve as generating functionals for the interacting fields. Unfortunately, a construction of S(f ) in four dimensions is known only in the case of interaction Hamiltonians which are linear or quadratic in the free field, but in two dimensions Wrezinski [63] proved that, at least in the particular case of factorizable f , f (t, x) = g(t)h(x), such a construction is possible for ϕ 4 . One therefore mainly relies on the “infinitesimal” description of the local S-matrix S(g) by studying its formal power series [6] expansion in terms of the “coupling constant” g. The connection with the usual formulation may be done via the adiabatic limit, i.e. the limit for S(g) when g → 1 all over space-time is the S-matrix, or in cases where the limit for S(g) does not exist due to infrared divergences, the limit for the vacuum expectation values of Sg (f ), g → 1, is the generating functional for the time-ordered correlation functions. The description given so far emphasizes the fact that the Epstein and Glaser method is local in spirit, and it might be a favorite candidate for developing a renormalization theory on curved space-times. A closer inspection, however, shows that also in this method translation invariance plays an important rôle, both conceptually and technically, and it will require a lot of work to replace it by other structures. A similar problem was studied by Dosch and Müller [16]. These authors developed the Epstein and Glaser method on Minkowski space for quantum electrodynamics with external time independent electromagnetic fields. Their use of the Hadamard parametrices for the Dirac operator is already much in the spirit of a local formulation of perturbation theory; by the assumption of time independence of the external fields, however, time translation invariance still plays a crucial rôle in their approach. As a matter of fact, it will turn out that microlocal analysis [35] is ideally suited to carry through the program where in particular the concept of the wave front set proves to be extremely useful. We note, en passant, that other reseachers [15,36,37] had previously used these tools in quantum field theory and that more recently Verch [58] has developed a generalization of the concept of wave front sets which can be applied in algebraic quantum field theory. This paper is an extended version of a previous one [9] where we sketched the main ideas. Here we give all the necessary mathematical details. The paper is organized as follows: After this introduction, Sect. 2 provides some useful grounding concepts and fixes the notations. Moreover, we present a new construction of Wick polynomials which may be of independent interest. In Sect. 3 we state the first principles by which we build up the perturbative method on smooth curved globally hyperbolic space-times. The most important change w.r.t. the Epstein and Glaser method is a characterization of the singularity structure of the time-ordered numerical distributions replacing translation covariance. In the course of this part we show a local version of the so called “Theorem 0” of Epstein and Glaser which provides the necessary mathematical properties of the building blocks of the construction. In Sect. 4 we start the inductive procedure which aims at constructing the time-ordered functions up to

628

R. Brunetti, K. Fredenhagen

the small diagonal of the product manifold M n , where all “dangerous” singularities are located. Sections 5 and 6 have a more mathematical flavour; we introduce the concept of scaling degree at a point, following essentially Steinmann [53], and its generalizations in terms of microlocal analysis. The main aim of this section is the description of the extension to all space of distributions defined on the complement of a submanifold. These tools are needed for the classification and implementation of renormalizability. The next, Sect. 7, contains the end of the inductive procedure by which we prove the theories with polynomial interactions to follow the same perturbative classification as on Minkowski space-time. We emphasize that the method of defining the local S-matrix joins perturbation theory with the more abstract algebraic formulation of quantum field theory. In fact, we are able to define a unique family (net, precosheaf, etc.) of ∗-algebras of observables on globally hyperbolic space-times via the idea of the local relative S-matrices. Sect. 8 describes this construction which seems to be widely unknown, in spite of the fact that it may already be found, in a preliminary form, in [52]. This section partly justifies the rather abstract starting point of Sect. 2. An outlook, Sect. 9, concludes the paper. Finally, we stress that the procedure works for general field theories but for simplicity we stick to the notationally easiest case of a single scalar (massive) field theory with self interactions without derivatives. 2. General Theory of Quantized Fields and Microlocal Analysis In order to fix our notations we recall some basic geometrical concepts. Further details may be found in some books on general relativity and Lorentzian geometry (see, for instance, [59] and [2]). We shall work on a space-time (M, g), where by this we mean that M is a connected, Hausdorff, boundaryless topological space of pure dimension d ≥ 2 which (i) is paracompact, (ii) is equipped with a smooth structure, (iii) is endowed with a Lorentzian metric gab , i.e. a smooth 2-cotensor of signature (1, d − 1), i.e. (+, −, · · · , −) and (iv) is oriented and time oriented. Given the metric we have a canonically associated derivative, namely the Levi–Civita derivative denoted by ∇ a , with R the scalar curvature. The notion of and an associated curvature tensor Rbcd totally geodesic submanifold, i.e., that one for which all tangential geodesics stay on the submanifold, is used in Sect. 6. Some words on notations. Sometimes we write a zero section of a vector bundle B as {0} at other times to make precise that it belongs to that bundle we shall write Z(B). However, in order to avoid any abuse, we use the notation B˙ to denote the bundle deprived from its zero section, i.e., B\Z(B). We shall also use the notation M n , whenever we treat the nth order cartesian product of a manifold M, and by 1K , where K ⊆ {1, . . . , n}, we mean the (smooth, closed) submanifold of M n for which any of its points (pi , . . . , pn ) are such that pk1 = pk2 for any pair k1 6 = k2 in K. The causality principle plays a crucial rôle in our construction. Therefore we restrict our space-times to be globally hyperbolic. This means that M is homeomorphic to R×6, where 6 is a (d − 1)-dimensional topological submanifold of M and for each t ∈ R, {t} × 6 is a (spacelike) Cauchy surface. A Cauchy surface is a subset of M which every inextendible non-spacelike curve intersects exactly once. Given a subset S of M we define the causal future/past sets J ± (S) as the subsets of M which consist of all points p ∈ M for which there exists some point s ∈ S connected to p by a non spacelike future/past directed curve. If M is globally hyperbolic, the set J + (p) ∩ J − (q) is compact for any pair p, q ∈ M. Finally, if p ∈ M then the induced metric on the tangent

Microlocal Analysis and Renormalization

629

space Tp M and cotangent space Tp∗ M are Minkowskian, and we define the future/past light-cones V± over these spaces (based on p) in the usual way. Quantum field theories on more general spaces pose consistency problems (see, e.g., Hawking’s “Chronology Protection Conjecture” [33] or the divergence of the energy momentum tensor at the Cauchy horizon observed in [40]). We remark, however, that since our constructions will be purely local, one can as well consider a globally hyperbolic submanifold of any Lorentzian space-time. In many concrete cases, exact solutions of the Einstein equation, like Minkowski, de Sitter, Schwartzschild are real analytic. In these cases some of our results might be sharpened by working with the analytic version of microlocal analysis [44]. In this respect, we should mention some recent results of Bros, Epstein and Moschella for a Gårding-Wightman-like description of quantum field theories on de Sitter space-time [7] where analytic function techniques play a major rôle. 2.1. Wave front sets and Hadamard states for free fields. For the (massive) free field ϕ satisfying the (generalized) Klein–Gordon equation of motion, (g + m2 − κR)ϕ = 0,

(4)

where g is the d’Alembertian (or Laplace-Beltrami) operator w.r.t. the Lorentzian metric g, m ≥ 0 and κ ∈ R, one may associate an algebra of observables defined in the following way: Let Eret resp. Eadv be the retarded resp. advanced Green functions of the Klein–Gordon operator which are uniquely defined on globally hyperbolic space-times, and let E = Eret − Eadv . Then we consider the unital ∗-algebra A which is generated by the symbols ϕ(f ), f ∈ D(M) (space of complex-valued smooth and compactly supported functions), with the following relations: 1. 2. 3. 4.

The map f 7 → ϕ(f ) is linear, ϕ(f )∗ = ϕ(f¯), [ϕ(f ), ϕ(g)] = iE(f ⊗ g)1, ∀f, g ∈ D(M), ϕ((g + m2 − κR)f ) = 0, ∀f ∈ D(M),

where the symbol [ϕ(f ), ϕ(g)] stands for ϕ(f )ϕ(g) − ϕ(g)ϕ(f ) and f¯ means complex conjugation. A state is, by definition, a linear functional ω on A (the expectation value) which is positive (i.e. ω(a ∗ a) ≥ 0) and normalized (ω(1) = 1). It is uniquely determined by a sequence of multilinear functionals ωn , n = 0, 1, . . . (the n-point functions) on the test function space D(M), ωn (f1 , . . . , fn ) = ω(ϕ(f1 ) · · · ϕ(fn )).

(5)

We only consider states whose n-point functions are distributions and restrict furthermore our attention to the states called quasi-free, namely, those states whose only non-trivial n-point functions have n even and are generated in terms of the 2-point functions (see, e.g. [29]). Among them a distinguished class is formed by the so-called Hadamard states (see, e.g. [14,41]). They are thought to be the appropriate analogue of the concept of the vacuum which has no direct counterpart on generic space times. In fact, they are quasifree states whose 2-point functions have a prescribed short-distance behaviour which is partially motivated by the fact that it allows the definition of the expectation value of the energy-momentum tensor (see, e.g. [60]). As first observed by Radzikowski [49] the 2-point functions of Hadamard states can be characterized in terms of their wave front set.

630

R. Brunetti, K. Fredenhagen

To discuss this characterization we need to enter into the realm of microlocal analysis. We give some motivations to the basic notions of wave front sets and present those basic results which are used throughout the paper. We leave the reader the task to look further into the large existing literature [35]. Physicists might, for concreteness, start from the well-written short exposition of Junker in [39], where they can find definitions and results about pseudodifferential operators, which we hold as known. We shall denote by E(Rn ) the space of complex-valued smooth functions and by 0 E (Rn ) its dual space, i.e., the space of compactly supported distributions. It is a standard result in distribution theory that u ∈ E 0 (Rn ) is a smooth function iff its Fourier transform b u decays rapidly in Fourier dual space Rn , i.e. for any integer N ˙ n , where u(k)| ≤ CN (1 + |k|)−N for all k ∈ R there exists a constant CN such that |b . ˙ Rn \{0} = Rn . In case u is not smooth the Fourier transform may still rapidly decay ˙n in certain directions. We may describe this set of directions by an open cone in R ˙ n . It is easy to see that 6(φu) ⊂ 6(u) when and define 6(u) as its complement in R u ∈ E 0 (Rn ) and φ ∈ D(Rn ). This property suggests a strategy for the general case in which u is not of compact support. So, considering u ∈ D0 (Rn ) and a point x ∈ Rn in the support of u, supp(u) ⊂ O, O open subset of Rn , we first localize u via multiplication with some φ ∈ D(Rn ) such that φ(x) 6 = 0 and then consider the Fourier transform of . φu, now a distribution of compact support. We then define the set 6x (u) = ∩φ 6(φu), where the intersection is taken w.r.t. all smooth functions of compact support φ such that φ(x) 6 = 0. This may be called the set of singular directions of u over x. It is empty whenever x ∈ / singsupp(u). . Hence, finally, we define the (smooth) wave front set for u ∈ D0 (Rn ) as WF(u) = n ˙ n | k ∈ 6x (u)}. This set is readily seen to be closed and conic, where {(x, k) ∈ R × R the last means that if k ∈ 6x (u) so do any λk for all λ > 0. It is now crucial that the notion of the wave front set can be lifted to any smooth manifold M, where it is invariantly defined as a subset of the cotangent bundle T˙ ∗ M. This covariance under coordinate transformations is what gives to the definition its real technical power. Among the results which will be important for us we mention that derivatives do not enlarge the wave front set of a distribution, i.e. WF(∂u) ⊆ WF(u), and the following criterion called the Hörmander criterion for multiplication of distributions: M1. Product. Picking two distributions u1 , u2 ∈ D0 (M), the pointwise product u1 u2 exists as a bona-fide distribution whenever WF(u1 ) + WF(u2 ) does not intersect the zero section Z(T ∗ M), i.e., if for all covectors ki ∈ WF(ui ), i = 1, 2, based over the same point one finds that k1 + k2 6= 0. Moreover, if WF(ui ) ⊂ 0i , i = 1, 2, then WF(u1 u2 ) ⊂ 01 ∪ 02 ∪ (01 + 02 ). We shall also refer frequently to a certain continuity property in microlocal analysis which in the body of the paper is sometimes called “Hörmander (pseudo) topology”. It has to do with the notion of convergent sequences which respect also wave front set properties: . M2. Continuity.Let D00 (M) = {v ∈ D0 (M) | WF(v) ⊂ 0}, where 0 is a closed conic set in T˙ ∗ M. A sequence {ui }i∈N ⊂ D00 (M) converges to u ∈ D00 (M) in the sense of the Hörmander (pseudo) topology whenever the following two properties hold true: (a) ui → u weakly∗ (i.e. in D0 (M)), (b) for any properly supported pseudodifferential operator A such that µ supp(A)∩0 = ∅, we have that Aui → Au in the sense of E(M). (µ supp(A) is the projection onto the second component of the wave front set of the Schwartz kernel of A.)

Microlocal Analysis and Renormalization

631

A last property is connected with the sequential continuity, in the sense of M2, of the operation of restriction of a distribution to a submanifold: M3. Trace. Let N ⊂ M denote a submanifold, and let u ∈ D0 (M). Then u can be restricted to the submanifold N whenever WF(u) does not intersect the conormal bundle N ∗ N of N . Moreover, if WF(u) ⊂ 0, with 0 a closed conic set such that 0∩N ∗ N = ∅, then the operator of restriction (trace) γ can be lifted as a sequentially continuous operator, in the sense of M2, from D00 (M) to D0 (N ). For later purpose, it is convenient to have a coordinate dependent formulation of M2(b) by using Fourier transforms. Namely, let x0 ∈ M and let V be an open conical neighbourhood of 0x0 , where the last denotes a set of covectors associated to the point x0 . Choose a chart (ϕ, U ) at x0 such that 0x ⊂ V for all x ∈ U . Let χ ∈ D(U ) with χ(x0 ) 6 = 0. Then the Fourier transform of χu, u ∈ D00 (M), is strongly decreasing in the complement of V , and χui − χ cu)(k)|(1 + |k|)N → 0, sup |([

k ∈V /

(6)

for all N ∈ N if ui → u in D00 (M). If, on the contrary, the above convergence holds true for all choices of x0 , V , (ϕ, U ) and χ, we obtain M2(b). After this digression into microlocal analysis we briefly describe Radzikowski’s characterization of Hadamard states [49]. The idea is to use wave front sets for a formulation of a spectral condition. The antisymmetric part of the 2-point function is the commutator function E. Its wave front set is WF(E) = {(x, k; x 0 , −k 0 ) ∈ T˙ ∗ M 2 | (x, k) ∼ (x 0 , k 0 )}.

(7)

Here the equivalence relation ∼ means that there exists a null geodesic from x to x 0 such that k is coparallel to the tangent vector of the geodesic and k 0 is its parallel transport from x1 to x2 . For coinciding points, the relation is defined as consisting of the degenerate (i.e., only one point) geodesic at x = x 0 which has covector k still along the boundary of the light-cone and k 0 ≡ k. We remark, for a later purpose, that since only light-like covectors are present, one can restrict E, and, whenever local coordinates are chosen, ˙ to any spacelike Cauchy hypersurface. its derivative w.r.t. time E, As a result of [49,43], the 2-point function of a Hadamard state has a wave front set which is just the positive frequency part of WF(E), WF(ω2 ) = {(x, k; x 0 , −k 0 ) ∈ WF(E) | k ∈ V + }.

(8)

Since (8) restricts the singular support of ω2 (x1 , x2 ) to points x1 and x2 which are null related, ω2 is smooth for all other points. The smoothness for space-like related points is known to be true for quantum field theories on Minkowski space satisfying the spectrum condition by the Bargmann–Hall–Wightman Theorem [55]. For time-like related points, however, a similar general prediction on the smoothness does not exist. Another deep result from Radzikowski [49] shows that the Duistermaat–Hörmander [17] distinguished parametrices for the Klein–Gordon equation are nothing else than the (Stückelberg–)Feynman–anti-Feynman propagators (up to C ∞ ) for quasi-free Hadamard states. We recall that the time-ordered 2-point function EF arising from ω2 is given by iEF (x1 , x2 ) = ω2 (x1 , x2 ) + Eret (x1 , x2 ).

632

R. Brunetti, K. Fredenhagen

Its wave front set [49] is WF(EF ) = O ∪ D, where the off-diagonal piece is given by, O = {(x, k; x 0 , −k 0 ) ∈ T˙ ∗ M 2 | (x, k) ∼ (x 0 , k 0 ), x 6 = x 0 , k ∈ V ± if x ∈ J ± (x 0 )}, and the diagonal one by, D = {(x, k; x, −k) ∈ T˙ ∗ M 2 | x ∈ M, k ∈ T˙x∗ M}. Now, one can see why in naive perturbation theory we may find divergences. Indeed, the perturbative expansion in terms of Feynman graphs in position space leads to pointwise products of Feynman propagators. But these products do not satisfy Hörmander criterion for multiplication of distributions since covectors based on the diagonal piece D can add up to zero. 2.2. A new construction of Wick polynomials. In a previous paper [8] we constructed Wick polynomials as operator-valued distributions. We considered a fixed Hadamard state ω and the induced GNS representation (Hω , πω , ω ) for the ∗-algebra A and found the Wick polynomials as operator-valued distributions on the dense cyclic domain generated by ω . We recall that by a GNS triple (Hω , πω , ω ) we mean a complex Hilbert space Hω , a representation πω of A by unbounded operators on Hω , and finally by ω ∈ Hω the cyclic vector representing the state ω for which one has the connection equation ω(A) = (ω , πω (A)ω ), ∀A ∈ A. The dependence of the construction of Wick polynomials on the choice of the Hadamard state led to two problems: The first one is due to the convention that the expectation value of a Wick polynomial vanishes in the chosen Hadamard state. Other choices lead to a finite redefinition, a problem well known from the definition of the expectation value of the energy momentum tensor [61]. Since we shall not discuss finite renormalizations in this paper, we do not treat this problem at the moment. The other problem is of a more technical nature: The smeared Wick polynomials are unbounded operators. We know from the work of Verch [57] that, locally, i.e. in bounded regions of space time, different Hadamard states lead to equivalent representations. But this theorem does not guarantee that the domains of definition for different choices of the cyclic vector coincide. We therefore give here a new definition which depends only on the representation but not on the cyclic vector. Its restriction to the cyclic subspaces coincides with the previous definition. It is well known that the operators ϕ(f ) (now representatives under πω of the abstract elements of the ∗-algebra A in Subsect. 2.1) for a real valued test function are essentially self adjoint on the cyclic domain generated by ω , and that the Weyl operators W (f ) = exp(iϕ(f )∗∗ ) (where now the ∗-operation denotes the Hilbert space adjoint) satisfy the Weyl relation W (f )W (g) = exp(− 2i E(f ⊗ g))W (f + g). The expectation value in the given Hadamard state is 1 ω(W (f )) = exp(− ω2 (f, f )). 2

(9)

Microlocal Analysis and Renormalization

633

. Let :W (f ): = exp( 21 ω2 (f, f ))W (f ), and define for 9 ∈ Hω the vector-valued function 9(f ) = : W (f ) : 9 . Definition 2.1. We say that 9(f ) is infinitely often differentiable at f = 0 if there exists for every integer n ≥ 0 a symmetrical vector-valued distribution δ n 9/δf n on D(M n ), and continuous seminorms pn on the test function space D(M) with pn+1 ≥ pn , such that (a) if p0 (h) = 0 then 9(h) = 9(0) , (b) if h → 0 with pn (h) 6 = 0, then

n

X 1 δ l 9 ⊗l

(h ) pn (h)−n −→ 0,

9(h) −

l! δf l l=0

where k · k stands for the Hilbert space norm in Hω . The kernel of the functional derivative can be written, δn9 = i n : ϕ(x1 ) · · · ϕ(xn ) : 9. δf (x1 ) · · · δf (xn )

(10)

The right-hand side of Eq. (10) defines what is called a Wick monomial. We want to find those vectors on which the Wick monomials can be restricted to partial diagonals. In view of the criterion M3 for the restriction of distributions, we define as the microlocal domain of smoothness the following set: D = 9 ∈ Hω | 9(f ) is infinitely often differentiable at f = 0, and for every n ∈ N the wavefront set of

δn9 is contained in the set δf n

(11)

{(x1 , k1 ; . . . ; xn , kn ) ∈ T˙ ∗ M n | ki ∈ V − , i = 1, . . . , n} . The vector-valued distributions (10) with 9 ∈ D can be restricted to all partial diagonals, and give all possible Wick polynomials. Moreover, according to M1, they may also be multiplied by distributions whose wavefront sets do not contain elements (x1 , k1 ; . . . ; xn , kn ) ∈ T˙ ∗ M n , ki ∈ V + , i = 1, . . . , n. The domain D is invariant under application of Weyl operators and smeared Wick polynomials. We state a crucial property: Lemma 2.2. Let 8 ∈ Hω induce some quasi-free Hadamard state ω0 . Then 8 ∈ D. Proof. The main point rests on the validity of Leibniz rule. Indeed, we can write, 1 1 8(f ) = exp( (ω2 (f, f ) − ω20 (f, f )) exp( ω20 (f, f ) + iϕ(f )∗∗ )8, 2 2 and differentiate w.r.t. f . The general nth order derivative gives δ n 8 ⊗n (h ) = δf n

X I ⊆{1,...,n},|I |

|I |/2

χ (h, h) even

c

δ |I | 80 ⊗|I c | ), c (h δf |I |

(12)

634

R. Brunetti, K. Fredenhagen

where, 80 (f ) = exp( 21 ω20 (f, f ) + iϕ(f )∗∗ )8, χ = ω2 − ω20 is a smooth function on M 2 and a solution of the Klein–Gordon equation in both entries. 0 0 q Now, 8 (h) satisfies the estimate in Definition 2.1 with the seminorms pn (h) = ω20 (h, h) for all n, and the numerical prefactor with the seminorms pn (h) =

q (ω2 + ω20 )(h, h),

hence for the whole expression we may also use the seminorms pn . Actually, as was shown by Verch in [57], there exist two positive constants A and B, such that Aω2 (f, g) ≤ ω20 (f, g) ≤ Bω2 (f, g), f, g ∈ D(M), hence all these seminorms are equivalent. We conclude that 8(f ) is infinitely differentiable at f = 0. The wave front sets of the functional derivatives of 8(f ) and 80 (f ) coincide, since χ is smooth. Using the formula

n 0

δ 8 ⊗n 2 0 n

δf n (h ) = n! ω2 (h, h) ,

(13)

and the information on the wave front set of Hadamard states, we find WF

δn8 δf n

= {(x1 , k1 ; . . . ; xn , kn ) ∈ T˙ ∗ M n | ki ∈ ∂V− , i = 1, . . . , n},

(14)

so 8 ∈ D. u t We use the so-called (generalized) Wick expansion formula which is the basic combinatorial formula for perturbation theory. Let us denote by L any Wick polynomial on D ⊂ Hω , and we define the “derivative” of a Wick polynomial with respect to ϕ to be ∂L/∂ϕ. We can characterize it by the following result: Lemma 2.3. There is a unique Wick polynomial ∂L/∂ϕ which satisfies the equation, [L(x), ϕ(y)] =

∂L (x)iE(x, y), ∂ϕ

(15)

in the sense of operator-valued distributions on D. Proof. By linearity it is sufficient to prove it for any Wick monomial e.g. : ϕ n (x) :. It is obvious that n: ϕ n−1 (x) : satisfies (15), hence we only need to prove that if A is any Wick polynomial for which A(x)E(x, y) = 0 this means that A ≡ 0. But this follows from the fact that for every x ∈ M we can find some test function f such that the (smooth) solution E(x, f ) of the Klein–Gordon equation does not vanish at x. u t Now, let us consider, for any Wick polynomial L, the fields L(j ) = ∂ j L/∂ϕ j , j ∈ N, which, by induction, are uniquely defined according to the previous lemma.

Microlocal Analysis and Renormalization

635

Theorem 2.4 (Generalized Wick expansion Theorem). Let Lk , k = 1, . . . , n, be Wick polynomials. The following relation holds: L1 (x1 ) · · · Ln (xn ) =

X

(j )

(j )

(ω , L1 1 (x1 ) · · · Ln n (xn )ω ) ×

j1 ,...,jn

: ϕ j1 (x1 ) · · · ϕ jn (xn ) : , × j1 ! · · · jn !

(16)

where the summations over the jk ’s go from 0 to the order of the corresponding Lk . For a proof see [34] or just use the previous notion of differentiability and apply induction. Note that the products in the theorem above exist because the wave front sets of their expectation values satisfy Hörmander criterion M1 due to the convexity of the forward light cone. The wave front sets for the Wightman distributions of Wick polynomials may be larger than those of the corresponding distributions for the free field ϕ. Consider as an example the 2-point Wightman function for the Wick monomial : ϕ 2 (x) :, i.e. (ω , : ϕ 2 (x1 ) :: ϕ 2 (x2 ) :ω ). According to the theorem, this is equal to 2ω2 (x1 , x2 )2 . This product exists according to the Hörmander criterion, and its wave front set is contained in (WF(ω2 ) + WF(ω2 )) ∪ . WF(ω2 ) = z2 . The set z2 will be instrumental for some results below. Now, WF(ω2 ) + WF(ω2 ) contains directions which lie inside the light-cone, as is clear by adding up two covectors k1 + k2 for points on the diagonal. One thus sees how already the smallest possible non-linearity may give rise to additional singular directions w.r.t. those already present in the wave front sets of the Wightman functions for the original field ϕ. Another important remark is that z2 is an involutive closed cone, i.e. is a closed cone which is stable under sums, and, as a straightforward result, it gives that (ω2 (x, y))n still has WF(ω2n ) ⊂ z2 . The general structure for multi-point expectation values of Wick polynomials can be found as follows. For a more compact notation some definitions from graph theory are used: Let Gn denote the set of all finite nonoriented graphs with vertices V = {1, . . . , n} and let E G denote the set of edges of a given graph G. Moreover, for any vertex i ∈ V we denote by EiG the subset of edges which belong to the vertex i, possibly empty, and by |EiG | their number and similarly by EijG the subset of edges connecting points i and P j , with the obvious relation |EiG | = j |EijG |. For any edge e ∈ E connecting points i and j we use the “source and range” notation, i.e. i = s(e) and j = r(e), whenever i < j. It is sufficient, by linearity, to restrict ourselves to the treatment of products of Wick monomials. Indeed, let us denote by ωnm1 ,...,mn the expectational value, w.r.t. the GNS-vector ω for a quasi-free Hadamard state ω, of the product of Wick monomials bn (m1 , . . . , mn ) the set of all graphs G for which : ϕ m1 (x1 ) : · · · : ϕ mn (xn ) : , and define as G all vertices j with mj edges are saturated, i.e. |EjG | = mj . Moreover, following [8], we call a triple (x, γ , k) an immersion of any graph G ⊂ Gn into the manifold M whenever, (a) x : V → M is a map from all vertices i of G to points xi of M; (b) γ maps edges e ∈ E G to null geodesics γe connecting points xs(e) and xr(e) ; (c) k maps edges e ∈ E G to future directed covector fields kγ(e) ≡ ke which are coparallel to the tangent vector γ˙e

636

R. Brunetti, K. Fredenhagen

of the null geodesic. Hence, generalizing the set z2 above by, . bn (m1 , . . . , mn ) and an zn = (x1 , k1 ; . . . ; xn , kn ) ∈ T˙ ∗ M n | ∃ G ∈ G immersion (x, γ , k) of G such that X X ke (xi ) − ke (xi ) , ki = e∈E G s(e)=i

(17)

e∈E G r(e)=i

we have: Proposition 2.5. The wave front set of ωnm1 ,...,mn is geometrically bounded as WF(ωnm1 ,...,mn ) ⊂ zn .

(18)

. b bn = Proof. Let us define for notational purpose G Gn (m1 , . . . , mn ). As follows from Theorem 2.4 the expectation value is X Y . X G ωn (x1 , . . . , xn ) = ω2 (xs(e) , xr(e) ). (19) ωnm1 ,... ,mn (x1 , . . . , xn ) = bn G∈G

bn e∈G G∈G

Considering one graph G in this sum we see that, in explicit form, . Y ω2 (xs(e) , xr(e) ) ωnG (x1 , . . . , xn ) = e∈G

G

G

G

= ω2 (x1 , x2 )|E1,2 | ω2 (x1 , x3 )|E1,3 | · · · ω2 (xn−1 , xn )|En−1,n | . In the last equality the 2-point distributions should be understood as distributions on the product manifold M n . Hence their wave front set is given, when i < j , by, G| |Ei,j

WF(ω2

) ⊂ {(x1 , 0; . . . ; xi , ki,j ; . . . ; xj , −ki,j ; . . . ; xn , 0)|(xi , ki,j ; xj , −ki,j ) ∈ z2 }.

It is straightforward to see from the last expression that the claim of the proposition is correct. Indeed a general covector ki will have the following expression, ki = −k1,i − · · · − ki−1,i + ki,i+1 + · · · + ki,n ,

(20)

where some of the kl,m may be zero. Now, as follows from Eq. (8), to any edge e which consists of a pair of joined vertices i, j in the graph G there exist points on the manifold xi , xj and a null geodesic γe connecting them together with a future directed covector field ke which is coparallel to the tangent vector of the geodesic and is such that, in agreement with Eq. (20), X X ke (xi ) − ke (xi ). ki = e∈E G s(e)=i

e∈E G r(e)=i

Since this applies equally well to all graphs, and since the wave front set of sums of distributions is bounded by the union of the wave front sets, we get the thesis. u t

Microlocal Analysis and Renormalization

637

3. On a Local Formulation of Perturbation Theory We have recalled in the introduction the main ideas of the Epstein and Glaser formulation of perturbation theory. Here we give the details of our generalization. We start from the Gell–Mann and Low formula for the S-matrix for quantum theories on four dimensional Minkowski space-time M; this means adopting the following formal expression, Sλ = T (eiλ

R

M Lint (x)d

4x

),

where T denotes the notion of time ordering, Lint , the interaction Lagrangian, is some local field and λ is the strength of the interaction. Developing in Taylor series w.r.t. λ gives Sλ =

Z ∞ X (iλ)k k=0

k!

M

Z ···

M

T (Lint (x1 )Lint (x2 ) · · · Lint (xk ))

k Y

d 4 xi .

i=1

Hence, the perturbative solution to scattering theory is reduced to quadratures once one finds the general solution of the time ordering operation inside the integral. For noncoinciding points the solution is given by the following expression, T (Lint (x1 ) · · · Lint (xn )) = X θ(xπ(1) − xπ(2) ) · · · θ (xπ(n−1) − xπ(n) )Lint (xπ(1) ) · · · Lint (xπ(n) ),

(21)

π∈P1,...,n

where P1,...,n is the set of all permutations of the index set {1, . . . , n} and θ is the Heaviside step function, ( θ (x) =

1, if x 0 > 0, 0, otherwise,

where x 0 denotes the time component of the points in M. As is well known, this expression leads to the description of scattering processes by Feynman graphs [19]. Due to local commutativity of the Lagrangian the singularities of the Heaviside step function at coinciding times are harmless, as long as all points are different. Unfortunately, this is no longer true for coincident points, since Lint is an operator-valued distribution which cannot, in general, be multiplied by a discontinuous function; if one tries to define the products by convolutions in momentum space, this leads, in a naive approach, to the occurrence of ultraviolet divergences. Several procedures have been found to cope with these singularities. But typically they are nonlocal and are therefore not immediately generalizable to the case of Lorentzian curved backgrounds. Better is the situation in Euclidean field theory (see e.g. [45] for a generalization of dimensional renormalization to the curved case). Let us consider, as possible interaction Lagrangians L, Wick polynomials of the free field ϕ. From Eqs. (21) and (16) we find

638

R. Brunetti, K. Fredenhagen

X

T (L1 (x1 ) · · · L(xn )) =

(j )

(j )

(ω , T (L1 1 (x1 ) · · · Ln n (xn ))ω ) ×

j1 ,...,jn

×

: ϕ j1 (x1 ) · · · ϕ jn (xn ) : , j1 ! · · · jn !

(22)

where, however, the expectation value in the right-hand side is a priori not defined all over Mn , but only over Mn \6, where 6 is the union of all diagonals in Mn . So, the main problem is to give a mathematical meaning to this formula on all points. 3.1. Formulation of the local S-matrix. The general starting idea, due to Bogoliubov, is to consider the usual Gell–Mann and Low formulation of the S-matrix but supplemented by the hypothesis necessary to the implementation of the causality principle. In one stroke one finds also the solution to the problem of the correct treatment of the operatorvalued distributions. Choosing as the interaction Lagrangian Lint (x) = L(x)η(x), a Wick polynomial L multiplied by a space-time function of compact support η ∈ D(M) (considered as a generalized “coupling constant”), we define the local S-matrix Sλ (η) as a formal power series (see, for instance, [6]) in the coupling strength λ, ∞

X (iλ)n . Sλ (η) = 1 + n! n=1

Z Mn

T (Lint (x1 ) · · · Lint (xn ))dµ1 · · · dµn ,

(23)

where dµ is the natural invariant volume measure on the globally hyperbolic space-time (M, g), and 1 is the Hilbert space identity operator. One can enlarge the definition, e.g. by using a more general Lagrangian, λLint =

l X

ηk Lk ,

(24)

k=1

with Wick polynomials Lk and associating to each a different “coupling constant” ηk ∈ D(M), where the additional “Lagrangians” Lk are defined as terms like currents, external fields, etc., including in particular all derivatives of the basic interaction Lagrangian. Using this extended Lagrangian we may replace the time-ordered operator in Eq. (23) by, X T (Lk1 (x1 ) · · · Lkn (xn ))ηk1 (x1 ) · · · ηkn (xn ), T (Lint (x1 ) · · · Lint (xn )) = k1 ,...,kn

where the summation over the k’s go from 1 to l, the number of the terms in the extended Lagrangian. We remark that, eventually, the test function(s) η should be sent to a fixed value over all space-time. This procedure, known as adiabatic limit, amounts to treating the infrared nature of the theory. Some studies of this limit in the case of Minkowski space-time have been performed by Epstein and Glaser themselves [21]. It is not clear how to generalize their study to curved spaces. It is therefore gratifying that all local properties of the theory are already obtained via the construction of the local S-matrices, and this point of view might also be useful in cases (like non-abelian gauge theories) where due to infrared problems the S-matrix in the adiabatic limit does not properly exist.

Microlocal Analysis and Renormalization

639

3.2. Defining properties. Our main goal is the inductive construction of the time-ordered products of Wick polynomials, T (L1 (x1 ) · · · Ln (xn )). Following Epstein and Glaser we require the following properties: P1. Well-posedness. The symbols T (L1 (x1 ) · · · Ln (xn )) are well defined operatorvalued distributions on the GNS-Hilbert space Hω , i.e. (multilinear, strongly continuous) maps D(M n ) → End(D), where D ⊂ Hω is the dense subspace (microlocal domain of smoothness) defined in (11). P2. Symmetry. Any time-ordered product T (L1 (x1 ) · · · Ln (xn )) is symmetric under permutations of indices, i.e. the action of the permutation group P{1,...,n} of the index set {1, . . . , n} gives, T (Lπ(1) (xπ(1) ) · · · Lπ(n) (xπ(n) )) ≡ T (L1 (x1 ) · · · Ln (xn )), for any π ∈ P{1,...,n} , in the sense of distributions. This symmetry property corresponds to the fact that the time-ordered products are functional derivatives of the local S-matrix. More crucial is the following causality property, which follows from Eq. (2); P3. Causality. Consider any set of points (x1 , . . . , xn ) ∈ M n and any full partition of the set {1, . . . , n} into two non empty subsets I and I c such that no point xi with / J − (xj ) for any i ∈ I i ∈ I is in the past of the points xj with j ∈ I c , i.e. xi ∈ c and j ∈ I . Then the time-ordered distributions are required to satisfy the following factorization property: Y Y Lj (xj )). T (L1 (x1 ) · · · Ln (xn )) = T ( Li (xi )) T ( i∈I

j ∈I c

In the Epstein and Glaser scheme on Minkowski space-time one requires, in addition, translation covariance of the time-ordered products. If the free field is among the possible terms in the Lagrangian one can show that the time-ordered products are sums of pointwise products of Wick polynomials with translation invariant numerical distributions. (Such products exist due to Theorem 0 of Epstein and Glaser, an easy proof of which follows from our microlocal characterization of the domain of definition of Wick products.) Moreover, the condition, [T (L1 (x1 ) · · · Ln (xn )), ϕ(y)] =

n X k=1

T (L1 (x1 ) · · ·

∂Lk (xk ) · · · Ln (xn ))iE(xk , y), ∂ϕ (25)

fixes the coefficients to be vacuum expectation values of time-ordered products of those Wick polynomials which are of lower order w.r.t. the chosen interacting Lagrangian (from now on, we shall call them sub-Wick polynomials), hence the problem is reduced to a problem for numerical distributions. Unfortunately, in the case of a curved spacetime, we have not yet determined the class of fields which are relatively local to the scalar free field, i.e., what is known in the literature as Borchers’ class [22]. One also needs a replacement of the condition of translation covariance. Our idea is to impose

640

R. Brunetti, K. Fredenhagen

a condition on the time-ordered distributions which in a sense employs both ideas of invariance and spectrality crucial in the Minkowskian case. Since, as emphasized in the previous section, spectrality for us means wave front set properties we now look for a condition which fixes the properties of the singularities of the time-ordered distributions. We use the graph theoretic definitions of Sect. 2. P4. Spectrality.For the expectation value tn ∈ D0 (M n ), n ≥ 2, of any time-ordered product it holds, WF(tn ) ⊂ 0nto , where

0nto = (x1 , k1 ; . . . ; xn , kn ) ∈ T˙ ∗ M n | ∃ a graph G ∈ Gn and an immersion (x, γ , k) of G in which ke is future directed / J − (xr(e) ) and such that, whenever xs(e) ∈ X X km (xi ) − kn (xi ) . ki = m:s(m)=i

n:r(n)=i

This may be motivated by the fact that, for non-coinciding points, tn can be expressed in terms of the usual Feynman graphs, and for the set of coinciding points we have an infinitesimal remnant of translation invariance, since all covectors at coinciding points sum up to zero. We can now formulate a microlocal version of Theorem 0 of Epstein and Glaser. Theorem 3.1 (Microlocal Theorem 0). If P4 holds for tn then tn (x1 , . . . , xn ): ϕ l1 (x1 ) · · · ϕ ln (xn ) :, is a well defined operator-valued distribution, for any n and any choice of indices l1 , . . . , ln , on the dense invariant domain D in the Hilbert space Hω . l1 ln Proof. Let 9 ∈ D. The Pvector-valued distribution : ϕ (x1 ) · · · ϕ (xn ) :9 is a restriction of δ l 9/δf l , l = ni=1 li , to a partial diagonal with wave front set contained in S ×n to x∈M n {x} × V − . Since 0n does not contain elements of the form (x1 , k1 , . . . , xn , kn ) with ki ∈ V + , i = 1, . . . , n, the product is a well defined vector-valued distribution, and after smearing with some test function one obtains again a vector in D. u t

In particular, formula (22) makes sense everywhere, provided the expectation values of all time-ordered products of sub-Wick polynomials satisfy P4. Moreover, every expansion into a sum of products of Wick polynomials by numerical distributions which satisfies (25) is of this form: P5. Causal Wick Expansion. T (L1 (x1 ) · · · L(xn )) =

X

(j )

(j )

(ω , T (L1 1 (x1 ) · · · Ln n (xn ))ω ) ×

j1 ,...,jn

×

: ϕ j1 (x1 ) · · · ϕ jn (xn ) : . (26) j1 ! · · · jn !

Microlocal Analysis and Renormalization

641

4. Inductive Construction up to the Small Diagonal The properties defined in the previous section allow us to set up an inductive procedure in the spirit of Epstein and Glaser. We rely on a variation of their construction proposed by Stora [54]. We start with a linear space W of Wick polynomials which contains all respective sub-Wick polynomials and want to define the time-ordered products T (L1 (x1 ) · · · L(xn )) as a family of operator-valued distributions which are multilinear in the entries Li ∈ W and satisfy the properties P1-5. We start the induction by setting T (1) = 1 and T (L) = L and assume that the time-ordered products for 1 < l ≤ n − 1 factors have been constructed and satisfy all the defining properties. In a first step, we aim at constructing time-ordered products of n factors on M n \1n , where 1n is the small diagonal submanifold of M n , i.e. the set of points (x1 , . . . , xn ) with the property x1 = x2 = · · · = xn . We use the space-time notion of causality in order to define a certain partition of unity for M n \1n : Let us denote by J the family of all non-empty proper subsets I of the / index set {1, . . . , n} and define, accordingly, the sets CI = {(x1 , . . . , xn ) ∈ M n | xi ∈ J − (xj ), i ∈ I, j ∈ I c } for any I ∈ J . Note that the defining relation for the CI ’s is related to causality on M and not on M n . It is fairly easy to show that Lemma 4.1. Let M be a globally hyperbolic space-time, then it holds [ CI = M n \1n . I ∈J

Proof. The inclusion ∪I CI ⊂ M n \1n is obvious. The opposite inclusion is proved as follows. Consider any set of points (x1 , . . . , xn ) such that xi 6 = xj for some i 6= j , then the points xi and xj can be separated by a Cauchy surface 6 as follows from the global hyperbolicity assumption. One may choose it as to contain none of the points xk , k = 1, . . . , n. Hence, defining I = {k | xk ∈ J + (6)} and noting that I ∈ J we find t (x1 , . . . , xn ) ∈ CI . u We use the short-hand notations Y T I (xI ) = T ( Li (xi )), xI = (xi , i ∈ I ).

(27)

i∈I

The first step now is to set on any CI , c . TI (x) = T I (xI ) T I (xI c ),

(28)

as an operator-valued distribution since according to the induction hypothesis and the fact that I is proper this is a well defined operation on D(CI ). We now glue together all operators previously defined on different elements of the cover. For this we need to prove a sheaf consistency condition. Indeed, different CI ’s overlap but due to the causality hypothesis P3 and the causal Wick expansion P5 valid for the lower order terms, the following property holds: Proposition 4.2. For any choice of I1 , I2 ∈ J such that CI1 ∩ CI2 6= ∅ we have TI1 CI1 ∩CI2 = TI2 CI1 ∩CI2 , in the sense of operator-valued distributions over M n \1n .

642

R. Brunetti, K. Fredenhagen

Proof. Let I1 , I2 ∈ J and x = (x1 , . . . , xn ) ∈ CI1 ∩ CI2 . Using the causality property P3 which is by assumption valid for time-ordered products of less than n factors we find, c

T I1 (xI1 ) = T I1 ∩I2 (xI1 ∩I2 ) T I1 ∩I2 (xI1 ∩I2c ), c

c

c

(29)

c

T I1 (xI1c ) = T I1 ∩I2 (xI1c ∩I2 ) T I1 ∩I2 (xI1c ∩I2c ), c

c

c

and similarly for T I2 and T I2 . Now the terms T I1 ∩I2 and T I1 ∩I2 commute. Namely, they are based on mutually space-like points, thus using the Wick expansion for these terms this follows from local commutativity for the Wick polynomials of the free scalar field ϕ. Hence from definition (28) we get on CI1 ∩ CI2 , c

c

c

c

TI1 = T I1 ∩I2 T I1 ∩I2 T I1 ∩I2 T I1 ∩I2 , (Eq.(28) + P3), c

= T I2 T I2 , = TI2 ,

(Eq.(29)), (Eq.(28)).

t u

Let now {fI }I ∈J be a locally finite smooth partition of unity of M n \1n subordinate to {CI }I ∈J . We formally define, following (27) and (28), . X 0 T (L1 (x1 ) . . . Ln (xn )) = fI TI . (30) I ∈J

Hence, we get our first crucial result, namely, Theorem 4.3. The symbols 0 T are well defined operator-valued distributions over M n \1n which satisfy the defining properties on D ⊂ Hω . Proof. We first prove that the definition does not depend on the choice of the partition of unity. Indeed, let {fI0 }I ∈J be another such partition. Consider x ∈ M n \1n , and let K = {I ∈ J | x ∈ CI }. Then there exists a neighbourhood V of x such that V ⊂ ∩I ∈K CI , / K. In this case and supp(fI ) and supp(fI0 ) do not meet V for all I ∈ X

I ∈J

(fI − fI0 ) TI V =

X

I ∈K

(fI − fI0 ) TI V .

P P However, on V TI is independent of the choice of I ∈ K. Since I ∈K fI = I ∈K fI0 = 1 on V , we arrive at the conclusion. Furthermore, an inspection of the formula readily gives that the operator 0 T is defined on the domain D, the microlocal domain of definition of the Wick monomials, because of induction. Hence property P1. As far as the symmetry property P2 is concerned we just observe that the permuted π distribution 0 T (x1 , . . . , xn ) = 0 T (Lπ(1) (xπ(1) ) . . . Lπ(n) (xπ(n) )), has the expansion, X X 0 π π π T = fIπ TIπ = fπ(I ) Tπ(I ) , I ∈J

I ∈J

π where we used the fact that the set J is invariant under permutations, but Tπ(I ) = TI π and {fπ(I ) }I ∈J is a partition of unity subordinate to {CI }I ∈J , so symmetry follows from the result of the previous paragraph about the independence of 0 T on the choice of the partition of unity.

Microlocal Analysis and Renormalization

643

Causality P3 follows from an argument similar to the one used for the independence from the partition of unity. Indeed, take any point x ∈ M n \1n , as before x ∈ V ⊂ ∩I ∈K CI . From (30), X 0 T (x) = fI (x)TI (x). I ∈K

P Since TI V does not depend anymore on I ∈ K and I ∈K fI = 1 over V , hence 0 T (x) ≡ T (x), which from (28) satisfies causality by definition. I Now, we want to show that property P4 holds on M n \1n . It is sufficient to check that this property is satisfied for each TI on CI . We apply the Wick Theorem to the components of the product in definition (28). It can be easily checked that the distributions tI in the Wick expansion of TI are sums of terms of the following form Y c ω2 (xi , xj )ai,j , (31) fI (x)t I (xI )t I (xI c ) · (i,j )∈I ×I c c

with ai,j ∈ N0 , and where t I , t I are expectation values of lower order time-ordered products. The wave front set of (31) is contained in the convex combination of the wave front sets of its factors. Hence it is given in terms of immersions of graphs with vertex sets I, I c , resp., and of ai,j graphs with vertex sets {i, j }. All these immersions satisfy the condition in P4, the first two by assumption, and the last ones because of the definition of CI and the properties of the wave front set of a Hadamard state. The union of these graphs is a graph with vertices {1, . . . , n}, and any convex combination of the components is given by an admissible immersion of this graph. Finally, property P5 follows from expression (28) by a straightforward application of the (generalized) Wick Theorem. u t 5. Steinmann Scaling Degree and the Extension of Distributions We now want to extend 0 T (L1 (x1 ) . . . Ln (xn )) to the whole M n . As discussed before the problem can be reduced to the extension of the numerical time-ordered distributions . 0 t (x , . . . , x ) = (ω , 0 T (L1 (x1 ) . . . Ln (xn ))ω ). 1 n The extension can be performed in two steps. First 0 t is extended by continuity to the subspace of test-functions which vanish on 1n up to a certain order, and then it is arbitrarily defined on a complementary subspace. It is this last step which corresponds to the method of counterterms in the classical procedure of perturbative renormalization. The extension of 0 t by continuity requires some topology on test-function space. The seminorms used by Epstein and Glaser in their paper are quite complicated, and their generalization to curved space-times appears to be rather involved. We found it preferable therefore to apply a different method already introduced by Steinmann [53], namely the concept of scaling degree at a point of a distribution (see also [16]). Its generalization to curved space-time is very similar to the concept of the scaling limit as introduced by Haag, Narnhofer and Stein [30] and further developed by Fredenhagen and Haag [24]. A similar technique is used in [50]. On Minkowski space, by translation invariance, the distribution is in terms of relative coordinates everywhere defined up to the origin, and there the concept of the scaling degree at a point leads to a rather smooth and economic method of renormalization, see

644

R. Brunetti, K. Fredenhagen

e.g. [48], where the relation to differential renormalization is elaborated. On a curved space-time one needs the corresponding notion for a scaling degree with respect to the submanifold 1n , and one also needs some uniformity of the singularity along 1n as well as control of the wave front sets during the extension process. Our strategy will be that, at first we introduce this improvement in the case of Rn , then we discuss the case of manifolds. There we try to set up a procedure which allows to restrict the discussion to the pointwise case. 5.1. The scaling degree. For simplicity, we work at first on Rd . Hence, consider a distribution t ∈ D0 (Rd ). Let the action of the positive reals (dilations) be defined via the map 3 : R+ × D(Rd ) −→ D(Rd ) . (λ, φ) 7−→ φ λ = λ−d φ(λ−1 · ), and obtain, by pull-back, the map over distributions t ∈ D0 (Rd ) as, . . (3∗ t)(φ) = tλ (φ) = t (φ λ ), where this operation in case t ∈ L1loc (Rd ) is given by the explicit formula, Z tλ (φ) = t (λx)φ(x)d d x, ∀φ ∈ D(Rd ).

(32)

The map 3 is clearly continuous w.r.t. the topology of D(Rd ) and we shall sometimes use the previous formula (32), by the usual abuse of notation, also in the general case. We say that t has scaling degree sd(t) = ω w.r.t. the origin in Rd , if ω is the infimum of all ω0 ∈ R for which, 0

lim λω tλ = 0, λ↓0

(33)

holds in the sense of D0 (Rd ). It should be clear from the definition that every distribution t ∈ D0 (Rd ) has a scaling degree ω ∈ [−∞, +∞[. If the distribution is not defined at the point we want to check, then the scaling degree might also be equal to +∞. We give some examples. Examples. 1. Trivial example. Every φ ∈ E(Rd ) has sd(φ) ≤ 0. 2. Dirac measure. Let µ ∈ E 0 (Rd ) with µ(φ) = φ(0), φ ∈ E(Rd ), then sd(µ) = d. 3. Feynman propagator. In the case of a free massive scalar field which is covariant under translation on Minkowski space-time, the Feynman propagator can be written as Z eip·x −d d d p, EF (x) = (2π ) p2 − m2 + i from which it is readily seen that sd(EF ) = d − 2. 4. Homogeneous distributions. If t ∈ D0 (Rd ) is homogeneous of order α at the origin, i.e. tλ = λα t, then sd(t) = −α.

Microlocal Analysis and Renormalization

645

5. Infinite degree. The smooth function x → exp(1/x), x ∈ R+ , is not defined at the origin and its scaling degree w.r.t. the origin is clearly infinite. As inferred from the 4th example, the scaling degree may be seen as a generalization of the notion of the degree of homogeneity. Actually, our extension method is similar to the extension to all space of a homogeneous distribution as discussed in Hörmander’s book [35], which on the other hand is also quite similar to the Epstein and Glaser procedure of distribution splitting [20]. The fact that homogenous extensions do not always exist is the mathematical origin of the logarithmic corrections to scaling found in renormalization. Here, a discussion about space-time symmetries and their implementation after renormalization is absent. It will be presented in [11]. Lemma 5.1. The scaling degree obeys the following properties: (a) Let t ∈ D0 (Rd ) have sd(t) = ω at 0 , then 1. Let α ∈ Nd be any multiindex, then sd(∂ α t) ≤ ω + |α|. 2. Let α ∈ Nd be any multiindex, then sd(x α t) ≤ ω − |α|. 3. Let f ∈ E(Rd ), then sd(f t) ≤ sd(t). (b) For ti ∈ D0 (Rdi ), i = 1, 2 we have sd(t1 ⊗ t2 ) = sd(t1 ) + sd(t2 ). Proof. The first two cases in (a) as well as (b) are straightforward. The third case in (a) follows from the fact that, by the Banach-Steinhaus principle, a convergent sequence of distributions is uniformly bounded. Hence, for every ω0 > sd(t) and every compact set K ⊂ Rd there is some polynomial P such that 0

|λω tλ (φ)| ≤ sup |P (∂)φ(x)| ≡ ||φ||∞,P . x∈K

(34)

Hence, for f ∈ E(Rd ), we have 0

|(f t)λ (φ)| = |tλ (fλ φ)| ≤ λω ||fλ φ||∞,P .

(35)

The statement follows now from the boundedness of the sequence ||fλ φ||∞,P as λ → 0. t u

5.2. Extensions of distributions to a point. We now want to show how to extend a distribution t ∈ D0 (Rd \{0}) to all space by using the concept of the scaling degree. The scaling degree can easily be defined for such distributions by restricting the test functions appropriately. Equivalently, we can also, for each χ ∈ E(Rd ) with 0 6 ∈ supp(χ ) look at the behaviour of the sequences χtλ , now considered as sequences in D0 (Rd ). There are three possible cases; when the scaling degree is +∞; in this case no extension to a distribution on Rd exists; when the scaling degree ω is finite, but ω ≥ d; then a finite dimensional set of extensions exists; or otherwise ω < d. We first study the third case. Theorem 5.2. Let t0 ∈ D0 (Rd \{0}) have scaling degree ω < d w.r.t. the origin. There exists a unique t ∈ D0 (Rd ) with scaling degree ω such that t (φ) = t0 (φ), φ ∈ D(Rd \{0}).

646

R. Brunetti, K. Fredenhagen

Proof. The uniqueness is easy. Indeed, the difference among two possible extensions would be a distribution with support at {0}. By a well known structural theorem of distribution theory this last is given by P (∂)δ, where P is a polynomial of degree deg(P ) and δ is Dirac measure at the origin. But this distribution has scaling degree equal to d + deg(P ), hence a contradiction. Let us now consider a smooth function of compact support ϑ such that ϑ = 1 in a . neighbourhood of the origin. Set ϑλ (x) = ϑ(λx), λ ∈ R and . t (n) = (1 − ϑ2n )t0 , n ∈ N, where now t (n) is a sequence of distributions defined on the whole Rd . We wish to show that the sequence converges in the weak∗ topology of D0 (Rd ). Because of the sequential completeness of D0 (Rd ) it is sufficient to prove that it is a Cauchy sequence. Let φ ∈ D(Rd ) and look at (t (n+1) − t (n) )(φ) = (φt0 )(ϑ2n − ϑ2n+1 ) = 2−nd (φt0 )2−n (ϑ − ϑ2 ).

(36)

According to Lemma 5.1, (a) 3, this sequence is majorized, for every ω0 ∈ ]ω, d[ by 0 const. 2n(ω −d) , hence it is summable as required. The limit . t (φ) = lim t (n) (φ), ∀φ ∈ D(Rd ), n→∞

then defines an extension of t0 . It is obvious that the scaling degree of t is not smaller than ω. It remains to prove that it is not bigger than ω. Pick φ ∈ D(Rd ) and consider the following expression: tλ (φ) = lim λ−d t0 ((1 − ϑ2n )φλ−1 ). n→∞

Let R, > 0 be such that supp(φ) ⊂ {x, |x| < R} and ϑ(x) = 1 for |x| < . Then, (1 − ϑ(2n x))φ(λ−1 x) = 0, whenever 2−n > λR. Let us choose nλ ∈ N such that 2−nλ > λR > 2−(nλ +1) . We have, tλ (φ) = =

∞ X n=nλ ∞ X n=nλ

λ−d t0 ((ϑ2n − ϑ2n+1 )φλ−1 ) (37) n

−d

(2 λ)

(t0 )2−n ((ϑ − ϑ2 )φ2−n λ−1 ).

The set {(ϑ − ϑ2 )φµ , µ < const.} is bounded in D(Rd \{0}). Hence for every ω0 > ω we find a constant c > 0 such that (t0 )2−n ((ϑ − ϑ2 )φ2−n λ−1 ) ≤ c 2nω0 , n ≥ nλ .

Microlocal Analysis and Renormalization

647

Inserting this estimate back into Eq. (37) we have ∞ X

−d

|tλ (φ)| < c λ

2

−n(d−ω0 )

n=nλ

≤

c

1−2

λ−d −(d−ω0 )

−d

= cλ

2R

d−ω0

0

2−nλ (d−ω ) 0 1 − 2−(d−ω ) 0

λd−ω ≤ c0 λ−ω

0

for some constant c0 > 0. This proves the assertion. u t We now deal with the extension procedure in case a distribution has a finite scaling degree ω ≥ d. This extension procedure corresponds to renormalization in other schemes. To adhere more to the standard notation we introduce the degree of singularity . ρ = ω − d. This is the analog of the degree of divergence of a Feynman diagram. Let Dρ (Rd ) be the set of all smooth functions of compact support which vanish of order ρ at the origin, and let W be a projection from D(Rd ) onto Dρ (Rd ). Since the orthogonal complement of Dρ (Rd ) consists of the derivatives of the δ-function up to order ρ, W is of the form X wα ∂ α φ(0), (38) Wφ = φ − |α|≤ρ

with wα being smooth functions of compact support such that ∂ α wβ (0) = δβα . Theorem 5.3. Let t0 ∈ D0 (Rd \{0}) have a finite scaling degree ω ≥ d. Then there exist extensions t ∈ D0 (Rd ) of t0 with the same scaling degree, and, given W , they are uniquely determined by their values on the test functions wα . Proof. Any φ ∈ D(Rd ) can be uniquely decomposed as φ = φ1 + φ2 , where φ1 = P α d |α|≤ρ wα ∂ φ(0) and φ2 ∈ Dρ (R ). φ2 has the form, X

φ2 (x) =

x α ψα (x),

|α|=[ρ]+1

with ψα ∈ D(Rd ). We set ht, φi =

X

hx α t0 , ψα i + ht, φ1 i.

(39)

|α|=[ρ]+1

Since, by Lemma 5.1, x α t0 has scaling degree equal to ρ − [ρ] − 1 + d which is strictly smaller than d this term has a unique extension by Theorem 5.2. We now prove that t has the same scaling degree as t0 . We write, X t (wα ) ∂ α φ (0)λ−d−|α| . tλ (φ) = (t0 ◦ W )λ (φ) − |α|≤ρ

The second term clearly has scaling degree less than or equal to ρ + d = ω. The first term can be written in the form (t0 )λ (W φ) + (t0 )λ ((W φλ−1 )λ − W φ).

648

R. Brunetti, K. Fredenhagen

By assumption, the first term has scaling degree ω. To analyze the second term we write ((W φλ−1 )λ − W φ)(x) =

wα (x) − λ−|α| wα (λx) ∂ α φ (0).

X |α|≤ρ

(Note that (wα (·) − λ−|α| wα (λ ·)) ∈ Dρ (Rd ).) Using the identity, (wα (x) − λ−|α| wα (λx)) =

Z Z

=

1

λ 1 λ

d −|α| µ wα (µx) dµ dµ

(40)

(µxk (∂k wα )(µx) − |α|wα (µx)) µ−|α|−1 dµ,

we get, after a moment of reflection for the exchange of the order between integration and duality, that (t0 )λ ((W φλ−1 )λ − W φ) = Z 1 X µ−d−|α|−1 (t0 )λµ−1 ((xk ∂k − |α|)wα )dµ. ∂ α φ (0) |α|≤ρ

λ

The integrand can be estimated according to Lemma 5.1. Indeed, for any ω0 > ω we have, ω0 (t0 )λµ−1 ((xk ∂k − |α|)wα ) ≤ const. λ−1 µ , and therefore Z

1 λ

ω0 −d−|α| 0 1−λ , µ−d−|α|−1 (t0 )λµ−1 ((xk ∂k − |α|)wα ) dµ ≤ const. λ−ω ω0 − d − |α|

which proves the assertion. u t The expert reader can now proceed from this point to study the renormalizability of any theory which admits space-time translation covariance. The ambiguity of the extension is given by terms localized over the origin. The coefficients of these terms can be fixed by additional requirements, as customary in perturbative quantum field theory. We refer the reader to [11] for more details. During this process, one needs estimates on the scaling degrees of the arising distributions, corresponding to the power counting rules. In addition to Lemma 5.1 estimates on scaling degrees of products of distributions (provided they exist) are required. These can be obtained by explicit calculations (see e.g. the analogous estimates in [20]). Much more elegant is a general method which exploits a microlocal version of the scaling degree. This technique is actually necessary if one wants to generalize the methods above to generic manifolds. We shall describe it in the next section.

Microlocal Analysis and Renormalization

649

6. Surfaces of Uniform Singularity and the Microlocal Scaling Degree The generalization of the previous procedure to the case of submanifolds is what we really need in the treatment of perturbation theory on curved spaces. Indeed, the description given in Sect. 4 led to the notion of a scaling degree w.r.t. the small diagonal 1n of the topological product M n . Here we classify the behaviour of distributions near some surface by a microlocal version of the scaling degree. We introduce two different notions. The first one, the (microlocal) scaling degree at a surface, involves only the surface under consideration, the second one, the transversal scaling degree, involves a fibration of the surface by transversal surfaces. The first notion behaves very nicely under tensor products and restrictions, whereas the second one admits an easy generalization of the extension procedure. As a matter of fact, the notions can be shown to be equivalent. 6.1. Scaling degrees at submanifolds. Let M be a smooth paracompact manifold of dimension d and t be a distribution in D0 (M). Let N ⊂ M be a submanifold such that the wave front set of t is orthogonal to the tangent bundle T N of N , i.e. for (x, k) ∈ WF(t) with k ∈ Tx∗ M, x ∈ N , hk, ξ i = 0, ∀ξ ∈ Tx N .

(41)

Under these circumstances, t can be restricted to a sufficiently small submanifold C ⊂ M which intersects N in a single point x0 , such that the intersection of their tangent spaces at x0 is trivial and their sum spans the whole tangent space (the submanifolds C and N are transversal, see e.g. [27], symbolically C t N ). This is due to the fact that WF(t) does not intersect the conormal bundle N ∗ C = {(x, k) ∈ T ∗ M|hk, ξ i = 0, ∀ξ ∈ Tx C} of C. Namely, for k ∈ Tx∗0 M, (x0 , k) ∈ WF(t) we have hk, ξ i = 0 for ξ ∈ Tx0 N , / N ∗ C. But WF(t) ∩ N ∗ C is a closed hence hk, ξ i 6 = 0 for some ξ ∈ Tx0 C, thus (x0 , k) ∈ ∗ ˙ conical set in TC M, hence its complement is an open conical neighbourhood of T˙x∗0 M, in particular it contains a set T˙U∗0 M, where U0 is an open neighbourhood of x0 in C. By choosing C = U0 we arrive at the conclusion. So we proved, Lemma 6.1. Let t ∈ D0 (M) be a distribution on a smooth manifold M and let N be a submanifold such that WF(t) ⊥ T N . Then t can be restricted to every sufficiently small submanifold C such that N t C . The singularity of tC at x0 may be classified by a covariant extension of the notion of the scaling degree, or better by a slight extension which uses microlocal analysis. For economy of presentation we first look at the concept of scaling degree at some surface N which reduces for each transversal surface C to the scaling degree at the intersection point. This last will just be a pointwise reduction of the general case we proceed to discuss right now. Let U be a star-shaped neighbourhood of the zero section Z(TN M) and consider a map α : U → α(U ) ⊂ N × M which is a diffeomorphism onto its range and such that the following properties hold true: (i) (ii) (iii) (iv)

α(x, 0) = (x, x), x ∈ N ; α(T N ∩ U ) ⊂ N × N ; α(x, ξ ) ∈ {x} × M, x ∈ N , ξ ∈ Tx M; dξ α(x, ·)ξ =0 = idTx M .

650

R. Brunetti, K. Fredenhagen

A concrete example of such a map α can be defined, whenever we consider the manifold M endowed with a (semi-)Riemannian metric, in terms of the exponential . map, namely, α(x, ξ ) = (x, expx ξ ), provided the submanifold N is totally geodesic, as will be the case in our applications. In the general case, we shall call the set of all such maps Z. . α α 0 Let α ∈ Z and set t α = R (1 ⊗ t) ◦ α on D (U ) and tλ (x, ξ ) = t (x, λξ ), 0 < λ ≤ 1. Here, h1 ⊗ t, φ ⊗ ψi = φ · ht, ψi for test-densities φ ∈ D1 (N ) and ψ ∈ D1 (M). Since U is starshaped, λ−1 U ⊃ U for 0 < λ ≤ 1, hence tλα can be considered as a distribution on D1 (U ). As a preliminary step we have the following Proposition 6.2. For any t ∈ D0 (M) which satisfies the hypothesis of Lemma 6.1, there exists a closed conic set 0 ⊂ T˙ ∗ U such that (i) 0 ⊥ T (T N ∩ U ); (ii) WF(tλα ) ⊂ 0. Proof. Since α maps T N ∩ U into N × N , its derivative α∗ : T U → T (N × M) maps T (T N ∩ U ) into T (N × N ). But WF(t) ⊥ T N implies WF(1 ⊗ t) ⊥ T (N × N ), hence WF(t α ) = α ∗ WF((1 ⊗ t)α(U ) ) ⊥ α∗−1 (Tα(U ) (N × N )) = T (T N ∩ U ). Now, ∗ α WF(tλα ) = {(x, ξ ; k) ∈ T(x,ξ ) (U )|(x, λξ ; k) ∈ WF(t )}.

Here, we identified the cotangent spaces at the points (x, ξ ) and (x, λξ ) by the isomorphism induced by the diffeomorphism U → λU , (x, ξ 0 ) → (x, λξ 0 ), ξ 0 ∈ Tx M. Now, let ξ ∈ Tx N and η ∈ T(x,ξ ) N . We may identify η with a vector in T(x,λξ ) N and observe that it is orthogonal to WF(t α ) and hence also to WF(tλα ). We now set 0 = ∪0<λ≤1 WF(tλα ), where the closure is performed within T˙ ∗ U . It remains to prove (i). Let (x, ξn ; kn ) ∈ WF(tλαn ) be a convergent sequence in T ∗ U with limit (x, ξ ; k), k 6 = 0 and ξ ∈ Tx N . There is a corresponding sequence (x, λn ξn ; kn ) ∈ WF(t α ). Let λ ∈ [0, 1] be a limit point of the bounded sequence {λn }n∈N . Then, there is a subsequence converging to (x, λξ ; k) ∈ WF(t α ). But λξ ∈ Tx N , hence we have k ⊥ T(x,λξ ) (T N ). If we again identify the tangent spaces at (x, λξ ) and (x, ξ ) we obtain the desired result. t u Choosing first any map α ∈ Z we are ready for the following: Definition 6.3. A distribution t ∈ D0 (M) has the microlocal scaling degree ω at a submanifold N w.r.t. a closed conical set 00 ⊆ N˙ ∗ N , symbolically ω = µsd0N0 (t, α), if, (i) there exists a closed conic set 0 ⊂ T˙ ∗ (TN M) with the properties stated in Proposition 6.2, with the first one replaced by 0T N ⊂ α ∗ (Z(T ∗ N ) × 00 ), (ii) ω is the infimum of all those ω0 for which, 0

lim λω tλα = 0, λ↓0

in the sense of the Hörmander topology on D00 (TN M).

(42)

Microlocal Analysis and Renormalization

651

Now, depending on the position of 00 one can give different refined versions of the microlocal scaling degree. Indeed, when the inclusion in N˙ ∗ N is proper we speak of the strict microlocal scaling degree. When 00 ≡ N˙ ∗ N , we call it simply the scaling degree at the submanifold, symbolically sdN (t). Moreover, when 00 = ∅ we speak of the smooth scaling degree. The definition seems to depend on the choice of the map α ∈ Z. In our concrete case we could make use of the metric to choose a canonical diffeomorphism α in terms of the exponential map, but for reasons which will become clear in the following it is helpful to prove its independence. Before coming to that point we show an example for the computation of the scaling degree which is relevant for the physical discussion, namely, the generalization of the example (3) in Subsect. 5.1, the Feynman propagator EF , to curved space-time. The Feynman propagator is considered as a distribution on M × M, and we are interested in the microlocal scaling degree at the diagonal 12 ⊂ M × M. Indeed, the wave front set of EF is orthogonal to the tangent bundle of the diagonal. We choose α : T12 M 2 → 12 ×M 2 ' M ×M 2 as α(x, ξ1 , ξ2 ) = (x, expx ξ1 , expx ξ2 ) and obtain as on Minkowski space sd12 (EF ) = d − 2. A similar result holds for the 2-point Wightman distribution ω2 . So, Lemma 6.4. The microlocal scaling degrees of the 2-point Wightman distribution ω2 at the diagonal 12 w.r.t. 00 = {(x, k; x, −k)|x ∈ M, k ∈ ∂V+ , k 6= 0} is given by µsd0102 (ω2 ) = d − 2, where d is the dimension of the space-time. 6.2. Invariance and properties for the scaling degrees. Let us then choose two maps α1 , α2 ∈ Z and state the following: Proposition 6.5. Let t ∈ D0 (M). Let ωi = µsd0N0 (t, αi ), i = 1, 2 be the microlocal scaling degrees w.r.t. N and 00 resp. for the two arbitrarily chosen maps. Then ω1 = ω2 . Proof. It is simple to check that tλα2 (φ) = tλα1 (φ ◦ βλ−1 ), ∀φ ∈ D1 (TN M),

(43)

where βλ (x, ξ ) = λ−1 β(x, λξ ) and β = α1−1 ◦ α2 . Now, assume ω1 is the scaling degree for t w.r.t. N and 00 associated with α1 . We should prove that Eq. (42) for α2 converges in the sense of D0 (TN M) as well, at the same rate as λ ↓ 0. The convergence in the sense of distributions is simple. Indeed, if supp(φ) ⊂ K, K a compact subset, then there exists a λ0 such that βλ (supp(φ)) ⊂ K for all λ ≤ λ0 . Hence it suffices by the Banach–Steinhaus principle to prove that the family {φ ◦ βλ−1 |0 < λ ≤ λ0 } is bounded, uniformly in λ, in D1 (K) w.r.t. the family of continuous seminorms which gives the appropriate Fréchet topology. This check proceeds easily from the chain rule and the verifiable fact that the only contribution comes from the 0th and 1st order derivatives w.r.t. βλ−1 . In the limit they are the only terms which survive giving resp. the identity map on the bundle and the derivative of the identity map. Hence, we get the same rate of convergence as far as plain distribution convergence is concerned. A little bit trickier is the convergence in the sense of seminorms for M2 (b). Starting again from Eq. (43), via the multiplication of a smooth test function of compact support ψ such that ψ ≡ 1 on a small neighbourhood of supp(φ), we have that ψt is of compact support and then, by a partition of unity with functions with support on charts, that we

652

R. Brunetti, K. Fredenhagen

are working on Rδ × Rd , where δ is the dimension of the submanifold N . Now, let us multiply the test function φ with the term exp(ihk, · i), and use the inverse Fourier transform to get, [ φtλα2 (k) = βλ∗ tλα1 (φ exp(ihk, · i)) Z α1 d+δ [ p, = ψt λ (p)Iφ (p, k; βλ )d where, Z Iφ (p, k; βλ ) =

e−i(hβλ (ξ ),pi−hξ,ki) φ(ξ ) d d+δ ξ,

where in all these expressions the coordinates ξ are the local coordinates of TN M and k and p are their dual coordinates. We use the idea of the proof for the stationary phase theorem, see for instance Theorem 7.7.1 in Hörmander’s books [35]. Because of βλ → id for λ → 0 the oscillatory integral Iφ falls off rapidly outside of any conical neighbourhood of the diagonal p = k in Rd+δ × Rd+δ , uniformly for λ sufficiently small, i.e. for every > 0 there exists a λ0 > 0 such that for every N ∈ N, sup

sup

(1 + |p| + |k|)N |Iφ (p, k, βλ )| < ∞.

0<λ<λ0 |p−k|>|k|

(44)

Now let 0 ⊂ Rd+δ be a closed cone such that α1 ω [ sup(1 + |p|)N |ψt λ (p)|λ → 0, λ → 0, C

(45)

for all closed cones C with C ∩ 0 = ∅ and all N ∈ N. We now want to show that the same property holds for [ φtλα2 . So let C be a closed cone such that the closed cone C 0 = {p ∈ Rd+δ , |p − k| ≤ |k| for some k ∈ C}

(46)

does not intersect 0. Then we split the region of integration over p into the parts |p−k| ≤ |k| and the rest. In the first region we can estimate k by p and use the fast decay of α1 0 [ (1 + |p|)N |ψt λ (p)| within C and the polynomial boundedness of Iφ ; in the second α1 [ (p)| and the fast decay of I region the polynomial boundedness of (1 + |p|)N |ψt (44). This proves the desired estimate for [ φtλα2 . u t

λ

φ

The microlocal scaling degree µsd has similar properties as the scaling degree, as described in Lemma 5.1. In addition, one finds the following two properties: Lemma 6.6. Let t1 , t2 ∈ D0 (M) with µsd ω1 resp. ω2 at N ⊂ M, w.r.t. 001 resp. 002 and such that Z(N ∗ N ) ∈ / (001 + 002 ). Then the pointwise product t1 t2 exists in a small neighbourhood of N and has the microlocal scaling degree ω ≤ ω1 + ω2 at N w.r.t. 00 = 001 ∪ 002 ∪ (001 + 002 ).

Microlocal Analysis and Renormalization

653

α and t α , some α ∈ Z, on a sufProof. By assumption, the wave front sets of t1,λ 2,λ α ) + WF(t α )) ∩ ficiently small neighbourhood of N satisfy the condition (WF(t1,λ 2,λ Z(T ∗ (TN M)) = ∅, hence their product exists there by M1. Because of the sequential continuity of the products in the Hörmander topology M2, the microlocal scaling degree is given by the sum w.r.t. the stated conic region as follows from M1 and does not depend on the choice of the map α. u t

The following nice property follows from the sequential continuity of the restriction operator to submanifolds M3: Lemma 6.7. Let N1 be a submanifold of N , and let t ∈ D0 (M) have the microlocal scaling degree ω at N w.r.t. 00 . Then the microlocal scaling degree of t at N1 w.r.t. the restriction 01 of 00 to N1 is less or equal to ω, µsd0N11 (t) ≤ µsd0N0 (t). A last word is devoted to a pointlike trivialization of the above procedure. This case can be derived straightforwardly by considering N ≡ {p}, where p ∈ M is a generic point and thought of as (a rather singular case of) a submanifold. The translation to this simpler case is done via the following correspondence between geometrical and analytical quantities: U ∈ TN M

−→

Up ∈ Tp M,

α :U →N ×M

−→

α : Up → M, (i) and (iv) valid,

t α = (1 ⊗ t) ◦ α

−→

t α = t ◦ α,

0 ⊂ T˙ ∗ U, 0 ⊥ T (T N ∩ U ) −→ 00 ⊂ N˙ ∗ N

−→

0 ⊂ T˙ ∗ Up , 0 ⊥ T Up , 0p ⊂ T˙p∗ M.

6.3. Transversal scaling degree. Instead of blowing up distributions on M to distributions on N ×M in the definition of the scaling degree, one could also use a fixed fibration of a neighbourhood of N in M by transversal surfaces. For this purpose we decompose TN M into complementary subbundles, TN M = T N + C. The map αC := π2 ◦ αC∩V , with the projection π2 : N × M → M onto the second factor and with V being a sufficiently small neighbourhood of the zero section Z(TN M), is then a diffeomorphism onto some neighbourhood of N . The images of the fibers of C are transversal surfaces. The transversally (w.r.t. α and C) scaled distribution is then defined by tλ,⊥ (x, η) = t ◦ αC (x, λη), (x, η) ∈ C.

(47)

The transversal microlocal scaling degree may then be defined as the infimum of all ω ∈ R such that the sequence λω tλ,⊥ converges to zero within D00 C (C), with a closed conical set 0C ⊂ T˙ ∗ C with 0C Z(C) = αC∗ (00 ). Fortunately, it turns out that this new concept of a scaling degree at a surface coincides with the old one. Thus, in particular, the transversal scaling degree does not depend on the choice of the fibration.

654

R. Brunetti, K. Fredenhagen

Proposition 6.8. The transversal microlocal scaling degree defined above coincides with the microlocal scaling degree defined in (42). Proof. We may restrict ourselves to a sufficiently small neighbourhood of a point at the surface N . In suitable coordinates, M is a subset of Rδ × Rd−δ , where the first factor corresponds to N and the second factor to the transversal surfaces. TN M is a subset of Rδ × Rδ × Rd−δ with the first factor corresponding to N , the second to the tangent spaces of N and the third one to the fibers of the transversal bundle C. The map α may be chosen as α(x, ξ, η) = (x, x + ξ, η).

(48)

Then αC becomes the identity. The distribution t may be replaced by a distribution with compact support, and the factor 1 in the blow up of t may be replaced by a test function χ ∈ D(N ). For the Fourier transforms we then obtain b b(p − λ−1 q)b t(λ−1 q, λ−1 k), tλ (p, q, k) = λ−d χ

(49)

[ t(p, λ−1 k). tλ,⊥ (p, k) = λd−δb

(50)

and

Furthermore, using a corresponding trivialization of the respective cotangent bundles, we may identify 00 = 0C with {0}×K, where K is a closed cone in Rd−δ , considered as the transversal part of the cotangent space, and 0 with {0} R × {0} × K. The convergence of tλ may be discussed in terms of seminorms of the form V (1 + |p|)N |tλ | with, respectively, conical neighbourhoods V of 0 and some N ∈ (−N) and closed conical sets V in the complement of 0 and all N ∈ N. Since χ is strongly decreasing, these seminorms of tλ can be estimated in terms of the corresponding seminorms of tλ,⊥ . u t 6.4. Extension of distributions to surfaces. We now want to apply these concepts to the extension problem of distributions t ∈ D0 (M\N ). The wavefront set of the extension shall be orthogonal to T N , hence a necessary condition is that this holds true for the closure of WF(t) within T ∗ M. We extend the notion of the microlocal scaling degree to such distributions in an analogous way as in the extension problem to a single point. Namely, for an arbitrary function χ ∈ E(M) with supp χ ∩ N = ∅, (1 ⊗ χ) ◦ α · tλα can be considered as a distribution on U ⊂ TN M. The microlocal scaling degree of t at N is then defined in terms of all sequences so obtained. We choose a fibration of a neighbourhood of N by transversal surfaces. It is easy to see that if t has a scaling degree ω at N , then its restriction to a transversal surface has a scaling degree at the point of intersection with N which is less than or equal to ω. We therefore obtain the corresponding extension theorem. Theorem 6.9. Let N be a submanifold of the manifold M, and let t0 ∈ D0 (M\N ). (i) If sdN (t0 ) < codim(N ) there exists a unique distribution t ∈ D0 (M) extending t0 with sdN (t) = sdN (t0 ) (ii) If codim(N ) ≤ sdN (t0 ) < ∞ there exist extensions t ∈ D0 (M) with sdN (t) = sdN (t0 ). They are uniquely characterized by their values on some closed subspace of D(M) which is complementary to the space of all test functions which vanish on N up to order sdN (t0 ) − codim(N ).

Microlocal Analysis and Renormalization

655

Proof. According to Theorems 5.2 and 5.3 there exist extensions of the restrictions of t0 to every transversal surface with the same scaling degree. We have to show that they are restrictions of a unique distribution t on M with the same microlocal scaling degree. We first fix some normal fibration as described above and consider t0 as a distribution on U \{0} with a neighbourhood U of the zero section of C. We perform the construction of t at all fibers, by choosing a smooth function ϑ ∈ E(C) which is equal to 1 in a neighbourhood of the zero section and whose restrictions to every fiber have compact support. Moreover, we choose smooth functions wβ ∈ E(U ), also with compact support γ γ on each fiber, which satisfy the condition ∂ξ wβ (x, 0) = δβ , where ξ denotes the variable in the fiber over x ∈ N . We may take wβ = wξ β /β! with some function w which is identical to 1 in a neighbourhood of the zero section. We set ρ = sdN (t0 ) − codim(N ), ϑλ−1 (x, ξ ) = ϑ(x, λ−1 ξ ) and W φ(x, ξ ) = P β φ(x, ξ ) − |β|≤ρ wβ (x, ξ )∂ξ φ(x, 0). Let 0C = WF(t0 ) ∪ N ∗ N . We already know that the sequence tn = t0 (1 − ϑ2n ) ◦ W converges on every fiber, and it is easy to see that it converges weakly in D0 (U ). We now want to show that the wave front set of t is perpendicular to T N . For this purpose we show that the above sequence converges even in D00 C (U ), i.e. that for every pseudodifferential operator A whose wave front set does not intersect 0C (hence Atn is smooth) the sequence Atn converges in the sense of smooth functions. For a pseudodifferential operator with smooth kernel the argument is essentially the same as for the weak convergence, hence we may restrict ourselves to pseudodifferential operators whose kernels have support in a sufficiently small neighbourhood of the diagonal of U × U (only here singularities may occur). According to the discussion of condition M2(b), we may equivalently look at the Fourier transform of χt0 (ϑ2m − ϑ2m+1 ), where χ is a test function with sufficiently small support which does not vanish at some point x0 ∈ N . Introducing suitable local coordinates in a neighbourhood of x0 , we find ∧ ∧ χ t0 (ϑ2m − ϑ2m+1 ) (p, k) = 2−m(d−δ) (χ t0 )2−m ,⊥ (ϑ − ϑ2 ) (p, 2−m k), where the symbol [ · ]∧ means Fourier transform, the p’s are the dual coordinates w.r.t. points x ∈ N and similarly the k’s are dual coordinates w.r.t. the points ξ ∈ Cx of the fibers Cx of C and finally δ is the dimension of N . By the assumption on the scaling degree of t0 at N we know that for all ε > 0, N ∈ N and ω > sdN (t0 ) there exists some c > 0 such that |[(χt0 )2−m ,⊥ (ϑ − ϑ2 )]∧ (p, k)| ≤ c 2mω (1 + |p| + |k|)N ,

(51)

for all (p, k) with |p| > ε|k| (i.e. outside of a certain conical neighbourhood of the normal bundle {(p, k), p = 0}). Therefore, the sequence sup |[χt0 (ϑ2m − ϑ2m+1 )]∧ (p, k)|(1 + |p| + |k|)N ,

|p|>ε|k|

(52)

is summable for ω < (d − δ). If sdN (t0 ) < (d − δ) such an ω exists, and we conclude, that the sequence t0 (1 − ϑ2m ) converges in D00 C (M). In case of sdN (t0 ) ≥ (d −δ) we have to apply the W -operation defined above, which, for χ with sufficiently small support, reduces to a subtraction of the Taylor series up to order [ρ]. We obtain from the integral formula for the remainder in the Taylor expansion [χt0 (ϑ2m − ϑ2m+1 ) ◦ W ]∧ (p, k) = Z 1 X (1 − µ)[ρ] ([ρ] + 1) dµ. [χt0 (ϑ2m − ϑ2m+1 )ξ β ]∧ (p, µk)k β β! 0 |β|=[ρ]+1

656

R. Brunetti, K. Fredenhagen

Since χt0 ξ β has scaling degree sdN (t0 ) − [ρ] − 1 < (d − δ), we may use the same estimate as before and find that the sequence t0 (1 − ϑ2m ) ◦ W converges in D00 C (M). It remains to prove the stability of the scaling degree under the extension procedure. The argument is a straightforward combination of the techniques in the corresponding proofs in Sect. 5 and the arguments above and is therefore omitted. u t 7. Extension to the Diagonal and Renormalization We come to the main point, namely to prove that the inductive analysis of Sect. 4 closes when supplemented by the information about the scaling degrees and gives well defined operator-valued distributions Tn all over the space M n . This process is what it is usually called “renormalization”. Toward this goal we must show that the scaling degree for the distributions of the nth order of the induction can be estimated in terms of those of lower orders. As explained before it is sufficient to do it for the numerical distributions 0

t (x1 , . . . , xn ) = ω(0 T (x1 , . . . , xn )),

(53)

where ω is our reference Hadamard state. According to (31) 0 t is a finite sum of terms Y c fI (x)t I (xI )t I (xI c ) ω2 (xi , xj )aij , (54) (i,j )∈I ×I c

with nonnegative integers aij and a smooth partition of unity of M n \1n fI ∈ E(M n \1n ) with supp fI ⊂ CI . M n inherits a natural metric from M, and all partial diagonals 1I are totally geodesic submanifolds. The map α : T M n → M n × M n may therefore be defined by α(x, ξ ) = (x, expx ξ ). Then all restrictions of α to partial subdiagonals, αI = αT1I M n satify the conditions before Proposition 6.4. We can choose the functions fI with smooth scaling degree 0 at the small diagonal. We then may consider all factors in (54) as distributions on M n . According to Lemma 6.7, their microlocal scaling degrees with respect to 1n are bounded from above by their microlocal scaling degrees with respect to the respective partial diagonals. Moreover the convex combinations of the respective conical subsets of the conormal bundle of the small diagonal do not meet the zero section. Hence, by Lemma 6.6, the scaling degree of the distribution in (54) is bounded by X aij (d − 2). (55) ω = sd1I (tI ) + sd1I c (tI c ) + ij

From this we get the formula sd1n (ω(T (:

n Y i=1

ϕ li (xi ) :))) ≤

n X d −2 . li 2

(56)

i=1

We thus obtain Theorem 7.1 (Main Theorem). All polynomially interacting quantum field theories based on the scalar field on d ≥ 2 dimensional globally hyperbolic space-times follow the same short-distance perturbative classification as on their respective Minkowskian cases.

Microlocal Analysis and Renormalization

657

We recall that for simplicity we have considered only pure monomials as interacting terms, i.e. without derivatives, multiple interacting fields and so forth. The general case however can be derived straightforwardly from our construction and we leave the task to the reader. We close this section by specifying the choices which have to be made in the abstract geometrical setting of Sect. 6. First we choose the normal fibration of M n in a neighbourhood of 1n which was used in the proof of Theorem 6.9 for the construction of the extension. Let N1n be the orthogonal complement of T 1n in T1n M n w.r.t. the metric in M n , X ξi = 0}. (57) N1n = {(x, ξ1 , . . . , ξn ) ∈ T1n M n , Then π2 ◦ αN1n describes the desired fibration, π2 ◦ α(x, ξ1 , . . . , ξn ) = (expx ξ1 , . . . , expx ξn ).

(58)

x may be considered as the center P of mass of the points xi = expx ξi , and the tangent vectors ξi with the constraint ξi = 0 play the rôle of relative coordinates. We further choose a smooth function w on T M which is equal to 1 on a neighbourhood of the zero section and has compact support on each fiber and use the function Y w(x, ξi ), (59) wn (x, ξ1 , . . . , ξn ) = in the extension to the diagonal 1n . By these conventions we get a reference definition of time-ordered products for all Wick products which involves only the chosen Hadamard state and the function w. The algebra of interacting fields can then be defined by choosing the coefficients in the Lagrangian. 8. On the Definition of the Net of Local Algebras of Observables In the preceding section we finished the construction of time-ordered products of Wick polynomials of the free fields. We now want to show that this already gives the full net of local algebras of observables (within perturbation theory). An “adiabatic limit”, whatever this might mean on a curved space-time, is not required. Actually, these observations are not completely new. In [52] it was already observed that the local S-matrices of the Stückelberg–Bogoliubov–Epstein–Glaser approach give a local net of observables. Let W be the set of Wick polynomials with coefficients from D(M). So every A ∈ W is an operator valued distribution with compact support which is relatively local to the free field. Our starting relation is the causal factorization (2), S(A + B + C) = S(A + B)S(B)−1 S(B + C),

(60)

for A, B, C ∈ W, whenever supp(A) is later than supp(C). Now let L be our interaction Lagrangian. Then gL ∈ W for g ∈ D(M). We may define observables with respect to the interaction gL by Bogoliubov formula Sg L (A) = S(gL)−1 S(gL + A).

(61)

We now show that the interacting observables depend only locally on the interaction. More precisely, we have the following

658

R. Brunetti, K. Fredenhagen

Proposition 8.1. Let O be a causally closed region, and let the test functions g and g 0 coincide on some neighbourhood of O. Then there exists a unitary V such that for all A ∈ W with supp(A) ⊂ O, V Sg L (A)V −1 = Sg 0 L (A).

(62)

Proof. We may split g 0 − g = a + b, where a does not intersect the past of O and b not the future. Then supp(aL) is later than supp(A), hence from (60) we find S(g 0 L + A) = S(g 0 L)S((g + b)L)−1 S((g + b)L + A),

(63)

thus Sg 0 L (A) = S(g+b)L (A). Moreover, supp(A) is later than supp(bL), hence S((g + b)L + A) = S(gL + A)S(gL)−1 S((g + b)L).

(64)

t Hence we obtain (62) with V = Sg L (bL)−1 . u We conclude that the algebra Ag L (O) which is generated by Sg L (A), supp(A) ⊂ O with g ≡ 1 in a neighbourhood of O, is up to unitary equivalence uniquely fixed by L. We may formalize the construction in the following way. Let 2(O) for a causally closed compact region O be the set of test functions which equal unity on a neighbourhood of O. Consider for g, g 0 ∈ 2(O) the set Vgg 0 (O) of unitaries V which satisfy the intertwining relation V Sg L (A) = Sg 0 L (A)V , supp(A) ⊂ O.

(65)

The algebra of observables AL (O) can now be defined as the algebra of covariantly constant sections of the bundle [ {g} × Ag L (O). (66) g∈2(O )

Here, a section A = (Ag )g∈2(O) is called covariantly constant if V Ag = Ag 0 V , ∀V ∈ Vgg 0 (O).

(67)

AL (O) contains for example the elements SL (A) = (Sg L (A))g∈2(O) . To complete the construction of the net of local algebras of observables we have to fix the imbeddings iO2 O1 : AL (O1 ) → AL (O2 ) for O1 ⊂ O2 . But this structure is inherited from the fibers and may be defined by the restriction of the section from 2(O1 ) to 2(O2 ). 9. Summary and Outlook We have proven that renormalization on curved backgrounds can be done in close analogy to renormalization on Minkowski space, and that the removal of singularities follows the well known power counting rules. This result was expected since the ultraviolet behaviour on smooth manifolds should be essentially identical to that on Minkowski space. But we had to overcome two major obstacles: On a generic space-time there is no reason to expect a decent infrared behaviour, hence we had to use a method which decouples completely the short distance from the long distance problem; the other source of the problem was the absence of translation invariance which, on the technical side

Microlocal Analysis and Renormalization

659

makes obsolete the usual momentum space methods, and on the side of physics, forbids basing the construction on a distinguished vacuum state and on the notion of particles. We solved these problems by basing the construction on the local S-matrices of Stückelberg and Bogoliubov, by invoking the general ideas of algebraic quantum field theory and by replacing translation invariance by smoothness properties, making extensive use of techniques and concepts from microlocal analysis. Besides the solution of the problem to which this paper is addressed, we solved several other problems which might be of independent interest. First we have found a new construction of Wick polynomials on a domain which depends only on the representation associated to a fixed Hadamard state but not on the state itself. Since according to Verch [57], all representations induced by Hadamard states are locally quasiequivalent, this amounts to an algebraic construction of Wick polynomials. Second, we gave a perturbative construction of algebraic quantum field theory. In particular, we proved that the theory (in the algebraic sense), is completely fixed if it is known locally. Actually this holds independently of perturbation theory and might be a hint for a construction (in the sense of constructive quantum field theory) of asymptotically free theories. There the construction in small volumes seems to be possible [46] but the infrared problem poses, at present, insurmountable difficulties. The message of this paper is that the construction of the algebra of observables is nevertheless possible once the algebra of observables for small space-time regions have been constructed. The long distance behaviour of such a theory would still require an extra investigation, but it would be the behaviour of an existing theory, quite similar to the computation of spectra of Hamiltonians which have been shown to be self-adjoint operators. On the technical side we had to study the extension problem for distributions which are defined on the complement of some submanifold. This seems to be a natural mathematical question, and in a simple case it was treated in [23] by similar methods. What seems to be new is our concept of the (microlocal) scaling degree at a surface which combines the condition of smoothness along a surface with a classification of the singularity in a transversal direction. The main open point in this paper is the fixing of the finite renormalizations. One expects that they can be chosen in terms of local functions of the metric, but a precise formulation meets a lot of problems. A similar problem was studied (and partially solved) in the definition of the expectation value of a renormalized energy momentum tensor of free fields by R. Wald [61]. We hope to return to this problem in a future publication [10]. Acknowledgements. We are particularly grateful to Raymond Stora who long ago suggested to us the relevance of the Epstein and Glaser procedure for renormalization on curved backgrounds. In an early stage of this work, some results, in particular on the wave front sets of time-ordered functions, have been obtained in collaboration with Martin Köhler which is gratefully acknowledged. The first named author was partially supported by a grant of the Training and Mobility of Researchers (TMR) programme of the European Community.

References 1. Ashtekar, A.: Mathematical problems of non-perturbative quantum general relativity. In: Zinn-Justin et al. (eds.) Les Houches Summer School on Gravitation and Quantization. Amsterdam: North-Holland, 1994 2. Beem, J. K., Ehlrich, P. E. and Easley, K. L.: Global Lorentzian Geometry. New York: Marcel Dekker Inc., 1996 3. Birrel, N. D., and Davies, P. C. W.: Quantum Fields in Curved Space. Cambridge: Cambridge University Press, 1982 4. Blanchard, P., and Sénéor, R.: Green’s functions for theories with massless particles (in perturbation theory). Ann. Inst. Henry Poincaré 23, 147 (1975)

660

R. Brunetti, K. Fredenhagen

5. Bogoliubov, N. N., and Shirkov, D. V.: Introduction to the Theory of Quantized Fiedls. 3rd edition, New York: John Wiley and Sons, 1976 6. Bourbaki, N.: Algèbre. Chap.VIII. Paris: Hermann, 1970 7. Bros, J., Epstein, H., and Moschella, U.: Analyticity properties and thermal effects for general quantum field theory on de Sitter space-time. Commun. Math. Phys. 186, 535 (1998) 8. Brunetti, R., Fredenhagen, K., and Köhler, M.: The microlocal spectrum condition and the Wick’s polynomials of free fields. Commun. Math. Phys. 180, 633 (1996) 9. Brunetti, R., and Fredenhagen, K.: Microlocal analysis and interacting quantum field theory: Renormalizability of ϕ 4 . In: Doplicher, S., Longo, R., Roberts, J. E., and Zsido, L (eds.) Operator Algebras and Quantum Field Theory. Proceedings, Roma 1996, Cambridge, MA: International Press, 1997 10. Brunetti, R., and Fredenhagen, K.: Work in progress 11. Brunetti, R., and Fredenhagen, K.: On the connection between interacting quantum field theory and microlocal analysis. Forthcoming review paper 12. Buchholz, D.: Current trends in axiomatic quantum field theory. hep-th/9811233 13. Bunch, T. S.: BPHZ Renormalization of λφ 4 field theory in curved spac-times. Ann. of Phys. 131, 118 (1981) 14. De Witt, B. S., and Brehme, R. W.: Radiation damping in a gravitational field. Ann. of Phys. 9, 220 (1965) 15. Dimock, J.: Scalar quantum field in an external gravitational field. J. Math. Phys. 20, 2549 (1979) 16. Dosch, H. G., and Müller, V. F.: Renormalization of quantum electrodynamics in an arbitrary strong time independent external field. Fort. der Physik 23, 661 (1975) 17. Duistermaat, J. J., and Hörmander, L.: Fourier integral operators II. Acta Math. 128, 183 (1973) 18. Dütsch, M., and Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: The example of QED. Commun. Math. Phys. 203, 71 (1999) 19. Dyson, F.: Collected works. Providence RI: American Mathematical Society and International Press, 1996 20. Epstein, H., and Glaser, V.: The role of locality in perturbation theory. Ann. Inst. Henri Poincaré-Section A, vol. XIX, n.3, 211 (1973) 21. Epstein, H., and Glaser, V.: Adiabatic limit in perturbation theory. In: Velo, G., and Wightman, A. S. (eds.) Renormalization Theory. Proceedings, Dordrecht-Holland: D. Reidel Publishing Co., 1976 22. Epstein, H.: On the Borchers class of a free field. Nuovo Cimento 27, 886 (1966) 23. Estrada, R.: Regularization of distributions. Internat. J. Math. & Math. Sci. 21, 625 (1998) 24. Fredenhagen, K., and Haag, R.: Generally covariant quantum field theory. Commun. Math. Phys. 108, 91 (1987) 25. Fulling, S.: Aspects of Quantum Field Theory in Curved Space-Time. Cambridge: Cambridge University Press, 1989 26. Glimm, J., and Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York–Berlin– Heidelberg: Springer-Verlag, 1981 27. Guillemin, V., and Pollack, A.: Differential Topology. Englewood-Cliffs, N.J.: Prentice-Hall, Inc., 1974 28. Green, D. B., Schwarz, J. H., and Witten, E.: Superstring Theory. Vol. 1 and 2. Cambridge: Cambridge University Press, 1987 29. Haag, R.: Local Quantum Physics: Fields, particles and algebras. 2nd edition, Berlin: Springer-Verlag, 1996 30. Haag, R., Narnhofer, H., and Stein, U.: On quantum field theory in gravitational background. Commun. Math. Phys. 94, 219 (1984) 31. Halzen, F., and Martin, A. D.: Quarks and Leptons: An Introductory Course in Modern Particle Physics. New York: John Wiley and Sons, 1984 32. Hawking, S.: Particle creation by black holes. Commun. Math. Phys. 43, 199 (1975) 33. Hawking, S.: The Chronology protection conjecture. Phys. Rev. D 46, 603 (1992) 34. Hepp, K.: Théorie de la Renormalisation. Lect. Notes in Phys. 2. Berlin–Heidelberg: Springer-Verlag, 1969 35. Hörmander, L.: The Analysis of Linear Partial Differential Operators. Vol. I–IV Berlin: Springer-Verlag, 1983–1986 36. Iagolnitzer, D.: Scattering in Quantum Field Theories: The Axiomatic and Constructive Approaches. Princeton NJ: Princeton University Press, 1993 37. Iagolnitzer, D.: Microlocal analysis and phase space decomposition. Lett. Math. Phys. 21, 323 (1991) 38. Itzykson, C., and Zuber, J. B.: Quantum Field Theory. New-York: McGraw-Hill, 1980 39. Junker, W.: Hadamard states, adiabatic vacua and the construction of physical states for scalar quantum fields on curved spacetimes. Rev. Math. Phys. 8, 1091 (1996) 40. Kay, B. S., Radzikowski, M., and Wald, R. M.: Quantum field theories on spacetimes with a compactly generated Cauchy horizon. Commun. Math. Phys. 183, 533 (1997) 41. Kay, B. S., and Wald, R. M.: Theorems on the uniqueness and thermal properties of stationary, non singular, quasifree states on spacetimes with a bifurcate Killing horizon. Phys. Rep. 207, 49 (1991) 42. Kinoshita, T. (ed.): Quantum Electrodynamics. Singapore: World Scientific, 1990

Microlocal Analysis and Renormalization

661

43. Köhler, M.: Ph.D. thesis. University of Hamburg, 1994 44. Liess, O.: Conical refractions and higher microlocalization. Lect. Notes in Math. 1555. Berlin: SpringerVerlag, 1993 45. Lüscher, M.: Dimensional regularization in the presence of large background fields. Ann. of Phys. 142, 359 (1982) 46. Magnen, J., Rivasseau, V., and Sénéor, R.: Construction of YM-4 with an infrared cutoff. Commun. Math. Phys. 155, 325 (1993) 47. Osterwalder, K. and Schrader, R.: Axioms for Euclidean Green’s functions: I, II. Commun. Math. Phys. 31, 81 (1973); ibidem 42, 281 (1975) 48. Prange, D.: Causal perturbation theory and differential renormalization. hep-th/9710225 49. Radzikowski, M.: Micro-local approach to the Hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529 (1996) 50. Scharf, G.: Finite Quantum Electrodynamics: The Causal Approach. 2nd edition, Berlin: Springer-Verlag, 1995 51. Schwinger, J. (ed.): Selected Papers on Quantum Electrodynamics. New York: Dover, 1960 52. Il’in, V. A., and Slavnov, D. A.: Observable algebras in the S-matrix approach. Theor. Math. Phys. 36, 32 (1978) 53. Steinmann, O.: Perturbation Expansions in Axiomatic Field Theory. Lect. Notes in Phys. 11. Berlin: Springer-Verlag, 1971 54. Stora, R.: Differential algebras in Lagrangean field theory. ETH Lectures, January-February 1993. Manuscript 55. Streater, R. F., and Wightman, A. S.: PCT, Spin & Statistics and all that. New York: W.A. Benjamin, Inc., 1964 56. Stückelberg, E. C. G., and Peterman, A.: La normalisation des constants dans la theorie des quanta. Helv. Phys. Acta 26, 499 (1953); and earlier references therein 57. Verch, R.: Local definitness, primarity and quasiequivalence of quasifree Hadamard quantum states in curved spacetime. Commun. Math. Phys. 160, 507 (1994) 58. Verch, R.: Wavefront sets in algebraic quantum field theory. math-ph/9807022 59. Wald, R. M.: General Relaitvity. Chicago: The University of Chicago Press, 1984 60. Wald, R. M.: Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics. Chicago: The University of Chicago Press, 1994 61. Wald, R. M.: The back reaction effect in particle creation in curved spacetime. Commun. Math. Phys. 54, 1 (1977) 62. Weinberg, S.: The Quantum Theory of Fields. Vol.I-II, Cambridge: Cambridge University Press, 1995– 1996 63. Wrezinski, W. F.: Note on the construction of the Bogolyubov scattering operator in the (: ϕ 4 :)2 theory. Theor. Math. Phys. 11, 331 (1972) 64. Zimmermann, W.: Convergence of Bogoliubov method of renormalization in momentum space. Commun. Math. Phys. 15, 208 (1969) Communicated by A. Jaffe

Commun. Math. Phys. 208, 663 – 669 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Remarks on Positive Mass Theorem Liqun Zhang? , Xiao Zhang ?? Institute of Mathematics, Chinese Academy of Sciences, Beijing 100080, P. R. China. E-mail: [email protected]; [email protected] Received: 4 May 1999 / Accepted: 28 June 1999

Abstract: We establish a positive mass theorem for time-symmetric initial data sets by relaxing the nonnegative scalar curvature to the case where the first Neumann eigenvalue of conformal Laplacian operators is eventually nonnegative. 1. Introduction In general relativity, an initial data set is characterized by (M, gij , pij ), where M is a 3-dimensional manifold, gij is a metric tensor of M and pij is a symmetric 2-tensor on M. (In fact, pij can be treated as a nonsymmetric 2-tensor, see [Z].) For this data, local mass density and local momentum density are defined by 1 (R + (pii )2 − pij pij ), 2 j j Ji = ∇j p i − ∇i p j µ=

respectively, where R is the scalar curvature of M. M is satisfied by the dominant energy condition if µ ≥ (Ji J i ) 2 . 1

(1.1)

The positive mass theorem, which was proved first by R. Schoen and S.T. Yau [SY1, SY2,SY3], then by E. Witten [W,PT], states that for any asymptotically flat initial data set (M, gij , pij ) which satisfies the dominant energy condition, the total energy is not less than the total linear momentum on each asymptotically flat end of M. Moreover, if the total energy is zero for some end, then M is topologically R 3 and can be isometrically ? The research is partially supported by the Chinese NSF.

?? The research is partially supported by the Chinese NSF and mathematical physics program of CAS.

664

L. Zhang, X. Zhang

embedded into 4-dimensional Minkowski space R 3,1 as a spacelike hypersurface so that gij is the induced metric from R 3,1 and hij is the second fundamental form. For his purpose to understand how gravity is involved, Professor Yau suggests to relax the dominant energy condition (1.1) to the condition that 1 −4 + (µ − |J |) 4

(1.2)

is a nonnegative operator on L2 (M) in the context of the positive mass theorem [Y]. In fact, (1.2) can be understood as an energy condition contributed both from matter and from gravity. In this note, we define the eventual nonnegativality of Dirichlet and Neumann eigenvalues of Schrödinger operator L = −4 + q on H 1 (M). (L is always denoted as such a Schrödinger operator throughout the paper.) In the time-symmetric case, i.e., pij = 0, we can show that the positive mass theorem holds if the Neumann eigenvalue of operator (1.2) is eventually nonnegative on M. Note that the following example, found by J.P. Wang and the second author, shows that the positive mass theorem doesn’t hold if the Dirichlet eigenvalue of operator (1.2) is nonnegative only: On R 3 , let gij = u4 δij , pij = 0, √1 . Since the conformal Laplacian −4 + R8 of the metric g has a 2 1+r 2 positive solution u−1 , a theorem of Fischer–Colbrie and Schoen [FCS] says its Dirichlet

where u = 1 −

eigenvalue is nonnegative. But this metric has total mass −1. 2. First Eigenvalues

Let M be a complete, noncompact manifold, ⊂ M a bounded domain. Let λL () be the first Dirichlet eigenvalue of L on , i.e., R |2 + qf 2 |∇f R , λL () = inf 2 f ∈H01 () f and ηL () be the first Neumann eigenvalue of L on , i.e., R |2 + qf 2 |∇f R . ηL () = inf 2 f ∈H 1 () f Remark 2.1. ηL () is, in general, nonzero for this Schrödinger operator, also, ηL () ≤ λL (). The domain monotonicity property holds for Dirichlet eigenvalues of L, i.e., if 1 ⊂ 2 , then λL (1 ) ≥ λL (2 ). But this property doesn’t hold for Neumann eigenvalues.

Remarks on Positive Mass Theorem

665

The domain monotonicity property allows us to define λL (M) ≥ 0 in the following way: Definition 2.2. We say λL (M) ≥ 0 if for any f ∈ H01 (M), Z |∇f |2 + qf 2 ≥ 0. M

The definition of λL (M) ≥ 0 implies that for any bounded domain ⊂ M, λL () > 0. Now we introduce the definition of ηL (M) ≥ 0. Definition 2.3. We say ηL (M) ≥ 0 if for any exhaustion {n } of M there exists a large integer N such that for any f ∈ H 1 (n ) and n > N, Z |∇f |2 + qf 2 ≥ 0. n

The exhaustion of S M means that {n } is a sequence of bounded domain of M, n ⊂ n+1 and M = n . Remark 2.4. ηL (M) ≥ 0 implies λL (M) ≥ 0, but vice versa is not true. Remark 2.5. We can easily find some examples that q changes sign on M while ηL (M) ≥ 0. A sufficient condition for ηL (M) ≥ 0 is that ηL (M0 ) ≥ 0 and q ≥ 0 on M \ M0 . In particular, this is true when q ≥ 0 on M. 3. Positive Solutions Now we assume M is a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 , i.e., there is a compact set K ⊂ M such that M \ K is the disjoint union of a finite number of subsets M1 , · · · , Mk − called the “ends” of M − each diffeomorphic to the complement of a contractible compact set in R 3 . Under the diffeomorphism the metric of Ml ⊂ M is of the form gij = δij + aij in the standard coordinates {x i } on R 3 , where aij = O(r −τ ),

∂k aij = O(r −τ −1 ),

∂l ∂k aij = O(r −τ −2 ).

For convenience of following discussions, we assume that M has only one end. We assume also that the scalar curvature of M satisfies Z |R| < ∞. M

With these conditions, we can define the total energy in the end of M as limits over the sphere Sr of radius r in M \ K ⊂ R 3 , Z 1 lim (∂j gij − ∂i gjj ) ∗ dx i . E= 16π r→∞ Sr Note E ≡ 0 if τ > 1.

666

L. Zhang, X. Zhang

Let γ = τ for 1 > τ > 21 and γ = 1 − ε for τ = 1, where ε > 0 is chosen such that 0,α 1 > γ > 21 . If q ∈ C−τ −2 (M), then L gives the map for the following weighted Hölder spaces 2,α 0,α (M) −→ C−γ L : C−γ −2 (M).

Lemma 3.1. If ηL (M) ≥ 0, then the above L is an isomorphism. Proof. We only need to show that the kernel of L is trivial since, by Theorem 9.2 (d), [LP], the injectivity implies L is an isomorphism. By the result of Fischer–Colbrie and Schoen [FCS], if λL (M) ≥ 0, there exists a positive function v on M such that Lv = 0 on M. We may assume v(p) = 1 at some point p ∈ M. Now it is easy to see that there exists a positive constant C1 , C2 , C3 and R0 > 0 such that for r > R0 , and p ∈ B2r (p) \ Br (p) the first nonzero Neumann eigenvalue of the Laplacian operator satisfies C2−1 r −2 ≤ η−4 (Br (p)), see Corollary 5.1 [L]. In fact, by the asymptotically flat assumption on M, we know that its Ricci curvature on Br (p) is bounded by C/r 2 for some constant C > 0. And the volume increasing satisfies Vol(B2r (p)) ≤ C3 Vol(Br (p)). We also have the Sobolev inequality for any φ ∈ C0∞ (Br (p)) Z Z 1 φ 6 ) 3 ≤ C1 |∇φ|2 . ( Br (p)

Br (p)

Since q = O(r −τ −2 ), [L, Theorem11.1] says that there is C > 0 depending only on C1 , C2 , C3 such that sup v ≤ C inf v. Br (p)

Br (p)

Therefore there exists a constant C > 0 such that the following Harnack’s inequality is true, sup

B2r (p)\Br (p)

v≤C

inf

B2r (p)\Br (p)

v.

In fact, the set B2r (p) \ Br (p) can be covered by finite number of Br (pj ) with pj ∈ B2r (p) \ Br (p) for j = 1, · · · , J , where J is independent of r. Now we claim that inf v > 0.

x∈M

If not, by Harnack’s inequality, we deduce that for some {xj }, limxj →∞ v = 0. Put n = {x|v(x) ≥

1 }. n

Remarks on Positive Mass Theorem

667

We may assume xj ∈ ∂Brj (p), then by Harnack inequality we have lim

rj →∞,x∈Srj (p)

v(x) = 0.

Then n is contained in Brjn (p) for some jn . So it is easy to see that {n } is an exhaustion R ∂v ∂v of M. We may assume ∂n ∂ν < 0, since ∂ν ≤ 0, v > 0 and limxj →∞ v = 0. Then Z

1 |∇v| + qv = n n 2

Z

2

∂n

∂v < 0. ∂ν

2,α . Contradicts to our assumption ηL (M) ≥ 0. Now suppose Lh = 0 for some h ∈ C−δ h Put w = v . Then w → 0 as r → ∞. Since w satisfies

4w + 2(∇ log v)(∇w) = 0. Then maximum principle implies w ≡ 0. Thus h ≡ 0. u t Lemma 3.2. If ηL (M) ≥ 0, then there is a unique positive solution of Lu = 0 such that limr→∞ u = 1, and such u can be written as u = 1 + O(r −γ ) as r → ∞. Furthermore, for any bounded positive solution of Lu = 0, limr→∞ u exists. 0,α 2,α Proof. Since q ∈ C−γ −2 (M), there exists a unique h ∈ C−γ (M) such that

Lh = −q. Then u = 1 + h is the desired solution. Suppose u¯ is any positive solution of Lu¯ = 0. We may assume, up to a constant, that limr→∞ u¯ = 1. Let = {x ∈ M : u¯ − u ≥ 0}. Again we put w =

u−u ¯ v ,

w satisfies 4w + 2(∇ log v)(∇w) = 0

on and w = 0 on ∂. Then either u¯ ≡ u on , then u¯ ≡ u on M, or is empty. In the first case we have limr→∞ u¯ = 1. In the second case we have u − u¯ > 0 on M and limr→∞ (u¯ − u) = 0, a similar argument as in the proof of Lemma 3.1 we know this is t impossible by our assupmtion ηL (M) ≥ 0. u

668

L. Zhang, X. Zhang

4. Main Result Now we prove the following positive mass theorem. Theorem 4.1. Let M be a 3-dimensional asymptotically flat manifold of order 1 ≥ τ > 21 which has only one end. If the first Neumann eigenvalue of the conformal Laplacian operator −4 + R8 is nonnegative, then, E ≥ 0. If E = 0, then M is R 3 with a strong conformal flat metric gij = (1 + o(r −1 ))4 δij . Proof. By Lemma 3.2, there exists a positive u = 1 + O(r −γ ) which satisfies 4u − Thus

Z Br

(|∇u|2 +

R 2 u )) ∗ 1 = 8

R u = 0. 8 Z Sr

(4.1)

(∂i u + O(r −2γ −1 ) ∗ dx i .

Define a new metric g˜ = u4 g. g˜ is asymptotically flat of order γ > the previous positive mass theorem [SY1], we have

1 2

(4.2)

and scalar flat. By

E(g) ˜ ≥ 0, and g˜ ij = δij . Now and zero if and only if M is Z (∂j g˜ ij − ∂i g˜ jj ) ∗ dx i Sr Z = (−2∂i u4 + u4 (∂j gij − ∂i gjj ) + O(r −2γ −1 )) ∗ dx i Sr Z Z R 2 2 = −8 (|∇u| + u ) ∗ 1 + ((∂j gij − ∂i gjj ) + O(r −2γ −1 )) ∗ dx i . 8 Br Sr R3

By the assumption that ηL (M) ≥ 0, there exists R0 > 0, the left side of (4.2) is nonnegative. Since it also has finite limit as r → ∞, we have Z R 1 (|∇u|2 + u2 ) + E(g). (4.3) E(g) ˜ =− 2π M 8 R ˜ = 0. Thus M is R 3 , If E(g) = 0, then (4.2) gives M (|∇u|2 + R8 u2 ) = 0. So E(g) −4 and g˜ ij = δij . Therefore gij = u δij . On the other hand, Z Z R ∂ν u = |∇u|2 + u2 = 0, lim r→∞ S 8 M r where ν is outer unit normal of sphere Sr . This actually implies that u = 1 + o(r −1 ). Therefore the proof of theorem is complete. u t

Remarks on Positive Mass Theorem

669

Remark 4.2. From the above theorem, we know that the positive mass theorem can hold for asymptotically flat manifolds with scalar curvature negative somewhere, which can be measured by the first Neumann eigenvalue of the conformal Laplacian operator. For example, let be a compact subset of M, gij be an asymptotically flat metric such that η−4+ R () ≥ 0 and scalar curvature R ≥ 0 on M\. Then η−4+ R (M) ≥ 0, hence the 8 8 positive mass theorem holds for this metric. Remark 4.3. Although we only treat 3-dimensional manifolds in this note, Theorem 4.1 can be extended to manifolds of dimension less or equal to 7, or to spin manifolds of any dimension. Remark 4.4. Now let us see the case for black holes. Suppose M has an inner boundary 6 which is a minimal sphere. Then there is a positive solution of (4.1) which satisfies ∂u ∂n = 0 on 6, where n is the inward normal vector of 6. With respect to the new metric u4 g, the mean curvature of 6 is ∂u = 0. H˜ = u−2 H + 4u−3 ∂n Hence the conformal change of the metric g˜ = u4 g preserves the minimal sphere. Since it holds for the metric g, ˜ the positive mass theorem holds for the metric g also. Acknowledgements. The second author would like to thank Professors F. H. Lin, L. F. Tam, J. P. Wang and S. T. Yau for their useful conversations. We would like to thank the referee for pointing out an error in the first version of the paper.

References [B]

Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 36, 661–693 (1986) [D] Delanoe, P.: Generalized stereographic projections with prescribed scalar curvature. Contemporary mathematics. 127, 17–25 (1992) [FCS] Fisher-Colbrie, D., Schoen, R.: The structure of complete stable minimal surface in 3-manifolds of nonnegative scalar curvature. Comm. Pure Appl. Math. 33, 199–211 (1980) [GHHP] Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) [H] Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) [LP] Lee, J., Parker, T.: The Yamabe problem. Bull. Am. Math. Soc. 17, 31–81 (1987) [L] Li, P.: Lecture notes on geometric analysis. Lecture Notes Series No. 6, Research Institute of Mathematics and Global Analysis Research Center, Seoul National University, Seoul, 1993, (http://www.math.uci.edu/ pli) [PT] Parker, T., Taubes, C.: On Witten’s proof of the positive energy theorem. Commun. Math. Phys. 84, 223–238 (1982) [S] Schoen, R.: Variational theory for the total scalar curvature functional for Riemannian metric and related topics. Lecture Notes in Math. 1365, Springer-Verlag, 1987, pp. 120–154 [SY1] Schoen, R., Yau, S.T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65, 45–76 (1979) [SY2] Schoen, R., Yau, S.T.: The energy and the linear momentum of spacetimes in general relativity. Commun. Math. Phys. 79, 47–51 (1981) [SY3] Schoen, R., Yau, S.T.: Proof of the positive mass theorem. II. Commun. Math. Phys. 79, 231–260 (1981) [W] Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981) [Y] Yau, S.T.: Applications of geometric ideas in general relativity: Black holes and conserved quantity. Preprint [Z] Zhang, X.: Angular momentum and positive mass theorem. Commun. Math. Phys. 206, 137–155 (1999) Communicated by H. Nicolai

Commun. Math. Phys. 208, 671 – 687 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

On Tree Form-Factors in (Supersymmetric) Yang–Mills Theory K. G. Selivanov ITEP, B. Cheremushkinskaya 25, Moscow, 117259, Russia. E-mail: [email protected] Received: 29 October 1998 / Accepted: 16 July 1999

Abstract: Perturbiner, that is, the solution of field equations which is a generating function for tree form-factors in N = 3 (N = 4) supersymmetric Yang–Mills theory, is studied in the framework of twistor formulation of the N = 3 superfield equations. In the case when all one-particle asymptotic states belong to the same type of N = 3 supermultiplets (without any restriction on kinematics), the solution is described very explicitly. It happens to be a natural supersymmetrization of the self-dual perturbiner in non-supersymmetric Yang–Mills theory, designed to describe the Parke–Taylor amplitudes. In the general case, we reduce the problem to a neatly formulated algebraic geometry problem (see Eqs. (70), (71), (72)) and propose an iterative algorithm for solving it, however we have not been able to find a closed-form solution. Solution of this problem would, of course, produce a description of all tree form-factors in non-supersymmetric Yang–Mills theory as well. In this context, the N = 3 superfield formalism may be considered as a convenient way to describe a solution of the non-supersymmetricYang–Mills theory, very much in the spirit of works by E. Witten [1] and by J. Isenberg, P. B. Yasskin and P. S. Green [2]. 1. Introduction It is well known that multi-particle amplitudes, even in the tree approximation, are generically out of reach by means of available field theoretical methods, though, there are a lot of efforts and achievements in this direction (see [3]–[29] and other references to [3]). In principle, the problem can be considered as a purely technical one, since the n-particle tree amplitude is, according to text-book rules, represented as a sum of a number of rational functions of momenta of the particle (a term of the sum is a contribution of a Feynman diagram, and the amplitude is a sum of contributions of a number of diagrams). However, the number of terms grows enormously with the number of particles, so that the total expression becomes untreatable when the number of particles becomes bigger than, say, 9, to say nothing about arbitrary n. On the other side, sometimes

672

K. G. Selivanov

a final expression for the amplitude is essentially simpler than the intermediate ones. There are known cases when cancellations among contributions of different Feynman diagrams are just wonderful [13]–[17], [23]. These cases are, essentially, scalar field amplitudes with most of the external particles at threshold [15]–[17], [23] and the socalled like-helicity amplitudes inYang–Mills theory (that is, amplitudes with most of the external gluons in the same helicity state)[13,14]. If one is optimistic concerning possible cancellations in more general cases, say, in the case of Yang–Mills amplitudes with arbitrary helicities, one should look for a way to avoid those intermediate steps.A possible idea is to use the classical field equations since the tree amplitudes can, of course, be obtained from a classical solution of the field equations. This approach has been discussed in the classical text-books [30,31], and it has recently been resurrected in the literature. The threshold amplitudes in scalar field theories were obtained from spatially uniform classical solutions, thus, the field equations reduced to ordinary differential equations [15]–[17], [23]. The like-helicity Yang–Mills amplitudes were related to solutions of the self-duality equations in [24,25]. From our subjective point of view, one of the most interesting by-products of the above developments was the idea of perturbiner, or ptb-solution, [26]–[29]. To define the perturbiner, one first fixes a solution of linearized field equations (which are assumed to describe asymptotic one-particle states) of the type φ (1) =

L X

aJ J tJ ekJ x =

J =1

L X

J EˆJ ,

(1)

J

where x stands for a space-time coordinate, kJ stands for a momentum of the J th particle, J stands for a polarization of the J th particle, tJ stands for a “polarization” of the J th particle in the internal space (e.g. tJ is a generator of the color group), aJ is a symbol of the annihilation/creation operator, EˆJ = tJ EJ , EJ = aJ ekJ x .

(2)

The perturbiner is a complex solution of the field equations which is a formal power series in the “harmonics” EJ , J = 1, . . . , L, Eq. (2), the first order term of the series being just the solution of Eq. (1). Notice that x-dependence of the perturbiner comes only via monomials in EJ , J = 1, . . . , L, on which differential operators entering the field equations act as algebraic operators, and existence/uniqueness1 of the perturbiner normally takes place provided the set of momenta, kJ , J = 1, . . . , L, is non-resonant, that is, provided none of the linear combinations of kJ , J = 1, . . . , L with positive integer coefficients including more than one momentum gets to the mass shell. The physical meaning of the perturbiner is that its expansion in powers of symbols aJ , J = 1, . . . , L generates tree form-factors in the theory, φ ptb (x, {k}, {a}) =

L X X

aJ1 . . . aJl < kJ1 , . . . , kJl |φ(x)|0 >tree .

(3)

l=1 {J }

Notice that all the external mass-shell particles are arbitrarily considered as the outones. In principle, those with negative frequency should be considered as out-states while 1 In gauge theories the uniqueness is, of course, modulo gauge transformations.

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

673

those with positive frequency should be considered as in-states but at tree level analytical continuation from negative to positive frequency is trivial, so we do not distinguish them. The form-factors < kJ1 , . . . , kJl |φ(x)|0 >tree in Eq. (3) are in the coordinate representation. The monomials in the harmonics EJ , J = 1, . . . , L will produce the momentum conservation δ-functions after transformation to the momentum space. It is very convenient to add to the above definition of the perturbiner the requirement of nilpotency of the symbols aJ , J = 1, . . . , L, that is aJ2 = 0.2

(4)

In terms of the form-factors the nilpotency means that form-factors with identical asymptotic states will not appear in the expansion of the perturbiner in powers of aJ , J = 1, . . . , L (see Eq. (3)). Clearly, it does not assume any loss of generality if the perturbiner is known for a arbitrary number L of the asymptotic one-particle states, since the formfactors with identical asymptotic states can be obtained from those with all asymptotic states different. It is clear that under the nilpotency assumption the perturbiner is, in fact, not the power series but just a polynomial in (nilpotent) harmonics EJ , J = 1, . . . , L, Eq. (2). Actually, for massless particles, the nilpotency assumption is necessary for the non-resonantness condition, because any multiple of a light-like momentum is again a light-like momentum, and the nilpotency makes these multiples irrelevant. Everywhere below we shall assume the nilpotency condition Eq. (4). We note that textbooks (see, e.g., [30,31]) offer another definition of the solution of field equations, generating tree amplitudes in the theory (they use the so-called Feynman asymptotic condition to define it). Our definition above proved to be very convenient, in particular, it happened to be very conveniently compatible with the twistor description of solutions for the gauge self-duality equation ([26]), for the gravitational self-duality equations [27]) and for the gauge-gravitational self-duality equation ([28]). The traditional finite-action and reality conditions are substituted in the case of perturbiner by the condition of analyticity in the harmonics EJ , J = 1, . . . , L, Eqs. (1), (2). It is worth explaining that the perturbiner obeys the self-duality equations instead of the full equations, such as Yang–Mills equations or Einstein equations, when all polarizations entering Eq. (1) describe the same helicity state. Self-dual perturbiner generates only like-helicity form-factors. In this way the so-called Parke-Taylor amplitudes [13,14] are very unusually described in terms of meromorphic functions on an auxiliary CP 1 space. In this paper we describe perturbiner in N = 3 (N = 4) supersymmetric Yang–Mills theory. N = 3 supersymmetric Yang–Mills is equivalent to N = 4 super Yang–Mills but N = 3 superfield formalism is more naturally combined with twistors [1], that is why we follow N = 3 notation. N = 4 super Yang–Mills multiplet consists of the following particles: 1 × 1 (positive helicity gluon) 4 × 1/2 (positive helicity gluinos) 6 × 0 (scalars) 4 × −1/2 (negative helicity gluinos) 1 × −1 (negative helicity gluon)

(5)

2 This condition has nothing to do with the statistics of the particles considered. Say, for bosons the symbols still commute, while for fermions - anticommute.

674

K. G. Selivanov

This multiplet decomposes into two N = 3 multiplets as follows: 1×1 3 × 1/2 3×0 1 × −1/2

(6)

1 × 1/2 3×0 3 × −1/2 1 × −1

(7)

It occurs that if one includes only states from one type of the N = 3 multiplets, say, the one from table (6) (in arbitrary kinematics), that is all plane waves in Eq. (1) belong to the same type of the N = 3 multiplets (with arbitrary momenta kJ , J = 1, . . . , L), the solution is obtained from the (non-supersymmetric) self-dual perturbiner [26] by substituting the harmonics E, Eq. (2) with their supersymmetric extensions S, Eq. (51). Such a solution will be called the chiral N = 3 perturbiner. If one includes both types of the N = 3 multiplets the problem becomes much more complicated.3 In this case we describe a twistor reformulation of the problem, we show how it can be solved iteratively, but we have not been able to find a closed-form solution of the problem. Nevertheless, the problem is reduced to a neatly formulated algebraic geometry problem, Eqs. (70), (71),(72), and we feel that the complete solution might be somewhere nearby and we, perhaps, just do not know an appropriate mathematics to describe it. The rest of the paper is organized as follows. In Sect. 2 we, for the purpose of closeness of this paper, recall the construction of non-supersymmetric self-dual perturbiner [26].4 A key point is a sort of Riemann–Hilbert problem, Eq. (31), which is solved upon introducing the so-called “color ordering” (see Eq. (32) and explanations about it). Interestingly, the same solution Eq. (33) of the same problem Eq. (31) was shown [29] to generate tree form-factors in sin(h)-Gordon theory. In Sect. 3 we recall the N = 3 super Yang–Mill equations and construct the plane wave solution of the linearized field equations. In Sect. 4 we describe the chiral N = 3 perturbiner. In Sect. 5 the generic (non-chiral) perturbiner is considered.

2. Non-Supersymmetric Self-Dual Perturbiner [26] We adopt the spinor notation, so that the connection-form has two indices, Aα α˙ , α = ˙ 2, ˙ so has the space-time derivative, ∂α α˙ = ∂αα˙ , and the connection itself, 1, 2 α˙ = 1, ∂x ∇α α˙ = ∂α α˙ + Aα α˙ . The curvature form, Fα αβ ˙ β˙ = [∇α α˙ , ∇β β˙ ]

(8)

3 This is not surprising because all tree form-factors in non-supersymmetric Yang–Mills are contained among the N = 4 form-factors (when all external particles are gluons, the fermions and the charged scalars can appear only in loops). Actually, all N = 3 machinery can be viewed as a convenient way to describe solutions of ordinary Yang–Mills equations, very much in the spirit of [1] (see, also, [2]). 4 A solution, similar to our self-dual perturbiner, has been discussed in [32].

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

675

˙ has four indices and, being antisymmetric with respect to transposition α α˙ ↔ β β, decomposes as follows: Fα αβ ˙ β˙ = εαβ Fα˙ β˙ + εα˙ β˙ Fαβ ,

(9)

where εαβ , εα˙ β˙ are the standard antisymmetric tensors and the fields Fα˙ β˙ , Fαβ are symmetric with respect to transposition of indices. The first term in the r.h.s. of Eq. (9) is the self-dual part of the curvature, the second - antiself-dual (the metric in this notation is gα αβ ˙ β˙ = εαβ εα˙ β˙ ). Then, the self-duality equation is Fαβ = 0.

(10)

The (anti)self-duality equation is very well known to be sufficient for the Yang–Mills equation to be satisfied. Construction of the perturbiner (see the Introduction) starts with picking up a solution of the linearized (“free”) field equation. The linearized version of the self-duality equation reads (11) ∂α α˙ Aβ β˙ − ∂β β˙ Aα α˙ |symmetrized in α,β = 0. For a plane wave solution (see Eq. (1)), β β˙

Aα α˙ = α α˙ tekβ β˙ x ,

(12)

Eq. (11) gives kα α˙ β β˙ − kβ β˙ α α˙ |symmetrized in α,β = 0

(13)

which, in turn, gives that ¯ α˙ , kα α˙ = æα æ q æ ¯ α α˙ (+) . α α˙ = (æ, q)

(14)

Equations (14) mean that kα α˙ and α α˙ are both light-like (recall that the metric is gα αβ ˙ β˙ = εαβ εα˙ β˙ ), moreover, the dotted spinor in the decomposition of kα α˙ and α α˙ is the same. qα is an arbitrary (reference) spinor defined up to (linearized) gauge equivalence, qα ∼ qα + const · æα .

(15)

The factor (æ, q) is introduced for normalization. The brackets (p, q) here and below are defined as (p, q) = p α qα = εαβ pα q β

(16)

(indices are raised and lowered with the ε-symbols). With concern to the normalization, the antiself-dual plane wave would have a polarization (−)

α α˙ =

æα q¯α˙ (æ, ¯ q) ¯

(17)

676

K. G. Selivanov

and (+)

α α˙ (−)α α˙ = −1.

(18)

The polarizations Eqs. (14) and (17) can be seen to define positive and negative helicity ¯ α˙ entering kα α˙ , Eq. (14), can be states, correspondingly. Notice that the spinors æα , æ considered as independent since we are anyway looking for a complex solution. At the end, in computation of probabilities, the reality condition √ (19) æ¯ α˙ = −1æ∗α should be imposed. So, the appropriate solution of the linearized field equations reads (1)

Aα α˙ =

L X qαJ æ ¯ Jα˙ Eˆ J , (æJ , q J )

(20)

J =1

where, as in Eq. (2), EˆJ = tJ EJ , EJ = aJ ekJ x ,

(21)

where x α α˙ stands for a space-time coordinate, kαJ α˙ stands for a momentum of the J th particle, tJ stands for a generator of the color group, aJ is a symbol of the annihilation/creation operator of the J th particle (obeying the nilpotency condition, Eq. (4)). We are now going to use the twistor construction [33] to describe solutions of the (nonlinear) self-duality equation (10). Introduce a couple of complex numbers, ρ α , α = 1, 2. ρ α will also be referred to below as the auxiliary spinor. ρ α , α = 1, 2 can be considered as homogeneous coordinates on a CP 1 space. Contracting the undotted indices of the curvature form Fα αβ ˙ β˙ , Eq. (8), with ρ’s one automatically picks up to antiself-dual part of it (see Eq. (9)) (because the self-dual part is antisymmetric in the undotted indices). Hence, the self-duality equation is equivalent to a sort of zero-curvature condition [∇α˙ , ∇β˙ ] = 0 at any ρ α , α = 1, 2,

(22)

where ∇α˙ = ρ α ∇α α˙ . Thus, if one introduces Aα˙ = ρ α Aα α˙ ,

(23)

any self-dual connection form can be (locally) represented as Aα˙ = g −1 ∂α˙ g,

(24)

where g is a group valued function of ρ and x and ∂α˙ = ρ α ∂α α˙ . All the non-triviality of the self-duality equation is now encoded in the condition that g must depend on ρ in such a way that Aα˙ is a polynomial of degree 1 in ρ, as in Eq. (23). If g is ρ-independent, it is a pure gauge transformation, as it is seen from Eq. (24). The above condition on ρ-dependence of g is equivalent to the condition that g is a homogeneous meromorphic function of ρ of degree 0 such that Aα˙ from Eq. (24) is a homogeneous holomorphic function of ρ of degree 1 (a homogeneous holomorphic function of ρ of degree 1 is necessarily just linear in ρ, as in Eq. (23).). Notice, that

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

677

nontrivial (not a pure gauge) g necessarily has singularities in ρ, since if it is regular homogeneous of degree 0, then it is just ρ-independent, that is, a pure gauge. All the above is about ρ-dependence of g. In the case of perturbiner, the groupvalued function g, like the connection-form Aα α˙ (see the definition of the perturbiner in the Introduction), must be polynomial in harmonics EJ , J = 1, . . . , L, Eq. (21). First order terms of this polynomial, (1)

gptb (ρ, {E}) =

L X

gJ (ρ)EˆJ

(25)

J =1

are fixed by those in Aα α˙ , Eq. (20). Expanding Eq. (24) up to first order in the harmonic EJ and using Eq. (20) one obtains gJ (ρ) =

(ρ, q J ) . (ρ, æJ )(æJ , q J )

(26)

Equations (25), (26) define the first order terms in the expansion of g in powers of the harmonics EJ , J = 1, . . . , L. Thus, in terms of g, our problem is as follows. We must find the polynomial gptb , (1)

gptb (ρ, {E}) = 1 + gptb (ρ, {E}) + higher order terms in powers of E 0 s,

(27)

which is a rational function of ρ α of degree 0, such that Aα˙ from Eq. (24) is regular. (1) Notice that gptb has first order poles at ρ α = æαJ , J = 1, . . . , L (due to factors J α (ρ, æ ) = εαβ ρ æβ in the denominators, see Eq. (26)). We recall that æαJ is a spinor which appears in the decomposition of the corresponding four momentum kα α˙ , Eq. (14). (1) Importantly, a singular part of complete gptb (not only of gptb ) at ρ α = æαJ is necessarily proportional to the harmonic EJ (that is because gptb at EJ = 0 does not contain any information about the J th particle; form-factors including the J th particle will not be generated by the perturbiner at EJ = 0). Then, taking into account the nilpotency, Eq. (4), EJ2 = 0, one can show that gptb may have only a simple pole at ρ α = æαJ for Aα˙ from Eq. (24) to be regular at this point. Moreover, the residue of gptb at this point must obey the condition (28) ∂α˙ (res|ρ=æJ g · g −1 ) |ρ=æJ = 0, and, according to the rules of the game (see the definition of the perturbiner in the Introduction), Eq. (28) must be solved in the form of a polynomial in the harmonics EJ , J = 1, . . . , L. Clearly, the unique (up to a color Lee algebra valued constant) solution of Eq. (28) reads \J · EJ res|ρ=æJ g · g −1 |ρ=æJ = const

(29)

onst entering Eq. (29) is constant in the sense that it is (recall that ∂α˙ |ρ=æJ = æαJ ∂α α˙ ). c\ E-independent. It can be found by putting all the harmonics E but EJ to 0. Then, using Eqs. (25),(26), one finds \J res|ρ=æJ gJ · tJ = const with the “one-particle” gJ from Eq. (26).

(30)

678

K. G. Selivanov

Equations(29), (30) are seen to be equivalent to the condition that (1 − gJ EˆJ )gptb is regular atρ α = æαJ , J = 1, . . . , L

(31)

(with gJ from Eq. (26)). The condition, expressed by Eq. (26), defines gptb uniquely modulo gauge transtformations (that is, modulo multiplication by a ρ-independent matrix from the right). To conveniently represent the solution of Eq. (31) let us assume for a moment that the generators tJ , J = 1, . . . , L defining color states of the gluons (see Eq. (20)) belong to a free associative algebra (that is, there is no relation between them but the associativity relation (tJ1 tJ2 )tJ3 = tJ1 (tJ2 tJ3 )). This means that all monomials of the type of EˆJ1 EˆJ2 . . . EˆJL (EˆJ as in Eq. (20)) are linearly independent. gptb is then uniquely represented as X X gJ (ρ)EˆJ + gJ1 ,J2 (ρ)EˆJ1 EˆJ2 + . . . . (32) gptb (ρ, {E}) = 1 + J

J1 ,J2

Equation (31)) is then easily solved for the coefficients gJ1 ,J2 ,... ,JL , gJ1 ,J2 ,... ,JL (ρ) =

(ρ, q J1 ) (æJ1 , q J2 )(æJ2 , q J3 ) . . . (æJL−1 , q JL ) (ρ, æJ1 ) (æJ1 , æJ2 )(æJ2 , æJ3 ) . . . (æJL−1 , æJL ) 1 . (æJ1 , q J1 )(æJ2 , q J2 ) . . . (æJL , q JL )

(33)

Equations (32), (33) are a solution of the problem Eq. (31). Of course, it remains to be a solution if ones introduces back the relations between the color group generators tJ , J = 1, . . . , L. Any other solution is obtained from Eqs. (32), (33) multiplying it by a ρ-independent matrix from the right.5 ptb The connection-form, Aα α˙ , is obtained from g ptb , Eqs. (32), (33), via Eqs. (23), (24). One can do it by a straightforward computation. One can also simplify the computation ptb ptb −1 ∂α˙ gptb is linear in ρ α , α = 1, 2. Hence Aα α˙ noticing that by construction Aα˙ = gptb ptb

can be found as a ρ-derivative of Aα˙ taken at any value of ρ. Choosing all q’s in Eq. (33) equal to each other and equal to a spinor q (recall that q’s are defined up to ptb ptb the gauge freedom Eq. (15)) we find Aα α˙ as a ρ-derivative of Aα˙ at ρ α = q α . Since ptb g |ρ α =q α = 1 (see Eq. (33)), the computation becomes really easy, and one finds ptb

Aα α˙ =

L X J =1

AJα1α˙...JM = −

AJα α˙ EˆJ +

X J1 J2

qα q β kβJ1α˙...JM

AJα1α˙J2 EˆJ1 J2 + . . .

1 , (æJ1 , q)(æJM , q) (æJ1 , æJ2 )(æJ2 , æJ3 ) . . . (æJM−1 , æJM )

(34)

where kαJ1α˙J2 ...JL = kαJ1α˙ + kαJ2α˙ + . . . + kαJLα˙ . One can see that the above choice of q’s corresponds to the Lorentz gauge. Thus Aptb from Eqs. (34) is a generating function (in 5 There is a minor subtlety at this point. When one specifies t , J = 1, . . . , L in Eqs. (32), (33) to be J matrixes belonging to a gauge Lie algebra, gptb defined by Eqs. (32), (33) will not necessarily belong to the corresponding gauge group, only to GL(∗) instead. It will however be gauge equivalent (over GL(∗)) to a matrix from the gauge group.

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

679

the sense of Eq. (3)) for tree form-factors, or currents introduced in [14], in the Lorentz gauge. Using g ptb , Eqs. (33) one easily obtains the prominent Parke-Taylor amplitudes [13]. This is done in [26] and we shall not repeat it here. 3. N = 3 Supersymmetric Perturbiner; Preliminaries N = 3 and N = 4 Yang–Mills theories are equivalent but N=3 formalism is more naturally combined with the twistors [1] that is why we adopt N = 3 notation. N = 3 super-space is parametrized by commuting coordinates x α α˙ and by the anticommuting ˙ 2˙ are the Lorentz indices, they can be lowered and raised ones θ αj , θ¯jα˙ . α = 1, 2; α˙ = 1, with the antisymmetric ε-tensors, j = 1, 2, 3 is an isotopic index. Supercharges act in the super-space as ∂ 1 − θ¯jα˙ ∂α α˙ , αj θ 2 ∂ 1 j ¯ = − θ αj ∂α α˙ . Q α˙ 2 θ¯ α˙

Qαj =

(35)

j

Introduce super-covariant derivatives, ∂ 1 + θ¯jα˙ ∂α α˙ , αj θ 2 ∂ 1 j D¯ α˙ = α˙ + θ αj ∂α α˙ . 2 θ¯

Dαj =

(36)

j

Introduce also super-connections ∇αj = Dαj + Aαj , j j j ∇¯ α˙ = D¯ α˙ + A¯ α˙ ,

(37)

j

where Aαj and A¯ α˙ are superfields. N=3 supersymmetric Yang–Mills equations can be represented as [1], [34]–[36] ∇αj ∇βl + ∇βl ∇αj |symmetrized in α,β = 0, j j ∇¯ α˙ ∇¯ βl˙ + ∇¯ βl˙ ∇¯ α˙ |symmetrized in α, ˙ β˙ = 0, ∇αj ∇¯ βl˙ + ∇¯ βl˙ ∇αj = δjl ∇α β˙

(38)

j (only the traceless part of the last equation is, in fact, the equation on Aαj and A¯ α˙ , the rest is the definition of the connection ∇α α˙ = ∂α α˙ + Aα α˙ ). Linearization of Eqs. (38) reads Dαj Aβl + Dβl Aαj |symmetrized in α,β = 0, j j D¯ α˙ A¯ lβ˙ + D¯ βl˙ A¯ α˙ |symmetrized in α, ˙ β˙ = 0,

Dαj A¯ lβ˙ + D¯ βl˙ Aαj = δjl Aα β˙ .

(39)

680

K. G. Selivanov

As discussed in the Introduction, the N = 4 multiplet splits into two N = 3 multiplets, see tables (5),(6),(7). We first describe plane waves corresponding to the highest helicity states in each multiplet, that is the positive helicity gluon and + 21 singlet gluino. These plane waves are Aαj =

¯ kαα˙ y αα˙ qα (θ¯j , æ)t j e , A¯ α˙ = 0 (æ, q)

(40)

(positive helicity gluon) j Aαj = 0, A¯ α˙ =

q¯α˙ 21 εj lm (θ¯l , æ)( ¯ θ¯m , æ)t ¯ kαα˙ y αα˙ e (æ, ¯ q) ¯

(41)

(+ 21 singlet gluino). In these equations y α α˙ = x α α˙ + 21 θ αj θ¯jα˙ is a chiral coordinate, defined so that D¯ βl˙ y α α˙ = 0,

(42)

the bracket (a, b), as before, stands for contraction of a and b with the ε-tensor, Eq. (16), ¯ α˙ is a (lightε j lm is the totally antisymmetric tensor in the isotopic space, k α α˙ = æα æ like) four-momentum, t is a gauge Lie algebra generator, qα and q¯α˙ are the reference spinors, they are defined modulo gauge equivalence qα ∼ qα + const · æα , ¯ α˙ . q¯α˙ ∼ q¯α˙ + const · æ

(43)

One can check by a straightforward computation that the plane waves, Eqs. (40), (41) go through the linearized field equations Eqs. (39). One can also check that Qαj Aβl = 0, Qαj A¯ l˙ = 0,

(44)

¯ j , æ)A ¯ βl = 0, (Q ¯ j , æ) ¯ A¯ lβ˙ = 0 (Q

(45)

β

and

on both states Eqs. (40) and Eqs. (41). Equations (44) mean that these states are highest states in the multiplets, while Eqs. (45) mean that half of the supercharges act trivially on the whole multiplets (since these states are massless). ¯ charges on the states Eqs. (40) and Eqs. (41) one can obtain plane Acting with the Q waves corresponding to all states in the tables Eqs. (6) and Eqs. (7). We shall, however, α α˙ do it a bit differently. Instead of plane waves of the type of ekαα˙ y we shall use plane waves of the type of ekαα˙ y

α α˙ +(θ j ,æ)χ j

(46)

which are proper states for the supercharges Qαj , Qβl ekαα˙ y

α α˙ +(θ j ,æ)χ j

= æβ χl ekαα˙ y

α α˙ +(θ j ,æ)χ j

,

(47)

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

681

where χl , l = 1, 2, 3 are Grassmann variables; they are super-partners of momentum kα α˙ , so they will be referred to as momentino. Now the multiplets Eqs. (6) and Eqs. (7) can be organized as Aαj =

¯ + χj )t kαα˙ y αα˙ +(θ j ,æ)χj qα ((θ¯j , æ) j e , A¯ α˙ = 0, (æ, q)

(48)

(positive helicity gluon) j Aαj = 0, A¯ α˙ =

q¯ 21 εj lm ((θ¯l , æ) ¯ + χl )((θ¯m , æ) ¯ + χm )t kαα˙ y αα˙ +(θ j ,æ)χj e . (æ, ¯ q) ¯

(49)

Various members of the multiplets Eqs. (6) and Eqs. (7) arise as coefficients in the Taylor expansion of Eqs. (48) and Eqs. (49) in powers of χj . For example, the plane wave solution corresponding to the negative helicity gluon state arises as a coefficient at 3!1 εj lm χj χl χm in the expansion of Eq. (49). 4. Chiral N = 3 Supersymmetric Perturbiner In this section we construct the perturbiner which generates only form-factors with all asymptotic states belonging to the same type of the N = 3 super-multiplets, say, the one from table (6) (no kinematical restrictions are assumed, that is, the set of momenta is arbitrary). According to the definition of the perturbiner (see the Introduction), one first picks up a solution of the linearized field equations (39) in the form of a superposition of plane waves of the type of Eq. (48), (1)

Aαj =

J J L q J ((θ¯ , æ X j ¯ ) + χj )t α J =1

(æJ , q J )

j SbJ , A¯ α˙ = 0,

(50)

where SbJ = tJ SJ , SJ = aJ ekαα˙ y

α α˙ +(θ j ,æJ )χ J j

,

(51)

a J is a commuting nilpotent (see Eq. (4)) symbol of the annihilation/creation operator of the J th particle, χjJ stands for the momentino of the J th particle (see Eq. (47)), other notations are as in Eqs. (40), (41). Then one looks for a solution of the field equations (38), which is polynomial in the super-harmonics SJ , J = 1, . . . , L, Eq. (51), and whose first order term is as in Eq. (50). To describe this solution, introduce again an auxiliary CP 1 space with homogeneous coordinates ρ α , α = 1, 2. Introduce, also, Dj , ∇j and Aj as Dj = ρ α Dαj , ∇j = ρ α ∇αj , Aj = ρ α Aαj .

(52)

Then, as in the self-dual case (Sect. 2), the first equation of Eqs. (38) is represented as a zero-curvature condition by contracting its Lorentz indices with ρ’s. Hence, the first equation is (locally) solved as Aj = g −1 Dj g,

(53)

682

K. G. Selivanov

where (superfield) g on r.h.s. is (similarly to Eq. (24)) a group valued rational homogeneous function of ρ α , α = 1, 2 of degree 0, such that (superfield) Aj is a regular homogeneous function of ρ α , α = 1, 2 of degree 0 of degree 1. The rest of the equations of Eqs. (38) are solved provided j D¯ α˙ g = 0.

(54)

According to the rules of the game, g sptb is sought for as a polynomial in the superharmonics SJ , J = 1, . . . , L, Eq. (51), the first order term of the polynomial being defined by the one in Aαj Eq. (50) via Eq. (53). All steps in constructing such g sptb are parallel to the ones in Sect. 2. Moreover, amusingly, the resulting g sptb is given by the same formulae as g ptb , Eq. (32), (33) with the substitution ˆ S. b E→

(55)

(Equation (54) is satisfied because D¯ j Sb = 0.) Clearly, g sptb has the same singularities in the auxiliary CP 1 space as the nonsupersymmetric self-dual perturbiner g ptb , Eqs. (32), (33), namely, it has simple poles at ρ α = æαJ , J = 1, . . . , L, where æαJ is the spinor appearing in the decomposition of the momentum kαJ α˙ of the J th particle, (see Eq. (14)). j Finally, the generating functions for form-factors of the superfields Aαj , A¯ α˙ are obtained from Eqs. (53) (the computation is parallel to the one in Sect. 2), sptb

Aαj

=

L X J =1

AJαj1 ...JM = −

AJαj EˆJ +

X

sptb j J1 J2 AJαj1 J2 EˆJ1 J2 + . . . ; A¯ α˙ = 0,

qα q β [æβ χj ]J1 ...JM + θ¯ α˙ kβJ1α˙...JM

(æJ1 , q)(æJM , q) 1 , · J J J J 1 2 2 (æ , æ )(æ , æ 3 ) . . . (æJM−1 , æJM )

(56)

where [æβ χj ]J1 ...JM = æJβ1 χjJ1 + . . . + æJβM χjJM . 5. Generic (Nonchiral) N = 3 Perturbiner The nonchiral N = 3 supersymmetric perturbiner is a generating function for all tree form-factors in the N = 3 (N = 4) supersymmetric Yang–Mills theory. According to the rules of the game, the solution of the linearized field equations must include now both types of harmonics, Eqs. (48),(49), J ) + χJ t L q J (θ¯j , æ ¯ X α j (1) SbJ , Aαj = (æJ , q J ) J =1

(1)j A¯ α˙ =

L X J =1

bJ

q¯αJ˙ 21 εj lm ((θ¯l , æ ¯ J ) + χlJ )((θ¯m , æ ¯ J ) + χmJ )t (æ ¯ J , q¯ J )

SbJ ,

(57)

where bJ is an anticommuting nilpotent symbol, all other notations are the same as in Eq. (50).

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

683

The nonchiral perturbiner is a solution of the field equations, Eqs. (38), polynomial in the harmonics SJ , J = 1, . . . , L, Eq. (51), the first order term of the polynomial being just A(1) as in Eq. (50). We again use a twistor construction [1] to describe solutions of Eqs. (38). To this end, introduce two couples of complex numbers, ρ α , α = 1, 2, ¯ 2¯ which can be viewed on as homogeneous coordinates on CP 1 × CP 1 ρ¯ α˙ , α˙ = 1, space. Contracting all Lorentz indices in Eqs. (38) with these ρ’s, ρ’s ¯ one again obtains a sort of zero-curvature condition ∇j ∇l + ∇l ∇j = 0, ∇¯ j ∇¯ l + ∇¯ l ∇¯ j = 0, ∇j ∇¯ l + ∇¯ l ∇j = δjl ∇,

(58)

where ∇j = ρ α ∇αj , j ∇¯ j = ρ¯ α˙ ∇¯ α˙ ,

∇ = ρ α ρ¯ α˙ ∇α α˙ .

(59)

Clearly, Eqs. (38) and Eqs. (58) are equivalent provided Eqs. (58) are solved identically in ρ, ρ. ¯ Equations (58) are locally solved as Aj = g −1 Dj g, A¯ j = g −1 D¯ j g,

(60)

where Aj = ρ α Aαj , Dj = ρ α Dαj , j A¯ j = ρ¯ α˙ A¯ α˙ ,

j D¯ j = ρ¯ α˙ D¯ α˙ ,

(61)

¯ as in Eqs. (61). and ρ, ρ¯ dependence of g must be such that Aj , A¯ j are just linear in ρ, ρ, More definitely, (superfield) g is a meromorphic homogeneous function of ρ, ρ¯ of degree (0, 0) such that (superfield) Aj , Eqs. (61) is a holomorphic homogeneous function of ρ, ρ¯ of degree (1, 0) and (superfield), A¯ j , Eqs. (61) is a holomorphic homogeneous function of ρ, ρ¯ of degree (0, 1), as in Eqs. (61). Again, in the case of perturbiner, g ncsptb must be polynomial in the super-harmonics SJ , J = 1, . . . , L, Eq. (51). First order terms of this polynomial, (1)

¯ {S}) = gncsptb (ρ, ρ,

L X J =1

ncsptb

gJ

(ρ, ρ) ¯ SbJ

(62)

are fixed by those in A’s, Eqs. (57) via Eqs. (60) (analogously to Eqs. (25), Eqs. (26)). Expanding Eq. (60) up to first order in the harmonic SJ and using Eq. (57), Eq. (62) one obtains ncsptb

gJ

(ρ, ρ) ¯ =

(ρ, q J ) (ρ, ¯ q¯ J ) J + b (ρ, æJ )(æJ , q J ) (ρ, ¯ æ¯ J )(æ¯ J , q¯ J ) 1 ¯ J ) + χjJ )((θ¯l , æ ¯ J ) + χlJ )((θ¯m , æ ¯ J ) + χmJ ). (63) · ε j lm ((θ¯l , æ 3!

684

K. G. Selivanov

Thus, according to Eqs. (62), Eq. (63), the first order terms in g ncsptb have simple poles at surfaces ρα = æJα , J = 1, . . . , L

(64)

ρ¯α˙ = æ¯ Jα˙ , J = 1, . . . , L

(65)

and at surfaces

¯ An analysis shows that for A, A¯ from in the CP 1 × CP 1 space parametrized by ρ, ρ. Eqs. (60) to be regular, the higher order terms in the polynomial gncsptb may have only simple poles at the surfaces, Eqs. (64), (65) and also at the surfaces CM in CP 1 × CP 1 , CM : {ρ α ρ¯ α˙ kαM α˙ = 0},

(66)

P J where kαM J ∈M kα α˙ , and M is a subset of the set J = 1, . . . , L. That is, any α˙ = linear combination of momenta of the asymptotic states included in the perturbiner defines a surface in CP 1 × CP 1 at which gncsptb has a simple pole. Notice that due to the non-resonantness condition (see the Introduction) the surfaces CM Eq. (66) with M including more than one element never reduces to the ones of the type of Eqs. (64),(65). Furthermore, the regularity of A, A¯ from Eqs. (60) dictates a condition on residues of gncsptb at the surfaces CM , namely −1 ) |CM = 0, Dj (res|CM gncsptb · gncsptb −1 ) |CM = 0, (67) D¯ j (res|CM gncsptb · gncsptb where Dj , D¯ j are as in Eqs. (61) and notation |CM in Eqs. (67) indicates restriction at the surface CM Eq. (66). It is again convenient to use the expansion of gncsptb in the color ordered monomials (see Eq. (32) and explanations about it), X ncsptb X ncsptb ¯ {S}) = 1 + gJ (ρ ρ) ¯ SbJ + gJ1 ,J2 (ρ, ρ) ¯ SbJ1 SbJ2 + . . . . (68) gncsptb (ρ, ρ, J

J1 ,J2

The singularity structure of gncsptb Eqs. (66),(67) dictates the following form of coeffincsptb cients gJ1 ,J2 ,... ,JL (ρ, ρ), ¯ ncsptb

¯ = gJ1 ,J2 ,... ,JM (ρ, ρ)

P J1 J2 ...JM (ρ, ρ, ¯ θ¯ ) (ρ α ρ¯ α˙ kαJ1α˙ )(ρ α ρ¯ α˙ kαJ1α˙J2 ) . . . (ρ α ρ¯ α˙ kαJ1α˙J2 ...JM )

,

(69)

J1 J2 ...JM (ρ, ρ, ¯ θ¯ ) is a where, we recall, kαJ1α˙J2 ...JM = kαJ1α˙ + kαJ2α˙ + . . . + kαJM α˙ , and P ncsptb polynomial in ρ, ρ¯ of degree (M,M) (so, that gJ1 ,J2 ,... ,JM (ρ, ρ) ¯ is a rational function of ρ, ρ¯ of degree (0, 0)). We have explicitly indicated only ρ, ρ¯ and θ¯ dependence of th these polynomials. They, of course, depend on quantum numbers of J1th , J2th , . . . , JM particles, such as momenta, momentinos, etc. Now, the problem of constructing gncsptb , and thus, the problem of computing all tree form-factors in (supersymmetric) Yang–Mills theory, will be formulated as a prob¯ θ¯ ), M = 1, . . . , L. From lem of constructing a set of polynomials P J1 J2 ...JM (ρ, ρ,

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

685

Eqs. (67),(68) and (69), crucially using the nilpotency, S 2 = 0, and linear independence of various color ordered products of S’s, one obtains ρ α ([æα χj ]J1 ...JM + θ¯jα˙ kαJ1α˙J2 ...JM )P J1 J2 ...JM |CJ1 J2 ...JM = 0, ! α˙ ∂ J1 J2 ...JM |CJ1 J2 ...JM = 0, (70) ρ¯ α˙ P θ¯ j

P J1 J2 ...JM |CJ1 J2 ...JI = P J1 J2 ...JI |CJ1 J2 ...JI P JI +1 ...JM |CJ1 J2 ...JI , I < M,

(71)

and from Eqs. (63) one sees that PJ =

(ρ, q J )(ρ, ¯ q¯ J ) (ρ, ¯ q¯ J )(ρ, æJ ) + bJ J J (æ , q ) (æ ¯ J , q¯ J ) 1 ¯ J ) + χjJ )((θ¯l , æ ¯ J ) + χlJ )((θ¯m , æ ¯ J ) + χmJ ). · ε j lm ((θ¯j , æ 3!

(72)

¯ θ¯ ), M = 1, . . . , L are homogeneous polynomials on We recall that P J1 J2 ...JM (ρ, ρ, ˙ 2, ˙ of degree (M,M). CP 1 × CP 1 space, parametrized by ρ α , α = 1, 2, ρ¯ α˙ , α˙ = 1, J1 J2 ...JL JL J1 J2 J J J = kα α˙ + kα α˙ + . . . + kα α˙ , (kα α˙ = æα æ ¯ α˙ is the momentum of the J th particle), kα α˙ [æβ χj ]J1 ...JM = æJβ1 χjJ1 + æJβ2 χjJ2 + . . . + æJβM χjJM . The notation P |C means that a polynomial P is restricted on a surface C. CJ1 J2 ...JM are surfaces in CP 1 × CP 1 defined by Eqs. (66). Due to the non-resonantness condition (that a sum of the type of kαJ1α˙ + kαJ2α˙ + . . . + kαJLα˙ is never light-like when there is more than one term) only surfaces CJ are reducible, as in Eqs. (64),(65). Notice that the surfaces CJ1 J2 ...JI and CJI +1 J2 ...JL , I < L intersect at two points in CP 1 × CP 1 , defined by the system ρ α ρ¯ α˙ kαJ1α˙J2 ...JI = 0, J

ρ α ρ¯ α˙ kαIα˙+1

J2 ...JL

= 0,

(73)

and at every one of these intersection points three surfaces meet - CJ1 J2 ...JI , CJI +1 J2 ...JL and CJ1 J2 ...JL . The bracket of the type of (a,b) has been defined in Eq. (16). θ αj , θ¯jα˙ , ˙ 2, ˙ j = 1, 2, 3 are Grassmann variables. χ J , the momentino of the J th α = 1, 2; α˙ = 1, j particle, has been introduced in Eqs. (46),(47). qα and q¯α˙ are the reference spinors, they are defined modulo gauge equivalence qα ∼ qα + const · æα , q¯α˙ ∼ q¯α˙ + const · æ¯ α˙ (one may use this freedom in solving the problem, Eqs. (70), (71), (72)). We shall now explain how this problem, Eqs. (70), (71),(72), can be solved iteratively. Namely, we explain how to find the polynomial P J1 J2 ...JL , provided all polynomials P J1 J2 ...JI , I < L are known (in this sense Eq. (72) is a first step of the iteration). Consider, first, the polynomials restricted on “their own” surfaces, that is, introduce P˜ J1 J2 ...JL = P J1 J2 ...JL |CJ1 J2 ...JL .

(74)

P˜ J1 J2 ...JL is a homogeneous polynomial on CJ1 J2 ...JL of degree 2L, and hence it has 2L + 1 degrees of freedom.

686

K. G. Selivanov

Equations (71) on the restricted polynomials reduce to P˜ J1 J2 ...JL |Zi (CJ1 J2 ...JI ∩CJI +1 ...JL ) = P˜ J1 J2 ...JI |Zi (CJ1 J2 ...JI ∩CJI +1 ...JL ) P˜ JI +1 ...JL |Zi (CJ1 J2 ...JI ∩CJI +1 ...JL ) , I < L, (75) where Zi (CJ1 J2 ...JI ∩CJI +1 ...JL ), i = 1, 2 stand for the intersection points of the surfaces CJ1 J2 ...JI and CJI +1 ...JL , Eq. (73). Thus Eq. (23) defines the polynomial P˜ J1 J2 ...JL at these 2(L − 1) intersection points. Equations (70) tell that the polynomial P˜ J1 J2 ...JL must be of the form 1 ¯ = εj lm ρ α ([æα χj ]J1 ...JM + θ¯jα˙ k J1 J2 ...JM ) ¯ θ) P˜ J1 J2 ...JL (ρ, ρ, α α˙ 3! J J ...J ρ α ([æα χl ]J1 ...JM + θ¯lα˙ kα1α˙ 2 M ) ρ α ([æα χm ]J1 ...JM + θ¯mα˙ kαJ1α˙J2 ...JM ) ¯ CJ1 J2 ...JL , R J1 J2 ...JL (ρ, ρ)|

(76)

¯ CJ1 J2 ...JL is a polynomial on the surface CJ1 J2 ...JL of degree 2L−3. where R J1 J2 ...JL (ρ, ρ)| ¯ CJ1 J2 ...JL , and Thus, the number of degrees of freedom in the polynomial R J1 J2 ...JI (ρ, ρ)| so in the polynomial P˜ J1 J2 ...JI , is 2L − 2, that is, precisely the number of points at which the polynomial is defined according to Eqs. (75)! (Equations (76) and (75) are compatible due to nilpotency of θ αj , θ¯jα˙ and χjJ .) Once restriction of the polynomial P J1 J2 ...JL on the surface CJ1 J2 ...JL is found and restrictions of the polynomial P J1 J2 ...JL on the surfaces CJ1 J2 ...JM , M < L are known due to Eqs. (71), the polynomial P J1 J2 ...JL is fixed modulo a polynomial which is zero at surfaces CJ1 J2 ...JM , M = 1, . . . , L, that is precisely modulo a polynomial in the ncsptb denominator of gJ1 ,J2 ,... ,JL , Eq. (69), that is, modulo the gauge freedom. The described iterative procedure can, in principle, be used as an alternative to the usual perturbation theory, and it might even be the more economical alternative, but we shall not try here to prove its efficiency. Instead we would like to express our hope that Eqs. (70), (71),(72) can be, in some sense, solved completely. By the way, Eqs. (71),(72) allow such a complete solution up to a freedom which is to be fixed by Eqs. (70) (or Eqs. (76), namely J1 J1 J2 P QJ1 J2 J3 . . . QJ1 ...JL α α˙ J1 QJ 2 ρ ρ¯ k QJ2 J3 . . . QJ2 ...JL α α˙ P J1 J2 0 α α ˙ J 3 J1 J2 ...JL ρ ρ¯ kα α˙ P . . . QJ3 ...JL . = det P . .. .. .. .. .. . . . . 0 0 0 . . . P JL Each entry of the L × L matrix in the above equation is a homogeneous polynomial on the CP 1 × CP 1 space of degree (1, 1). Q’s in the upper triangular part of the matrix represent the freedom which is to be fixed by Eqs. (70) (or Eqs. (76)). Unfortunately, we have not been able to implement these equations to fix this freedom. Acknowledgements. I benefitted a lot from discussions with A. Rosly to whom I am very much obliged. This work was supported by INTAS grant 97-0103.

Tree Form-Factors in (Supersymmetric) Yang–Mills Theory

687

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.

Witten, E.: Phys. Lett. B 77, 394 (1978) Isenberg, J., Yasskin, P.B., Green, P.S.: Phys. Lett. B 78, 462–464 (1978) Mangano, M.L., Parke, S.: Phys. Rep. 200, 301 (1991) Caravaglios, F., Mangano, M.L., Moretti, M., Pittau, R.: CERN-TH-98-249, hep-ph/9807570 Draggiotis, P., Kleiss, R., Papadopoulos, C.: CERN-TH-98-207, hep-ph/9807207 Chalmers, G., Siegel, W.: ITP-SB-98-05, hep-ph/9801220; ITP-SB-97-45, hep-ph/9708251 Kosower, D.: Phys. Rev. D 57, 5410–5416 (1998), hep-ph/9710213 Del Duca, V.: Phys. Rev. D 52, 1527–1534 (1995), hep-ph/9503340 Bern, Z., Dixon, L., Dunbar, D., Kosower, D.: Nucl. Phys. B 435, 59–101 (1995), hep-ph/9409265 Dunbar, D., Norridge ,P.: Nucl. Phys. B 433, 181–208 (1995), hep-th/9408014 Kim, Chanju, Nair, V.P.: Phys. Rev. D 55, 3851–3858 (1997), hep-th/9608156 Kosower, D., Lee, Bum-Hoon, Nair, V.P.: Phys. Lett. B 201, 85 (1988) Parke,S., Taylor, T.: Phys.Rev.Lett. 56 2459 (1986) Berends, F., Giele ,W.: Nucl. Phys. B 306 759 (1988) Voloshin, M.B.: Phys. Rev. D 47, 357 (1993); 47, 2573 (1993); 47, 1712 (1993) Smith, B.: Phys .Rev. D 47, 3518 (1993); 49, 1081 (1994) Argyres, E., Kleiss, R., Papadopulos, C.: Phys. Lett. B 302, 70 (1993); Phys. Lett, B 319, 544 (1993) Brown, L.S.: Phys. Rev. D 46,4125 (1992) Cornwall: J.: Phys. Lett. B 243, 271 (1990) Goldberg, H.: Phys. Lett. B 246, 445 (1990) Aoyama, H.: Goldberg, H.: Phys. Lett. B 188, 506 (1987) Bezrukov, F., et.al.: Mod. Phys. Lett. A 10, 2135–2141 (1996); hep-ph 9512342 Libanov, M.V., Rubakov, V.A., Troitsky, S.V.: Phys. Lett. B 318, 134 (1993) Bardeen, W.: Prog. Theor. Phys. Suppl. 123, 1–8 (1996) Selivanov, K.G.: Preprint ITEP-21-96, hep-ph/9604206 Rosly, A.A., Selivanov, K.G., Phys. Lett. B 399, 135–140 (1997); hep-th/9611101 Rosly, A.A., Selivanov, K.G.: Preprint ITEP-TH-56-97, hep-th/9710196 Selivanov, K.G.: Phys. Lett. B 420, 274 (1998), hep-th/9710197 Rosly, A.A., Selivanov, K.G.: Phys. Lett. B 426, 334–338 (1998), hep-th/9801044 Slavnov, A.A., Faddeev, L.D.: Introduction to the Theory of Quantum Gauge Fields. Moscow: Nauka, 1978 Itzykson, C., Zuber, J.B.: Quantum Field Theory. New York: McGraw-Hill, 1980 Korepin, V., Oota, T.: J. Phys. A 29, L625–L628 (1996), hep-th/9608064 Ward, R.S.: Phys. Lett A 61, 81 (1977) Sohnius, M.: Nucl. Phys. B 133, 275 (1978) Harnad, J.: Hurtubise, J., Legare, M., Shnider, S.: Nucl. Phys. B 256, 609 (1985) Harnad, J., Shnider, S.: Commun. Math. Phys. 106, 183 (1986)

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 208, 689 – 712 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Chiral Observables and Modular Invariants K.-H. Rehren Institut für Theoretische Physik, Universität Göttingen, Bunsenstrasse 9, 37073 Göttingen, Germany. E-mail: [email protected] Received: 19 April 1999 / Accepted: 20 July 1999

Abstract: Various definitions of chiral observables in a given Möbius covariant twodimensional (2D) theory are shown to be equivalent. Their representation theory in the vacuum Hilbert space of the 2D theory is studied. It shares the general characteristics of modular invariant partition functions, although SL(2, Z) transformation properties are not assumed. First steps towards a classification are made.

1. Introduction The program of classification of modular invariant partition functions in 2D conformal quantum field theory (see below for more details) has seen steady progress since the original A-D-E classification for SU(2) theories [3]. Apart from explicit classifications for new models [8], classification theorems have been established for the general case [22,1].Yet, the feeling persists that the full depth of the problem has not yet been sounded. It is the intention of the present note to show that general classification theorems of a very similar nature can be derived in a setting which does not refer to modular transformations of Gibbs states at all. Our statements are on the decomposition (described by a “coupling matrix”) of the vacuum representation of a conformal 2D quantum field theory upon restriction to its chiral observables. They can be considered with a different perspective as statements on the possible 2D extensions of given left and right chiral algebras. Our mathematical tool is the structure theory of subfactors applied to the inclusion of local algebras of chiral observables into local algebras of 2D observables. Note that a modular invariant partition function is also described by a coupling matrix which is usually also interpreted as a chiral decomposition of a 2D vacuum representation. But the classification method based on arithmetic properties of the representation matrices S and T of the SL(2, Z) generators is entirely different and does not rely on this interpretation. In fact, there seem to be exotic (accidental?) modular invariants which do not derive from a 2D theory (cf. [1, Part III]).

690

K.-H. Rehren

In contrast to the modular invariants program, we make only rather general structural assumptions on the theory under consideration. We put the emphasis on the local structure [12], rather than the accidental Lie algebra structure of chiral observables. Thus we avoid the, somewhat artificial, restriction to chiral algebras which are related to affine Lie algebras because these are the only ones for which Gibbs functionals Trπ e−βL0 (“characters”) are known [13]. Likewise, the problem that for most W -algebras it is not clear on which suitable set of “zero mode quantum numbers” for chiral Gibbs functionals the modular group should act, does not pose itself in our approach. Furthermore, we do not assume that the left and right chiral observables are isomorphic, nor that they have isomorphic fusion of their superselection sectors. Instead, we shall derive that the maximal (see below) chiral observables automatically possess sectors with identical fusion rules. To be sure, it is not our intention to depreciate the modular point of view at all. On the contrary, the SL(2, Z) symmetry between high and low temperature Gibbs states is one of the most fascinating features of chiral models which calls for a sound physical understanding. Indeed, there are general arguments, with reasonable assumptions, in favour of a modular transformation law which generalizes the one for affine Lie algebras [13] as conjectured in [32]. E.g., Cardy [4] argues with transfer matrix methods and invariance under global resummations in lattice models before the continuum limit is taken, and Nahm [23] exploits the operator product algebra of the Schwinger functions to show that Gibbs states transform into Gibbs states. None of these, however, provides a completely satisfactory explanation in terms of the real time local quantum field theory. On the other hand, the modular group SL(2, Z) plays a fundamental role even without any Gibbs functionals to act on by a modular transformation of the temperature. Namely, the general theory of superselection sectors collects monodromy data of braid group statistics in numerical matrices S stat and T stat , and as a “maximality” feature of braid group statistics, these matrices represent the modular group [27,6,21]. In models where both concepts are defined, one has S = S stat and T = T stat . E.g., the Kac–Peterson modular matrices [13] for affine Lie algebras can be computed from the statistics of the representations with positive energy of associated local current algebras. Furthermore, the matrix entries of S stat were found in [5, part II] to describe the spectrum of the central observables naturally associated with the nontrivial topology of the space S 1 . These discoveries are general structure theorems from local quantum field theory and never refer to Gibbs functionals (and hardly to conformal invariance). They show also, however, that a degeneracy (S stat being not invertible) can – and in higher dimensions must – occur which obstructs the existence of an SL(2, Z) representation. (Algebraic conditions for non-degeneracy are given in [15].) Thus, even if SL(2, Z) does not act on chiral characters, it is likely to be around, with various caveats as in the discussion above, as a consequence of fundamentals of local quantum field theory, and an interpretation in terms of Gibbs functionals would be highly desirable. This issue will not be addressed here. In the classification program for modular invariant 2D partition functions, it is assumed that certain chiral observables AL ' AR are a priori given along with a collection of representations (sectors) described by their chiral characters (Gibbs functionals for the conformal Hamiltonian L0 and suitable other quantum numbers such as Cartan charges for current algebras). These characters transform linearly under the group SL(2, Z) which is essentially generated by the imaginary unit shift (T ) and the inversion (S) of the inverse temperature parameter β/2π. One then looks for bilinear combinations of chiral characters with positive integer coefficients Zl,r (the coupling matrix) which are

Chiral Observables and Modular Invariants

691

invariant under the simultaneous SL(2, Z) transformations for both chiral factors (that is, Z commutes with S and T ). The resulting modular invariant partition functions are considered as Gibbs functionals for two-dimensional energy and momentum operators in a representation of a 2D conformally invariant quantum field theory. The latter contains the chiral observables along with additional local 2D fields which are nonlocal in each light-cone coordinate separately. In this interpretation, the entries of the coupling matrix Z clearly are the multiplicities of the sectors of the chiral algebras within the representation space of the 2D theory. E.g., one usually imposes the constraint Z0,0 = 1 on the coupling matrix which ensures this representation to contain a unique vacuum vector. One of the most important general classification statements [22] asserts that every solution can be turned into a permutation matrix induced by an “automorphism of the fusion rules” with respect to some “suitably extended algebra of chiral observables” ext Aext L ' AR . Furthermore, it was found [1] that the non-vanishing diagonal entries of the coupling matrix Z (with respect to the initially given chiral observables) can be characterized in terms of structure data which refer to the chiral extension A ⊂ Aext only. In the case of SU(2), these two statements yield the A-D-E classification of [3]. In this article, we endeavour a somewhat opposite program. We assume a local 2D conformally invariant quantum field theory, denoted by B, to be given in its vacuum representation π 0 on a Hilbert space H . Within this theory we identify chiral observables, max denoted by Amax L and AR , and show that these are the respective relative commutants of any initially given chiral observables AR and AL within the same 2D theory (Corollary 2.7). We then study the superselection sectors of the maximal chiral observables which are contained in H , that is, the branching of the irreducible representation π 0 upon max restriction to the subalgebra Amax L ⊗ AR . We show that the coupling matrix for the chiral observables Amax is described by an isomorphism between the left and right chiral fusion rules (Corollary 3.5), which as a side result implies that Amax coincide with Aext in the modular classification statement (by virtue of Lemma 3.4). We just use the laws controlling local extensions of local algebras, as established in [17]. The crucial point is the fact that the same coupling matrix which describes the vacuum branching (or the 2D partition function), at the same time describes a distinguished DHR representation of the chiral observables, and an endomorphism of a von Neumann algebra of the form AL ⊗ AR canonically associated with a subfactor AL ⊗ AR ⊂ B. The constraints on the coupling matrix arise by the latter endomorphism both being canonical and respecting the tensor product (these notions are explained in Sect. 3). Unlike locality of the chiral observables, locality of the 2D net is only implicitly exploited and does not yet enter our (outline of the) classification itself. It is well known that left and right chiral sectors (charged fields) cannot be freely composed to yield local 2D fields [28,25,22], and a general algebraic condition in terms of a statistics operator was given in [17]. The incorporation of this condition into our present scheme is still awaiting. As far as these constraints are concerned, very similar arguments also apply to “coset models” in which a tensor product of two commuting subtheories is embedded within a given chiral theory. Therefore, the same constraints on the coupling matrix also arise for the branching of the vacuum sector of the ambient theory upon restriction to the pair of subtheories. The paper is organized as follows. Section 2 sets the physical stage with emphasis on the equivalence of various possible definitions of the chiral observables. In Sect. 3 the decomposition of the 2D vacuum representation upon restriction to the chiral observ-

692

K.-H. Rehren

ables is analyzed in the light of the general theory described in [17]. The central result (Theorem 3.6) is a generalization of the “automorphism of the fusion rules” theorem [22]. Section 4 discusses the (first) implications for the classification problem. The central argument in Sect. 3 is in fact a theorem on the sector decomposition of the canonical endomorphism of a von Neumann subfactor. This theorem, and the associated notion of a normal canonical tensor product subfactor, is of its own mathematical interest [26] and constitutes the common link between various problems in quantum field theory, such as chiral observables in 2D, and coset models [35] and Jones–Wassermann subfactors [34,15] in chiral conformal quantum field theory. Its mathematical essence seems to be most appropriately formulated in terms of C* tensor categories. It furthermore reveals a connection to asymptotic subfactors [24] and quantum doubles [15]. This observation may support the expected role of quantum double symmetry in 2D conformal quantum field theory and coset models. 2. Chiral Observables We start with the discussion of various alternatives to define chiral observables within a conformally invariant 2D theory. The reader mainly interested in modular invariants is invited to skip this section, and take its results referred to in Sect. 3 for granted. We adopt the algebraic approach to quantum field theory in which the local algebras are considered rather than the local (Wightman) fields which possibly generate them. The underlying picture [11] is that the net of algebras, i.e., the complete collection of inclusion and intersection relations between algebras associated with smaller and larger space-time regions, is sufficient in principle to reconstruct the full physical content of the theory. Specifications of the model, therefore, have to be formulated as properties of the net of local algebras. A two-dimensional local conformal quantum field theory is defined on a covering e of Minkowski space-time M = R(1,1) . This manifold is obtained as follows manifold M [19,2]. One first considers Minkowski space-time as the Cartesian product R × R of its two chiral light-cone directions. On each light-cone, the Möbius group PSL(2, R) acts by the rational transformations x 7→ ax+b cx+d , thus enforcing the compactification of R to S 1 by addition of the point ∞ = −∞. In the quantum field theory, the chiral Möbius groups are only projectively represented, leading to a covering of S 1 (in which R e is the will be henceforth identified with the interval (0, 2π )). The covering manifold M 1 Cartesian product of the coverings of the two chiral S , quotiented by the identification (xL , xR ) = (xL + 2π, xR − 2π). Each subset (a, a + 2π ) × (b, b + 2π ) represents one e copy of Minkowski space-time M within M. e possesses a global causal structure such that the causal The covering manifold M complement of a double cone O = (a, b) × (c, d)1 is the double cone O 0 = (b, a + 2π) × (d − 2π, c) ≡ (b − 2π, a) × (d, c + 2π ), and (O 0 )0 = O. We may assume that the 2D theory B is given by the isotonous net of local von Neumann algebras B(O) associated with double cones in Minkowski space-time O = I × J ⊂ M, where I ⊂ R and J ⊂ R are open intervals on the respective chiral light-cones. We assume that B is irreducibly represented on a vacuum Hilbert space H , and transforms covariantly under a strongly continuous positive-energy representation U of the 2D conformal group. The latter is the Cartesian product of left and eR (with covering projection p : g˜ 7→ g), where eL , G right chiral covering groups G 1 It is always understood that 0 < b − a < 2π and 0 < d − c < 2π.

Chiral Observables and Modular Invariants

693

G = PSL(2, R) is the Möbius group. Both chiral Möbius groups G contain a subgroup U(1) with positive generators L0 , the chiral “conformal Hamiltonians”. The corresponding chiral “rotations by 2π” will be denoted for simplicity by UL (2π ) and UR (2π ). In a local theory, UL (2π) = UR (2π ), that is, the diagonal of the kernel of the covering projection p is represented trivially [19]. Conformal covariance means B(gL I × gR J ) = AdU (g˜L ,g˜R ) B(I × J ) e are represented by paths gt ∈ G connecting g with the whenever the elements g˜ ∈ G identity which map the respective chiral intervals pointwise into intervals. If, on the other hand, the image of an interval under gt contains ∞, then the above transformation law is considered as the definition of the algebra on the left hand side, where now gL I and gR J are intervals on the covering of the compactified light-cones S 1 = R ∪ {∞}. If we denote by I + 2π and J + 2π the images under chiral rotations by 2π , then it follows that B((I + 2π ) × (J − 2π )) = B(I × J ), e that is, the theory B is indeed defined over the conformal covering space M. Locality of B on Minkowski space implies that the local algebras also commute whenever the associated double cones in the covering manifold are spacelike separated, i.e., B(O 0 ) ⊂ B(O)0 . In theories generated by Wightman fields, one even has e Essential duality (duality on the covering space M): B(O)0 = B(O 0 ). The same also holds in parity invariant conformal nets [2]. We shall assume essential duality throughout. Note that any pair O and O 0 are a left and a right wedge, or likewise the other way e or can be mapped by Möbius round, in a suitable copy of Minkowski space-time in M, transformations into these wedges in any reference copy of Minkowski space-time. Hence, essential duality is equivalent to wedge duality in Minkowski space. We reserve the term Haag duality, according to its original usage [11], for the stronger property of duality on Minkowski space M (see below) which is not an automatic feature. It will not be assumed in this paper. We proceed to define chiral observables. Definition 2.1. The (maximal) left chiral observables are e 0 Amax L (I ) := B(I × J ) ∩ U (GR ) . The (maximal) right chiral observables Amax R (J ) are defined analogously. First we note that this definition does not depend on the interval J since any two open e which act trivially on intervals are connected by a Möbius transformation in {e} × G (I ) by definition. Second, the left chiral observables commute with UL (2π ) = Amax L UR (2π). Consequently, the chiral observables are defined over the compactified lightcone S 1 without covering, and are covariant under the proper Möbius group G = PSL(2, R). The operators UL (2π ) = UR (2π ) are multiples of unity in every irreducible subrepresentation of the chiral observables, contained in H . The chiral net of von Neumann algebras I 7 → Amax L (I ) satisfies chiral locality (commutativity for disjoint

694

K.-H. Rehren

intervals) since for given disjoint I1 and I2 it is always possible to find intervals J1 and J2 such that Oi = Ii × Ji are space-like to each other, and 2D space-like locality of the net B applies. Left chiral observables and right chiral observables commute with each other irrespective of their localization since for any I and J there are Jˆ and Iˆ such that I × Jˆ and Iˆ × J are space-like, and again space-like commutativity of B applies. Clearly, the net Amax eL . By L is Möbius covariant under the representation UL ≡ U |G the Reeh–Schlieder theorem, the projections EL onto the subspaces AL (I ) for any covariant net AL do not depend on the interval I . By standard arguments, involving the Tomita–Takesaki modular theory [31] and exploiting the geometric action of the modular group associated with conformal double cone algebras [2], one has Lemma 2.2. The projection EL implements a faithful normal conditional expectation εL : B(I × J ) → AL (I ), that is, for b ∈ B(I × J ) there is a unique a =: εL (b) ∈ A(I ) such that EL bEL = aEL . The expectation εL preserves the vacuum state, and the vacuum representation of the net AL is given by π0L (AL (I )) = EL B(I × J )EL . The corresponding statements hold for AR . Furthermore, for any Möbius covariant chiral net, the local algebras, unless trivial, are known to be type III von Neumann factors, and one has [7,2] Essential duality (duality on S 1 ): π0 (A(I ))0 = π0 (A(I 0 )) valid in the vacuum representation π0 of A. Hence the chiral observables automatically satisfy essential duality. Lemma 2.3. The subspace Amax L (I ) coincides with the subspace of UR -invariant vectors in H , that is EL0 = ELmax , where EL0 denotes the projection onto the UR -invariant subspace. The corresponding statement holds for AR . Proof. I owe the following argument to D. Buchholz. We only have to show that every UR -invariant vector can be approximated in Amax L (I ). Since B(O) is dense in H , EL0 B(O) is dense in EL0 H . Consider any vector 9 = EL0 b with b ∈ B(O). Then eR , and 9 = E 0 bT , where 9 = UR (g)9 = EL0 αg (b) for all g ∈ G L Z T 1 dt αgt (b) bT = 2T −T is an average over the one-parameter group of right chiral dilatations gt which leave the interval J fixed. Since kbT k ≤ kbk, the family bT has a weak limit point a in the von eR , Neumann algebra B(O) as T → ∞. We are going to show that a is invariant under G max 0 hence commutes with UR and thus belongs to AL (I ). It follows that 9 = EL bT = 0 EL0 a = a is in Amax L (I ), and hence the latter space is dense in EL H .

Chiral Observables and Modular Invariants

695

eR -invariance of a, we first note that In order to show the G Z −T Z T +s 1 |s| k kbk + dt αgt (b)k ≤ kαgs (bT ) − bT k = 2T T −T +s T which vanishes as T → ∞. Hence a is dilatation invariant, and kαg (a) − ak = kαg αgt (a) − αgt (a)k = kαg−t ggt (a) − ak eR and all t. For g a translation resp. a special conformal transformation for all g ∈ G (relative to the dilatations gt ), g−t ggt tends to the identity as t → −∞ resp. t → +∞. For a sufficiently regular to have norm-continuity of αg (which is the case if b above was regular; such operators still generate a dense subspace of H ) it follows that t kαg (a) − ak = 0, as asserted. u We want to study the equivalence of Definition 2.1 with several alternative reasonable definitions. For this purpose, we first compile some useful notions for 2D and for chiral nets. Generating property. The net B is said to have the generating property if eL ) ⊂ B(I × J ) ∨ B(I 0 × J ) U (G for any J , and equivalently (taking commutants and using essential duality of B) if eL )0 B(I × J ) ∩ B(I 0 × J ) ⊂ U (G for any J . (Here I 0 is either of the two intervals I + = (b, a + 2π ) or I − = (b − 2π, a) if I = (a, b). By the very second formula, the algebra on its left-hand side does not depend on this choice, since a suitable left Möbius transformation which maps I onto I + and I − onto I , leaves the intersection of the two algebras invariant.) Haag duality (duality on Minkowski space M). The net B fulfills Haag duality if B(O)0 = B(O c ) ≡ B(O − ) ∨ B(O + ), where O c is the disconnected causal complement of O in Minkowski space with connected components O − , O + . Strong additivity. The net B fulfills strong additivity if B(O1 ) ∨ B(O2 ) = B(O) for O1 and O2 the two connected components of the causal complement of an interior point in a double cone O. Chiral additivity. The net B fulfills chiral additivity if B(I1 × J ) ∨ B(I2 × J ) = B(I × J ) if I1 , I2 arise from I by removal of an interior point; and likewise for the two light-cone directions interchanged.

696

K.-H. Rehren

Generating property. A left chiral net AL of subalgebras of Amax L (I ) has the generating property if eL ) ⊂ AL (I ) ∨ AL (I 0 ); U (G the analogous definition holds for right chiral nets AR of subalgebras of Amax R (J ). Haag duality (duality on R). A chiral net A fulfills Haag duality if π0 (A(I ))0 = π0 (A(I c )) holds in the vacuum representation. Here I c denotes the (disconnected) open complement of an interval I in R. Strong additivity. A chiral net A fulfills strong additivity if A(I1 ) ∨ A(I2 ) = A(I ) if I is an interval in S 1 divided into two subintervals I1 , I2 by removal of an interior point. It is obvious, that if any net AL of subalgebras of Amax L has the generating property, has the generating property and B also has the generating property. then Amax L In fact, in view of the previous definition of chiral observables, the generating property for Amax is actually a feature of the 2D net B. In the cyclic subspace of the chiral observables (their vacuum representation), the assumption is always true by essential duality and factoriality. But the generating property for A is required to hold on the full vacuum Hilbert space H of B. It certainly holds if B possesses a conserved stress-energy tensor whose chiral components then are among the chiral observables. It also holds, e.g., in the theory generated by the derivatives of a massless conserved vector current which has nontrivial chiral observables (the derivatives of a U(1) current) but no stress-energy max tensor. Namely, in this model, B(I × J ) = Amax L (I ) ⊗ AR (J ), and H = HL ⊗ HR . Thus H contains only the vacuum representation of the chiral observables. Therefore, we believe that the assumption of the generating property for chiral observables does not exclude any models of serious interest. The following assertions hardly need to be proven. Lemma 2.4. (i) Haag duality is equivalent to strong additivity, both for 2D and chiral conformal nets. (ii) Strong additivity of a 2D net implies chiral additivity. Proof. (i) By essential duality, Haag duality is equivalent to B(O c ) = B(O 0 ) (in the 2D e the two connected case). This in turn is strong additivity since, in the covering space M, c components of O touch each other in a point (“space-like infinity”), and thus constitute the causal complement of that point in O 0 . The same argument applies in the chiral case, as the two connected components of I c touch each other in S 1 at infinity. (ii) Let J1 and J2 arise by removal of an arbitrary interior point from J , such that O1 = I1 ×J2 and O2 = I2 ×J1 are the components of the causal complement of an interior point in the double cone O = I ×J . Then B(O1 )∨B(O2 ) ⊂ B(I1 ×J )∨B(I2 ×J ) ⊂ B(O), and strong additivity implies equality. u t In order to compare alternative definitions of chiral observables, we consider the following two chains of inclusions, which are obvious just by isotony and essential duality:

Chiral Observables and Modular Invariants

697

@ 3,2

@ 3,1

2π

@ @

@ @

2,2

(0, 2π ) × (0, 2π ) = M

@ @

1,2 @ @ I @ @ 2,1 @ I3 @ pb @ @@@ cp 1,1 I2 @ pa @ @ @ J1 I1 @ 0

@ 2π

J2

Fig. 2.1. Space-time regions in Lemma 2.5

Lemma 2.5. With notations as explained below, one has Amax L (I2 ) ⊂

\ J

B(I2 × J ) ⊂ B2,1ˆ ∩ B2,2ˆ ⊂ B2,1 ∩ B2,2

 0 ∩B 0  ⊂ B2+3,1 ∩ B2,2 ≡ B1,2 2,2 ⊂ AR (J2 ) ∩ B2,2 .  ⊂ B 0 0 2+3,1 ∩ B1+2,2 ≡ B1,2 ∩ B1+2,2 ≡ B2+3,1 ∩ B3,1 . Here we have picked three left chiral intervals I1 = (0, a), I2 = (a, b), I3 = (b, 2π ) and two right chiral intervals J1 = (0, c), J2 = (c, 2π ) as indicated by the figure, and employ short-hand notations Bi,j = B(Ii × Jj ), Bi,jˆ = B(Ii × Jˆj ) with Jˆj ⊂ Jj . The labels 1 + 2 resp. 2 + 3 stand for the intervals (0, b) resp. (a, 2π ). Of course, the choice of the values 0 < a < b < 2π and 0 < c < 2π is completely immaterial since the ensuing partition of one copy of Minkowski space M within the e can be transferred to any other partition of any other copy by left and covering space M right Möbius transformations. AR in the second line in Lemma 2.5 is any covariant net of subalgebras of Amax R (J ). The consideration of subalgebras of the maximal chiral observables is motivated by our intention to compare with the context of modular invariant partition functions. There one usually starts with some a priori given chiral observables AR and AL such as current algebras while the maximal ones might turn out as some “W -algebra” extension thereof. Indeed, we shall later find a condition (Corollary 3.5) when the given chiral observables and the maximal ones coincide. T 0 ∩B 0 Of particular interest are the expressions J B(I2 ×J ), B1,2 1+2,2 , and AR (J2 ) ∩ B2,2 figuring in Lemma 2.5. The first one is possibly nontrivial even in massive 2D theories [30], where it provides a “holographic” satellite theory (with a conformal symmetry emerging automatically [33]); it has been used as a definition of observables on a horizon in curved space-time [10] in the absence of space-time symmetries. The second one is, up to a Möbius transformation, the relative commutant of a wedge algebra B(W + a) within another wedge algebra B(W ), where a is a shift in a light-like direction. The third one is the relative commutant of the opposite chiral observables within a double cone. Each of these would be a sensible definition of chiral observables. In fact, under suitable conditions, the inclusions above turn into equalities and the various definitions coalesce. (Note that any nontrivial inclusion in the first and second eR ).) line would require the respective larger algebra not to commute with U (G

698

K.-H. Rehren

Proposition 2.6 (referring to the chains of inclusions as in Lemma 2.5). (i) If B has the generating property, then all inclusions in the first line are equalities. (ii) If AR has the generating property, then all inclusions in the first and second line are equalities. (iii) If all inclusions in the first and second line are equalities, then B has the generating property. 0 ∩B 0 (iv) If B satisfies Haag duality, then B2,1ˆ ∩ B2,2ˆ = B1,2 1+2,2 , where Jˆ1 = (c , c), Jˆ2 = (c, c00 ), 0 < c0 < c < c00 < 2π ; in particular, if in addition the inclusions in the first line are equalities, then all inclusions in the first and third line are equalities. (v) If all inclusions in the first line are equalities and B satisfies chiral additivity, then max Amax L satisfies Haag duality. The corresponding statement holds for AR . Versions of assertions (iv) and (v) are also contained in [10]. Proof. (i) The generating property of B implies that B(I × J ) ∩ B(I × J 0 ) commutes eR ) and thus is contained in Amax (I ). with U (G L (ii) B2,2 commutes with AR (J1 ), and hence AR (J2 )0 ∩B2,2 is contained in [AR (J2 )0 ∩ 0 eR )0 ∩B2,2 = AR (J1 ) ]∩B2,2 which by the generating property of AR is contained in U (G max AL (I2 ). eR ) ⊂ Amax (I2 )0 = B1,2 ∨ B3,1 by definition and by assumption. (iii) We have U (G L The claim follows by isotony with I1 × J2 ⊂ (b − 2π, a) × J2 and I3 × J1 = (I3 − 2π) × (J1 + 2π) ⊂ (b − 2π, a) × J20 . 0 ∩B 0 0 0 0 (iv) We have B1,2 1+2,2 = B1,2 ∩ B3,1 . By isotony, this is contained in B1,2 ∩ B ˇ 3,1 which equals B ˆ by Haag duality, where Jˇ1 = (0, c0 ). The same algebra is similarly 2,1

0 = B , where Jˇ = (c00 , 2π ). This gives both assertions, by contained in B 0 ˇ ∩ B3,1 2 2,2ˆ 1,2 inspection of the chain of inclusions, Lemma 2.5. (v) Chiral additivity for B is, by passing to the commutants, equivalent to

B((a, b) × J ) = B((0, b) × J ) ∩ B((a, 2π ) × J ) for 0 < a < b < 2π and any interval J . Taking suitable intersections over J to obtain Amax L (I ) by using equality in the first line of the chain of inclusions, yields max max Amax L ((a, b)) = AL ((0, b)) ∩ AL ((a, 2π )).

Since the vacuum representation is faithful on R ∼ = (0, 2π ), the same holds in the vacuum representation π0 (i.e., after multiplication with ELmax ). Passing to the commutants in the vacuum representation, and using essential duality for Amax L , one gets strong additivity in the vacuum and hence in any representation. u t We must admit a little lapse in the proof of (v). Namely, the vacuum representation of a chiral net A is known to be faithful on the quasilocal C* algebra on R ∼ = (0, 2π ) which does not contain the von Neumann algebras A((0, b)) and A((a, 2π )). Yet, we are confident that the above conclusion from faithfulness is correct for the intersections. We have tested its validity R in the (prototypical) model with a chiral U(1) current j and associated charge Q = j (x)dx. The operator exp itQ which is trivially represented in the vacuum representation can be weakly approximated by Weyl operators exp itj (fR ) as R → ∞, where fR (x) = f (x/R) and f is a testfunction with f (x) = 1 for |x| < 1 and f (x) = 0 for |x| > 2, say. Splitting fR into two pieces fR± with supports in (−2R, a)

Chiral Observables and Modular Invariants

699

and in (−a, 2R) respectively, yields two Weyl operators exp itj (fR− ) and exp itj (−fR+ ) localized in overlapping left and right halfspaces whose weak limits as R → ∞ should coincide in the vacuum representation, and differ in a charged representation by a factor exp itQ. Nontriviality of these weak limits would invalidate our conclusion in the proof of (v). A calculation, however, shows that, due to scale invariance, the cutoff within the fixed interval (−a, a) in comparison to the increase in R behaves like a cutoff in a scaled interval (−a/R, a/R), and produces an ultraviolet singularity which causes the weak limits of interest to be zero. Since this ultraviolet behaviour is a “universal” effect of scale invariance, we believe that the same mechanism protects the validity of our conclusion also in general models. In any case, (v) will not be needed for the purposes of this paper. As we consider the assumption of the generating property for the chiral observables as no serious restriction, we reformulate the statements with this assumption as a default. Corollary 2.7. (i) Assume the generating property for some nets AL (I ) and AR (J ) of subalgebras of B(I × J ) which are invariant under the respective opposite Möbius group. Then \ B(I × J ) = AR (J )0 ∩ B(I × J ), Amax L (I ) = J

and similarly for Amax R . In particular, the left and right maximal chiral observables are each other’s mutual relative commutants in B. max (ii) If the net B is Haag dual, then Amax L and AR are Haag dual, and 0 Amax L (I1 ) = B(I2 × J ) ∩ B(I × J ),

where I1 , I2 arise from the interval I by removal of an interior point, and J is an arbitrary interval. The corresponding statement holds for Amax R . (Again, the assertion of Haag duality for the chiral observables has to be taken with a little caution.) We conclude this section with a study of the joint position of the subalgebras of left and right chiral observables within B(O). We have Proposition 2.8. In the vacuum representation of B, the left and right chiral observables are in a tensor product position, i.e., max max max Amax L (I ) ∨ AR (J ) ' AL (I ) ⊗ AR (J ).

Proof. The statement follows, by Tomita–Takesaki modular theory [31], from the existence of the conditional expectations εL and εR , cf. Lemma 2.2. We want to give a less abstract argument. Since left and right chiral observables mutually commute, it is sufficient to consider max products aL aR , where aL ∈ Amax L (I ) and aR ∈ AR (J ). Since the vacuum state ω is conformally invariant, and since the chiral observables transform under the respective chiral Möbius groups only, we have ω(aL aR ) = ω(αgL ×gR (aL aR )) = ω(αgL (aL )αgR (aR )). For suitable elements gL and gR , the localization of the transformed observables tends to space-like infinite separation, hence the cluster property of the vacuum state applies and entails ω(aL aR ) = ω(aL )ω(aR ). The factorization of the (normal) vacuum state implies the tensor product factorization of the corresponding algebras. u t

700

K.-H. Rehren

3. Representation Theory A subtheory A of a given theory B is described by a net of subalgebras (subfactors) A(O) ⊂ B(O). Conversely, B may be considered as a (local) extension of a given theory A. In the present paper, A is a net of left and right chiral observables2 O 7 → A(O) = AL (I ) ⊗ AR (J ), contained in a two-dimensional net O 7→ B(O). A general analysis of the representation theory in this situation was initiated in [17]. As a prerequisite it was required that, in generalization of an unbroken global gauge symmetry, there is a consistent family of (normal, faithful) conditional expectations εO : B(O) → A(O) which commute with space-time symmetries and preserve the vacuum state. In our situation at hand, these expectations are provided by Takesaki’s theorem [31], thanks to the fact that Tomita’s modular group for conformal double cone algebras is a eR and consequently preserves any Möbius covariant subtheory of eL × G subgroup of G the form AL ⊗ AR . As in Sect. 2, they are coherently implemented by the projection ELR onto the closure of the subspace AL (I )AR (J ) (not depending on I × J ), which commutes with Möbius transformations and preserves the vacuum state. Actually, for the analysis in [17] nets have to be directed. We must therefore pass to the 2D and chiral theories on Minkowski space M and the light-cone axes R, respectively. As is common practice, we denote the quasilocal C* algebra generated by a directed net of von Neumann algebras (say A(O)) by the same symbol (say A) as the net itself. We also denote the vacuum representations of A and of B by π0 and π 0 , respectively. In the algebraic approach to quantum field theory, positive energy representations are conveniently described in terms of DHR endomorphisms [11], provided Haag duality holds. But the restriction of π 0 to the subtheory A is always given by a DHR endomorphism ρ of A π 0 |A ' π0 ◦ρ even without assuming Haag duality [17]. Moreover, ρ is of the “canonical” form ρ = ı¯◦ı. Here ı : A → B is the embedding homomorphism and ı¯ : B → A is a conjugate homomorphism to ı in the sense [18] that there exist isometric intertwiners w ∈ A, w : idA → ı¯◦ı ≡ ρ and v ∈ B, v : idB → ı ◦ı¯ ≡ γ with w∗ v = w∗ γ (v) = λ−1/2 · 1. The number λ ≥ 1 is the (statistical) dimension of ρ and coincides with the index of the local subfactor A(O) ⊂ B(O) which is independent of O. (We assume this index, and hence the dimensions of ρ and all its subsectors, to be finite throughout.) The construction given in [17] starts off from a canonical endomorphism [16] γO of the local von Neumann algebra B(O) for any fixed double cone O into its subfactor A(O). γO extends to a canonical endomorphism γ of the quasilocal algebra B into A ˆ Oˆ ⊃ O, it yields a canonical endomorphism of B(O) ˆ in such a way that on any B(O), ˆ ˆ into A(O), and consequently the restriction of ρ = γ |A to A(O) is the corresponding dual canonical endomorphism. It was shown that ρ is a DHR endomorphism localized in the fixed double cone O, and that w ∈ A(O) and v ∈ B(O) are local operators. In the present case, A being a tensor product AL ⊗ AR of C* algebras, any irreducible representation is also a C* tensor product. As pointed out by R. Longo, there is a theoretical possibility (in case the chiral representations are not “type I”, cf. [15]), that the C* tensor products are not spatial. In a large class of models, including current algebras, this possibility can be ruled out (Lemma 12 in [15]), however, and it can presumably never arise when the statistical dimension is finite. Thus we may assume that the corresponding subspaces of H are also tensor products. 2 Henceforth, the notation O = I × J will be understood.

Chiral Observables and Modular Invariants

701

Let therefore the irreducible decomposition of the restricted vacuum representation into chiral sectors be given by M Zl,r πlL ⊗ πrR π 0 |AL ⊗AR ' l,r

with a (possibly rectangular) matrix of nonnegative integers Zl,r , where l, r run over the irreducible superselection sectors of the left and right chiral observables contained in H . Equivalently, the corresponding DHR endomorphism ρ decomposes as M Zl,r ρlL ⊗ ρrR ρ' l,r

with irreducible chiral DHR endomorphisms ρlL and ρrR , and with the same matrix Z. We call Z the coupling matrix, and we reserve the labels l = 0 and r = 0 for the respective vacuum sectors, ρ0L ' idAL ≡ idL and ρ0R ' idAR ≡ idR . Making contact with modular invariants, it should be clear that the coupling matrix also enters the decomposition of the vacuum partition function of a 2D local theory X L R L R Zl,r Trπ L e−βL0 TrπrR e−βL0 Trπ 0 e−β(L0 +L0 ) = l,r

l

into chiral characters χπ = Trπ e−βL0 of the representations of the chiral observables. A similar algebraic situation with a tensor product of two nets of observables embedded into another net also arises in coset models [35] in chiral quantum field theory. These models are given by a net of chiral observables B(I ) and a proper subnet A(I ) (e.g., the current algebras associated with a compact Lie group G and a subgroup H ). The coset theory is defined as the net of relative commutants C(I ) := A(I )0 ∩ B(I ). Unless the pair of groups gives rise to a conformal inclusion (in which case C(I ) is trivial), the net C possesses a stress-energy tensor of its own which commutes with the stress-energy tensor of A. An argument similar as in Proposition 2.8, making use of the two commuting Möbius groups for A and C, yields the tensor product position of A and C within B. Again, the branching of the vacuum sector of B is described by a coupling matrix, and our results below can be easily adapted to coset models. We are going to study the branching of the vacuum representation π 0 |A in terms of the endomorphism ρ. It turns out convenient to do this in a framework of endomorphisms of von Neumann algebras. For this purpose we use the fact that ρ as a DHR endomorphism of the quasilocal algebra A has the same decomposition into irreducibles as its restriction ρO = ρ|A(O) as a (dual canonical) endomorphism of a local von Neumann algebra. This statement is standard if one assumes Haag duality and strong additivity. But it has also been established without these assumptions in the chiral case, making use of conformal symmetry and essential duality instead, provided the statistical dimension is finite [9]. The latter argument carries over without difficulty to the 2D case. We just state this result without repeating its proof. Lemma 3.1. Let A be a local net on M or R. Assume either that A is the restriction of a e resp. S 1 , or that A satisfies Haag duality and strong additivity. Let σ , conformal net on M τ be two DHR endomorphisms (in the conformal case: with finite statistical dimension), localized in some double cone or interval O, and σO , τO their restrictions to A(O). Then the intertwiner spaces (σ, τ ) and (σO , τO ) coincide. In particular, σ and σO have the same decomposition into irreducibles.

702

K.-H. Rehren

Since our nets B and AL , AR are conformal, the lemma applies to all DHR endomorphisms with finite dimension. It follows that the decomposition M Zl,r ρlL ⊗ ρrR ρO ' l,r

of the dual canonical endomorphism for the local subfactor AL (I ) ⊗ AR (J ) ⊂ B(O) is again described by the same coupling matrix Z, where now ρlL and ρrR are local restrictions of chiral DHR endomorphisms. The crucial additional information here is that ρ and hence the dual canonical endomorphism ρO respects the tensor product structure A(O) = AL (I ) ⊗ AR (J ) in the sense that its irreducible components are equivalent to tensor products of irreducible endomorphisms of the factor algebras. We call a von Neumann subfactor A ⊗ C ⊂ B with this property a canonical tensor product subfactor (CTPS)3 with associated coupling matrix Z. The subfactors AL (I ) ⊗ AR (J ) ⊂ B(O), or A(I ) ⊗ C(I ) ⊂ B(I ) for coset models, are examples of CTPS’s. Other examples in conformal quantum field theory are Jones– Wassermann subfactors arising from partitions of S 1 into four intervals [34,15]. Since we assume the index to be finite, only finitely many sectors can contribute which all must have finite dimension, hence the coupling matrix is a finite matrix. Since we have assumed the defining representation of B to contain a unique vacuum vector, it follows that its restriction to the chiral observables contains the joint vacuum representation exactly once, hence Z0,0 = 1. This implies that the multiplicity of idL ⊗ idR in ρ is one, hence the embedding AL ⊗ AR ⊂ B is irreducible (both for the local von Neumann algebras and for the quasilocal C* algebras). We summarize the discussion so far: Proposition 3.2. The local subfactors AL (I ) ⊗ AR (J ) ⊂ B(O) are irreducible canonical tensor product subfactors. The irreducible sector decomposition of their dual canonical endomorphisms is described by the same finite coupling matrix Z as the decomposition of the restricted vacuum representation π 0 |AL ⊗AR of B. We are going to study the constraints on Z being the coupling matrix of a canonical TPS. These constraints are then read back as constraints on the representation π 0 |AL ⊗AR or on the 2D partition function. In the sequel when we write AL ⊗ AR ⊂ B, we have in mind the local subfactor AL (I ) ⊗ AR (J ) ⊂ B(O), or with suitable modifications A(I ) ⊗ C(I ) ⊂ B(I ) in coset models. But we are actually going to establish general statements on coupling matrices of CTPS’s without reference to quantum field theory. We shall several times need “Frobenius reciprocity”, cf. [18], which we recall in Lemma 3.3. Let A, B, C be unital C* or von Neumann algebras and α : A → B, β : B → C, γ : A → C unital homomorphisms. Denote by hγ , αβi the dimension of the linear space of intertwiners t ∈ C, t : γ → αβ. Then ¯ αi hαγ ¯ , βi = hγ , αβi = hγ β, provided the conjugate homomorphisms α¯ : B → A or β¯ : C → B exist. 3 An elementary example of a subfactor A ⊗ C ⊂ B which is not canonical in this sense was suggested to me by H.J. Borchers: take C = A, and B the crossed product of A ⊗ A by the flip automorphism. Then the dual canonical endomorphism is the direct sum of the identity and the flip. The latter does not respect the tensor product.

Chiral Observables and Modular Invariants

703

Here, as before, conjugates are defined in terms of a pair of intertwiners [18], say ¯ v : idB → α α¯ which satisfy the relations α(w)∗ v = 1B , α(v) ¯ ∗ w = 1A . w : idA → αα, 0 For X ⊂ B the relative commutant X ∩ B is commonly denoted by Xc . We have Lemma 3.4. Let AL ⊗ AR ⊂ B be an irreducible CTPS with finite index, and Zl,r its coupling matrix. Then, Z0,r = δ0,r if and only if 1⊗AR = (AL ⊗1)c . The corresponding statement holds exchanging AL and AR . Proof. We have the following equation: Z0,r = h¯ı ı, idL ⊗ ρrR i = hı, ı ◦(idL ⊗ ρrR )i = dim{ψ ∈ B : ψ(aL ⊗ aR ) = (aL ⊗ ρrR (aR )) ∀ aL ∈ AL , aR ∈ AR } = dim{ψ ∈ (AL ⊗ 1)c : ψ(1 ⊗ aR ) = (1 ⊗ ρrR (aR )) ∀ aR ∈ AR } = hı1 , ı1 ◦(idL ⊗ ρrR )i = h¯ı1 ı1 , idL ⊗ ρrR i. Here, ı and ı1 are the inclusion maps of AL ⊗ AR into B, and of 1 ⊗ AR into (AL ⊗ 1)c , respectively. The equality between the second and the third line follows by putting in turn aR = 1 and aL = 1. Thus, the existence of a nontrivial sector ρrR such that Z0,r 6= 0 is equivalent to a nontrivial subsector of ı¯1 ı1 , and the claim follows by noting that Z0,0 = 1 by irreducibility. t u The lemma allows us to characterize the maximal chiral observables by a normality property of the local subfactors, see Corollary 3.5 below. We recall that an inclusion A ⊂ B is called normal if (Ac )c = A. In general, Acc ⊃ A. It follows that (Acc )c ⊂ Ac and (Ac )cc ⊃ Ac , hence Accc = Ac which is obviously equivalent to the statement that a relative commutant is always normal. We call (with a slight abuse of terminology) a tensor product subfactor A ⊗ C ⊂ B normal if A ⊗ 1 and 1 ⊗ C are each other’s relative commutants in B. A normal TPS is automatically irreducible, as (A⊗C)0 ∩B = (A⊗1)0 ∩(1⊗C)0 ∩B = (A⊗1)0 ∩(A⊗1) = (A0 ∩ A) ⊗ 1 and A is a factor. Hence, the local subfactors of chiral observables within 2D conformal quantum field max theories, Amax L (I ) ⊗ AR (J ) ⊂ B(O), are examples of normal and canonical TPS’s. Also coset models give rise to local subfactors which are normal CTPS’s. Namely, one obtains normality by extending (if necessary) A(I ) by the relative commutant of C(I ). Normality of the local subfactors is characteristic for the maximal chiral observables, and a criterium in terms of the coupling matrix is given in Corollary 3.5. The following are equivalent. max (i) AL = Amax L and AR = AR . (ii) The local subfactors AL (I ) ⊗ AR (J ) ⊂ B(O) are normal CTPS’s. (iii) The coupling matrix satisfies Z0,r = δ0,r and Zl,0 = δl,0 . (iv) The coupling matrix describes an isomorphism of the left and right chiral fusion rules (in the sense of Theorem 3.6 below).

Proof. (i) and (ii) are equivalent by Corollary 2.7. (ii) and (iii) are equivalent by Lemma 3.4. (iii) and (iv) are equivalent by the following theorem. u t (The equivalence (i) ⇔ (iii) could have been argued already from Lemma 2.3.)

704

K.-H. Rehren

Theorem 3.6. Let AL ⊗ AR ⊂ B be a CTPS with finite index, and Zl,r its coupling matrix, that is M Zl,r ρlL ⊗ ρrR , ρ = ı¯◦ı ' l,r

where ı : AL ⊗ AR → B denote the inclusion map and ı¯ its conjugate. If the coupling matrix satisfies Z0,r = δ0,r and Zl,0 = δl,0 (that is, the CTPS is normal and irreducible), then (1) Z is a permutation matrix. It induces a bijection ˆ· with inverse ˇ· between the systems of sectors {ρlL } and {ρrR } contributing to the decomposition of ρ such that Zl,r = δl,r ˆ = δl,ˇr . (2) Both systems of sectors {ρlL } and {ρrR } are closed under conjugation and under decomposition of products (fusion). They satisfy the same fusion rules M M m L t Nˆ rs Nlk ρm and ρrR ρsR ' ρtR ρlL ρkL ' m

t

m =N ˆ mˆ . In particular, the bijection ˆ· respects conjugation, and the dimenwith Nlk lˆkˆ sions of the corresponding sectors coincide:

d(ρlˆR ) = d(ρlL ). (3) The homomorphisms ılL := ı ◦(ρlL ⊗ idR ) : AL ⊗ AR → B are irreducible and . mutually inequivalent. The same holds for ırR := ı ◦(idL ⊗ ρrR ), and ırR ' ırL ˇ Moreover, M M N ˇk ıkL ' Nˆ ˆs ısR . ı ◦(ρlL ⊗ ρrR ) ' k

r¯ l

s

¯ lr

Proof. The proof adopts and extends methods taken from [20]. Let the index sets {l} and {r} label the irreducible sectors ρlL of AL and ρrR of AR , respectively, obtained by closure under reduction of products of those sectors which contribute to ρ. If among these there are any “new” sectors not already contributing to ρ, we extend the coupling matrix Z by zero columns and rows, but we are eventually going to show that there are no such new sectors. Only finitely many columns and rows of Z are non-zero. Since ρ = ı¯◦ı is selfconjugate, along with ρlL ⊗ ρrR also its conjugate must contribute with the same mulL R tiplicity, and hence Zl,r = Zl,¯ ¯ r . In particular, both systems {ρl } and {ρr } are closed under conjugation. Let the homomorphisms ılL : AL ⊗ AR → B be as in (3). We compute hılL , ılL0 i = hı ◦(ρlL ⊗ idR ), ı ◦(ρlL0 ⊗ idR )i = hρlL ⊗ idR , ı¯◦ı ◦(ρlL0 ⊗ idR )i X X = Zk,s hρlL ⊗ idR , ρkL ρlL0 ⊗ ρsR i = Zk,s hρlL , ρkL ρlL0 ihidR , ρsR i. k,s

k,s

Chiral Observables and Modular Invariants

705

To this sum contributes only s = 0, ρsR = idR , since hidR , ρsR i = δs,0 , and by the assumed properties of Z also k = 0, ρkL = idL , is the only contribution. Hence hılL , ılL0 i = hρlL , ρlL0 i = δl,l 0 . Thus the homomorphisms ılL are all irreducible and mutually inequivalent. The symmetric argument applies to ırR . Next we compute L R hılL , ırR ¯ i = hρl ⊗ ρr , ı¯◦ıi = Zl,r .

As we have seen that both sets of homomorphisms {ılL } and {ırR } consist of mutually inequivalent irreducibles, each ılL can be equivalent to at most one ırR ¯ . Hence for fixed index l, at most one entry Zl,r can be different from zero and must be one. It follows also that no ılL associated with a “new” sector ρlL can be equivalent to any of the ırR , old or new, and vice versa. For the “old” sectors, we write r = lˆ and l = rˇ iff Zl,r = 1, that is, iff ılL ' ırR ¯ . That this assignment between old sectors is bijective follows from transitivity of equivalence of sectors. Since we have already seen that Z is conjugation invariant, this assignment respects conjugation, that is ρ¯lˆR = ρ¯ˆR = ρˆ¯R , etc. l

l

Next, we consider homomorphisms ıl,r := ı ◦(ρlL ⊗ ρrR ) : AL ⊗ AR → B and compute M L L L N ˇk ıkL . ◦(ρl ⊗ idR ) = ı ◦(ρ¯rˇ ρl ⊗ idR ) ' ıl,r = ırR ◦(ρlL ⊗ idR ) ' ırˇL ¯ k

r¯ l

The symmetric argument produces also the decomposition ıl,r = ılL ◦(idL ⊗ ρrR ) ' ı ◦(idL ⊗ ρ¯lˆR ρrR ) '

M s

Nˆ ˆs ısR . ¯ lr

In the first of these two decomposition formulae of the same object, no “new” label k can appear, since we have seen that such a term ıkL is not equivalent to any term ısR in the second decomposition formula, and vice versa. This shows that the sets of sectors contributing to the coupling matrix are already closed under reduction of products. Furthermore, comparison of the two decomposition formulae shows equality of the ˆ Hence the bijection ˆ· between the sectors multiplicities N ˇk and Nˆ ˆs ≡ Nˆ s¯ˆ if s¯ = k. r¯ l ¯ r¯ l lr induces an isomorphism of the fusion rules. Since finally the fusion rules of a finite system determine the dimensions uniquely, also the equality of the dimensions follows. u t We have thus reproduced a result found previously in the classification program for modular invariant partition functions with heavy use of SL(2, Z) machinery [22], reducing every modular invariant to an “automorphism of the fusion rules” for suitably extended chiral observables. Our analysis is, however, much stronger since its assumptions are much weaker. Furthermore, it implies that the “suitably extended” chiral observables

706

K.-H. Rehren

are indeed the maximal chiral observables defined in 2.1, and coincide with the relative commutants of the initially given chiral observables (Corollary 2.7 (i)). Second, if possibly the maximal left and right chiral observables are not isomorphic, then the result still implies an isomorphism of the respective fusion rules. The corresponding statement is even more interesting in the case of coset models where typically A ⊂ B is a theory with well-known fusion rules, while the coset theory C = Ac is in general a W -algebra whose superselection structure is a priori unknown. The theorem establishes that the fusion rules of this W -algebra are isomorphic to those of a local extension of the given theory A, namely the relative commutant Acc of C, which is in turn controllable in terms of the representations of A itself. For coset models based on current algebras, our result seems to be the algebraic backbone of the modular reasoning as in [29]. Finally, we emphasize that the sectors in Theorem 3.6 were never referred to as being restrictions of DHR sectors. Neither was it required that their fusion be abelian. The theorem is thus of a quite more general nature than its specific application to conformal quantum field theory as treated in this paper. 4. Towards Classification Modular invariant partition functions associated with affine Lie algebras (AL ' A ' AR ), as far as they have been classified, exhibit a classification scheme which refers to certain graphs and their exponents (eigenvalues of the square of the adjacency matrix) [3,8]. An essential statement is on the non-vanishing diagonal entries of the coupling matrix Z. A rather general formulation can be found in [1, Part II]. It entails that Zλ,λ 6= 04 if and only if the DHR sector λ of A belongs to a set of “exponents” associated with the chiral extensions A ⊂ Aext . The set of exponents is a subset of the sectors of A. By modular invariance, the sectors of A label at the same time also the irreducible representations of their own fusion algebra, the modular matrix S playing the role of a “generalized Fourier transformation” between the fusion algebra itself and its dual. On the other hand, modular invariance of the partition function implies that the coupling matrix coincides with its Fourier transform (up to a conjugation). Hence, the above statement on the sector λ being an exponent can as well be interpreted as a statement on the irreducible representation λ of the fusion algebra and on non-vanishing entries of the Fourier transformed coupling matrix. In the following, we set out to formulate a generalization of this version of the statement to the more general situation we discussed in this paper (without parity symmetry between left and right chiral algebras, and without assumption of modular invariance). ⊗ Amax ⊂ B denote some initially given chiral observLet AL ⊗ AR ⊂ Amax L R ables embedded into a two-dimensional local theory B (satisfying the assumptions of Sect. 2) along with their maximal chiral extensions obtained by passing to the relative commutants in B. Let WL and WR denote the fusion algebras of all irreducible DHR sectors λL , λR of the initially given chiral observables (or fusion subalgebras containing all sectors which contribute to the coupling matrix Z). Let WLmax and WRmax denote the fusion algebras of the irreducible sectors τL , τR of the extended (= maximal) chiral observables which 4 In affine models the DHR sectors of the initially given chiral observables are given in terms of weights λ of semisimple Lie algebras. Throughout this section, we adopt the labels λ for DHR sectors in order to make the present generalizations more transparent.

Chiral Observables and Modular Invariants

707

contribute to the coupling matrix (i.e., which are contained in the vacuum representation of B). According to Theorem 3.6 and Corollary 3.5, the fusion algebras WLmax and WRmax are isomorphic under the bijection ˆ· . We use this bijection to identify WLmax with WRmax , so the coupling matrix with respect to Amax becomes the unit matrix 1. To be on safe grounds, we assume that WL and WR contain only finitely many sectors, and that these have finite dimensions. This implies the same for W max , and ensures that all extensions have finite index. Restriction and extension prescriptions between DHR sectors of a theory B and a subtheory A were given in [17], and further analyzed in [1]. We are going to apply this max theory to the chiral extensions Amax L of AL , and AR of AR . The restriction is just the restriction of representations and coincides with the “canonical” prescription in terms of the inclusion homomorphism ı and its conjugate, given by τ 7 → στ = ı¯◦τ ◦ı. It was named σ -restriction in [1]. In the present situation, σ -restriction maps W max into W .5 In contrast, the extension prescription λ 7 → αλ [17] differs from the canonical induction λ 7 → ı ◦λ◦ı¯; it was named α-induction in [1] for distinction. In particular, unlike canonical induction, α-induction respects sector composition, and the trivial sector of the subtheory extends to the trivial sector of the extended theory. Furthermore, α-extensions of DHR sectors of the subtheory in general are not DHR but only half-space localized (solitonic) sectors, due to a monodromy obstruction [17]. Let VL and VR denote the, possibly non-abelian, fusion algebras of all sectors (labelled β) generated by reduction of products of α-extended DHR sectors from WL and WR . In [1], a reciprocity formula for α-induction and σ -restriction was found: hαλ , τ i = hλ, στ i provided λ and τ are DHR sectors of the respective theories. It entails that αλ and ı ◦λ◦ı¯, while otherwise different, contain the same DHR subsectors. It also entails that, in the present setting, the fusion algebras V contain the abelian subalgebras W max . Let BL and BR denote the rectangular “branching matrices”, describing chiral σ restriction, with non-negative integer entries hλ, στ i which connect the irreducible DHR sectors τ ∈ W max with λ ∈ W . Then the (in general rectangular, dim WL × dim WR ) coupling matrix with respect to the initially given chiral observables is Z = BL BRt , that is, ZλL ,λR 6 = 0 if and only if the sectors λL and λR arise by restriction from a pair of sectors of the maximal chiral observables which are identified by the bijection ˆ· of Theorem 3.6. This is just the “block form” of the coupling matrix expected by restricting first πB0 to the maximal chiral observables, and subsequently restricting the sectors so obtained to the initially given chiral observables. Each fusion algebra has a “regular representation” defined by representing a sector by its matrix of fusion multiplicities with the other sectors. W and W max being abelian, all their irreducible representations are one-dimensional and contribute with multiplicity one to the regular representations. The values of the generators of the fusion algebra in the irreducible representations provide “character tables” which are non-degenerate b , and square matrices. We denote the one-dimensional representations of W by φ ∈ W their character tables by X. 5 Here and in the sequel, we often suppress the subscripts L and R when both chiralities are understood.

708

K.-H. Rehren

The character table defines a “generalized Fourier transform” between any abelian fusion algebra and its representations. The Fourier transformed coupling matrix is thus defined as b = (XL BL )(XR BR )t . Z Its matrix entries are the values of the restriction of the vacuum sector of the 2D theory B, as a DHR sector of AL ⊗ AR , in the irreducible representations φL ⊗ φR of the tensor product WL ⊗ WR of the chiral DHR fusion algebras. A priori, the entries need not to be integers. b denote the conjugate representation of φ. Since the adjoint in the fusion Let φ¯ ∈ W ¯ = φ(λ) ¯ algebra is given by sector conjugation, we have φ(λ) = φ(λ). This means b are the conjugation matrices for the DHR sectors b = XC = X, where C and C CX of the initially given chiral observables A and for the representations of their fusion algebras W , respectively. Furthermore, restriction respects sector conjugation, hence BC max = CB, where C max is the conjugation matrix for the sectors τ ∈ W max of the maximal chiral observables Amax . Thus, since the branching matrices are real, we arrive at b = (XL BL )C(XR BR )+ . bC b = (XL BL )(XR BR )+ or equivalently Z Z bC b to be non-zero requires that the corresponding It follows that a matrix entry of Z complex row vectors of XL BL and XR BR are not orthogonal, and a fortiori non-zero. If both the chiral branching and the chiral fusion algebras are isomorphic, e.g., if the bC b vanishes if and only theory B is parity symmetric, then a diagonal matrix entry of Z if the corresponding row vector of XB vanishes. A modular (transformation) matrix S, if it exists, establishes a natural identification between the generators of a fusion algebra and its representations, and X = S. Since b modular S-invariance is the statement that the coupling matrix Z = S 2 = C = C, ∗ b b SZS = Z C equals its own Fourier transform up to a conjugation. This remark implies that Proposition 4.1 below reduces to the above-mentioned statement on “exponents” in [1] in the case with modular invariance. We have first to adapt definitions made in [1, Part II] to our more general setting. We b which reflect the structure of the chiral extensions. introduce certain subsets of W For a given irreducible DHR sector τ ∈ W max , we define the σ -supports SuppL (τ ) and SuppR (τ ) as the subsets of those irreducible representations of WL and WR which do not vanish on the respective restrictions στ of τ to the initially given left and right chiral observables, that is, those rows of XB which have non-zero entry in column τ . The notion “support” is motivated by considering the abelian fusion algebra W as an algebra b of its one-dimensional representations. Thus Supp(τ ) ⊂ W b is of functions on the set W indeed the support of the function στ ∈ W . (The σ -supports were called Eig(τ ) in [1].) As shown in [1], α-induction of sectors induces a homomorphism of fusion algebras W → V . Composing this homomorphism with the regular representation of V yields another representation, πα , of W . We define the α-spectra SpecL and SpecR as the subsets of those irreducible representations of WL and WR which are contained in the α-induced representations παL and παR . (The α-spectra were called Exp in [1] and are the “exponents” mentioned above.) Now, by virtue of α-σ -reciprocity [1], we are going to derive bC b vanishes unless for some sector τ ∈ W max , Proposition 4.1. (i) A matrix entry of Z both matrix indices belong to the respective left and right σ -supports Supp(τ ). It also vanishes unless both matrix indices belong to the left and right α-spectra Spec.

Chiral Observables and Modular Invariants

709

(ii) If (fusion and branching of) the left and right chiral theories are isomorphic, then a bC b is non-zero if and only if the corresponding represendiagonal matrix entry of Z S tation of W belongs to the union τ Supp(τ ). S In fact, there are many interesting cases when τ Supp(τ ) = Spec (some of them being given below), so the last statement can be phrased in terms of the α-spectrum Spec. The proposition is the desired generalization of the classification statement [3,8,1] for modular invariant partition functions. (The second statement seems not to be sensible with differing left and right chiral fusion and branching matrices, since the product of two different row vectors can clearly vanish without these vectors being zero.) The proposition makes assertions about the coupling matrix for the initially given chiral observables AL ⊗ AR embedded into the 2D theory B, in terms of the chiral extensions A ⊂ Amax to which α-induction and σ -restriction pertain. Thus the 2D problem is reduced to a chiral problem. An open issue remains, however, a modelindependent classification of possible α-spectra, and hence of 2D chiral extensions. The available classifications for affine Lie and Virasoro algebras (“diagonal or automorphism, orbifold, exceptional” [3,8,1]) refer to the chiral extensions being in turn trivial, fixpoints under an abelian group, or conformal embeddings, and are expected to be too coarse in the general case. Proof of the Proposition. (i) The first statement is obvious since by the representation bC b = (XL BL )(XR BR )+ , every matrix entry is the inner product of row vectors whose Z components are the values of the functions στ , τ ∈ W max , evaluated on the respective left and right one-dimensional representations. The inner product vanishes whenever these representations do not belong to the respective σ -supports. The second statement is a consequence of the first in view of Lemma 4.2 below. (ii) For isomorphic left and right chiral fusion and branching, XL BL = XR BR , bC b are norm squares of row vectors of XB which vanish if diagonal matrix entries of Z and only if all their entries vanish, hence if and only if the corresponding representation t of W does not belong to any of the σ -supports Supp(τ ), τ ∈ W max . u We have used S Lemma 4.2. τ Supp(τ ) ⊂ Spec. Proof. The one-dimensional representations φ of an abelian fusion algebra with generators λ, considered as vectors with entries φ(λ), are pairwise orthogonal [14]. This property enables us to decide whether a representation φ is contained in the α-induced representation πα (λ) with matrix entries hαλ β1 , β2 i, by contracting the matrix-valued vector (πα (λ))λ with the vector (φ(λ))λ . Thus φ belongs to the α-spectrum Spec if and only if the resulting matrix X X φ(λ)πα (λ)β1 β2 = φ(λ)hαλ β1 , β2 i (φ · πα )β1 β2 ≡ λ

λ

is non-zero. But for β1 = idAmax , and β2 = τ an irreducible sector from W max ⊂ V , the matrix entry of the α-induced representation equals hλ, στ i by α-σ -reciprocity, and the contracted matrix entry equals φ(στ ). Hence, if φ belongs to any of the σ -supports Supp(τ ), then φ belongs to the α-spectrum Spec. u t We list here two “extremal”, but by no means exhaustive, conditions to ensure equality S in Lemma 4.2, that is, τ Supp(τ ) = Spec:

710

K.-H. Rehren

Lemma 4.3. If α-induction is surjective (considered as a linear map from W into V ), S then Supp(idAmax ) = τ Supp(τ ) = Spec. max into W ), then S If σ -restriction is surjective (considered as a linear map from W b τ Supp(τ ) = Spec = W exhaust all representations of W . The case of surjective Sinduction was also paid special attention in [1]. Indeed, there are many other cases when τ Supp(τ ) = Spec, but we have no satisfactory characterization yet. Proof. We want to compute the σ -support Supp(idAmax ). For this purpose, we multiply φ(σidAmax ) with φ(µ), µ ∈ W . Using in turn α-σ -reciprocity, the representation condition for φ, Frobenius reciprocity, the homomorphism property of α-induction, and associativity of fusion, we arrive at X X φ(λ)φ(µ)hαλ , idAmax i = Nκλµ¯ φ(κ)hαλ , idAmax i = φ(σidAmax )φ(µ) = λ

=

X

κλ

φ(κ)hακ α¯ µ , idAmax i =

κ

X

φ(κ)hα¯ µ , βihακ β, idAmax i.

κ,β

Here the sum over β extends over all sectors of V . The last sum must vanish for every µ, since the left-hand side does, if φ(σidAmax ) = 0, i.e., if φ 6∈ Supp(idAmax ). Now, if α-induction is surjective, then every sector β arises as a linear combination of sectors αµ , and consequently X X φ(κ)hακ β, idAmax i = φ(λ)hαλ , βi κ

λ

must vanish for all β. These are sufficiently P many matrix entries to ensure the vanishing of the full matrix (since hαβ1 , β2 i = β hα, βihββ1 , β2 i), and hence the absence of φ from the α-spectrum Spec. Hence Spec ⊂ Supp(idAmax ), implying the first claim. On the other hand, if σ -restriction is surjective, then φ(στ ) = 0 for all τ ∈ W max implies φ(λ) = 0 for all λ ∈ W , hence φ = 0. Thus the union of the σ -supports exhausts all representations of W , implying the second claim. u t We have thus established some first constraints on the coupling matrix in terms of representations of fusion algebras. Further constraints are expected to derive from locality which was only partially exploited in the form of α-σ -reciprocity in Proposition 4.1, and in the commutativity of left and right chiral observables in Theorem 3.6. Notably the condition for locality of the 2D theory in terms of the local subfactor data and the statistics which was given in [17] remains to be transcribed into a condition on the coupling matrix. As mentioned in the introduction, chiral locality produces matrices S stat and T stat which represent SL(2, Z) [27,6], except for a possible degeneracy of the braiding. A first implication of the locality condition for the 2D theory is that TLstat Z = ZTRstat , in accordance with local 2D conformal fields having integer spin hL − hR . The companion relation SLstat Z = ZSRstat , that is, modular invariance of the coupling matrix with respect to the representation of SL(2, Z) given by the statistics, cannot be established for general 2D nets B, however. The surprise is that, as shown here, one can go much of the way towards classification without knowing these formulae, and that one can do so whether the involved sectors have a degenerate braiding or not. (Müger’s proof [21] that the degeneracy can always be removed by an algebraic extension of the chiral observables does not help here, since this extension is in general not possible within the given 2D observables.)

Chiral Observables and Modular Invariants

711

5. Conclusions We have shown that in a 2D conformally invariant quantum field theory with sufficiently many chiral observables to generate the chiral Möbius groups, there are maximal algebras of chiral observables which are, locally, the relative commutants of each other, as well as of any a priori given chiral observables sharing the generating property (cf. Sect. 2). The representation theory of the chiral observables is governed by a “canonical tensor product subfactor” (CTPS) AL ⊗ AR ⊂ B given by the respective chiral and 2D local algebras. We have therefore investigated the general structure of CTPS’s and have found a characterization of the two tensor factors being each other‘s relative commutants (“normality”) in terms of a coupling matrix. The coupling matrix in this case provides an isomorphism between the respective fusion rules for the involved sectors of the two tensor factors. This abstract result, applied to the quantum field theoretical situation at hand, generalizes a statement on certain “extended” chiral observables in the classification program for 2D modular invariant partition functions, and shows that the latter coincide with the maximal chiral observables. Exploiting general properties of α-induction and σ -restriction between the superselection sectors of the maximal and the a priori given non-maximal chiral observables, constraints on the coupling matrix (with respect to the non-maximal chiral observables) are derived which are the direct counterparts of similar constraints in the modular classification program. Yet, modular invariance has not been assumed throughout the analysis. This supports our conviction that modular invariants are just one aspect of a deeper and more general mathematical structure (presumably related to “asymptotic subfactors” and “quantum doubles”). A classification in terms of graphs still remains to be established in the general situation. Possibly, additional constraints originating from locality will play a role here. Acknowledgement. I am indebted to J. Böckenhauer for many helpful and critical comments on an earlier version of this paper, as well as toA. Recknagel for discussions on modular invariance. I also thank D. Buchholz, B. Schroer and H.-W. Wiesbrock for useful suggestions concerning Sect. 2, and R. Longo for pointing out a difficulty in Sect. 3.

References 1. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. I, Commun. Math. Phys. 197, 361–386 (1998), II, ibid. 200, 57–103 (1999), and III, ibid. 205, 183–229 (1999) 2. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) (1) 3. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E classification of minimal and A1 conformal invariant theories. Commun. Math. Phys. 113, 1–26 (1987) 4. Cardy, J.L.: Operator content of two-dimensional conformally invariant theories. Nucl. Phys. B270, 186–204 (1986) 5. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with permutation group statistics and exchange algebras. I, Commun. Math. Phys. 125, 201–226 (1989), and II, Rev. Math. Phys. Special Issue, 113–157 (1992) 6. Fröhlich, J., Gabbiani, F.: Braid statistics in local quantum theory. Rev. Math. Phys. 2, 251–353 (1991) 7. Fröhlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 8. Gannon, T.: The classification of affine SU(3) modular invariants. Commun. Math. Phys. 161, 233–264 (1994) 9. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–36 (1996)

712

K.-H. Rehren

10. Guido, D., Longo, R., Roberts, J.E., Verch, R.: Charged sectors, spin and statistics in quantum field theory on curved spacetimes. Preprint math-ph/9906019 (1999) 11. Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer, 1992. 12. Kac, V.G.: The idea of locality. In: “Group 21”, Proceedings Goslar 1996, H.-D. Doebner et al eds., Singapore: World Scientific, 1997, pp. 16–32 13. Kac, V.G., Peterson, D.H.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) 14. Kawai, T.: On the structure of fusion algebras. Phys. Lett. B217, 247–251 (1989) 15. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Preprint math.OA/9903104 (1999) 16. Longo, R.: Index of subfactors and statistics of quantum fields. II, Commun. Math. Phys. 130, 285–309 (1990) 17. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 18. Longo, R., Roberts, J.E.: A theory of dimension. K-Theory 11, 103–159 (1997) 19. Lüscher, M., Mack, G.: Global conformal invariance in quantum field theory. Commun. Math. Phys. 41, 203–234 (1975) 20. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Int. J. Math. 8, 249–265 (1997) 21. Müger, M.: On charged fields with group symmetry and degeneracies of Verlinde’s matrix S. hepth/9705018, to appear in Ann. Inst. H. Poincaré 22. Moore, G., Seiberg, N.: Naturality in conformal field theory. Nucl. Phys. B313, 16–40 (1989) 23. Nahm, W.: A proof of modular invariance. Int. J. Mod. Phys. A6, 2837–2845 (1991) 24. Ocneanu, A.: Quantum symmetry, differential geometry, and classification of subfactors. Univ. Tokyo Seminary Notes 45 (1991) (notes recorded by Y. Kawahigashi) 25. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132, 461–483 (1990) 26. Rehren, K.-H.: In preparation 27. Rehren, K.-H.: Braid group statistics and their superselection rules. In: “The algebraic theory of superselection sectors”, Proceedings Palermo 1989, D. Kastler ed., Singapore: World Scientific, 1990, pp. 333–355 28. Rehren, K.-H., Schroer, B.: Einstein causality and Artin braids. Nucl. Phys. B312, 715–750 (1989) 29. Schellekens, A.N., Yankielowicz, S.: Simple currents, modular invariants, and fixed points. Int. J. Mod. Phys. A5, 2903–2952 (1990) 30. Schroer, B., Wiesbrock, H.-W.: Looking behind the thermal horizon: Hidden symmetries in chiral models. Preprint hep-th/9901031 (1999) 31. Takesaki, M.: Conditional expectations in von Neumann algebras. J. Funct. Anal. 9, 306–321 (1972) 32. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theories. Nucl. Phys. B300, 360–376 (1988) 33. Wiesbrock, H.-W.: Conformal quantum field theory and half-sided modular inclusions of von Neumann algebras. Commun. Math. Phys. 158, 537–544 (1993) 34. Xu, F.: Jones–Wassermann subfactors for disconnected intervals. Preprint q-alg/9704003 (1997) 35. Xu, F.: Algebraic coset conformal field theories. Preprint math.OA/9810035 (1998) Communicated by H. Araki

Commun. Math. Phys. 208, 713 – 760 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem Associated to Inverse Scattering Dmitry E. Pelinovsky, Catherine Sulem Department of Mathematics, University of Toronto, Toronto, Ontario, M5S 3G3, Canada Received: 17 December 1998/ Accepted: 21 July 1999

Abstract: A complete set of eigenfunctions is introduced within the Riemann–Hilbert formalism for spectral problems associated to some solvable nonlinear evolution equations. In particular, we consider the time-independent and time-dependent Schrödinger problems which are related to the KdV and KPI equations possessing solitons and lumps, respectively. Non-standard scalar products, orthogonality and completeness relations are derived for these problems. The complete set of eigenfunctions is used for perturbation theory and bifurcation analysis of eigenvalues supported by the potentials under perturbations. We classify two different types of bifurcations of new eigenvalues and analyze their characteristic features. One type corresponds to thresholdless generation of solitons in the KdV equation, while the other predicts a threshold for generation of lumps in the KPI equation. 1. Introduction 1.1. Motivations. Some nonlinear evolution equations have attracted intense studies in past years for their universal appearance in the mathematical description of wave processes in dispersive systems and their remarkable analytical properties. In particular, they are related to linear scattering problems in such a way that the nonlinear analysis of wave systems is possible through the Fourier-type analysis of the direct and inverse scattering transform of their linear counterparts [1]. The spectral data in inverse scattering consist typically of the continuous spectrum eigenfunctions and a discrete number of bound states. The bound states correspond to localized steady-state disturbances such as solitons, lumps, dromions and instantons. Among many universal properties in inverse scattering, Ablowitz, Kaup, Newell and Segur noticed in their pioneer paper [1] that the set of eigenfunctions for the continuous and discrete spectrum for theAKNS spectral problem is complete, i.e. an arbitrary vectorfunction with appropriate boundary conditions at infinity can be decomposed through this set of eigenfunctions. This property generalizes the Fourier decomposition [2] and

714

D. E. Pelinovsky, C. Sulem

is well-known in spectral theory of linear self-adjoint operators [3,4]. The completeness relation was proved in Ref. [1] by means of the Gelfand–Levitan–Marchenko (GLM) integral equations which appear in the formalism of the inverse scattering transform. In the AKNS spectral problem, the isolated eigenvalues appear as poles of transmission coefficients and correspond to exponentially localized bound states associated to solitons in nonlinear evolution equations. Further development in inverse scattering led to the construction of new linear scattering problems associated to the nonlinear evolution equation in one and two dimensions (see review in [5,6]). In the latter problems, Fokas and Ablowitz [7–9] showed that the isolated eigenvalues appear in homogeneous integral Fredholm equations and the corresponding bound states have algebraic decay at infinity. These bound states are associated to lumps or algebraic solitons in nonlinear evolution equations. The most general formulation of the inverse scattering transform relies on the Riemann-Hilbert (RH) boundary value problem or its generalization, the ∂¯ problem. This setting requires new methods for constructing and studying complete sets of eigenfunctions. A particular spectral system associated to the Benjamin-Ono (BO) equation was studied recently by Kaup, Lakoba, and Matsuno [10–12]. Their results serve as a pivot for our approach to integrable problems associated to the RH formalism. Studies of complete sets of eigenfunctions have many different prospects. First, they provide a basis for the spectral decomposition associated to the given linear problem. Second, they enable us to develop a perturbation theory and study variations of spectral data and eigenfunctions induced by perturbations of the potential. Third, bifurcations of eigenvalues can be analyzed through the expansions over a complete basis, while a standard perturbative analysis usually misses the possibility of such bifurcations. We recently obtained [13,14] that, for the spectral problem associated to the BO equation, this bifurcation may happen from the edge of the continuous spectrum when the potential of the scattering problem satisfies a condition of non-genericity. Finally, the orthogonality and completeness relations are used in Hamiltonian formalism of nonlinear evolution equations and construction of Poisson brackets and canonical variables [15]. In this paper we construct a complete set of eigenfunctions associated to the scalar RH formalism. Although our analysis is based on two canonical and physically important problems (Sect. 1.2), it can also be formulated in an abstract form (Sect. 1.3). The main analysis concentrates on the time-independent Schrödinger problem which is associated to solitons of the Korteweg–de Vries (KdV) equation and the time-dependent Schrödinger problem which is associated to lumps of the Kadomtsev–Petviashvili (KPI) equation. We derive non-standard scalar products and orthogonality relations and prove the completeness formula by means of the RH formalism. We then develop a regular perturbation theory from the integral representation of the linear eigenvalue problem and calculate variational derivatives of spectral data in the absence of bifurcations of new eigenvalues. When the integral representation becomes singular, we find the conditions for a new eigenvalue to emerge from the continuous spectrum. These bifurcations are classified into two general types. The type I bifurcation occurs when the marginal eigenfunction at the edge of the continuous spectrum becomes bounded (nonsecular) in space and belongs to the spectrum in contrast to a generic secular eigenfunction which is excluded from the spectrum. The multisoliton solutions are examples of nongeneric potentials and the type I bifurcation occurs under a certain thresholdless perturbation of multisoliton potentials. The type II bifurcation occurs when a new bound state is embedded into the continuous spectrum at the bifurcation point and splits apart from the continuous spectrum or disappears upon

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

715

a perturbation. This type is not supported by multisoliton potentials. A new eigenvalue appears above a certain threshold on the amplitude of a perturbation to the multisoliton potential. The type I bifurcation is illustrated in Sect. 2 for the time-independent Schrödinger equation. Although our results recover the standard inverse scattering formalism associated to this equation (see Appendix A.2 of Ref. [2]), we introduce and study a new non-standard basis of eigenfunctions within the RH formalism. The type II bifurcation is illustrated in Sect. 3 for the time-dependent Schrödinger equation. We find for the first time to our knowledge a complete set of eigenfunctions associated to this equation. The methods and results derived for these two basic problems can be generalized for other examples in inverse scattering which include differential-difference linear systems associated to the Intermediate Long-Wave (ILW) equation and the BO equation as well as vector eigenvalue problems such as the AKNS spectral system in one and two dimensions. A brief review of these spectral problems is discussed in Sect. 4. 1.2. Linear eigenvalue problems. The inverse scattering theory has been developed for several prototypical examples which include the Kadomtsev–Petviashvili equation referred to as the KPI equation, (ut + 6uux + uxxx )x = 3uyy .

(1.1)

It is associated with the time-dependent Schrödinger equation, iϕy + ϕxx + uϕ = 0,

(1.2)

where u = u(x, y, t) satisfies Eq. (1.1). Inverse scattering for the KPI equation was initiated by Manakov [16] and developed by Fokas and Ablowitz [8] by means of a (nonlocal) RH boundary value problem. In particular, the authors of [8] defined proper eigenfunctions M± and N± of the time-dependent Schrödinger equation (1.2) and incorporated the lump solutions in the inverse scattering scheme. Rigorous results on the solvability of direct and inverse scattering transforms were reported by Beals and Coifman [17], Zhou [18], and Fokas and Sung [19]. More complete results on existence and classification of multiple bound states in the discrete spectrum of the time-dependent Schrödinger equations were recently found by Ablowitz and Villaroel [20–22]. A complete version of the spectral transform for the KPI equation was derived by Boiti et al. [23–25] by means of a formal resolvent approach based on some orthogonality relations for the eigenfunctions of Eq. (1.2). However, their approach does not provide a complete basis of eigenfunctions for the perturbation theory and bifurcation analysis of weakly localized potentials such as multilump potentials. This problem was discussed by Kaup [26] who pointed out that the eigenfunctions of (1.2) are unbounded and incomplete in the Hilbert space if the potential u(x, y) is not absolutely integrable. Recently the inverse scattering transform theory was applied to solve rigorously the initial-value problem for the KPI equation (1.1) with and without the zero mass constaint [27–29]. Uniqueness and existence of the solution was also proved by Fokas and Sung [30,31] under the assumption that the initial data is a “small” function in the Schwartz space. The latter assumption was used to exclude generation of lumps (two-dimensional solitons) in the KPI equation (1.1) by localized initial data. The problem of lump generation in the KPI equation remains open in spite of its applications in water wave theory [6]. Kuznetsov and Turitsyn [32] showed that a single KPI lump is stable against small perturbations. Recent numerical simulations of the

716

D. E. Pelinovsky, C. Sulem

KPI equation (1.1) by He [33] showed that a localized initial condition may lead to the formation of KPI lumps if the amplitude of the initial pulse exceeds a certain threshold value. In the case of y-independent solutions, the KPI equation reduces to the KdV equation, ut + 6uux + uxxx = 0,

(1.3)

and the linear system (1.2) to the time-independent Schrödinger equation, ϕxx + (u + k 2 )ϕ = 0,

(1.4)

where k is a spectral parameter and u = u(x, t) satisfies Eq. (1.3). The standard complete set of eigenfunctions for this problem is described in Ref. [2]. The spectral properties of the eigenfunctions at the edge of the continuous spectrum were also studied in relation to bifurcation of new eigenvalues in the problem (1.4) [34]. The appearance of a single eigenvalue supported by a small potential for the problem (1.4) was analyzed by direct methods in [35–37]. It was found that the linear problem (1.4) exhibits a single small eigenvalue for infinitely small potentials under the constraint that the area integral of the potential is positive. The same conclusion was also formulated for a perturbation of a single soliton potential [38]. In this paper, we present a systematic method based on the RH problem to derive the spectral decomposition associated to the linear eigenvalue problems in inverse scattering. For the sake of clarity, it is first presented in the context of the time-independent Schrödinger equation (1.4) and then extended to the time-dependent Schrödinger equation (1.2) which is more difficult. In both cases, we use the completeness properties to study bifurcation of new eigenvalues. We recover and generalize some of the results discussed above. In particular, we show that for the spectral problem (1.4), an eigenvalue and its associated bound state exist for an arbitrary small potential u = u(x), while the spectral problem (1.2) does not have eigenvalues and bound states for small potentials u = u(x, y). This feature illustrates the different types of bifurcation of eigenvalues for Eqs. (1.2) and (1.4) (types I and II).

1.3. RH formalism and eigenfunctions. A Riemann–Hilbert boundary value problem in a complex plane (z ∈ C) consists in reconstructing meromorphic functions µ± (z) outside of a contour 0 ∈ C according to a given jump at the contour, (1.5) µ+ (z) − µ− (z) = T µ− (z) , where T is an operator and the functions µ± (z) satisfy the boundary conditions, lim µ± (z) = 1,

|z|→∞

in the corresponding domains of C. In inverse scattering [6,5], the RH problem appears typically if the scattering problem has a single spectral parameter (say k) and the continuous spectrum is located for real values of k, i.e. z = k and 0 = Re(k). This problem relates two Jost functions M± (x, k) which are generally (n × n) matrices and depend on m variables x1 , x2 , ..., xm . In what follows, we restrict ourselves to scalar RH problems (n = 1) in one dimension (x1 = x)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

717

or two dimensions (x1 = x and x2 = y). The Jost functions M ± (x, k) are introduced as particular solutions of Fredholm’s integral equations in Green’s function representation, Z ∞ G± (x − x 0 , k)u(x 0 )M± (x 0 , k)dx 0 . (1.6) M± (x, k) = 1 + −∞

Here u(x) is a real-valued potential, G+ (x, k) and G− (x, k) are Green’s functions which are supposed to be analytic in Im(k) ≥ 0 and Im(k) ≤ 0 respectively, and satisfy lim|k|→∞ G± (x, k) = 0. Taking the derivative ∂/∂ k¯ in Eq. (1.6), where k¯ is the complex conjugate of k, we find that the eigenfunctions M± (x, k) are analytic functions of k in the domains of analyticity of G± (x, k) if there are no homogeneous solutions of Fredholm’s integral equations (1.6). On the other hand, if Fredholm’s integral equations (1.6) do possess homogeneous solutions in a number of isolated points k of the complex domain, then the eigenfunctions M± (x, k) are meromorphic functions of k. We refer to the bound states in the former case as solitons and in the latter case as lumps. At real k, the limiting values of the eigenfunctions M± (x, k) are related by the following scattering problems, M+ (x, k) − M− (x, k) = ρ− (k)N− (x, k), a+ (k) M− (x, k) = ρ+ (k)N+ (x, k). M+ (x, k) − a− (k)

(1.7) (1.8)

Here a± (k) are the inverse transmission coefficients. The coefficients ρ± (k) represent scattering data and the eigenfunctions N± (x, k) are linearly independent solutions of the spectral system with non-constant boundary conditions at infinity, Z ∞ iβ(x,k) + G± (x − x 0 , k)u(x 0 )N± (x 0 , k)dx 0 . (1.9) N± (x, k) = e −∞

The coefficients a± (k) are identically equal to unity for problems associated to lumps and are not constant for problems associated to solitons. In the latter problems, the coefficients a± (k) have the same analyticity properties as the eigenfunctions M± (x, k) subject to the following boundary conditions, lim M± (x, k) = lim a± (k) = 1.

|k|→∞

|k|→∞

(1.10)

Combining all these facts, the scattering problem (1.7) (or, equivalently, Eq. (1.8)) defines a RH boundary-value problem, if the eigenfunctions N± (x, k) can be expressed through M± (x, k) by additional symmetry formulas, N± (x, k) = FM± (x, k)eiβ(x,k) ,

(1.11)

where F is an operator. Bound states are to be added to the problem (1.7) and (1.11) as pole contributions in the meromorphic functions [a± (k)]−1 M± (x, k). Then, a closed solution of the RH problem can be found (see Appendix A1 in [6]), from which the potential is recovered. In a simplified version, the inverse scattering scheme is a sequence of transformations from the given potential u = u(x, 0) to the set of eigenfunctions S(0) for the associated linear problem, then to the spectral data R(0) with simple evolution in time R = R(t),

718

D. E. Pelinovsky, C. Sulem

then back to the set of eigenfunctions S = S(t) via self-consistent integral equations and finally back to the potential u = u(x, t). This sequence of transformations generalizes the Fourier transform which is based on orthogonality and completeness relations of the trigonometric functions. Similarly, a closure of the general scheme at t = 0 implies the existence of a complete basis of eigenfunctions for the direct and inverse spectral transforms. However, the orthogonality and completeness relations for the eigenfunctions used in inverse scattering are not usually under consideration because their derivation may be labourous. Moreover, it is not always clear how to choose a proper basis for these transformations. For example, it is natural to use the eigenfunctions M± (x, k) for a characterization of the inverse scattering problem whereas these functions do not form a complete basis. Our main idea is that each linear problem associated to a nonlinear evolution equation provides a natural set of orthogonal and complete eigenfunctions which forms a basis of the inverse scattering transform. The complete set of eigenfunctions consists of the eigenfunctions N± (x, k) and associated bound states and characterize all other data of the spectral transform, including the associated eigenfunctions M± (x, k), the spectral data a± (k) and ρ± (k) and the potential u(x). We prove this statement in Sects. 2 and 3 for the particular scattering problems (1.2) and (1.4). 2. Time-Independent Schrödinger Equation The local RH problem (1.7) appears for the spectral problem (1.4) after the transformation, ϕ = me−ikx , where the function m = m(x, k) satisfies the problem, mxx − 2ikmx + u(x)m = 0.

(2.1)

We suppose that the function u(x) is real, smooth and belongs to Lp for any p ≥ 1. These requirements are satisfied for multisoliton potentials of the KdV equation (3.8) since such potentials have an exponential decay at infinity. The dependence of the potential and the eigenfunctions on evolution time t will be omitted henceforth. The standard complete set of eigenfunctions is described in Appendix A.2 of Ref. [2]. Here we view the problem by means of the RH formalism and introduce a new non-standard complete set of eigenfunctions. 2.1. Spectrum and scattering data. Two fundamental solutions M± (x, k) of Eq. (2.1) can be extended analytically for Im(k) ≥ 0 and Im(k) ≤ 0 according to the integral representation (1.6). The corresponding Green’s functions have the form [6]: G± (x, k) = ±

1 (1 − e2ikx )2(±x), 2ik

(2.2)

where 2(x) = 1 if x > 0 and 2(x) = 0 if x < 0. The other two fundamental solutions N± (x, k) can be found from Eqs. (1.9) with β(x, k) = 2kx. The eigenfunctions M± (x, k) and N± (x, k) satisfy the following boundary conditions in the limit x → ∓∞, M± (x, k) → 1,

N± (x, k) → e2ikx .

(2.3)

Taking the limits x → ±∞ in the Green’s function representation (1.6) and using Eqs. (2.2) and (2.3), we find the scattering relations (1.7) and (1.8) with the spectral

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

719

data ρ± = b∓ (k)/a∓ (k). The coefficients a± (k) and b± (k) can be expressed through M± (x, k) as Z ∞ 1 u(x)M± (x, k)dx, (2.4) a± (k) = 1 ± 2ik −∞ Z ∞ 1 b± (k) = − u(x)M± (x, k)e−2ikx dx. (2.5) 2ik −∞ The scattering coefficients satisfy the constraints [6] ∗ (k), a− (k) = a+ ∗ a± (k) = a± (−k),

b− (k) = b+ (k), ∗ b± (k) = b± (−k),

(2.6) (2.7)

and |a+ (k)|2 = 1 + |b+ (k)|2 .

(2.8)

Using these relations, we deduce from Eqs. (1.7) and (1.8) the boundary conditions for the eigenfunctions M± (x, k) and N± (x, k) in the limits x → ±∞, M± (x, k) → a± (k) ± b± (k)e2ikx ,

(2.9)

∗ ± b± (k).

(2.10)

N± (x, k) → a∓ (k)e

2ikx

When k → ∞ ± i0, the eigenfunctions M± (x, k) have the asymptotic representation, Z x 1 u(x 0 )dx 0 + O(k −2 ). (2.11) M± (x, k) = 1 + 2ik ∓∞ This formula follows from Eqs. (1.6) and (2.2). The scattering relation (1.7) defines the (local) RH boundary-value problem for M± (x, k). The closure relations (1.11) follow from the symmetry of the Green’s functions, G± (x, k) = G∗± (x, −k) = G∗± (x, k)e2ikx and have the form, ∗ (x, k)e2ikx . N± (x, k) = N±∗ (x, −k) = M±

(2.12)

Bound states for Eq. (2.1) exist for eigenvalues given by the zeros of a+ (k) in the upper half-plane of k and the zeros of a− (k) in the lower half-plane. Zeros of a± (k) are simple [6] and located symmetrically on the imaginary axis of k due to the constraints imposed on a± (k). These bound states correspond to exponentially localized solitons of the KdV equation (1.3). The two RH problems (1.7) and (1.8) supplemented by the boundary conditions (1.10) and the closure relation (2.12) can be solved in the form n c∓ 8∓ (x) X j j

Z

∞

ρ± (k 0 )N± (x, k 0 )dk 0 , k 0 − (k ± i0)

(2.13)

Z ∞ n c± 8± (x) X 1 ρ∓ (k 0 )N∓ (x, k 0 )dk 0 M± (x, k) j j + =1+ , a± (k) 2π i −∞ k 0 − (k ± i0) k − kj±

(2.14)

M± (x, k) = 1 +

j =1

k − kj∓

1 + 2π i

−∞

or, equivalently,

j =1

720

D. E. Pelinovsky, C. Sulem

± ± where 8± j (x) are the bound states, the eigenvalues kj satisfy the constraints kj = ±iκj due to the symmetry (κj > 0), n is the number of bound states, and cj± are renormalization constants. The limiting relations for the eigenfunctions M± (x, k) approaching bound states are

lim M± (x, k) = γj± 8± j (x),

k→kj±

(2.15)

where γj± = cj± (aj0 )± are real coefficients. Using the symmetry (2.6) and (2.7), we write the coefficients as cj± = ±iCj± and ± da± (k) = = ±iaj0 , aj0 dk k=k ± j

where Cj± and aj0 are real. The bound states 8± j (x) are real functions satisfying the inhomogeneous integral equations, Z ∞ ± −1 0 0 (x) = (γ ) + G± (x − x 0 , kj± )u(x 0 )8± (2.16) 8± j j j (x )dx , −∞

with the boundary conditions

(

8± j (x)

→

as x → ∓∞ (γj± )−1 ∓2κ x j ) as x → ±∞ O(e

(2.17)

We notice that the bound states 8± j (x) are not localized in the limit x → ∓∞. Using the boundary conditions for G± (x, k) we find the following integral representation, γj± Z ∞ u(x)8± (2.18) κj = j (x)dx, 2 −∞ which is also a condition for a± (k) to have a zero at k = kj± = ±iκj (see Eq. (2.4)). In addition, comparing the boundary values (2.3) and (2.17), we normalize the bound states according to the limiting relations, lim N± (x, k) = 8∓ j (x),

k→kj∓

(2.19)

∓2κj x as x → ±∞. or, equivalently, according to the boundary conditions 8± j (x) → e This renormalization leads by virtue of Eqs. (2.12) to the relations ± ± ±2κj x . 8∓ j (x) = γj 8j (x)e

(2.20)

It follows from Eqs. (2.20) that the coefficients Cj± and γj± satisfy the constraints, 2 Cj+ Cj− aj0 = 1,

γj+ γj− = 1.

(2.21)

The set of coefficients {a± (k), b± (k)} represents the spectral data for the continuous spectrum of the linear problem (2.1) while the set {kj± , γj± }m j =1 corresponds to the data for the discrete spectrum. The separation of the discrete and continuous spectra follows from the analysis of the asymptotic behavior of the spectral data in the limit k → 0.

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

721

Definition 2.1. The potential u(x) is called generic potential of type I if the limiting point k = 0 is excluded from the continuous spectrum, i.e. the limiting eigenfunctions M± (x, 0) are not bounded in x as x → ∞ and the spectral coefficients a± (k) are not bounded in k as k → 0, so that limk→0 [a± (k)]−1 M± (x, k) = 0. Otherwise, the potential is called nongeneric potential of type I. The asymptotic behavior of the scattering data as k → 0 follows from Eqs. (2.4), a± (k) → ± where

Z m−1 =

m−1 m−1 + O(1), b± (k) → − + O(1), 2ik 2ik Z

∞

u(x)M+ (x, 0)dx =

−∞

∞

−∞

u(x)M− (x, 0)dx,

and the eigenfunctions M± (x, 0) are real and satisfy the integral equations, Z x (x − x 0 )u(x 0 )M± (x 0 , 0)dx 0 . M± (x, 0) = 1 − ∓∞

(2.22)

(2.23)

(2.24)

These eigenfunctions have a secular growth in x at infinity according to the boundary conditions, ( 1 as x → ∓∞ (2.25) M± (x, 0) → ∓ m x as x → ±∞, 1 ± m± −1 0 where m± 0 =

Z

∞

−∞

xu(x)M± (x, 0)dx.

(2.26)

Thus, if m−1 6 = 0, the limiting point k = 0 is excluded from the continuous spectrum and the potential u(x) is a generic potential of type I. On the other hand, if m−1 = 0, the secularities of the spectral data as k → 0 disappear and the limiting eigenfunctions M± (x, 0) become bounded and related as + M− (x, 0) = (1 − m− 0 )M+ (x, 0), M+ (x, 0) = (1 + m0 )M− (x, 0).

(2.27)

In this case, the potential u(x) is a nongeneric potential of type I and the limiting point k → 0 belongs to the continuous spectrum as a± (k) → a0 + O(k), b± (k) → b0 + O(k),

(2.28)

where real coefficients a0 and b0 are expressed through m+ 0, a0 = 1 +

m+2 0

, 2(1 + m+ 0)

b0 = m+ 0 −

or, equivalently, through m− 0 according to the relation, m− 0 =

m+ 0

1 + m+ 0

.

m+2 0

2(1 + m+ 0)

,

(2.29)

722

D. E. Pelinovsky, C. Sulem

Multisoliton potentials are particular examples of nongeneric potentials of type I since they display a non-secular behavior of b± (k), i.e. b± (k) ≡ 0 [2]. The asymptotic expressions (2.22) for spectral data were analyzed in Ref. [34], where the number n of bound states was related to a finite value of arg a± (0), ( π n − 21 if m−1 6 = 0 (2.30) arg a± (0) = ∓ πn if m−1 = 0. Thus, the constraint m−1 = 0 changes the spectral data and may result in a change of the number of bound states (i.e. in bifurcation of a new eigenvalue of the linear system). This is the type I bifurcation analyzed in Sect. 2.4.

2.2. Scalar products, orthogonality and completeness relations. According to Eqs. (2.13) andh(2.14), the eigenfunctions i M± (x, k) can be characterized by either of the ∓ n ± two sets S = N± (x, k), {8j (x)}j =1 . Furthermore, the spectral data {a± (k), b± (k)}

and {kj± , γj± }nj=1 can be expressed through the functions of the sets S ± according to formulas (2.4), (2.5), (2.12), (2.17), and (2.18). Also the potential u(x) is related to the sets S ± by Z

x

∓∞

u(x)dx = −

1 π

Z

∞

−∞

ρ± (k)N± (x, k)dk ± 2

n X j =1

Cj∓ 8∓ j (x).

(2.31)

This formula results from Eqs. (2.11) and (2.13) in the limit k → ∞. Thus, the scheme for closure of the spectral transform holds for the sets S ± . We now prove the following main result. Proposition 2.2. An arbitrary scalar function f (x) with the boundary conditions lim f (x) = f± ,

x→±∞

where f± are constants, can be decomposed through the orthogonal and complete set of eigenfunctions S + if f− = 0 or through its dual set S − if f+ = 0. The proof of this proposition is based on two lemmas. n Lemma 2.3. The eigenfunctions N± (x, k) and {8∓ j (x)}j =1 introduced in Sect. 2.1 satisfy the orthogonality relations,

hN∓ (k 0 )|N± (k)i = 2π ika∓ (k)δ(k − k 0 ), h8± j |N± (k)i ∓ h8± l |8j i where the scalar product is defined by Z hg(k 0 )|h(k)i =

= =

∞

−∞

hN± (k)|8± j i ∓κj aj0 δj l ,

= 0,

g ∗ (x, k 0 )∂x h(x, k)dx.

(2.32) (2.33) (2.34)

(2.35)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

723

Proof. First, we derive the Wronskian relation for two solutions h(k) and g(k 0 ) of Eq. (2.1) with a real potential u(x), d ∗ 0 g (k )hx (k) − gx∗ (k 0 )h(k) − 2ik 0 g ∗ (k 0 )h(k) = 2i(k − k 0 )g ∗ (k 0 )hx (k). (2.36) dx Then, we integrate Eq. (2.36) for h(k) = N± (x, k) and g ∗ (k 0 ) = N∓∗ (x, k 0 ) over x and use the boundary conditions (2.10) and the formula for generalized functions, lim eikL = ±π ikδ(k).

(2.37)

L→±∞

As a result, we find Eq. (2.32). The zero scalar products in Eqs. (2.33) and (2.34) follow also from Eq. (2.36) for different bound states. In order to find the nonzero scalar products (2.34), we integrate Eq. (2.36) for h(k) = M± (x, k) and g ∗ (k 0 ) = 8± (x) over x and use the boundary conditions (2.3) and (2.17). As a result, we find the integral relation, 2i(k − kj∓ )

Z

∞

−∞

−1 ± 8± (x)∂ M (x, k)dx = 2κ . x ± j γj j

This equation reduces to Eq. (2.34) after computing the integral on the left-hand side with the help of Eq. (2.13) and the zero scalar products (2.33) and (2.34). u t The proof of the orthogonality relations uses only the direct analysis of the spectral problem (2.1). The next lemma formulates the completeness relation. It will be proved by using equations of the inverse scattering transform. n Lemma 2.4. The eigenfunctions N± (x, k) and {8∓ j (x)}j =1 satisfy the completeness relations,

Z ±2[±(x − y)] =

∞ −∞

± ∓ n N∓∗ (y, k)N± (x, k)dk X 8j (y)8j (x) ∓ . 2π i(k ∓ i0)a∓ (k) κj aj0

(2.38)

j =1

Proof. First, we close Eq. (2.13) with the help of Eqs. (2.12) and (2.15). As a result, we find a system of integral and algebraic relations for the eigenfunctions N± (x, k) and 8∓ j (x),  N± (x, k) = e2ikx 1 ± 

n iC ∓ 8∓ (x) X j j j =1

±2κj x  1− 8∓ j (x) = e

k ∓ iκj

n X C ∓ 8∓ (x) l

j =1

l

κj + κl

+

+

1 2π i 1 2π i

Z

∞

−∞

Z

∞

−∞



ρ± (k 0 )N± (x, k 0 )dk 0  , k 0 + k ∓ i0

(2.39)

ρ± (k)N± (x, k)dk  . k ∓ iκj

(2.40)



We express N∓∗ (y, k) by using Eqs. (1.7) and (2.12), N∓∗ (y, k) = a∓ (k)N±∗ (y, k) ∓ b∓ (k)N± (y, k)e−2iky .

(2.41)

724

D. E. Pelinovsky, C. Sulem

The product N±∗ (y, k)N± (x, k) can be found from Eqs. (2.39) and (2.40) using the pole decomposition,  n 8± (y)8∓ (x) n 8∓ (y)8± (x) X X j j j j ± N±∗ (y, k)N± (x, k) = e2ik(x−y) 1 ± i(k ∓ iκj )aj0 i(k ± iκj )aj0 +

1 2πi

j =1

Z

1 + 2πi

∞

−∞

Z

∞

−∞

j =1

0 ρ± (k 0 )N± (y, k 0 )N± (x, k 0 )e−2ik y dk 0 k 0 + k ∓ i0

# 0 ρ± (k 0 )N± (y, k 0 )N± (x, k 0 )e−2ik x dk 0 . k 0 − k ∓ i0

Then, the following integral can be evaluated using the residue theorem, Z ∞ ∗ N± (y, k)N± (x, k)dk 1 = 2πi −∞ k ∓ i0   Z∞ n 8± (y)8∓ (x) −2iky dk X 1 ρ (k)N (y, k)N (x, k)e j j ± ± ±  + ± 2[±(x −y)] 1+ κj aj0 2π i k ∓ i0 j =1 −∞   ∞ Z n 8± (y)8∓ (x) −2iky dk X 1 ρ (k)N (y, k)N (x, k)e j j ± ± ± . + ± 2[∓(x −y)]  κj aj0 2π i k ∓ i0 j =1

−∞

(2.42) Substituting Eqs. (2.41) and (2.42) into the integral on the right-hand side we derive the completeness relation (2.38). u t Proof of Proposition 2.2. Using Lemmas 2.3 and 2.4, we decompose the function f (x) into two equivalent integral representations, Z ∞ n X α± (k)N± (x, k)dk + αj∓ 8∓ (2.43) f (x) = f∓ + j (x), −∞

j =1

where α± (k) and αj± are coefficients of the expansion and f± are constants defined by boundary conditions for f (x). The coefficients of the expansion can be expressed through the function f (x) by means of Eqs. (2.32)–(2.34), α± (k) =

hN∓ (k)|f i , 2πi(k ∓ i0)a∓ (k)

αj∓ = ∓

h8± j |f i κj aj0

.

Then, Eq. (2.43) reduces to an identity by means of Eq. (2.38). u t The completeness relations (2.38) and scalar products (2.35) for a new complete set of eigenfunctions differ from the standard relations for Jost eigenfunctions of the time-independent Schrödinger problem (see Appendix A.2 in Ref. [2]). This is due to the derivative term ∂x appearing in the problem (2.1) in front of the spectral parameter k. Since only the derivatives of f (x) determine the coefficients in Eq. (2.43), an arbitrary function f (x) may not be localized at infinity. Another related feature is that we have to pass by the singular point k = 0 in the completeness relations (2.38) into

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

725

the corresponding complex extensions of k, where the functions a± (k) are analytic. We notice that the spectral problem (2.1) is not self-adjoint in contrast with the original problem (1.4). Furthermore, the scalar product (2.35) is not a proper inner product since it is not sign-definite [4]. However, the problem (2.1) inherits some features of the self-adjoint problem. In particular, the orthogonal sets S + and S − are self-dual, i.e. complex-conjugate eigenfunctions of S − are adjoint to eigenfunctions of S + and vice versa. It is important to point out that the relation (2.31) for the inverse scattering transform is a particular application of Eq. (2.43). The coefficients ρ± (k) and Cj∓ play the role of Fourier coefficients. Indeed, using the orthogonality relations (2.32) and (2.34), one can express these coefficients through the potential u(x) according to Eqs. (2.5), (2.12), and (2.18). Thus, the sets S ± represent the only basis for closure of direct and inverse scattering transforms. Since two alternative (self-dual) orthogonal and complete sets of eigenfunctions have been constructed, we can now study perturbations of the potential and the associated transformation of the spectrum of Eq. (2.1).

2.3. Perturbation theory for spectral data. The spectral data can be evaluated explicitly only in some special cases such as multisoliton potentials. Therefore, perturbation theory for the scattering data under a perturbation of the potential is an effective tool to study characteristic features of a given scattering problem. Furthermore, the dynamics of solitons in nearly integrable systems can be investigated with the help of the same perturbation theory (see reviews in Refs. [39,40]). The results of the perturbation theory for the time-independent Schrödinger equation are now well-known and have been used many times. Here we reproduce these results within the self-consistent scheme given in Sect. 2.2. Suppose that the potential can be decomposed h as u = u(x) + 1u(x), i where 1 n ± and the complete sets of eigenfunctions S = N± (x, k), {8∓ (x)} j j =1 are associated to the potential u(x). Here we evaluate variations of the spectral data due to the perturbation 1u(x). 2.3.1. Variations of data of discrete spectrum. Suppose that 8∓ j (x) solves Eq. (2.1) for ∓ u = u(x) + 1u(x) with the eigenvalue k = kj = ∓iκj . We expand 8∓ j (x) through the sets S ± according to Eq. (2.43) rewritten as 8∓ j (x) =

Z

∞

−∞

n

X α ∓ 8∓ (x) α± (k)N± (x, k)dk l l + . 4π(k ∓ i0)a∓ (k)(k ± iκj ) 2κl al0 (κl − κj )

(2.44)

l=1

The eigenvalue problem (2.1) reduces with the help of Eqs. (2.32)–(2.34) and (2.44) to an equivalent set of homogeneous integral equations for the coefficients α± (k) and αl∓ , # "Z n ∞ X K∓l (k)αl∓ K± (k, k 0 )α± (k 0 )dk 0 , (2.45) α± (k) = + 0 0 0 2κl al0 (κl − κj ) −∞ 4π(k ∓ i0)a∓ (k )(k ± iκj ) l=1 # "Z n ∗ (k)α (k)dk ∞ ∓ X K±l K∓lm αm ± ∓ , (2.46) αl = + 0 (κ − κ ) 2κm am m −∞ 4π(k ∓ i0)a∓ (k)(k ± iκj ) j m=1

726

D. E. Pelinovsky, C. Sulem

where the integral elements are

Z

0

K± (k, k ) = K±j (k) =

∞

−∞ Z ∞ −∞

Z

and K±j l =

1u(x)N∓∗ (x, k)N± (x, k 0 )dx, 1u(x)N±∗ (x, k)8± j (x)dx,

∞

−∞

± 1u(x)8∓ j (x)8l (x)dx.

We look for solutions of Eqs. (2.45) and (2.46) in the asymptotic limit → 0. The results are summarized in the following proposition. Proposition 2.5. Variational derivatives of data {κj , γj± }nj=1 of the discrete spectrum of Eq. (2.1) with respect to the potential u(x) are given by + 8− δκj j (x)8j (x) =− , δu(x) 2κj aj0

δ ln γj± δu(x)

=∓

+ x8− j (x)8j (x)

κj aj0

(2.47) +

i 1 h ∓ ∓ ± ± ∓ γj 8j (x)µ± j (x) − γj 8j (x)µj (x) , (2.48) 2κj

where the real functions µ± j (x) are introduced as the limits, " # ± ± M± (x, k) cj 8j (x) − = µ± lim j (x). a± (k) k − kj± k→kj±

(2.49)

Proof. It follows from the self-consistency condition for Eq. (2.46) at l = j that κj can be expanded into the asymptotic series, κj = κj + 1κj + 2 12 κj + O( 3 ), where 1κj = −

K∓jj . 2κj aj0

(2.50)

This formula is equivalent to Eq. (2.47). Using Eq. (2.50), we construct an asymptotic solution of Eq. (2.45) and (2.46) to the first order of the perturbation theory, α± (k) = K∓ (k) + O( 2 ), αl∓ = K∓lj + O( 2 ).

∓ ∓ As a result, we find a perturbation to the bound state, 8∓ j (x) = 8j (x) + 18j (x) + O( 2 ), in the form, Z ∞ X K∓lj 8∓ (x) K∓j (k)N± (x, k)dk ∓ ∓ ∓ l + , 18j (x) = 1αj 8j (x) + 2κl al0 (κl − κj ) −∞ 4π(k ∓ i0)a∓ (k)(k ± iκj ) l6 =j

(2.51)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

727

where 1αj∓ is defined through the corrections to αj∓ and κj . We use the boundary conditions (2.3) and (2.17) as x → ∓∞ and evaluate the contribution from a double pole at k = ∓iκj in Eq. (2.51). Then, the term O(e±2κj x ) should be removed from the ∓ asymptotic representation for 18∓ j (x) as x → ∓∞ by specifying the correction 1αj in the form, 1αj∓

1κj 1 = ∓ κj κj aj0

Z∞

± x1u(x)8∓ j (x)8j (x)dx

−

−∞

∞ γj± Z

2κj

∓ 1u(x)8± j (x)µj (x)dx,

−∞

(2.52) where µ± j (x) is defined by Eq. (2.49). On the other hand, we assume an expansion γj± = γj± + 1γj± + O( 2 ) and find the correction 1γj± , 1γj±

=

γj± 1αj∓

K∓j (0) + ± 2κj

Z∞

−∞

∗ (k)dk X K±j l γ ± K∓j (k)ρ∓ l + . 4π(k ∓ i0)(k ± iκj ) 2κl al0 (κl − κj ) l6 =j

(2.53) This formula follows from Eq. (2.51) in the limit x → ±∞ with the help of Eqs. (2.10) and (2.17). In order to simplify this formula, we rewrite Eqs. (2.14) and (2.49) in the form, µ± j (x)−M∓ (x, 0)

=

Cj± 8± j (x) κj

Z∞ ∗ (k)N ∗ (x, k)dk X C ± 8± (x) ρ∓ ∓ l l ± κj , − κj κl (κl −κj ) 2π(k ∓ i0)(k ± iκj ) l6 =j

−∞

(2.54) where we have used Eqs. (2.7) and (2.12). Substitution of Eq. (2.54) into Eq. (2.53) gives Z ∞ 1κj 1 ± 1u(x)8∓ + 1γj± = γj± 1αj∓ − j (x)µj (x)dx. κj 2κj −∞ This expression reduces to Eq. (2.48) with the help of Eq. (2.52) u t Formulas (2.47) and (2.48) for the variations of data of discrete spectrum coincide with those derived from the standard perturbation theory of Eq. (1.4) (see Ref. [39]). Here we have derived these formulas by using the non-standard complete sets S ± of Eq. (2.1). In addition, the solution of the first order of the perturbation theory enables us to evaluate from Eq. (2.46) at l = j the next-order correction 12 κj , Z 12 κj = −

∞

−∞

∗ (k)K (k)dk K±j ∓j

8πκj aj0 (k ∓ i0)a∓ (k)(k ± iκj )

−

X l6 =j

K∓j l K∓lj . (2.55) 4κj κl aj0 aj0 (κl − κj )

728

D. E. Pelinovsky, C. Sulem

2.3.2. Variations of data of continuous spectrum. The eigenfunctions of the continuous spectrum can also be decomposed through the sets S ± as in Sect. 2.3.1. Suppose that N± (x, k) solves Eq. (2.1) for u = u(x) + 1u(x). We expand them to the first order of the perturbation theory, N± (x, k) = N± (x, k) + 1N± (x, k) + O( 2 ), and find the correction 1N± (x, k) in the form, Z 1N± (x, k) =

∞

−∞

n ∗ (k)8∓ (x) X K±j K± (k 0 , k)N± (x, k 0 )dk 0 j ∓ . 4π(k 0 ∓ i0)a∓ (k 0 )(k 0 − (k ± i0)) 2iκj aj0 (k ± iκj ) j =1

(2.56) Proposition 2.6. Variational derivatives of data {a± (k), b± (k)} of the continuous spectrum of Eq. (2.1) with respect to the potential u(x) are given by N ∗ (x, k)N± (x, k) δa± (k) =± ∓ , δu(x) 2ik N ∗ (x, k)M∓ (x, k) δb± (k) =− ± . δu(x) 2ik

(2.57) (2.58)

Proof. We expand the scattering data in the form (k) = a± (k) + 1a± (k) + O( 2 ), a±

(k) = b± (k) + 1b± (k) + O( 2 ), b±

and analyze Eq. (2.56) in the limits x → ±∞ by comparin0g with the boundary conditions (2.10) and (2.17). We find explicit solutions 1a± (k) = ±

K± (k, k) 2ik

(2.59)

and K∓ (k, 0) + 1b± (k) = − 2ik

Z

∞

−∞

n X K±j (k)Cj± K∓ (k, k 0 )ρ∓ (k 0 )dk 0 − . 4π(k 0 ± i0)(k 0 − (k ∓ i0)) 2iκj (k ∓ iκj ) j =1

(2.60) Then, Eq. (2.57) follows directly from Eq. (2.59). Furthermore, using Eq. (2.13), we derive the relation, Z ∞ n C ∓ 8∓ (x) X k ρ± (k 0 )N± (x, k 0 )dk 0 j j + k . M± (x, k) − M± (x, 0) = 2πi −∞ (k 0 ∓ i0)(k 0 − (k ± i0)) κj (k ± iκj ) j =1

Substituting this formula to Eq. (2.60), we recover Eq. (2.58). u t Formulas (2.57) and (2.58) for the variations of data of continuous spectrum coincide with those obtained from the standard perturbation theory of Eq. (1.4) (see Ref. [39]). We notice that in the limit k → 0, the variations 1a± (k) and 1b± (k) are divergent, i.e. 1a± (k) → ±

K± (0, 0) , 2ik

1b± (k) → −

K∓ (0, 0) . 2ik

(2.61)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

729

Comparing with Eq. (2.22) we identify the expansion of the parameter m−1 = m−1 + 1m−1 + O( 2 ), where Z 1m−1 = K± (0, 0) =

∞

−∞

1u(x)M− (x, 0)M+ (x, 0)dx.

(2.62)

2.3.3. Example: A single soliton potential. We solve Eq. (2.40) for ρ± (k) = 0 and n = 1 in the form, 8∓ 1 (x) =

e±2κ1 x , 1 + e±2κ1 (x−x0 )

(2.63)

where we have used the parametrization, C1∓ = 2κ1 e∓2κ1 x0 . Then, the soliton of the KdV equation is us (x) = 2κ12 sech2 κ1 (x − x0 ).

(2.64)

Using the following data for the single soliton potential, ∗ (k) = a+ (k) = a−

k − iκ1 , k + iκ1

a10 = −

1 , 2κ1

γ1± = e±2κ1 x0 ,

we evaluate the corrections of the first order of the perturbation theory from Eqs. (2.47), (2.48), (2.57), and (2.58), 1κ1 = 1γ1±

1 8κ12

Z

∞

1u(x)us (x)dx,

−∞ Z ∞

(2.65)

1 x1u(x)us (x)dx 4κ12 −∞ Z ∞ 1 1u(x) tanh[κ1 (x − x0 )]dx, (2.66) ± 2κ1 −∞ Z ∞ 1 2 2 2 1u(x) k + κ tanh [κ (x − x )] dx, (2.67) 1a± (k) = ± 1 0 1 2ik(k ∓ iκ1 )2 −∞ Z ∞ 1 1b± (k) = − 1u(x) (k − iκ1 tanh[κ1 (x − x0 )])2 e−2ikx dx. (2.68) 2ik(k 2 + κ12 ) −∞ γ1±

=±

The results for a single KdV soliton can be found in Refs. [39,40]. We notice that theR integral in Eq. (2.65) identifies with 1Ps , the correction to the momentum Ps = 1 ∞ 8 3 2 2 −∞ us dx of the KdV soliton. Computing Ps from Eq. (2.64) as Ps = 3 κ1 , we conclude from Eq. (2.65) that the correction 1Ps defines completely the renormalization of the parameter κ1 of the KdV soliton. The momentum of the continuous spectrum is therefore affected at the order of O( 2 ). This result is associated to the stability of a KdV soliton against small perturbations (see Ref. [41] for other examples).

730

D. E. Pelinovsky, C. Sulem

2.4. Type I bifurcation of new eigenvalues. The number of bound states may change if the potential u(x) is a nongeneric potential of type I, i.e. the criterion m−1 = 0 is met (see Definition 2.1). Using the asymptotic formulas (2.61) and (2.62), we find a necessary condition for this bifurcation. Suppose that the nongeneric potential u(x) has n bound (k) for the perturbed potential u = u(x) + 1u(x) has states. Then, the coefficient a± the following behavior as k → 0, 1m−1 + O(, k), 2ik where a0 is given by Eq. (2.29) and 1m−1 is defined by Eq. (2.62). Then, we take into account Eq. (2.30) for the potential u(x) and derive the extension of this formula for the perturbed potential u (x), σ (0) = ∓π n + , (2.69) arg a± 2 where σ = sign(1m−1 /a0 ). Comparing Eqs. (2.30) and (2.69) we conclude that a new (n + 1)th eigenvalue detaches from the edge of the continuous spectrum if σ = +1. Here we derive asymptotic expansions for the data of discrete spectrum corresponding to the new bound state. Also we discuss applications of these results to some physical problems such as soliton generation in the KdV equation and bifurcation of oscillatory modes in nonlinear Klein–Gordon equations. (k) = a0 ± a±

2.4.1. Asymptotic expressions for a new eigenvalue and bound state. Suppose that the potential u(x) is nongeneric and the perturbation 1u(x) supports the type I bifurcation. ± A new bound state 8∓ n+1 (x) can be decomposed through the complete sets S according to the same integral representation (2.44) but with the eigenvalue k = ∓iκn+1 such that = 0.A self-consistent solution to the homogeneous integral equation (2.45) lim→0 κn+1 may appear only if the kernel of the integral transform becomes singular as k 0 → 0 for → 0. Indeed, this is the case for a nongeneric potential, when the coefficients κn+1 a± (k) satisfy the asymptotic representation (2.28). Solving Eqs. (2.45) and (2.46) into the limit → 0 we derive the following result. Proposition 2.7. Under the conditions that the potential u(x) satisfies m−1 = 0 and the perturbation 1u(x) satisfies a0−1 1m−1 > 0, the potential u = u(x) + 1u(x) supports a bound state in a neighbourhood of k = 0 for > 0. The spectral data , γ ± ) for the new bound state are defined by (κn+1 n+1 κn+1 = 1κ + 2 12 κ + O( 3 ), ± = 1γ ± + O(), γn+1

where 1κ =

1m−1 > 0, 2a0

(2.70)

1 12 κ = 2a0   Z ∞ −1 n X a− (k)K+ (0, k)K+ (k, 0) − a0−1 (1m−1 )2 K−j (0)K+j (0) p.v. , dk + 4π k 2 2κj2 aj0 −∞ j =1

(2.71)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

731

and 1γ ± = 1 ± m± 0,

(2.72)

where m± 0 is defined by Eq. (2.26). Proof. Evaluating the singular contribution from the pole k 0 = 0 in Eq. (2.45), we find the leading order term in the form, α± (k) → where

Z Q± =

∞

−∞

(k 0

K± (k, 0)α± (0) Q± , 4π a0

(2.73)

dk 0 2π = , ∓ i0)(k 0 ± i1κ) 1κ

if 1κ > 0, and Q± = 0, if 1κ < 0. Therefore, the new eigenvalue exists under the condition 1κ > 0 (assuming > 0). Writing Eq. (2.73) at k = 0 gives the asymptotic expression (2.70). The new bound state 8∓ n+1 (x) is defined by Eq. (2.44) and (2.73). Using the boundary condition (2.3) for N± (x, k), we take the limit x → ∓∞ for 8∓ n+1 (x) and find α± (0) ∓21κx . e 8∓ n+1 (x) → K± (0, 0) The boundary condition (2.17) is met if α± (0) = K± (0, 0). Then, Eqs. (2.44) and (2.73) reduce to an asymptotic expression for the new bound state, Z ∞ K± (k, 0)N± (x, k)dk (x) = + O(), (2.74) 8∓ n+1 4π(k ∓ i0)a∓ (k)(k ± i1κ) −∞ where the integral term is an order of O(1). At the intermediate scale for finite x, we find from Eq. (2.74) that 8∓ n+1 (x) = N± (x, 0)+O(). Therefore, the bound state approaches a delocalized limiting eigenfunction of the continuous spectrum for finite x. Then, using Eqs. (2.10) and (2.17), we take the limit x → ±∞ in Eq. (2.74) and find 1γj± in the form, 1γ ± = a0 ± b0 . This expression reduces to Eq. (2.72) with the help of Eqs. (2.29). Finally, Eq. (2.71) follows from Eq. (2.45) at k = 0 by substituting the results of the first order of the perturbation theory. u t We notice that the asymptotic approximation for 1κ can be equivalently written from Eqs. (2.27), (2.62), and (2.70) as Z ∞ 2 b0 1 1− dx1u(x) M+ (x, 0) > 0. (2.75) 1κ = 2 a0 −∞ We have thus obtained that, for the type I bifurcation, a new eigenvalue is located in a neighbourhood of the edge of the continuous spectrum (e.g. k = 0) and a new (localized) bound state arises from a delocalized critical eigenfunction that exists in a nongeneric case.

732

D. E. Pelinovsky, C. Sulem

2.4.2. Example: A new eigenvalue supported by a small potential. Suppose that the initial potential is small, i.e. u(x) = 0 and u (x) = 1u(x). Then, the spectrum of the unperturbed problem is obvious: n = 0, a± (k) = 1, b± (k) = 0, M± (x, k) = 0, N± (x, k) = e2ikx . Since m−1 = 0, the zero background belongs to the class of nongeneric potentials of type I. Therefore, the type I bifurcation is possible, i.e. an infinitesimal initial disturbance can support a single eigenvalue in the problem (2.1). The criterion for this bifurcation follows from Eq. (2.75) as Z 1 ∞ 1 1u(x)dx > 0, (2.76) 1κ = 1M = 2 2 −∞ where 1M is the area integral which is the mass invariant for the KdV equation (1.3). This result is well-known as a Peierls problem in quantum mechanics [36]. In application to the KdV equation, we illustrate this phenomenon in Fig.1, where numerical simulations of Eq. (1.3) are presented. Figure 1(a) shows the evolution of the initial data u(x, 0) = 2asech2 x with a = 0.5. This corresponds to a disturbance of u with 1M > 0. We observe that the initial pulse evolves into a soliton propagating to the right and a radiative wave packet propagating to the left. The soliton has the mass Msol = 21M, while the radiation has the mass Mrad = −1M. On the other hand, the same initial pulse but with a = −0.4, which corresponds to the case 1M < 0, transforms solely into a linear radiative wave packet as seen in Fig. 1(b). No soliton is generated for this case. In the critical case 1M = 0 (e.g. for asymmetric pulses u(−x) = −u(x)), the type I bifurcation may still take place if 12 κ > 0. Inspecting the expression (2.71), we transform it to the form, ZZ ∞ 1 dxdy1u(x)1u(y)|x − y|, 12 κ = − 4 −∞ or, equivalently, 12 κ = −

1 2

Z

∞

−∞

Z dx

x

−∞

Z 1u(y)dy

x

∞

1u(y)dy .

(2.77)

It is clear from Eq. (2.77) that 12 κ > 0 if 1κ = 0. Therefore, the soliton generation always occurs even for critical initial disturbance with 1M = 0 (see also Ref. [37] for the same conclusion). Moreover, for small negative 1M the soliton generation still occurs if κ = 1κ + 2 12 κ + O( 3 ) > 0. Preliminary results on soliton generation in the critical case 1M = 0 were reported by Karpman (see Chap. 21 in [35]). Using physical motivations and analysis of quasi-linear self-similar solutions, he found that the quasi-linear solutions of the KdV equation (1.3) exist for 1M = 0 and Z ∞ xu(x, 0)dx < pcr , p1 = ∞

where pcr ≈ 7.As a result, he concluded that no soliton can be generated by a small initial perturbation with p1 < pcr . This conclusion together with early numerical simulations (see Fig. 21.1 in Ref. [35]) are not confirmed by the analysis developed here.

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

733

time u

6

7

x

-

-30

-20

-10

0

10

20

30

0

10

20

30

(a)

time u

6

7

x

-

-30

-20

-10 (b)

Fig. 2.1. Time evolution of the solution of the KdV equation (1.3) for the initial condition, u(x, 0) = 2asech2 (x). (a) Formation of a new soliton for a = 0.5. (b) Transformation of an initial pulse to a linear wave packet for a = −0.4

2.4.3. Example: A new eigenvalue supported by a perturbed single soliton potential. Multisoliton potentials also belong to the class of nongeneric potentials of type I for the problem (2.1). Therefore, perturbation of multisoliton potentials may generate a new eigenvalue and a bound state provided the condition (2.75) is met. In particular, a perturbation to a single soliton generates a new bound state if 1κ =

1 2

Z

∞

−∞

1u(x) tanh2 [κ(x − x0 )]dx > 0,

(2.78)

where x0 is defined in Eq. (2.63). This bifurcation was analyzed in Ref. [38] for the problem of soliton production from a shelf emitted by a moving soliton. The account of a secondary soliton allowed one to satisfy the mass conservation in the KdV equation perturbed by an external (dissipative) term.

734

D. E. Pelinovsky, C. Sulem

Recently, the same bifurcation was analyzed for the problem of existence of internal (oscillation) modes of kinks in nonlinear Klein-Gordon equations [42]. The criterion (2.78) was compared with numerical data for the oscillation mode in the spectrum of a double sine–Gordon equation. 3. Time-Dependent Schrödinger Problem The (nonlocal) RH formalism for the linear equation (1.2) can be developed after the 2 transformation, ϕ = me−ikx−ik y , where m = m(x, y, k) satisfies the problem, imy + mxx − 2ikmx + u(x, y)m = 0.

(3.1) Lp

for any p ≥ 2. We assume that the function u(x, y) is real, smooth and belongs to Also we assume the boundary condition for u(x, y) in the form, u(x, y) ∼ O(R −2 ) as p R = x 2 + y 2 → ∞, which includes the class of multilump potentials. Note that the function u(x, y) for the multilump potentials is not in L1 . A solution u = u(x, y, t) of the KPI equation (1.1) satisfies the constraint for t > 0 [27,28], Z ∞ u(x, y, t)dx = 0. (3.2) −∞

If the initial data u = u(x, y, 0) does not satisfy this constraint, the instant transformation of a solution occurs in an initial time layer so that the solution has the jump discontinuity at t → 0± [28–30]. Since the potential u(x, y) of the linear system (3.1) corresponds to any solution of the KPI equation (1.1) including the initial data u = u(x, y, 0), we do not impose the constraint (3.2) in our analysis and omit again the dependence on time. However, we assume the convergence of the following integral, Z ∞ Z ∞ < ∞. dy dxu(x, y) (3.3) −∞

−∞

Under this assumption, the integrals involving the eigenfunctions of Eq. (3.1), the spectral data and the potential u(x, y) are bounded in the scheme developed below (see e.g. Eq. (3.65)). 3.1. Spectrum and scattering data. Here we construct the continuous and discrete spectrum for Eq. (3.1) according to previous approaches [8,23] and also derive additional relations between the spectral data. 3.1.1. Green’s functions. The Green’s functions G± (x, y, k) associated to the problem (3.1) have the form [6], Z ∞ i 2 ei(ξ x+2ξ ky−ξ y) [2(y)2(±ξ ) − 2(−y)2(∓ξ )] dξ. (3.4) G± (x, y, k) = 2π −∞ The Green’s functions G+ (x, y, k) and G− (x, y, k) are analytic in the domains Im(k) ≥ 0 and Im(k) ≤ 0 respectively and have a jump at Im(k) = 0, Z ∞ i 2 sign(ξ )ei(ξ x+2ξ ky−ξ y) dξ. (3.5) G+ (x, y, k) − G− (x, y, k) = 2π −∞

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

735

In addition, the Green’s functions have two symmetry properties: i ∂G± (x, y, k) = i(x + 2ky)G± (x, y, k) ± ∂k 2π

(3.6)

G± (x, y, k) = G∗∓ (−x, −y, k).

(3.7)

and

It follows from Eq. (3.4), or, p equivalently, from Eq. (3.6) that the Green’s functions are localized in the limit R = x 2 + y 2 → ∞, G± (x, y, k) → ∓

1 + O(R −2 ), 2π(x + 2(k ± i0)y)

(3.8)

subject to x + 2ky 6 = 0. This expression is exact for y = 0. Furthermore, the Green’s functions are weakly localized along the singular line x+2ky = 0, where G± (x, y, k) → O(R −1/2 ) as R → ∞. The boundary value (3.8) implies the following asymptotic expansion in the limit k → ∞ ± i0, G± (x, y, k) → ∓

1 + O(k −2 ). 4π k(y ∓ i0sign(x))

(3.9)

Using the relation, 1 1 = ±π iδ(z) + p.v. , z ∓ i0 z

(3.10)

we express Eq. (3.9) in the form, G± (x, y, k) →

1 1 sign(x)δ(y) ∓ + O(k −2 ). 4ik 4π ky

(3.11)

This result agrees with the analysis of Ref. [28]. Remark. The order of integration becomes important for computing spectral data for the problem (3.1) when the potential u(x, y) is not absolutely integrable. Moreover, the result of integration of the Green’s functions (3.4) depends on the order in the double integrals, Z Z ∞

−∞

while

Z

∞ −∞

Z dx

dy ∞

∞

∞

∞

dxG± (x, y, k) = 0,

dyG± (x, y, k) = −

1 . 4(k ± i0)2

According to this result, we define all data for the former order of integration and use the following notation, Z ∞ Z ∞ ZZ dydx = p.v. dy dx. (3.12) R

−∞

−∞

736

D. E. Pelinovsky, C. Sulem

3.1.2. Continuous spectrum. The eigenfunctions M± (x, y, k) and N± (x, y, k, l) of Eq. (3.1) satisfy Fredholm’s integral equations, ZZ dy 0 dx 0 G± (x − x 0 , y − y 0 , k)u(x 0 , y 0 )M± (x 0 , y 0 , k) (3.13) M± (x, y, k) = 1 + R

and N± (x, y, k, l) = eiβ(x,y,k,l) Z (3.14) + dy 0 dx 0 G± (x − x 0 , y − y 0 , k)u(x 0 , y 0 )N± (x 0 , y 0 , k, l), R

where β(x, y, k, l) = (k − l)x + (k 2 − l 2 )y. The additional implicit parameter l appears for the eigenfunctions N± (x, y, k, l) according to the most general Fourier solution of Eq. (3.1) for the case u(x, y) = 0. Applying the boundary conditions (3.8) to Eqs. (3.13) and (3.14), we find that the eigenfunctions p M± (x, y, k) and N± (x, y, k, l) are not secular in x and y for any k and l. As R = x 2 + y 2 → ∞ and x + 2ky 6 = 0, they approach the boundary conditions, M± (x, y, k) → 1 + O(R −1 ), N± (x, y, k, l) → eiβ(x,y,k,l) + O(R −1 ).

(3.15)

The asymptotic representation of the eigenfunctions M± (x, y, k) in the limit k → ∞±i0 follows from Eqs. (3.11) and (3.13), Z x Z ∞ 1 − dx 0 u(x 0 , y) M± (x, y, k) = 1 + 4ik −∞ x (3.16) Z ∞ Z ∞ dy 0 1 0 0 0 −2 dx u(x , y ) + O(k ). ± 4πk −∞ y 0 − y −∞ Using Eq.(3.5), we find the RH boundary value problem for the eigenfunctions M± (x, y, k) at Im(k) = 0, Z k Z ∞ − M+ (x, y, k) − M− (x, y, k) = − dlr∓ (k, l)N± (x, y, k, l), (3.17) −∞

k

where r± = r± (k, l) is the spectral transform [8,23], ZZ 1 dydxu(x, y)M± (x, y, k)e−iβ(x,y,k,l) . r± (k, l) = 2πi R

(3.18)

The RH problem (3.17) is equivalent to the nonlocal form of Eq. (1.7) with a± (k) ≡ 1. The closure relations (1.11) between the eigenfunctions N± (x, y, k, l) and M± (x, y, k) follow from Eq. (3.6), ∂N± (x, y, k, l) = i(x + 2ky)N± (x, y, k, l) ± F± (k, l)M± (x, y, k), ∂k where F± (k, l) = −

1 2πi

(3.19)

ZZ R

dydxu(x, y)N± (x, y, k, l).

(3.20)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

737

This relation is to be complemented by the boundary conditions following from the uniqueness of solutions of Eqs. (3.13) and (3.14), N± (x, y, k, k) = M± (x, y, k).

(3.21)

Using these relations, we integrate Eq. (3.17) and obtain, Z k F± (p, l)M± (x, y, p)eiβ(x,y,k,p) dp. N± (x, y, k, l) = M± (x, y, l)eiβ(x,y,k,l) ± l

(3.22)

In addition to the spectral data r± (k, l) and F± (k, l), we consider also the spectral data T± (k, l, p) which appear in the relationship between the eigenfunctions N± (x, y, k, l) following from Eq. (3.14), Z k Z ∞ − dpT∓ (k, l, p)N± (x, y, k, p), N+ (x, y, k, l) − N− (x, y, k, l) = − −∞

k

(3.23) where 1 T± (k, l, p) = 2πi

ZZ R

dydxu(x, y)N± (x, y, k, l)e−iβ(x,y,k,p) .

(3.24)

We point out that the relation (3.23) is not a RH boundary value problem since the eigenfunctions N± (x, y, k, l) have no meromorphic continuation in a complex domain of k. Still, the relation (3.23) is formally valid for real k. Furthermore, the integrals (3.18), (3.20), and (3.24) for the spectral data are not absolutely integrable and, therefore, the order of integration specified by Eq. (3.12) cannot be interchanged. On the other hand, the integrals in Eqs. (3.13) and (3.14) converge absolutely and the order of integration can be interchanged in these integrals and also in further integration with respect to x, y and k. The spectral data r± (k, l) define the continuous spectrum of the problem (3.1) and satisfy the integral relations [23], Z l Z ∞ ∗ ∗ (l, k) ∓ − (l, p) = 0, dpr± (k, p)r± r± (k, l) + r± −∞

∗ (l, k) ± r± (k, l) + r∓

Z k

k

l

∗ dpr± (k, p)r∓ (l, p) = 0.

These equations were used in Ref. [23] to factorize the RH boundary-value nonlocal problem (3.17) and eliminate the set of eigenfunctions N± (x, y, k, l) from the problem. We intend to solve here a different problem: we express all eigenfuctions and scattering data in terms of the sets involving the eigenfunctions N± (x, y, k, l). In this respect, the following result completes the construction of the continuous spectrum for the problem (3.1). Proposition 3.1. The spectral data r± (k, l), F± (k, l) and T± (k, l, p) defined by Eqs. (3.18), (3.20) and (3.24) are related algebraically by r± (k, l) = F∓∗ (k, l),

(3.25)

T± (k, l, p) = −T∓∗ (k, p, l).

(3.26)

738

D. E. Pelinovsky, C. Sulem

∗ (x, y, k) and integrate over y and x. Using Proof. We multiply Eq. (3.14) by u(x, y)M∓ ∗ (x, y, k) we find a the symmetry relation (3.7) and integral equations (3.13) for M∓ simple formula, ∗ (k, l) + 2π iF± (k, l), 0 = −2π ir∓

where we have used definitions (3.18) and (3.20). This formula is nothing but Eq. (3.25). The proof of Eq. (3.26) can be done by the same method starting with Eq. (3.14) and t multiplying it by u(x, y)N∓∗ (x, y, k, l). u 3.1.3. Discrete spectrum. Bound states for Eq. (3.1) exist as homogeneous solutions of Fredholm’s integral equations (3.13) for isolated complex values of k (eigenvalues). The eigenvalues are located symmetrically in upper and lower half-planes [8]. The bound states correspond to algebraically decaying lumps of the KPI equation (1.1). It was proved [20,21] that the bound states may appear as multiple poles in the complex plane of k. Here we restrict ourselves to the case when the bound states are not multiple. The RH problem (3.17) coupled by the boundary conditions (1.10) and the closure relations (3.22) can be solved in the form, M± (x, y, k) = 1+

" + + n X cj 8j (x, y) j =1

1 − 2πi

Z

k− kj+

∞

−∞

+

cj− 8− j (x, y)

dk 0 k 0 −(k ± i0)

k−kj− Z

k0

−

−∞

Z

#

∞

k0

! dlr−σ (k 0 , l)N+σ (x, y, k 0 , l), (3.27)

where σ = +1 or σ = −1, 8± j (x, y) are the bound states, n is the number of bound ± states, and cj are renormalization constants. The bound states 8± j (x, y) are complex functions satisfying the homogeneous integral equations, 8± j (x, y)

ZZ =

R

0 0 dy 0 dx 0 G± (x − x 0 , y − y 0 , kj± )u(x 0 , y 0 )8± j (x , y ).

(3.28)

It follows from p Eq. (3.8) that they can be renormalized according to the boundary conditions as R = x 2 + y 2 → ∞, 8± j (x, y) →

1 + O(R −2 ), x + 2kj± y

(3.29)

subject to the normalization constraints, Q± = ∓

1 2π

ZZ R

dydxu(x, y)8± j (x, y) = 1.

(3.30)

Multiple bound states also occur for the KPI equation when the quantities Q± vanish. In this case, the expression (3.27) should be modified by multiple pole contributions [20, 21]. We consider only potentials u(x, y) for which the renormalization (3.30) holds.

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

739

The limiting relations for the eigenfunctions M± (x, y, k) approaching bound states can be derived from Eq. (3.1) in the form, " # cj± 8± j (x, y) ± ± ± ± = µ± lim M± (x, y, k) − j (x, y) = icj (x + 2kj y + γj )8j (x, y). ± ± k − k k→kj j (3.31) p Taking the limit R = x 2 + y 2 → ∞ in Eq. (3.31) with the help of Eqs. (3.15) and (3.29) we find that cj± = −i. We notice that this constraint does not hold for the problem (2.1) in one dimension, where cj± is related to γj± . The data {kj± , γj± }nj=1 defines the discrete spectrum of the problem (3.1) subject to the symmetry constraints, (kj− )∗ = kj+ ,

(γj− )∗ = γj+ .

(3.32)

The first symmetry constraint can be proved by means of the relation, ∗ (k, k), R± (k) ≡ r± (k, k) = −r∓

(3.33)

which follows from Eqs. (3.21) and (3.25). The coefficients R+ (k) and R− (k) are meromorphic functions in Im(k) ≥ 0 and Im(k) ≤ 0 respectively (see Eqs. (3.18) and (3.27)). Therefore, the symmetry constraint (3.33) implies that the location of the poles R+ (k) ∗ (k) coincides, i.e. the first relation in Eq. (3.32). The second symmetry constraint and R− in Eq. (3.32) follows from Eqs. (3.70) and (3.71) below. Notice that the coefficients R± (k) −1 in the problem (2.1) despite the play now the same role as the coefficients a± (k) fact that a± (k) = 1 for the RH problem (3.17). 3.1.4. Embedded eigenvalues. The continuous spectrum in the problem (3.1) has no edge points which separate it from the discrete spectrum. Recall that the problem (2.1) has the edge point at k = 0. Indeed, the spectral data r± (k, l) are not singular for real k and l in the general case (see Eqs. (3.18) p and (3.27)), and the eigenfunctions M± (x, y, k) are not growing in x and y as R = x 2 + y 2 → ∞ (see Eq. (3.15)). Still there are special (nongeneric) potentials u(x, y) for which the spectral data become singular at a certain point k = k0 at the real axis. Definition 3.2. The potential u(x, y) is called nongeneric of type II if there is at least one eigenvalue embedded into the continuous spectrum, i.e. the homogeneous Fredholm’s equations (3.28) exhibit bounded solutions at real k = k0 . Otherwise, the potential is called generic of type II. If the eigenvalue k = k0 is embedded into the continuous spectrum, the eigenfunctions M± (x, y, k), N± (x, y, k, l), and the spectral data r± (k, l) have a resonant pole at k = k0 . This pole is produced by the integral part in the solution of the RH problem (3.27). We introduce the singular behavior of M± (x, y, k) as k → k0 according to a limiting relation as k → k0 , M± (x, y, k) →

−i8± 0 (x, y) , k − (k0 ∓ i0)

(3.34)

740

D. E. Pelinovsky, C. Sulem

where 8± 0 (x, y) are solutions of Eq. (3.28) for k = k0 . It follows from Eq. (3.18) that r± (k, l) → where r0± (l)

1 =− 2π

We normalize the bound state

r0± (l) k − (k0 ∓ i0)

ZZ R

as k → k0 ,

(3.35)

−iβ(x,y,k0 ,l) dydxu(x, y)8± . 0 (x, y)e

8± 0 (x, y)

according to the same constraint (3.30) so that

r0± (k0 ) = ±1. However, the boundary conditions (3.29) are no longer valid due to the singular line at p 2 + y2 → (x, y) are weakly localized as R = x x + 2k0 y = 0 and the bound states 8± 0 ∞, −1 8± 0 (x, y) → O(R ) as x + 2k0 y 6 = 0

(3.36)

−1/2 ) as x + 2k0 y = 0. 8± 0 (x, y) → O(R

(3.37)

and

According to Eqs. (3.22) and (3.25), the eigenfunctions N± (x, y, k, l) are not singular as l → k0 and have the dominant behavior as k → k0 , N± (x, y, k, l) → ±

ir0∓∗ (l)8± 0 (x, y) . k − (k0 ∓ i0)

(3.38)

Using Eqs. (3.27), (3.34), (3.35) and (3.38), we find that the asymptotic expressions are self-consistent provided the following constraints are satisfied, − 8+ 0 (x, y) = −80 (x, y) ≡ 80 (x, y)

(3.39)

and Z

∞ −∞

dl sign(k0 − l) |r ± (l)|2 = 0.

(3.40)

These constraints can be derived by evaluating the residue contributions at k = k0 in Eq. (3.27) with the help of the formal expansion, sign(k − l) = sign(k0 − l) + 2(k − k0 )δ(k0 − l) + O(k − k0 )2 .

(3.41)

These eigenstates 8± 0 (x, y) are called half-bound states since they are weakly localized as R → ∞ and their spectral data consist only of the embedded eigenvalue k0 . Embedded eigenvalues and half-bound states are structurally unstable under a perturbation of the potential according to the theory of quantum resonances [43,44]. Therefore, we expect that the perturbation leads either to disappearance of the embedded eigenvalues at k = k0 or to their emergency into the complex domain as true eigenvalues. This is the type II bifurcation analyzed in Sect. 3.4.

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

741

3.2. Spectral decompositions. Here we study the spectral decomposition based on the eigenfunctions of the problem (3.1) for a potential u(x, y). Our analysis is not affected by the presence of embedded eigenvalues. The only assumption required for the potential u(x, y) is that it does not support multiple poles in the expansion (3.27). Non-standard orthogonality and completeness relations for the eigenfunctions of Eq. (3.1) are obtained in Sect. 3.2.1. Additional integral relations for the data γj± of the discrete spectrum are derived in Sect. 3.2.2. 3.2.1. Scalar products, orthogonality and completeness relations. The eigenfunctions M± (x, y, k) are characterized through the sets of eigenfunctions S ± = [N± (x, y, k, l), − n {8+ j (x, y), 8j (x, y)}j =1 ] by means of Eq. (3.27). The spectral data r± (k, l) and {kj± , γj± }nj=1 are defined by the sets S ± through Eqs. (3.20), (3.25), (3.30) and (3.31) (see also the additional Eq. (3.70) below). The potential u(x, y) is related to the sets S ± as follows [6] 1 2

Z

x

−∞

Z −

x

∞

u(x 0 , y)dx 0 =

1 π

Z

∞

Z dk

−∞ n h X

+2

j =1

k

−∞

Z −

k

∞

dlr∓ (k, l)N± (x, y, k, l)

i − (x, y) + 8 (x, y) . 8+ j j (3.42)

This formula results from Eqs. (3.16) and (3.27) in the limit k → ∞. Thus, the scheme for closure of the integral transform holds for the sets S ± and we state the following main result. Proposition 3.3. An arbitrary scalar function f (x, y) with the boundary conditions limx→±∞ f (x, y) = f± (y) can be decomposed through any of the orthogonal and complete sets of eigenfunctions S ± if f+ (y) + f− (y) = 0. The proof of this proposition is based on two lemmas. − n Lemma 3.4. The eigenfunctions N± (x, y, k, l) and {8+ j (x, y), 8j (x, y)}j =1 introduced in Sects. 3.1.2 and 3.1.3 satisfy the orthogonality relations,

hN± (k 0 , l 0 )|N± (k, l)i = 2π 2 i sign(k − l) δ(k − k 0 ) δ(l − l 0 ), ± ∓ ∓ h8± j |N± (k, l)i = hN± (k, l)|8j i = h8j |N± (k, l)i = hN± (k, l)|8j i = 0,

± ∓ ± h8± l |8j i = 0, h8l |8j i = ±π δj l ,

where the scalar product is given by ZZ hg(k 0 , l 0 )|h(k, l)i =

R

dydxg ∗ (x, y, k 0 , l 0 )∂x h(x, y, k, l).

(3.43)

(3.44)

(3.45)

(3.46)

742

D. E. Pelinovsky, C. Sulem

Proof. First, we derive a balance equation for two solutions h(k, l) and g(k 0 , l 0 ) of Eq. (3.1) with a real potential u(x, y), i

∂ ∂ g ∗ (k 0 , l 0 )h(k, l) + g ∗ (k 0 , l 0 )hx (k, l) − gx∗ (k 0 , l 0 )h(k, l) ∂y ∂x (3.47) −2ik 0 g ∗ (k 0 , l 0 )h(k, l) = 2i(k − k 0 )g ∗ (k 0 , l 0 )hx (k, l).

We integrate this equation for h = N± (x, y, k, l) and g ∗ = N±∗ (x, y, k 0 , l 0 ) over x and then over y. Using Eqs. (2.37) and (3.15), we derive the relation, 1 − lim lim hN± (k 0 , l 0 )|N± (k, l)i = 2(k−k 0 ) y→∞ y→−∞ Z ∞ dxN±∗ (x, y, k 0 , l 0 )N± (x, y, k, l) (3.48) · −∞

+ 2π 2 i

(k−k 0 −l)2 −l 02 δ(k−k 0 −l + l 0 )δ(k 2 −l 2 −k 02 + l 02 ). k−k 0

We substitute the integral representation (3.14) to evaluate the first term in Eq. (3.48) and integrate the Green’s functions according to Eq. (3.4). Then, the relation (3.48) reduces to the formula, hN± (k 0 , l 0 )|N± (k, l)i = 4π 2 i(k − l)δ(k − l − k 0 + l 0 )δ(k 2 − l 2 − k 02 + l 02 ) ∓ π 2 iδ(k − k 0 )R± ,

(3.49)

where R± = T± (k, l, l

0

) + T±∗ (k, l 0 , l) ∓

Z

k

−∞

Z −

k

∞

dpT± (k, l, p)T±∗ (k, l 0 , p)

and T± (k, l, p) is given by Eq. (3.24). In the derivation of Eq. (3.49) we have supposed that k 6 = l and k 0 6 = l 0 , i.e. the eigenfunction N± (x, y, k, l) is not degenerate [cf. Eq. (3.21)]. Under these conditions, zeros of both δ-functions in Eq. (3.49) occur only for k = k 0 and l = l 0 . Therefore, we simplify Eq. (3.49) by using the following formulas, αδ(αx) = sign(α)δ(x), 2δ(x + y)δ(x − y) = δ(x)δ(y).

(3.50)

Then, Eq. (3.49) reduces to Eq. (3.43) provided R± = 0. The latter identity follows from the relation (3.26) and the explicit expressions (3.23) and (3.24). The zero scalar products (3.44) and (3.45) can also be found from Eq. (3.47) for bound states. In order to find the nonzero inner products in Eqs. (3.45), we integrate Eq. (3.47) for h = M± (x, y, k) and g ∗ = 8∓∗ j (x, y) over x and then over y and use the boundary conditions (3.15) and (3.29). As a result, we derive the integral relation, ZZ ± dydx8∓∗ (3.51) 2i(k − kj ) j (x, y)∂x M± (x, y, k) = ±2π. R

This relation reduces to Eq. (3.45) after substitution of Eq. (3.27) for M± (x, y, k) and use of the zero scalar products (3.44) and (3.45). u t

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

743

We notice that Boiti et al. [23] used different scalar products for the orthogonality relations, Z ∞ 1 ∗ dxM∓ (x, y, l)M± (x, y, k)eiβ(x,y,l,k) = δ(l − k). (3.52) 2π −∞ These products generalize the results of the time-independent Schrödinger equation and are independent on y for solutions of Eq. (3.1). However, an arbitrary scalar function in two dimensions cannot be decomposed through the eigenfunctions M± (x, y, k)e−i(kx+k

2 y)

which depend on one spectral parameter. Note that, in the time-dependent problem, the eigenfunctions N± (x, y, k, l) and N± (x, y, k 0 , l 0 ) are orthogonal while N∓ (x, y, k, l) and N± (x, y, k 0 , l 0 ) are not (the time-independent problem has the opposite property, see Eq. (2.34)). − n Lemma 3.5. The eigenfunctions N± (x, y, k, l) and {8+ j (x, y), 8j (x, y)}j =1 satisfy the completeness relation, ZZ 1 1 sign(x − x 0 )δ(y − y 0 ) = dkdlN±∗ (x 0 , y 0 , k, l)N± (x, y, k, l) 2 2π 2 i D

+

n i 1 X h −∗ 0 0 + − 0 0 (x , y )8 (x, y) , 8j (x , y )8j (x, y) − 8+∗ j j π

(3.53)

j =1

where we have used the notation, Z ZZ dkdl ≡ D

∞

−∞

Z dk

k

−∞

Z −

∞

k

dl.

Proof. We start with transforming Eq. (3.19) to the form, ∂ ∗ 0 0 N± (x , y , k, l)N± (x, y, k, l) ∂k = i x − x 0 + 2k(y − y 0 ) N±∗ (x 0 , y 0 , k, l)N± (x, y, k, l) ∗ ∗ 0 0 ± M± (x, y, k)r∓ (k, l)N±∗ (x 0 , y 0 , k, l) + M± (x , y , k)r∓ (k, l)N± (x, y, k, l) , where we have used Eq. (3.25). Multiplying this equation by sign(k − l) and integrating over l, we derive the expression, ∂W (k) ∗ 0 0 = i x − x 0 + 2k(y − y 0 ) W (k) + M− (x , y , k)M+ (x, y, k) ∂k ∗ 0 0 (x , y , k)M− (x, y, k), + M+ where

Z W (k) =

k −∞

Z −

k

∞

(3.54)

dlN±∗ (x 0 , y 0 , k, l)N± (x, y, k, l).

∗ (x 0 , y 0 , k)M (x, y, k) and M ∗ (x 0 , y 0 , k)M (x, y, k) are meromorThe functions M− + − + phic in Im(k) ≥ 0 and Im(k) ≤ 0 respectively. We apply the Plemelj formula (see

744

D. E. Pelinovsky, C. Sulem

Appendix A1 in [6]) to reconstruct their sum from a given jump at Im(k) = 0. Evaluating the pole contribution according to Eqs. (3.31) and (3.32), we derive the following representation, 1 ∗ 0 0 ∗ 0 0 (x , y , k)M− (x, y, k) M− (x , y , k)M+ (x, y, k) + M+ 2 1 + 1 (k) − 1− (k) , = R(k) + 2 where

"

(3.55)

# 1 + R(k) = 1 + k − kj− (k − kj− )2 j =1 # " n X i(x − x 0 + 2kj+ (y − y 0 )) 1 −∗ 0 0 + 8j (x , y )8j (x, y) + + k − kj+ (k − kj+ )2 j =1 n X

− 0 0 8+∗ j (x , y )8j (x, y)

i(x − x 0 + 2kj− (y − y 0 ))

(3.56) and Z ∞ 1 dk 0 1± (k) = ± 0 2πi −∞ k − (k ± i0) ∗ 0 0 0 ∗ 0 0 0 (x , y , k )M− (x, y, k 0 ) . · M− (x , y , k )M+ (x, y, k 0 ) − M+

(3.57)

The functions 1+ (k) and 1− (k) represent the boundary values at real k of analytical functions in the upper and lower half-plane of k, respectively, subject to the boundary conditions in the limit k → ∞ ± i0, 1± (k) → ±

1± ∞ + O(k −2 ), k

(3.58)

where 1± ∞ are given by Z ∞ ∗ 0 0 1 ∗ 0 0 = − dk M− (x , y , k)M+ (x, y, k) − M+ (x , y , k)M− (x, y, k) 1± ∞ 2πi −∞ Z Z ∞ dy 00 (y − y 0 ) ∞ dx 00 u(x 00 , y 00 ). ± 00 00 0 4π −∞ (y − y)(y − y ) −∞ (3.59) Solving Eq. (3.54) as a differential equation in k, we derive explicitly, Z ∞ Z ∞ ZZ Z ∞ 0 0 + − W (k)dk = W (k)dk + W (k)dk + dkdlR(l)eiβ(x−x ,y−y ,k,l) , −∞

−∞

−∞

D

(3.60) where the functions W ± (k) solve the differential equations, ∂W ± (k) = i x − x 0 + 2k(y − y 0 ) W ± (k) ± 1± (k). ∂k

(3.61)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

745

In order to evaluate the last integral in Eq. (3.60) from Eq. (3.56), we transform the variables, 1 1 k = (p + κ), l = (p − κ), 2 2 and integrate first over p and then over κ with the use of the residue theorem. Then, Eq. (3.60) reproduces exactly the completeness relation (3.53) subject to the following constraint, Z ∞ Z ∞ + W (k)dk + W − (k)dk = 0. (3.62) −∞

−∞

Now we show that this constraint is satisfied for the functions W ± (k) defined by Eqs. (3.61). Since the right-hand-sides 1± (k) are analytical functions in the upper/lower half-planes of k, the functions W ± (k) solving Eq. (3.61) can be analytically continued in the corresponding domains of k subject to the constraint, Z ∞ W ± (k)dk = 0. (3.63) (y − y 0 ) −∞

In the case y 6 = y 0 , the boundary conditions for W ± (k) follow from Eqs. (3.58) and (3.61) as W ± (k) ∼ O(k −2 ). Therefore, the constraints (3.62) and (3.63) are satisfied. In the case y = y 0 , the constraint (3.63) is still met and the functions W ± (k) have the boundary

conditions as k → ∞ ± i0,

W ± (k) → As a result, we find explicitly that Z ∞ −∞

i1± ∞ + O(k −2 ). k(x − x 0 )

W ± (k)dk = ±

π 1± ∞ . x − x0

− 0 However, it follows from Eq. (3.59) that 1+ ∞ = 1∞ when y = y and, therefore, the constraint (3.62) is satisfied. u t

Proof of Proposition 3.3. We decompose a scalar function f (x, y) in the form, ZZ 1 dkdlα± (k, l)N± (x, y, k, l) f (x, y) = (f+ (y) + f− (y)) + 2 D n (3.64) X − − (x, y) + α 8 (x, y) . αj+ 8+ + j j j j =1

The coefficients of the expansion can be expressed through the derivative fx (x, y) according to Eqs. (3.43)–(3.45), hN± (k, l)|f i , α± (k, l) = 2π 2 i

αj±

=±

h8∓ j |f i π

.

Then, Eq. (3.64) reduces to an identity by means of Eq. (3.53). u t We conclude that the relation (3.42) for the inverse scattering transform is a particular application of Eq. (3.64). The coefficients r± (l, k) play the role of Fourier coefficients

746

D. E. Pelinovsky, C. Sulem

and they can be found from Eqs. (3.42) and (3.43) in the form (3.20) and (3.25). The coefficients cj± for the discrete spectrum are all fixed, cj± = −i, due to the renormalization conditions (3.30). These conditions are consistent with Eqs. (3.42) and (3.45). Notice that the formula (3.42) gives a nontrivial limit at x → ±∞ even in the case when the constraint (3.2) does not hold. Indeed, integrating Eq. (3.42) over y and then taking the limit x → ∞, we derive the explicit expression, ZZ

ZZ R

dydxu(x, y) = 2πiR± (0) −

D

dkdl

n k− − k+ X |r± (k, l)|2 j j , + πi + − k ∓ i0 kj kj j =1

(3.65) where we have used the relations following from Eqs. (3.4), (3.14) and (3.28), Z ∞ π r ∗ (k, l) dy N± (x, y, k, l) = 2π 2 i sign(k − l) δ(k) δ(l) − ∓ lim − lim x→∞ x→−∞ k ± i0 −∞ (3.66) and

Z

lim − lim

x→∞

x→−∞

∞

−∞

dy 8± j (x, y) = ±

πi . kj±

(3.67)

The relation (3.65) can also be derived from Eq. (3.27). We notice that the spectral decomposition gives an explicit value for the mass integral (3.65) provided the order of integration is specified according to Eq. (3.12). 3.2.2. Characterization of the data of the discrete spectrum. Here we use the orthogonality relations (3.43)–(3.45) and find an integral representation for the parameters γj± of the bound states. First, it follows from Eqs. (3.27) and (3.31) that the functions 8± j (x, y) satisfy the system of algebraic equations, " # n 0 X 8− 8+ ± ± ± l (x, y) l (x, y) + ± (x + 2kj y + γj )8j (x, y) = 1 − i kj± − kl+ kj − kl− l=1 (3.68) Z k Z ∞ Z ∞ dk 1 − dlr−σ (k, l)N+σ (x, y, k, l), − 2πi −∞ k − kj± −∞ k

P where σ = +1 or σ = −1 and 0 stands for sum without the singular term at kl± = kj± . Equation (3.68) can be viewed as a spectral decomposition of the functions µ± j (x, y) ± defined by Eq. (3.31) through the complete sets S . It follows from Eq. (3.68) and Eqs. (3.43)–(3.45) that ZZ ± ± ± dydx8∓∗ (x, y) (x + 2k y + γ )8 (x, y) = 0. (3.69) j j j j R

x

As a result, the spectral data γj± can be expressed from Eqs. (3.45) and (3.69) as ZZ 1 ± ± dydx8∓∗ (3.70) γj± = ∓ j (x, y)(x + 2kj y)8j x (x, y), π R

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

subject to the constraint,

Z

∞

−∞

± dx8∓∗ j (x, y)8j (x, y) = 0.

747

(3.71)

∗ This constraint can be derived by integrating Eq. (3.47) for h = 8± j (x, y) and g = (see Eq. (3.29)). 8∓∗ j (x, y) over x subject to the zero boundary conditions as y → ∞ ∗ We notice that Eqs. (3.70) and (3.71) imply the symmetry constraint γj− = γj+ , i.e. the second relation in Eq. (3.32). We apply the orthogonal and complete sets of eigenfunctions S ± to study perturbations of the potential and variation of the spectral data for Eq. (3.1).

3.3. Perturbation theory. Suppose that the potential can be decomposed as u = u(x, y) + 1u(x, y), where 1. We hassume that the potential u(x, y) supportsi the com− n plete sets of eigenfunctions S ± = N± (x, y, k, l), {8+ j (x, y), 8j (x, y)}j =1 . Also we assume that the perturbation 1u(x, y) ∼ O(1) as → 0. Here we evaluate variations of the spectral data due to the perturbation 1u(x, y). 3.3.1. Variations of data of discrete spectrum. Suppose that 8± j (x, y) solves Eq. (3.1) ± for u = u(x, y) + 1u(x, y) with the eigenvalue k = kj . We expand 8± j (x, y) through the sets S ± according to Eq. (3.64) rewritten as ZZ α± (k, l)N± (x, y, k, l) ± dkdl 8j (x, y) = 4π 2 (k − kj± ) D " # (3.72) n α˜ l∓ 8∓ 1 X αl± 8± l (x, y) l (x, y) − ± ± . 2π i kj± − kl± kj − kl∓ l=1 The eigenvalue problem (3.1) reduces with the help of Eqs. (3.43)–(3.45) and (3.72) to a set of homogeneous integral equations, "Z Z K± (k, k 0 , l, l 0 )α± (k 0 , l 0 ) dk 0 dl 0 α± (k, l) = 4π 2 (k 0 − kj± ) D !# (3.73) n K˜ ∓l (k, l)α˜ l∓ 1 X K±l (k, l)αl± , − ± ± 2π i kj± − kl± kj − kl∓ l=1

αl±

"Z Z =

D

dkdl

∗ (k, l)α (k, l) K˜ ∓l ±

4π 2 (k − kj± )

n 1 X ± 2π i

m=1

± ∓ K˜ ∓lm α˜ m K±lm αm ± ± − ± ∓ kj − km kj − km

!# , (3.74)

α˜ l∓

"Z Z =

D

dkdl

∗ (k, l)α (k, l) K±l ±

4π 2 (k − kj± )

n 1 X ± 2π i

m=1

± ∓ K∓lm α˜ m K˜ ±lm αm − ± ∓ kj± − km kj± − km

!# , (3.75)

748

D. E. Pelinovsky, C. Sulem

where the integral elements are ZZ dydx1u(x, y)N±∗ (x, y, k, l)N± (x, y, k 0 , l 0 ), K± (k, k 0 , l, l 0 ) = ZZ K±j (k, l) = K˜ ±j (k, l) =

ZZ

R

ZZ

R

K±j l = and K˜ ±j l =

R

R

ZZ R

dydx1u(x, y)N±∗ (x, y, k, l)8± j (x, y), dydx1u(x, y)N∓∗ (x, y, k, l)8± j (x, y), ± dydx1u(x, y)8∓∗ j (x, y)8l (x, y), ± dydx1u(x, y)8±∗ j (x, y)8l (x, y).

The results of the asymptotic analysis of Eqs. (3.73)–(3.75) in the limit → 0 are summarized in the following proposition. Proposition 3.6. Variational derivatives of the data {kj± , γj± }nj=1 of the discrete spectrum of Eq. (3.1) with respect to the potential u(x, y) are defined by δkj± δu(x, y)

=±

± 8∓∗ j (x, y)8j (x, y)

2π i

,

(3.76)

δγj±

± y8∓∗ j (x, y)8j (x, y) =∓ δu(x, y) πi # 0 " ∓∗ ± ± ∓∗ ∓ ± ±∗ n 8∓∗ 1 X 8j 8l − 8j 8l j 8l − 8j 8l + (3.77) ± 2π (kj± − kl± )2 (kj± − kl∓ )2 l=1 ZZ ± ∗ ∗ r± (k, l)N∓ (x, y, k, l)8∓∗ j (x, y) − r∓ (k, l)N± (x, y, k, l)8j (x, y) dkdl , ± 4π 2 (k − kj± )2 D P where 0l stands for sum excluding the singular term at kl± = kj± .

Proof. The self-consistency condition of Eq. (3.74) at l = j defines the expansion of the eigenvalue kj± as kj± = kj± + 1kj± + O( 2 ), where 1kj± = ±

K±jj . 2π i

(3.78)

This formula is equivalent to Eq. (3.76). We notice that the symmetry kj− = kj+∗ is preserved in the perturbation theory for real 1u(x, y). The set of integral equations is solved at the leading order as α± (k, l) = K±j (k, l) + O( 2 ), αl± = K±lj + O( 2 ), α˜ l∓ = K˜ ±lj + O( 2 ).

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

749

± This solution defines a perturbation to the bound state, 8± j (x, y) = 8j (x, y) + 2 18± j (x, y) + O( ), in the form,

18± j (x, y)

= 1αj± 8± j (x, y) ± ZZ +

D

dkdl

" n 0

K±lj 8± l (x, y)

1 X 2π i

kj± − kl±

l=1

−

K˜ ±lj 8∓ l (x, y)

#

kj± − kl∓

K±j (k, l)N± (x, y, k, l) , 4π 2 (k − kj± ) (3.79)

where the coefficient 1αj± is expressed through the corrections of αj± and kj± . This coefficient should be specified by normalizing the bound state 8± j (x, y) as R = p 2 2 x + y → ∞ according to Eq. (3.29) or, equivalently, Eq. (3.30). The latter constraint can be expanded to the first order in as ∓

1 2π

ZZ R

± dydx u(x, y)18± (x, y) + 1u(x, y)8 (x, y) = 0. j j

(3.80)

We prove that this integral equation defines the correction 1αj± in the form, 1αj±

# K˜ ±lj K±lj =− ± ± + 2π i kj kl± (kl± − kj± ) kl∓ (kl∓ − kj± ) l=1 ∗ (k, l) kj± Z Z K±j (k, l)r∓ 1 . K±j (0, 0) ± dkdl ± 2 2π 4π i (k ± i0)(k − kj± ) D 1kj±

n 0 kj± X

"

(3.81)

We evaluate the first integral in Eq. (3.80) by substituting Eq. (3.79) and using Eqs. (3.20), (3.25), and (3.30), ZZ

1 ∓ 2π

R

dydxu(x, y)18± j (x, y)

= 1αj± Z ∓

" n 0

1 X ± 2π i

D

l=1

dkdl

K˜ ±lj K±lj ± ± + ± kj − kl kj − kl∓

∗ (k, l) K±j (k, l)r∓

4π 2 i(k − kj± )

.

#

(3.82)

Then, we evaluate the second integral in Eq. (3.80) by using the spectral decomposition, 1 2

Z

Z −

x0

∞

∞ x0

ZZ 1 = dkdlK±j (k, l)N± (x0 , y, k, l) 2π 2 i D n 1 X ∓ ˜ ± K±lj 8± l (x0 , y)− K±lj 8l (x0 , y) . (3.83) π

dx1u(x, y)8± j (x, y)

l=1

Integrating this expression first in y and then taking the limit x0 → ∞, we find the second integral in Eq. (3.80) with the use of Eqs. (3.66) and (3.67),

750

D. E. Pelinovsky, C. Sulem

1 ∓ 2π

ZZ R

dydx1u(x, y)8± j (x, y)

1 ∓ K±j (0, 0) ± 2π

" # n K˜ ±lj 1 X K±lj =± + ∓ 2π i kl± kl l=1

ZZ D

dkdl

∗ (k, l) K±j (k, l)r∓ . 4π 2 i(k ± i0)

(3.84)

Formulas (3.82) and (3.84) reduce Eq. (3.80) to the form (3.81). Furthermore, we simplify Eq. (3.81) as follows ZZ 1 ± ± dydx1u(x, y)(x + 2kj± y + γj± )8∓∗ (3.85) 1αj = ± j (x, y)8j (x, y). 2π R This transformation is based on the relation following from Eqs. (3.27) and (3.68), µ± j (x, y) − M∓ (x, y, 0) = −

i8± j (x, y) kj±

− ikj± −

n 0 X l=1

kj± Z Z 2π i

"

8+ l (x, y)

kl+ (kj± − kl+ ) D

dkdl

+

8− l (x, y)

#

kl− (kj± − kl− )

r± (k, l)N∓ (x, y, k, l) , (k ± i0)(k − kj± )

where µ± j (x, y) is defined by Eq. (3.31). In order to prove Eq. (3.77) we assume the asymptotic expansion, γj± = γj± + 1γj± + O( 2 ), and express the correction 1γj± from Eq. (3.69) in the form, ZZ h i 1 ± ± ± dydx18∓∗ (x, y) (x + 2k y + γ )8 (x, y) 1γj± = ∓ j j j j x π ZZ R h i 1 ± ± ± ∓ dydx8∓∗ (3.86) j (x, y) (x + 2kj y + γj )18j (x, y) x π R 21kj± Z Z ± ∓ dydxy8∓∗ j (x, y)8j x (x, y). π R The first two terms can be evaluated by means of direct substitution of Eq. (3.79) and use of Eq. (3.68) and the orthogonality relations (3.43)–(3.45). The result is given by the expression, " # n 0 K∗ − K ˜ ∗ − K˜ ±lj X K ±lj 1 ∓lj ∓lj + ± 1γj± = ± 2π (kj± − kl± )2 (kj − kl∓ )2 l=1 ZZ ∗ (k, l) − r ∗ (k, l)K (k, l) r± (k, l)K∓j ±j ∓ dkdl ± ± 2 2 4π (k − kj ) D (3.87) ± ZZ 21kj ± ∓ dydxy8∓∗ j (x, y)8j x (x, y) π R ZZ 1 ± dydx8∓∗ ∓ j (x, y)18j (x, y). π R

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

751

In order to evaluate the last two terms in this expression, we use the equation for 18± j , ± ± ± ± ± ± ± i18± jy + 18j xx − 2ikj 18j x + u18j = 2i1kj 8j x − 1u8j p with the boundary condition as R = x 2 + y 2 → ∞,

18± j →−

21kj± y

(x + 2kj± y)2

+ O(R −2 ).

(3.88)

(3.89)

The balance equation following from Eqs. (3.1) and (3.88) can be integrated to the form, Z ∞ ∂ ± dx8∓∗ (x, y)18 (x, y) j j ∂y −∞ Z ∞ Z ∞ ± ± dx8∓∗ dx1u(x, y)8∓∗ = 21kj± j (x, y)8j x (x, y) + i j (x, y)8j (x, y). −∞

−∞

(3.90) Multiplying this equation by y and integrating over y with the boundary conditions (3.29) and (3.89), we find ZZ 21kj± Z Z 1 ± ± dydxy8∓∗ (x, y)8 (x, y) ∓ dydx8∓∗ ∓ j jx j (x, y)18j (x, y) π π R R ZZ 1 ± dydxy1u(x, y)8∓∗ =∓ j (x, y)8j (x, y). πi R (3.91) Formulas (3.87) and (3.91) reduce to Eq. (3.77). We notice that the symmetry constraint t (γj− )∗ = γj+ is preserved by the real potential 1u(x, y). u The results formulated in Proposition 3.6 constitute the basis for the analysis of dynamics of the KPI lumps under small perturbations, e.g. under distortions of their shapes. 3.3.2. Variations of data of continuous spectrum. Suppose that N± (x, y, k, l) solves Eq. (3.1) for u = u(x, y) + 1u(x, y). We expand it to the first order, N± (x, y, k, l) = N± (x, y, k, l) + 1N± (x, y, k, l) + O( 2 ), and find the correction 1N± (x, y, k, l) in the form, ZZ K± (k 0 , k, l 0 , l)N± (x, y, k 0 , l 0 ) dk 0 dl 0 1N± (x, y, k, l) = 4π 2 (k 0 − (k ± i0)) D " # (3.92) n ∗ (k, l)8± (x, y) ∗ (k, l)8∓ (x, y) K±l 1 X K˜ ∓l l l − ± . ± ∓ 2πi k − k k − k l l l=1 The main result of this subsection is formulated in the following proposition. Proposition 3.7. Variational derivatives of the data r± (k, l) of the continuous spectrum of Eq. (3.1) with respect to the potential u(x, y) are given by N ∗ (x, y, k, l)M± (x, y, k) δr± (k, l) = ∓ . δu(x, y) 2π i

(3.93)

752

D. E. Pelinovsky, C. Sulem

Proof. The derivation follows the proof of Proposition 3.6. First, we expand the scatter (k, l) = r (k, l) + 1r (k, l) + O( 2 ) and use Eqs. (3.20) and (3.25) to ing data as r± ± ± express 1r± (k, l) as ZZ 1 dydx u(x, y)1N∓∗ (x, y, k, l) + 1u(x, y)N∓∗ (x, y, k, l) . 1r± (k, l) = 2πi R (3.94) The first integral can be evaluated explicitly through the substitution of Eqs. (3.92) and use of Eqs. (3.20), (3.25), and (3.30). The second integral can be found by integrating the spectral decomposition, Z x0 Z ∞ 1 − dx1u(x, y)N∓∗ (x, y, k, l) 2 −∞ x0 ZZ 1 ∗ 0 dk 0 dl 0 K∓ (k , k, l 0 , l)N∓∗ (x0 , y, k 0 , l 0 ) = 2π 2 i D n 1 X ˜ ±∗ (x , y) − K (k, l)8 (x , y) , K±l (k, l)8∓∗ ± 0 ∓l 0 l l π l=1

over y in the limit x0 → ∞. As a result, we deduce ZZ ∗ 0 0 0 0 1 k ∗ 0 0 K∓ (k , k, l , l)r± (k , l ) dk dl K∓ (0, k, 0, l) + 1r± (k, l) = 2πi 4π 2 (k 0 ± i0)(k 0 − (k ± i0)) D ! (3.95) n K∓l (k, l) k X K˜ ±l (k, l) + . + 2π kl± (kl± − k) kl∓ (kl∓ − k) l=1 Using Eqs. (3.27) for M± (x, y, k) − M∓ (x, y, 0), we conclude that Eq. (3.95) reduces to Eq. (3.93). u t 3.3.3. Example: A single lump potential. We solve Eq. (3.68) for r± (k, l) = 0 at l 6 = k and n = 1 in the form, 8+ 1 (x, y) = 2κ1

8− 1 (x, y) = 2κ1

2κ1 X − 4iκ12 Y − 1 4κ12 X 2 + 16κ14 Y 2 + 1 2κ1 X + 4iκ12 Y + 1 4κ12 X 2 + 16κ14 Y 2 + 1

,

(3.96)

,

(3.97)

where we have used the parametrization, k1± = p1 ± iκ1 , γ1± = −x0 − 2k1± y0 , X = x − x0 + 2p1 (y − y0 ), Y = y − y0 . Then, the lump of the KPI equation (1.1) is us (x, y) = wsX (X, Y ), ws (X, Y ) =

16κ12 X 4κ12 X2 + 16κ14 Y 2 + 1

,

(3.98)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

753

which satisfies the constraint (3.2). We use the following relation, 1 i ± wsY (X, Y ), 8∓∗ 1 (x, y)81 (x, y) = − us (X, Y ) ± 4 8κ1 and evaluate explicitly the perturbation corrections of the first order of the perturbation theory (3.76) and (3.77), 1k1± =

1γ1± =

4κ1 π

1 i 1Psx , 1Psy − 2p1 1Psx ± 16π κ1 8π

ZZ R

1 ± 4πi

dydx

ZZ R

X(16κ14 yY − 1) (4κ12 X2

+ 16κ14 Y 2 + 1)2

(3.99)

1u(x, y) (3.100)

dydx yus (x, y)1u(x, y),

where 1Psx and 1Psy are corrections to the x and y-projections of the momentum for the KPI equation (1.1), ZZ ZZ 1 1 2 dydxwsx , Psy = dydxwsx wsy . Psx = 2 2 R R The perturbation of the data of the continuous spectrum 1r± (k, l) can be found from Eq. (3.93) by using the explicit relation (see Eqs. (3.22) and (3.27)), # " 4iκ12 (2(l − p1 )X + 4κ12 Y − i) eiβ(x,y,k,l) . N± (x, y, k, l) = 1 − ((l − p1 )2 + κ12 )(4κ12 X2 + 16κ14 Y 2 + 1) (3.101) We notice that R± (k) = r± (k, k) = −

4π κ1 , (k − p1 )2 + κ12

i.e. R(k) 6 = 0. On the other hand, we confirm from Eqs. (3.20), (3.25), and (3.101) that r± (k, l) = 0 for any l 6 = k. Since the projections of the momentum at the KPI lump (3.98) are Psx = 8π κ1 and Psy = 16πκ1 p1 , we check from Eq. (3.99) that the first-order corrections 1Psx and 1Psy define completely the renormalization of the parameters κ1 and p1 of the KPI lump (3.98) and affect the excitation of the momentum of the continuous spectrum in the order of O( 2 ). This result confirms the stability of the single KPI lump against small perturbations [32]. 3.4. Type II bifurcation of new eigenvalues. The results of Sects. 3.2 and 3.3 remain valid even if the potential u(x, y) is a nongeneric potential of type II, i.e. it supports an embedded eigenvalue at k = k0 . Indeed, the half-bound states 8± 0 (x, y) appear as pole contributions of the continuous h spectrum and their presence does not affect i the complete + − n ± sets of eigenfunctions S = N± (x, y, k, l), {8j (x, y), 8j (x, y)}j =1 . However, the eigenfunctions M± (x, y, k) and N± (x, y, k, l) are singular at k = k0 according to Eqs. (3.34) and (3.38). As a result, the variation of the scattering data r± (k, l) defined by

754

D. E. Pelinovsky, C. Sulem

Eq. (3.93) becomes divergent as k → k0 if the nongeneric potential u(x, y) is perturbed by a correction 1u(x, y), 1r± (k, l) → ± where K0± = −K0 and

r0± (l)K0± , 2π i(k − (k0 ∓ i0))2

(3.102)

ZZ K0 =

R

dydx1u(x, y)|80 (x, y)|2 .

(3.103)

Combining Eqs. (3.35) and (3.102), we find that the perturbation 1u(x, y) shifts the pole at k = k0 into the complex domain, iK0 . (3.104) 2π This shift crosses the real axis if sign(K0 ) > 0. In this case, the eigenfunctions M+ (x, y, k) and M− (x, y, k) acquire a new pole in the upper and lower half-plane of k, respectively (cf. Eqs. (3.27) and (3.34)). We prove below that the bifurcation of the embedded eigenvalue into the complex plane occurs under the condition sign(K0 ) > 0 and Eq. (3.104) gives the leading order of the new eigenvalue. In the opposite case, i.e. when sign(K0 ) < 0, the analyticity properties of M± (x, y, k) in the corresponding domains of k are not affected and we expect that the embedded eigenvalue just disappears. Here we derive asymptotic expansions for the new eigenvalue and associated bound state. The results are applied to the problem of generation of a KPI lump by a localized initial pulse. k0± = k0 ±

3.4.1. Asymptotic expressions for a new eigenvalue and bound state. Suppose that the type II bifurcation occurs under the perturbation 1u(x, y).A new bound state 8± n+1 (x, y) can be decomposed through the complete sets S ± according to Eq. (3.72) with the ± ± such that lim→0 kn+1 = k0 . If the potential u(x, y) is nongeneric, eigenvalue k = kn+1 ± → k0 . the homogeneous integral equation (3.73) has a singular kernel at k 0 → k0 if kn+1 Solving this equation asymptotically in the limit → 0, we derive the following result. Proposition 3.8. Under the conditions that the potential u(x, y) exhibits an embedded eigenvalue at k = k0 and the perturbation 1u(x, y) satisfies the criterion K0 > 0 (see Eq. (3.103), the potential u = u(x, y) + 1u(x, y) supports a bound state in ± for the new bound state a neighbourhood of k = k0 for > 0. The eigenvalue kn+1 ± 8n+1 (x, y) is defined by ± = k0 ± i1k + O( 2 ), kn+1 where 1 (3.105) K0 > 0. 2π Proof. We consider an asymptotic solution of Eq. (3.73) at k, l → k0 and → 0. Using Eq. (3.38), we rescale the variables in the problem, 1k =

K± (k, k 0 , l, l 0 ) =

r0∓ (l)r0∓∗ (l 0 )P± (k, k 0 , l, l 0 ) r0∓ (l)A± (k, l) (k, l) = , α . ± (k − (k0 ± i0))(k 0 − (k0 ∓ i0)) k − (k0 ± i0) (3.106)

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

755

Then, we evaluate the singular contribution from the pole k 0 = k0 in the integral of Eq. (3.73) and find the leading order term in the form, A± (k, l) → where

P± (k, k0 , l, k0 )A± (k0 , k0 ) Q± , 4π 2

(3.107)

ZZ

dk 0 dl 0 |r0∓ (l 0 )|2 . 0 2 0 D (k − k0 ) (k − k0 ∓ i1k) Using the formal expansion (3.41) and the constraint (3.40), we evaluate Q± by means of the residue theorem as Z ∞ 2π 2dk = sign(1k). (3.108) Q± = 1k −∞ (k − k0 )(k − k0 ∓ i1k) Q± =

Writing Eq. (3.107) at k = l = k0 and assuming > 0, we find the simple result, |1k| =

1 1 P± (k0 , k0 , k0 , k0 ) = K0 . 2π 2π

The latter equation is self-consistent only for K0 > 0, when the bifurcation occurs and the new eigenvalue has the asymptotic approximation (3.105). The new bound state Using the same approach, 8± n+1 (x, y) is defined by Eqs. (3.72), (3.106), and (3.107). p 2 + y2, (x, y) for finite R = x we simplify the expression for 8± n+1 8± n+1 (x, y) = ±

iA± (k0 , k0 ) Q± 8± 0 (x, y) + O(). 4π 2

(3.109)

Using Eq. (3.108), we satisfy the normalization condition (3.30) by specifying A± (k0 , k0 ) = ∓iK0 . Then, Eqs. (3.72), (3.106), and (3.107) reduce to the asymptotic expression for the new bound state, ZZ r0∓ (l)P± (k, k0 , l, k0 )N± (x, y, k, l) i (x, y) = ∓ dkdl + O(), 8± n+1 4π 2 K (k − (k0 ± i0))(k − k0 ∓ i1k) D where the integral term is the order of O(1). u t We have thus found that, for the type II bifurcation, a new eigenvalue appears transversely to the real axis in the neighbourhood of the embedded eigenvalue and a new bound state arises from a localized eigenfunction corresponding to the half-bound state. 3.4.2. Example: Generation of a single KPI lump. The multilump potentials of the linear problem (1.2) do not belong to the nongeneric potentials of type II since r± (k, l) are not singular for real k. Indeed, for such potentials, r± (k, l) = 0 at l 6= k and N± (x, y, k, l) = M± (x, y, l)eiβ(x,y,k,l) , where M± (x, y, l) = 1 − i

" + n X 8j (x, y) j =1

l − kj+

+

8− j (x, y) l − kj−

# .

It is clear from this expression that the embedded eigenvalues at real k are not supported by the multilump potentials. In the particular case n = 0, we conclude that the zero

756

D. E. Pelinovsky, C. Sulem

%u 80.0 ↑y

40.0

0.0

0.0

55.0 (a)

→x

110.0 (b)

(c)

(d)

Fig. 3.1. Formation of a new KPI lump for the initial condition (3.110) and a = 1.5 at times t = 5 (a), t = 10 (b), t = 15 (c), and t = 20 (d) ((a)–(d) in the same coordinates)

background u(x, y) = 0 does not exhibit embedded eigenvalues and, therefore, small initial data do not generate new eigenvalues. This implies that there must be a threshold for amplitude of the initial localized pulse to generate a new eigenvalue in the problem (1.2) and an associated lump in the KPI equation (1.1). This result is valid if the initial data 1u(x, y) ∼ O(1) as → 0. Note that the existence of a threshold follows also from the rigorous paper by Fokas and Sung [19] where a small-norm assumption was used to eliminate lumps from the spectral problem (1.2). In order to illustrate this result, we reproduce in Fig. 2 the numerical simulations of the KPI equation (1.1) performed by M. He [33]. The initial condition was chosen in the form of the KPI lump (3.98) with p1 = 0, κ1 = 1/2 and an arbitrary amplitude a, u(x, y, 0) = 4a

1 + y2 − x2 . (1 + x 2 + y 2 )2

(3.110)

If a = 1, it coincides with the KPI lump. If the amplitude a is greater or close to the amplitude of the KPI lump, the initial pulse transforms into a steady-state solitary wave. Fig. 2(a–d) shows successive snapshots at various times for the evolution of the initial

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

757

%u 80.0 ↑y

40.0

0.0

0.0

55.0 (a)

(c)

→x

110.0 (b)

(d)

Fig. 3.2. Transformation of an initial pulse to a linear wave packet for the initial condition (3.110) and a = 0.5 at times t = 2 (a), t = 4 (b), t = 6 (c), and t = 8 (d) ((a)–(d) in the same coordinates)

data (3.110) with a = 1.5. It is clearly seen that the initial pulse evolves into a KPI lump. On the other hand, if the amplitude a is small enough, the initial pulse broads up and decays into linear dispersive waves. Fig. 3(a–d) shows the decay of initial data (3.110) with a = 0.5. Since the multilump potentials do not support embedded eigenvalues, a small perturbation does not generate new bound states in the spectral problem (1.2). Therefore, similarly to the case n = 0, there is a threshold for the amplitude of a perturbation to the multilump potential to generate a new eigenvalue and an associated KPI lump.

4. Discussion We have presented a complete analysis of the spectral decomposition for the timeindependent and time-dependent Schrödinger equations within the RH formalism of inverse scattering. The spectral problems (1.2) and (1.4) are formulated for self-adjoint operators where the spectral decomposition, inner products and completeness relations follow from the spectral theory in Hilbert spaces [4] subject to the assumption that

758

D. E. Pelinovsky, C. Sulem

u(x, y) ∈ L1 . Since the multilump potentials violate this assumption, the discrete spectrum of the time-dependent Schrödinger equation (1.2) does not fit into this theory and the corresponding eigenfunctions diverge exponentially, ±

±2

−ikj x−ikj ϕ = 8± j (x, y)e

y

,

where 8± j (x, y) are defined by Eq. (3.28). These nonlocalized “bound states” account for resonant poles of the operator resolvent [4]. Spectral decomposition and completeness relations were not derived in this context. In the framework of the RH formalism of the inverse scattering, we have transformed the self-adjoint spectral problems (1.2) and (1.4) to the non-self-adjoint form (2.1) and (3.1), where new non-standard scalar products were introduced and orthogonality and completeness relations were proved by means of direct computations. Although no rigorous result is available for general non-self-adjoint linear operators, we conjecture that linear problems associated to nonlinear evolution equations within the formalism of inverse scattering always possess a complete basis for the spectral decomposition. We mention now some results concerning other linear spectral problems considered in the RH formalism of inverse scattering [6]. The ILW equation. This integro-differential equation is related to the scalar (local) RH boundary value problem (1.7) and (1.8). The associated linear problem generalizes Eq. (1.4) and has a standard complete basis of eigenfunctions. The discrete spectrum of this problem is associated to solitons of the ILW equation [45]. The BO equation. This equation is related to the scalar (nonlocal) RH problem (1.7) and (1.8) for a± (k) = 1. The discrete spectrum is associated to lumps (algebraic solitons) of the BO equation [7]. The spectral decomposition for the associated linear problem was recently analyzed [10]. Equations of the AKNS scheme. These equations are associated to the AKNS spectral problem and include the NLS equation and the modified KdV equation as particular cases [1]. The AKNS spectral problem can be formulated through the vector (local) RH boundary value problem and the discrete spectrum corresponds to solitons of the nonlinear evolution equations [6]. The standard spectral decomposition was proved in Ref. [1]. The DSI system. This system is related to the AKNS spectral problem in two dimensions. The vector (nonlocal) RH boundary value problem can be formulated and has a discrete spectrum associated to dromions of the DSI system [46,47]. In this paper, the spectral decomposition was used to solve the particular problem associated to nonlinear evolution equations, whether or not a small initial disturbance supports propagation of a soliton. Equivalently, this problem concerns the existence of a single eigenvalue for the discrete spectrum of the associated linear problem with a small potential. Extending the results of this paper, we conjecture that spectral problems with nongeneric potentials of type I may possess a single eigenvalue for a small potential while spectral problems with nongeneric potentials of type II have no eigenvalues for small potentials. We present below a table which summarizes the results on soliton generation for the problems solvable by means of inverse scattering.

Eigenfunctions and Eigenvalues for a Scalar Riemann–Hilbert Problem

nonlinear equation KdV equation ILW equation BO equation KPI equation AKNS equations DSI system

bound states solitons solitons lumps lumps solitons dromions

type of bifurcation type I type I type I type II type II ?

759

reference [35]–[37] [14] [13] this paper [48] [49]

Finally, there are also linear problems which possess localized bound states and are related to the ∂¯ formalism of inverse scattering rather than to the RH formalism. An example is provided by the DSII system [9]. The eigenfunctions of the continuous spectrum for these linear problems have no simple analytical properties in k and the spectral decomposition and bifurcations of eigenvalues remain open for further studies. Acknowledgements. The numerical simulations shown in Figs. 2 and 3 were performed by M. He [33]. We thank him for letting us include his results in this paper. We benefited from stimulating discussions with M. J. Ablowitz, A. Brudny, F. Calogero, A. Fokas, D. Kaup, O. Kiselev, B. Malomed, I. M. Sigal, A. Soffer, and Yu. A. Stepanyants. D.P. acknowledges support from a NATO fellowship provided by NSERC and C.S. acknowledges support from NSERC Operating grant OGP0046179.

References 1. Ablowitz, M.J., Kaup, D.J., Newell, A.C. and Segur, H.: The inverse scattering transform – Fourier analysis for nonlinear problems. Stud. Appl. Math. 53, 249–315 (1974) 2. Calogero, F. and Degasperis, A.: Spectral Transform and Solitons. Amsterdam: North-Holland Publishing Company, 1982 3. Newton, R.: Scattering theory of waves and particles. New York: Mc-Graw-Hill, 1966 4. Hislop, P.D and Sigal, I.M.: Introduction to spectral theory with application to Schrödinger operators. Series in Appl. Math. Sciences, Vol. 113 Berlin–Heidelberg–New York: Springer-Verlag, 1996 5. Zakharov, V.E. and Manakov, S.V.: The construction of multidimensional nonlinear integrable systems and their solutions. Func. Anal. Appl. 19, 89–101 (1985) 6. Ablowitz, M.J. and Clarkson, P.A.: Solitons, nonlinear evolution equations and inverse scattering. Cambridge: Cambridge University Press, 1991 7. Fokas, A.S. and Ablowitz, M.J.: The inverse scattering transform for the Benjamin–Ono equation – a pivot to multidimensional problems. Stud. Appl. Math. 68, 1–10 (1983) 8. Fokas, A.S. and Ablowitz, M.J.: On the inverse scattering of the time-dependent Schrödinger equation and the associated Kadomtsev–Petviashvili (1) equation. Stud. Appl. Math. 69, 211–228 (1983) 9. Fokas, A.S. and Ablowitz, M.J.: On the inverse scattering transform of multidimensional nonlinear equations related to first-order systems in the plane. J. Math. Phys. 25, 2494–2505 (1984) 10. Kaup, D.J. and Matsuno,Y.: The inverse scattering transform for the Benjamin–Ono equation. Stud. Appl. Math. 101, 73–98 (1998) 11. Kaup, D.J., Lakoba, T.I. and Matsuno,Y.: Complete integrability of the Benjamin–Ono equation by means of action-angle variables. Phys. Lett. A 238, 123–133 (1998) 12. Kaup, D.J., Lakoba, T.I. and Matsuno, Y.: Perturbation theory for the Benjamin–Ono equation. Inverse Problems 13, 215–240 (1999) 13. Pelinovsky, D.E. and Sulem, C.: Bifurcations of new eigenvalues for the Benjamin–Ono equation. J. Math. Phys. 39, 6552–6572 (1998) 14. Pelinovsky, D.E. and Sulem, C.: Asymptotic approximations for a new eigenvalue in linear problems without a threshold. Theor. Mat. Phys. (1999) 15. Fadeev, L.D. and Takhtajan, L.A.: Hamiltonian Methods in the Theory of Solitons. Berlin–Heidelberg– New York: Springer–Verlag, 1987 16. Manakov, S.V.: The inverse scattering transform for the time-dependent Schrödinger equation and Kadomtsev–Petviashvili equation. Physica D 3, 420–427 (1981) ¯ 17. Beals, R. and Coifman, R.R.: Linear spectral problems, nonlinear equations and the ∂-method. Inverse Problems 5, 87–130 (1989)

760

D. E. Pelinovsky, C. Sulem

18. Zhou, X.: Inverse scattering transform for the time dependent Schrödinger equation with applications to the KPI equation. Commun. Math. Phys. 128, 551–564 (1990) 19. Fokas, A.S. and Sung, L.-Y.: On the solvability of the N-wave, Davey–Stewartson and Kadomtsev– Petviashvili equations. Inverse Problems 8, 673–708 (1992) 20. Ablowitz, M.J. and Villarroel, J.: Solutions to the time-dependent Schrödinger and the Kadomtsev– Petviashvili equations. Phys. Rev. Lett. 78, 570–573 (1997) 21. Ablowitz, M.J. and Villarroel, J.: On the discrete spectrum of the nonstationary Schrödinger equation and multipole lumps of the Kadomtsev–Petviashvili I equation. Commun. Math. Phys. 207, 1–42 (1999) 22. Ablowitz, M.J., Chakravarty, S., Trubatch, A.D. and Villarroel, J.: Novel potentials of the nonstationary Schrödinger equation and solutions of the Kadomtsev–Petviashvili I equation. Preprint, APPM # 389 (1999) 23. Boiti, M., Leon, J.J.-P. and Pempinelli, F.: Spectral transform and orthogonality relations for the Kadomtsev–Petviashvili equation. Phys. Lett. A 141, 96–100 (1989) 24. Boiti, M., Pempinelli, F., Pogrebkov, A.K. and Polivanov, M.C.: Resolvent approach for the non-stationary Schrödinger equation. Inverse Problems 8, 331–364 (1992) 25. Boiti, M., Pempinelli, F., Pogrebkov, A.K. and Polivanov, M.C.: Resolvent approach for two-dimensional scattering problems. Application to the non-stationary Schrödinger problem and the KPI equation. Teor. Mat. Fiz 93, 181–210 (1992) 26. Kaup, D.J.: The time-dependent Schrödinger equation in multidimensional integrable evolution equations. Contemp. Math. 160, 173–190 (1994) 27. Ablowitz, M.J. and Villarroel, J.: On the Kadomtsev–Petviashvili equation and associated constraints. Stud. Appl. Math. 85, 195–213 (1991) 28. Boiti, M., Pempinelli, F. and Pogrebkov, A.K.: Solutions of the KPI equation with smooth initial data. Inverse Problems 10, 505–519 (1994) 29. Ablowitz, M.J. and Wang, X.P.: Initial time layers and Kadomtsev–Petviashvili-type equations. Stud. Appl. Math. 98, 121–137 (1997) 30. Fokas, A.S. and Sung, L.-Y.: The Cauchy problem for the Kadomtsev–Petviashvili I equation without the zero mass constraint. Math. Proc. Camb. Phil. Soc. 125, 113–138 (1999) 31. Sung, L.-Y.: Square integrability and uniqueness of the solutions of the Kadomtsev–Petviashvili I equation. Math. Phys. Analysis and Geometry, to appear (1999) 32. Kuznetsov, E.A. and Turitsyn, S.K.: Two- and three-dimensional solitons in weakly dispersive media. Sov. Phys. JETP 55, 844–847 (1982) 33. He, M.: Ph.D. thesis. Monash University, 1997 34. Mallick, B.B. and Kundu, A.: Levinson-type theorem for the Korteweg–de Vries system and its consequences. Phys. Lett. A 135, 113–116 (1989) 35. Karpman, V.I.: Nonlinear waves in dispersive media. New York: Pergamon Press, 1975 36. Landau, L.D. and Lifshitz, E.M.: Quantum Mechanics. New York: Pergamon, 1985 37. Simon, B.: The bound state of weakly coupled Schrödinger operator in one and two dimensions. An. of Phys. 97, 279–288 (1976) 38. Wright, J.: Soliton production and solutions to perturbed Korteweg–de–Vries equations. Phys. Rev. A 21, 335–339 (1980) 39. Karpman, V.I.: Soliton evolution in the presence of perturbation. Phys. Scripta 20, 462–478 (1979) 40. Kivshar, Yu.S. and Malomed, B.A.: Dynamics of solitons in nearly integrable systems. Rev. Mod. Phys. 61, 763–915 (1989) 41. Pelinovsky, D.E. and Grimshaw, R.H.J.: Structural transformation of eigenvalues for a perturbed algebraic soliton potential. Phys. Lett. A 229, 165–172 (1997) 42. Kivshar, Yu.S., Pelinovsky, D.E., Cretegny, T. and Peyrard, M.: Internal modes of solitary waves. Phys. Rev. Lett. 80, 5031–5035 (1998) 43. Soffer, A. and Weinstein, M.I.: Time dependent resonance theory. Geometric and Functional Analysis. (1999) 44. Merkli, M. and Sigal, I.M.: A time-dependent theory of quantum resonances. Commun. Math. Phys. 201, 549–576 (1999) 45. Kodama, Y., Ablowitz, M.J. and Satsuma, J.: Direct and inverse scattering problems of the nonlinear intermediate long wave equation. J. Math. Phys. 23, 564–576 (1982) 46. Fokas, A.S. and Santini, P.M.: Dromions and a boundary value problem for the Davey–Stewartson 1 equation. Physica D 44, 99–130 (1990) 47. Boiti, M., Leon, J.J-P. and Pempinelli, F.: A new spectral transform for the Davey–Stewartson I equation. Phys. Lett. A 141, 101–107 (1989) 48. Desaix, M., Anderson, D., Lisak, M. and Quiroga–Teixeiro, M.L.: Variationally obtained approximate eigenvalues of the Zakharov–Shabat scattering problem for real potentials. Phys. Lett. A 212, 332–338 (1996) 49. Nishinary, K., Yajima, T. and Nakao, T.: Time evolution of Gaussian–type initial conditions associated with the Davey–Stewartson equations. J. Phys. A: Math. Gen. 29, 4237–4245 (1996) Communicated by T. Miwa

Commun. Math. Phys. 208, 761 – 770 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Second Eigenvalue of Schrödinger Operators and Mean Curvature Ahmad El Soufi and Saïd Ilias Laboratoire de Mathématiques et Physique Théorique, Université de Tours, Parc de Grandmont, 37200 Tours, France. E-mail: [email protected]; [email protected] Received: 18 June 1999 / Accepted: 6 July 1999

Abstract: Let M be a compact immersed submanifold of the Euclidean space, the hyperbolic space or the standard sphere. For any continuous potential q on M, we give a sharp upper bound for the second eigenvalue of the operator −1 + q in terms of the total mean curvature of M and the mean value of q. Moreover, we analyze the case where this bound is achieved. As a consequence of this result we obtain an alternative proof for the Alikakos–Fusco conjecture concerning the stability of the interface in the Allen–Cahn reaction diffusion model. 1. Introduction Let N n (c) represent the Euclidean space Rn for c = 0, the hyperbolic space Hn for c = −1 and the unit Euclidean sphere Sn for c = 1. Given a compact immersed submanifold M m of dimension m ≥ 2 of N n (c), the operator L = −1 − |II|2 − mc, where 1 is the Laplace–Beltrami operator and |II| is the length of the second fundamental form of M, arose naturally in the study of the stability in certain physical and geometrical problems. For instance, when M is a minimal hypersurface of Sn (i.e. an extremum of the volume functional) or a constant mean curvature hypersurface of N n (c) (i.e. a constrained extremum of the volume functional), then L is the Jacobi operator of M which represents the second variation of the volume at M. Its spectral behaviour is then directly related to the instability of such hypersurfaces. Otherwise, it is known that the interfacial surface separating two phases in the Allen– Cahn reaction-diffusion model moves according to a “motion-by-mean curvature”. Observing that the operator L is the linearization of the mean curvature, Alikakos and Fusco showed that the negative eigenvalues of L together with their associated eigenfunctions correspond to instabilities of this interface. They conjectured that the sphere in R3 is the

762

A. El Soufi, S. Ilias

unique compact orientable surface of R3 on which the operator L has only one negative eigenvalue. This conjecture means that the sphere is the least instable surface under motion by mean curvature. Many works ([2, 6, 12, . . . ]) were devoted to this conjecture which has been recently proved by Harrell and Loss [7]. The aim of this paper is to put the study of this kind of problem in the more general context of the estimation of the eigenvalues of a Schrödinger operator −1 + q in terms of the extrinsic geometry of the submanifold M. Concerning the first eigenvalue of −1 + q, it is easy to see that it is bounded by the mean value q¯ of the potential q on M, Z 1 qvg . λ1 (−1 + q) ≤ q¯ := V (M) M Theorem 2.1 below deals with the second eigenvalue of −1 + q and states that Z 1 ¯ |H 2 | + c vg + q, λ2 (−1 + q) ≤ V (M) M

(1)

where H = m1 traceg II is the mean curvature of the submanifold M m of N n (c). For the particular potential q = −|II|2 − mc the previous estimates yield Z 1 |II|2 + mc vg λ1 (L) ≤ − V (M) m and λ2 (L) ≤ −

Z m

|II|2 − m|H |2 vg ≤ 0.

− is in fact equal to the square of the length of the traceless The integrand part of the second fundamental form (also called total umbilicity tensor). So λ2 (L) is nonpositive and vanishes only for geodesic spheres in N n (c). Consequently, if M is not a geodesic m-sphere of N n (c), then the operator L has at least two negative eigenvalues (Corollary 2.1). This consequence of inequality (1) extends the result of Harrell and Loss to the case where the ambient space is hyperbolic or spherical and where the codimension of M is greater than 1. Moreover, we don’t need any orientability assumption. In the particular case of the Laplace–Beltrami operator (i.e. q = 0), the inequality (1) is nothing but the so called “Reilly inequality” established by Reilly [14] for submanifolds of Rn and by us [5] for submanifolds of Hn , Sn and some other ambient spaces. In this same paper [5], we pointed out that this inequality can be extended to Schrödinger operators. However, considering the importance of these latter, we will develop in the proof of Theorem 2.1, the specific arguments which make this extension possible. The analysis of the case of equality in the inequality (1) gives rise to a curious phenomenon. Indeed, when the dimension of the submanifold M is greater than two then equality in (1) holds only if q is constant and M is an immersed minimal submanifold of a geodesic sphere of N n (c) (see Theorem 2.1). On the other hand, when M is a 2dimensional geodesic sphere of N n (c) then there exists an infinite family of non-constant potentials for which the equality holds in (1). In Theorem 2.2 we determine all these potentials and show that their set can be parametrized by [0, 1) × R × S2 . We don’t know if there exist other surfaces such that the equality in (1) occurs for non-constant potentials. Nevertheless, in Theorem 2.3 we consider three examples of well- known surfaces (the Clifford torus, the equilateral torus and the Veronese surface) |II|2

m|H |2

1 V (M)

Second Eigenvalue of Schrödinger Operators

763

and show that for each of them the equality in (1) holds if and only if the potential q is constant. As a final remark, note that the hypothesis on the dimension of M “m ≥ 2” in Theorem 2.1 is necessary. Indeed, Papanicolaou [12] gave an example of nonconstant potential on the unit circle of R2 for which the inequality (1) fails. Moreover, for curves in the hyperbolic plane H2 , the inequality (1) does not hold even if q is constant (cf. [9]). However, according to [7], for the particular Schrödinger operator L the second eigenvalue λ2 (L) is still nonpositive for simple closed curves in R3 and vanishes only for the circle. 2. Statement of the Results Let M m be a compact connected differentiable manifold of dimension m ≥ 2 and let N n (c) represent the Euclidean space Rn for c = 0, the Euclidean unit sphere Sn for c = 1 and the hyperbolic space Hn for c = −1. The standard Riemannian metric of N n (c) will be denoted by hc . Let φ : M m → N n (c) be an immersion. We denote by g = φ ∗ hc the induced metric on M from the metric hc of N(c), and by 1 the corresponding Laplace–Beltrami operator. The second fundamental form of φ (or of the immersed submanifold φ(M) ⊂ N n (c) will be denoted by II. Its mean curvature vector field is H = m1 traceg II whose length is |H | = (hc (H, H ))1/2 . For any continuous function q on M m the spectrum of −1+q consists of an increasing and unbounded sequence of eigenvalues λ1 (−1 + q) < λ2 (−1 + q) ≤ λ3 (−1 + q) ≤ · · · . The first eigenvalue λ1 (−1 + q) is known to be simple and to satisfy (by virtue of the min-max principle) Z 1 qvg , λ1 (−1 + q) ≤ V (M) M where V (M) and vg are respectively the Riemannian volume and the Riemannian volume element of (M, g). Moreover, this inequality is strict unless q is constant. Concerning the second eigenvalue we have the following Theorem 2.1. For any continuous potential q on M m , Z n o 1 m |H |2 + c + q vg . λ2 (−1 + q) ≤ V (M) m

(2)

If m ≥ 3, then equality in (2) holds if and only if q is constant and φ(M) is an immersed 1/2 m , minimal submanifold of a geodesic sphere of radius rc of N (c), with r0 = λ2 (−1)

r1 = arcsin r0 and r−1 = sinh−1 r0 .

In the case where M is 2-dimensional the situation is quite different regarding the equality case in inequality (2). Indeed, let M be the unit 2-sphere S2 and let φ be a standard embedding of S2 as a geodesic sphere of radius 1 in R3 , of radius π2 in S3 or of radius sinh−1 (1) in H3 (the radii here are chosen such that the metric g = φ ∗ hc

764

A. El Soufi, S. Ilias

coïncides with the standard metric of S2 ). For such φ we have |H |2 = 1 in R3 , |H |2 = 0 in S3 and |H |2 = 2 in H3 . Therefore Z 1 2 2 |H | + c vg = 2 = λ2 (−1). V (S2 ) S2 Theorem 2.2. Let q be a continuous function on S2 . The equality Z 1 qvg λ2 (−1 + q) = 2 + V (S2 ) S2 holds if and only if the potential q has the form: q(x) =

2b1 2 + b2 , √ 1 + 1 − b1 hx, ai

where b1 ∈ [0, 1), b2 ∈ R, a ∈ S2 and h, i is the Euclidean inner product. Let us now consider the following three surfaces: √ √ 1) The Clifford torus: M = S1 22 × S1 22 naturally embedded in R4 . √ 2) The equilateral torus: M = R2 / 0 with 0 = Z(1, 0) ⊕ Z 21 , 23 embedded in R6 by √ √ √ 1 φ(x, y) = √ exp 4iπy/ 3, exp 2iπ x − y/ 3 , exp 2iπ x + y/ 3 . 3 3) The Veronese surface: M = RP 2 = S2 /{±1} embedded in R6 by √ √ √ φ(x, y, z) = x 2 , y 2 , z2 , 2xy, 2xz, 2yz . For each of these three surfaces we have |H |2 = 1 and λ2 (−1) = 2 (in fact φ(M) is a minimal submanifold of the unit sphere of R4 or R6 and the Euclidean components of φ are second eigenfunctions of the Laplacian associated to the metric g = φ ∗ h0 ). Hence Z 2 |H |2 vg = 2 = λ2 (−1). V (M) M Theorem 2.3. Let M be one of the three surfaces above and φ its corresponding embedding in Rn . The equality Z 1 qvg λ2 (−1 + q) = 2 + V (M) M holds if and only if the potential q is constant.

Second Eigenvalue of Schrödinger Operators

765

Let us come back now to the inequality (2) in order to apply it to the particular potential q = −|II|2 − mc, where X hc II(ei , ej ), II(ei , ej ) , |II|2 (x) = i≤m

(ei )1≤i≤m being an orthonormal local frame at x. In the particular codimension one case we have X ki2 , |II|2 (x) = i≤m

where k1 , . . . , km are the principal curvatures at x . The significance of this potential comes from the fact that, in codimension one, the operator L = −1 − |II|2 − mc is the linearization of the mean curvature (see [13]). This operator appears also as the Jacobi operator of the volume functional at a minimal or at a constant mean curvature hypersurface (see [3 and 15]). Otherwise, Alikakos and Fusco showed how the negative eigenvalues of L and their associated eigenfunctions are related to the instabilities of the interface in the reaction-diffusion Allen–Cahn model (see [1 and 2]). They conjectured that for any compact surface in R3 different from a sphere, the operator L has at least two negative eigenvalues, i.e. λ2 (L) < 0 (for an Euclidean sphere we clearly have λ2 (L) = 0). Now, Theorem 2.1 applied to the operator L = −1 − |II|2 − mc gives Z 1 m|H |2 − |II|2 vg . λ2 (L) ≤ V (M) M But |II|2 − m|H |2 = |II − H g|2 is nonnegative and becomes zero if and only if φ(M) is a totally umbilical compact submanifold of N n (c), i.e. a geodesic sphere. Hence Corollary 2.1. For any immersion φ from M m to N n (c) we have λ2 −1 − |II|2 − mc ≤ 0, where equality holds if and only if φ(M) is a geodesic m-sphere of N n (c). The particular case of this corollary corresponding to an oriented codimension 1 immersed submanifold of Rn = N n (0) has been obtained by Harrell and Loss [7]. 3. Proof of the Results Proof of Theorem 2.1. It is known that the first eigenspace of −1 + q is generated by a positive eigenfunction u. In order to apply the min-max principle to the components of the immersion φ, we need the following orthogonalization step (see [8 and 10] for previous similar results). First step. There exists a regular conformal map 0 : N n (c) → Sn ⊂ Rn+1 such that for all i ≤ n + 1, the immersion 9 = 0◦ φ = (91 , . . . , 9n+1 ) satisfies Z 9i uvg = 0. M

766

A. El Soufi, S. Ilias

Proof. Let 5 : N n (c) → Sn be a regular conformal map from N n (c) to Sn (e.g. for c = 1, 5 is the identity map and for c = 0 or −1, 5 is the inverse of a stereographic projection). For any unit vector a ∈ Sn ⊂ Rn+1 , we denote by (γta ) the flow of the vector ¯ = a − hx,aix. The γta ’s are conformal diffeomorphisms of field on Sn given by a(x) n a −1 S (in fact, γt (x) = 5a et 5a (x) , where 5a is the stereographic projection with respect to the pole a). We claim that there exists a γta such that 0 = γta ◦ 5 satisfies the desired property. Indeed, otherwise, the map F : [0, +∞) × Sn → Sn ⊂ Rn+1 R u(x)γ a ◦5◦φ(x)v

defined by F (t, a) = kRM u(x)γta ◦5◦φ(x)vg k gives an homotopy on Sn between the identity g t M n map and a constant map

R (indeed, for any a ∈ S , F (+∞, a) = a and F (σ, a) = R

t M u(x)5 ◦ φ(x)vg / M u(x)5 ◦ φ(x)vg ). u

Second step. Let 9 = 0 ◦ φ be as in the first step. Then Z Z 1 1 |∇9|2 vg + qvg , (3) λ2 (−1 + q) ≤ V (M) M V (M) M P where |∇9|2 = i≤n+1 |∇9i |2 (here and in all the sequel ∇f denotes the gradient of f w.r.t. the metric g). Moreover, equality in (3) holds if and only if the functions 91 , . . . , 9n+1 are second eigenfunctions of −1 + q. Proof. The min-max principle gives for any i ≤ n + 1, Z Z Z 9i2 vg ≤ |∇9i |2 vg + q9i2 vg . λ2 (−1 + q) M

M

M

(4)

P 2 Summing up and using the fact that n+1 i=1 9i = 1 we obtain the inequality (3). Now, if the equality holds in (3), then the equality in (4) holds for each 9i . Thus, for any i ≤ n + 1, t u (−1 + q)9i = λ2 (−1 + q)9i . Third step. For any regular conformal map 0 : N n (c) → Sn we have: Z Z |∇(00 φ)|2 ≤ m |H |2 + c vg . M

M

(5)

Moreover, for m ≥ 3 equality implies that |∇(00 φ)|2 is constant on M. Proof. Let II¯ and H¯ be respectively the second fundamental form and the mean curvature of 0 (φ(M)) viewed as an immersed submanifold of Sn . From the conformal invariance of the traceless part of the second fundamental form we can deduce the following formula (cf. [5]): |∇(00 φ)|2 2 ¯ − m|H¯ |2 . |II| |II|2 − m|H |2 = m The Gauss equation gives |II|2 − m|H |2 = m(m − 1)|H |2 + m(m − 1)c − scalg , where scalg is the scalar curvature of (M, g). Likewise ¯ 2 − m|H¯ |2 = m(m − 1)|H¯ |2 + m(m − 1) − scalg¯ , |II|

Second Eigenvalue of Schrödinger Operators

767

where g¯ = (00 φ)∗ h1 is the induced metric on M from the standard metric of Sn . Hence ! 2 |∇(0 1 φ)| 0 |∇(00 φ)|2 = m |H |2 + c − |∇(00 φ)|2 |H¯ |2 − scalg − scalg¯ . m−1 m (6)

As 00 φ is a conformal map from (M, g) to (Sn , h1 ) we have g¯ = |∇(0m0 φ)| g. Therefore, using the conformal change formula for the scalar curvature we obtain 2

scalg −

|∇00 φ|2 scalg¯ = (m − 1) (m − 2) |∇|∇00 φ||2 + 1 ln |∇00 φ)|2 . m

Substitution into (6) yields after integration Z Z |∇(00 φ)|2 vg = |H |2 + c vg M M Z Z 2 ¯ 2 |∇(00 φ)| |H | vg − (m − 2) |∇|∇(00 φ)||2 vg . − M

M

This proves the claim of the third step. u t End of the proof of Theorem 2.1. Inequality (2) follows immediately from inequalities (3) and (5). If q is the R constant, equality in (2) becomes equivalent to the equality m 2 + c v . But in [5] we proved that this equality occurs if |H | λ2 (−1) = V (M) g M and only if φ(M) is a minimal immersed submanifold of a geodesic sphere of radius rc of N(c) (with r0 = (m/λ2 (−1))1/2 , r1 = arcsin r0 and r−1 = sinh−1 r0 ). Hence, it remains to prove that, if m ≥ 3, then the equality in (2) implies the constancy of q. Now, suppose the equality in (2) holds, then, using the second and third steps, |∇9|2 must be P 2 constant and 91 , . . . , 9n+1 must be second eigenfunctions of −1+q. As n+1 i=1 9i = 1 we obtain X ! X 1 9i2 = (−1 + q)(9i )9i − |∇9i |2 q = − 1+q 2 i i X (7) λ2 (−1 + q)9i2 − |∇9i |2 = i

= λ2 (−1 + q) − |∇9|2 .

t u

Proof of Theorem 2.2. Recall that if γ is a conformal diffeomorphism of the standard 2-sphere (S2 , g), then there exist an isometry ρ, a unit vector a ∈ S2 and a nonnegative real number t such that γ = ρ ◦γta , where γta is the flow defined in the proof of Theorem 2.1 (see [4]). The energy density of γ is given by (see [5]): |∇γ |2 (x) = |∇γta |2 (x) = and thus

2 (cosh t + sinh thx, ai)2

768

A. El Soufi, S. Ilias

γ ∗g =

|∇γ |2 b1 1 g= g= 2 , √ 2 2 (cosh t + sinh thx, ai) 1 + 1 − b1 hx, ai

where b1 = (cosh t)−2 . The point of Theorem 2.2 is then to show that the potentials for which the equality in (2) holds are, up to an additive constant, exactly the energy densities |∇γ |2 of conformal diffeomorphisms of S2 . Now let q = −|∇γ |2 +b2 , where γ is a conformal diffeomorphism of S2 and b2 ∈ R. Let f be a degree one spherical harmonic (i.e. −1f = 2f ). The function f ◦ γ is then a second eigenfunction of −1γ ∗ g (i.e. −1γ ∗ g f0 γ = 2f0 γ ). It follows that −1(f0 γ ) = −

|∇γ |2 1γ ∗ g (f0 γ ) = |∇γ |2 (f0 γ ). 2

Therefore, zero is an eigenvalue of −1 − |∇γ |2 . We claim that zero is in fact the second eigenvalue of −1 − |∇γ |2 . Indeed, the quadratic forms Q1 and Q2 associated respectively to the operators −1 − |∇γ |2 and −1γ ∗ g − 2 coïncide: Z w(−1γ ∗ g − 2)wvγ ∗ g Q2 (w) = S2 Z −2 |∇γ |2 vg = w 1 − 2 w g 2 |∇γ |2 2 ZS = w(−1g − |∇γ |2 )wvg = Q1 (w). S2

Consequently, the operators −1 − |∇γ |2 and −1γ ∗ g − 2 have the same number of negative eigenvalues. But, λ1 (−1γ ∗ can −2) = −2 and λ2 (−1γ ∗ can −2) = 0. It follows that −1 − |∇γ |2 has exactly one negative eigenvalue and thus λ2 (−1 − |∇γ |2 ) = 0. In conclusion we have λ2 (−1 + q) = λ2 (−1 − |∇γ |2 + b2 ) = b2 and

1 2+ V (S2 )

Z

Z 1 qvg = 2 − |∇γ |2 vg + b2 V (S2 ) S2 S2 Z 2 =2− vγ ∗ g + b2 = b2 . V (S2 ) S2

Reciprocally, let q be such that: λ2 (−1 + q) = 2 +

1 V (S2 )

Z S2

qvg .

This implies that, using the same notation as in the second step of the proof of Theorem 2.1, for any i ≤ n + 1, −19i + q9i = λ2 (−1 + q)9i . As in Eq. (7) above, we deduce that λ2 (−1 + q) − q = |∇9|2 .

Second Eigenvalue of Schrödinger Operators

769

Hence, the function q0 = λ2 (−1 + q) − q is positive and g¯ = metric on S2 . We clearly have λ2 (−1 − q0 ) = 0.

q0 2g

is a Riemannian

If f is a second eigenfunction of −1 − q0 (i.e. (−1 − q0 )f = 0) then −1g¯ f =

−2 1g f = 2f. q0

Therefore, zero is an eigenvalue of −1g¯ − 2. Using the same arguments as before we deduce that zero is in fact the second eigenvalue of −1g¯ − 2. Thus λ2 (−1g¯ ) = 2 = λ2 (−1g ). Moreover, the metrics g and g¯ have the same Riemannian area: Z Z Z 1 1 λ2 (−1 + q)V (S2 ) − vg¯ = q 0 vg = qvg = V (S2 ). 2 S2 2 S2 S2 From the well known Hersch theorem [8], the metrics g and g¯ must be isometric. That is, there exists a diffeomorphism γ of S2 satisfying γ ∗ g = g¯ = q20 g. Consequently, γ is conformal and γ ∗ g = |∇γ2 | g. In conclusion, we have q0 = |∇γ |2 and thus q is equal, t up to a constant, to −|∇γ |2 . u 2

Proof of Theorem 2.3. Let q be such that 1 λ2 (−1 + q) = 2 + V (M)

Z M

qvg .

The same arguments as in the proof of Theorem 2.2 show that the function q0 = λ2 (−1 + q) − q is positive and that the metrics g and g¯ have the same volume and satisfy λ2 (−1g¯ ) = λ2 (1g ). A result obtained independently by Montiel and Ros [11] and us [4] shows that for the surfaces under consideration we must have g¯ = g. Thus q is constant. u t

References 1. Alikakos, N.D., Fusco, G.: The spectrum of the Cahn–Hilliard operator for generic interface in higher space dimensions. Indiana U. Math. J. 4, 637–674 (1993) 2. Alikakos, N.D., Fusco, G., Stefanopoulos, V.: Critical spectrum and stability of interfaces for a class of reaction-diffusion equations. J. Diff. Equat. 126, 106–167 (1996) 3. Barbosa, L., Do Carmo, M., Eschenburg, J.: Stability of hypersurfaces of constant mean curvature. Math. Z., 197, 123–138 (1988) 4. El Soufi, A., Ilias, S.: Immersions minimales, première valeur propre du laplacien et volume conforme. Math. Ann. 275, 257–267 (1986) 5. El Soufi, A., Ilias, S.: Une inégalité du type “Reilly” pour les sous-variétés de l’espace hyperbolique. Comment. Math. Helvitici 67, 167–181 (1992) 6. Harrell II, E.M.: On the second eigenvalue of the Laplace operator penalized by curvature. J. Diff. Geom. Appl. 6, 397–400 (1996) 7. Harrell II, E.M., Loss, M.: On the Laplace operator penalized by mean curvature. Commun. Math. Phys. 195, 643–650 (1998) 8. Hersch, J.: Quatre propriétés isopérimétriques de membranes sphériques homogènes. C.R. Acad. Sci. Paris, Sér. A–B, 270, 1645–1648 (1970) 9. Langer, J., Singer, T.A.: The total squared curvature of closed curves. J. Diff. Geom. 20, 1–22 (1984)

770

A. El Soufi, S. Ilias

10. Li, P., Yau, S.T.: A new conformal invariant and its application etc. Invent. Math. 69, 269–291 (1982) 11. Montiel, S., Ros, A.: Minimal immersions of surfaces by the first eigenfunctions and conformal area. Invent. Math. 83, 153–166 (1986) 12. Papanicolaou, V.G.: The second periodic eigenvalue and the Alikakos–Fusco conjecture. J. Diff. Equat. 130, 321–332 (1996) 13. Reilly, R.C.: Variational properties of functions of the mean curvatures for hypersurfaces in space forms. J. Diff. Geom. 8, 465–477 (1973) 14. Reilly, R.C.: On the first eigenvalue of the Laplacian for compact submanifolds of Euclidean space. Comment. Math. Helvitici, 52, 525–533 (1977) 15. Simons, J.: Minimal varieties in Riemannian manifolds. Ann. Math. (2) 88, 62–105 (1968) Communicated by B. Simon

Commun. Math. Phys. 208, 771 – 785 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Poincaré–Lelong Approach to Universality and Scaling of Correlations Between Zeros Pavel Bleher1,? , Bernard Shiffman2,?? , Steve Zelditch2,??? 1 Department of Mathematical Sciences, IUPUI, Indianapolis, IN 46202, USA.

E-mail: [email protected]

2 Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA.

E-mail: [email protected]; [email protected] Received: 17 March 1999 / Accepted: 5 August 1999

Abstract: This note is concerned with the scaling limit as N → ∞ of n-point correlations between zeros of random holomorphic polynomials of degree N in m variables. More generally we study correlations between zeros of holomorphic sections of powers LN of any positive holomorphic line bundle L over a compact Kähler manifold. Distances are rescaled so that the average density of zeros is independent of N . Our main result is that the scaling limits of the correlation functions and, more generally, of the “correlation forms” are universal, i.e. independent of the bundle L, manifold M or point on M. Introduction This note is a companion to our article [BSZ], in which we study the correlations between the zeros of a random holomorphic section s ∈ H 0 (M, LN ) of a power LN of a positive line bundle L → M over a compact m-dimensional complex manifold M. Since the hypersurface volume of the zeros √ of a section of LN in a ball U around a given point z0 is ∼ N Vol (U ), we rescale U → N U to get a density of zeros independent of N. After expanding U this way, all manifolds and line bundles appear asymptotically alike, and it is natural to ask if the local statistics of zeros are universal, i.e. independent of L, M, ω and z0 . To define our statistics, we first provide H 0 (M, LN ) with a natural Gaussian measure (see Sects. 1.1–1.2). The local statistics are measured by the scaled √ z1 zn j N U (see Sect. 1.3). They n-point zero correlation forms K N n ( √N , . . . , √N ), z ∈ are smooth off the diagonal, and their norms define scaled zero correlation measures √ enN ( √z1 , . . . , √zn ). (The scaled correlation forms extend to all of ( N U )n as currents K N

N

? Research partially supported by NSF grant #DMS-9623214.

?? Research partially supported by NSF grant #DMS-9800479. ??? Research partially supported by NSF grant #DMS-9703775.

772

P. Bleher, B. Shiffman, S. Zelditch

enN give the expected value of the product of the of order 0.) The correlation measures K volumes of the zero set in n domains in M; see Eq. (12). In [BSZ], we used geometric probability methods and a (universal) scaled Szegö kernel to prove that there exist universal limits as N → ∞ of these correlation measures and more generally of the correlations between simultaneous zeros of k ≤ m sections. Here we use a complex analytic approach based on the Poincaré–Lelong formula for the currents of integration over the zero set of a section, together with the scaled Szegö kernel from [BSZ], to prove universality for the correlation forms. This approach, although limited to the hypersurface case, allows for a result on the level of forms and a somewhat simpler proof. Our universality theorem is as follows: 0(m−1)n,(m−1)n (Cmn ) such that Main Theorem. There is a universal current K ∞ n ∈ D the following holds: suppose that (L, h) is a positive Hermitian line bundle on an mdimensional compact complex manifold M, and let K N n be the n-point zero correlation n 0 current on M . Suppose z ∈ M and choose local holomorphic coordinates in M about ¯ 2 . Then z0 such that 2h |z0 = ∂ ∂|z| 1 zn 1 z ∞ 1 n , . . . , = K . (z , . . . , z ) + O KN √ √ √ n n N N N m Furthermore, K ∞ n is a smooth form on the off-diagonal domain Gn consisting of n-tuples C 1 n m th of distinct points z , . . . , z in C . The error term has k order derivatives ≤ √A,k on N each compact subset A ⊂ Gnm , ∀k ≥ 0.

This result is new in all dimensions, and the proof does not simplify in any essential way in the case m = 1 of Riemann surfaces. Our method leads to integral formulae for the universal limit forms, although the details rapidly become complicated as the number n of points increase. For the case n = 2 of the pair correlation function, we carry out the calculation in complete detail in dimension m = 1 and also use the method to obtain an explicit formula for the scaling limit pair correlation measures in all dimensions (Theorems 4.1 and 4.2). In particular, our formula gives the scaling limit pair correlations for SU(m + 1)-polynomials (which are sections of powers of the hyperplane bundle over complex projective space CPm ). The universal formula in dimension m = 1 agrees, as it must, with that of Hannay [Ha] in the case of random SU(2)-polynomials. (The universality of this formula was not proved previously). Similar formulas for correlations of zeros of real polynomials were given in [BD]. Before we get started on the proof, the following heuristic remark on correlation measures and forms may be helpful. Although the definition of the correlation measures is formulated in terms of expected volumes of the zero set (or in the one-dimensional case, expected numbers of zeros) in domains of M, it can also be given a probabilistic interpretation: the probability that the zero divisor of a random section s simultaneously intersects balls of radius ε around z1 , . . . , zn , respectively, is ≈ cε2n KnN (z1 , . . . , zn ) enN ). The correlation form (where KnN is the correlation function given by KnN dVM = K N K n gives a more refined probability that the zero divisor of s has tangent hyperplanes close to n fixed complex hyperplanes in TM . A final remark on the term “universality”: in this paper it means independence from the details of the geometric setting, i.e. from the complex manifold, line bundle, metric, connection, etc. In random matrix theory (cf. [D]) it has a somewhat different meaning.

Poincaré–Lelong Approach to Correlations Between Zeros

773

There, the “setting” consists of a fixed class of N × N matrices; what varies is the probability measure dνN on the space of matrices. Universality then means that the correlation functions in the scaling limit are the same for a broad class of measures dνN . This notion of universality makes sense as well for random polynomials and sections. We could study the statistcs of zeros relative to more general measures on H 0 (M, LN ) than Gaussian R 2 +a|s(z)|4 +b|s(z)|2 )dV (z) − (|∇s(z)| ds. Such ones, e.g. measures of the form dνN (s) = e M measures are biased against sections with strong oscillations, and one could ask how that affects the statistics of zeros. However, we do not consider universality in the measure aspect in this paper. 1. Notation We summarize here the notation from complex analysis that we will need in the proof. This notation is the same as in [SZ] and [BSZ], except that different normalizations for the metric and volume form are used in [SZ]. 1.1. Complex geometry. We denote by (L, h) → M a holomorphic line bundle with smooth Hermitian metric h whose curvature form 2h = −∂ ∂¯ log keL k2h ,

(1)

is a positive (1,1)-form. Here, eL is a local non-vanishing holomorphic section of L over in [BSZ], we an open set U ⊂ M, and keL kh = h(eL , eL )1/2 is the h-norm of eL . As √ give M the Hermitian metric corresponding to the Kähler form ω = 2−1 2h and the induced Riemannian volume form dVM =

1 m ω . m!

(2)

We denote by H 0 (M, LN ) the space of holomorphic sections of LN = L ⊗ · · · ⊗ L. The metric h induces Hermitian metrics hN on LN given by ks ⊗N khN = kskN h . We give 0 N H (M, L ) the Hermitian inner product Z hN (s1 , s2 )dVM (s1 , s2 ∈ H 0 (M, LN ) ), (3) hs1 , s2 i = M

hs, si1/2 .

and we write |s| = For a holomorphic section s ∈ H 0 (M, LN ), we let Zs denote the current1 of integration over the zero divisor of s: Z ϕ, ϕ ∈ Dm−1,m−1 (M). (Zs , ϕ) = Zs

The Poincaré–Lelong formula (see e.g., [GH]) expresses the integration current of a ⊗N in the form: holomorphic section s = geL Zs =

i i ¯ ∂ ∂ log |g| = ∂ ∂¯ log kskhN + N ω. π π

(4)

1 Here, D p,q () denotes the space of compactly supported (p, q) forms on a complex manifold . A current is an element of the dual space Dp,q ()0 = D 0 m−p,m−q ().

774

P. Bleher, B. Shiffman, S. Zelditch

We also denote by |Zs | the Riemannian (2m − 2)-volume along the regular points of Zs , regarded as a measure on M: Z Z 1 ϕωm−1 ; (5) (|Zs |, ϕ) = reg ϕdVol2m−2 = (m − 1)! Zsreg Zs i.e., |Zs | is the “total variation measure” of the current of integration over Zs : |Zs | = Zs ∧

m−1 1 . (m−1)! ω

(6)

1.2. Random sections and Gaussian measures. We now give H 0 (M, LN ) the complex Gaussian probability measure d

dµ(s) =

N X 1 −|c|2 e dc, s = cj SjN , π dN

(7)

j =1

where {SjN : 1 ≤ j ≤ dN } is an orthonormal basis for H 0 (M, LN ) and dc is 2dN dimensional Lebesgue measure. This Gaussian is characterized by the property that the 2dN real variables
(8)

in the sense that for any test form ϕ1 (z1 ) ⊗ · · · ⊗ ϕn (zn ) ∈ Dm−1,m−1 (M) ⊗ · · · ⊗ Dm−1,m−1 (M), 1 n 1 n KN n (z , . . . , z ), ϕ1 (z ) ⊗ · · · ⊗ ϕn (z ) = E Zs , ϕ1 Zs , ϕ2 · · · Zs , ϕn . (9) enN as the “total variation In a similar way we define the n-point correlation measures K measures” of the n-point correlation currents: 1 n enN (z1 , . . . , zn ) = K N K n (z , . . . , z ) ∧

1 1 ∧ ··· ∧ . (10) ωzm−1 ωm−1 n 1 (m − 1)! (m − 1)! z

Poincaré–Lelong Approach to Correlations Between Zeros

775

By (6) and (10), we have enN (z1 , . . . , zn ), ϕ1 (z1 ) . . . ϕn (zn ) = E (|Zs |, ϕ1 )(|Zs |, ϕ2 ) · · · (|Zs |, ϕn ) , K where ϕj ∈ C 0 (M). Equivalently,



enN (U1 × · · · × Un ) = E  K

n Y

(11)

 Vol(Zs ∩ Uj ) , U1 , . . . , Un ⊂ M.

(12)

j =1

Remark. In the case of pair correlation on a Riemann surface (n = 2, dim M = 1), the correlation measures take the form N N KN 2 (z, w) = [1] ∧ (K 1 (z) ⊗ 1) + κ (z, w)ωz ⊗ ωw (N 0),

where [1] denotes the current of integration along the diagonal 1 = {(z, z)} ⊂ M × M, and κ N ∈ C ∞ (M × M). 1.4. Szegö kernels. As in [Ze,SZ,BSZ] and elsewhere, we analyze the N → ∞ limit by lifting it to a principal S 1 bundle π : X → M. Let us recall how this goes. We denote by L∗ the dual line bundle to L, and define X as the circle bundle X = {λ ∈ L∗ : kλkh∗ = 1}, where h∗ is the norm on L∗ dual to h. We can view X as the boundary of the disc bundle D = {λ ∈ L∗ : ρ(λ) > 0}, where ρ(a) = 1 − kλk2h∗ . The disc bundle D is strictly pseudoconvex in L∗ , since 2h is positive, and hence X inherits the structure of a strictly pseudoconvex CR manifold. Associated to X is the contact ¯ X . We also give X the volume form form α = −i∂ρ|X = i ∂ρ| dVX =

1 α ∧ (dα)m = α ∧ π ∗ dVM . m!

(13)

The setting for our analysis of the Szegö kernel is the Hardy space H 2 (X) ⊂ L2 (X) of square integrable CR functions on X, where we use the inner product Z 1 F1 F2 dVX , F1 , F2 ∈ L2 (X). (14) hF1 , F2 i = 2π X on X. The action rθ commutes with the We let rθ x = eiθ x (x ∈ X) denote the S 1 action L 2 Cauchy-Riemann operator ∂¯b ; hence H 2 (X) = ∞ N =0 HN (X), where HN2 (X) = {F ∈ H 2 (X) : F (rθ x) = eiN θ F (x)}. A section sN of LN determines an equivariant function sˆN on X: sˆN (z, λ) = λ⊗N , sN (z) , (z, λ) ∈ X;

(15)

then sˆN (rθ x) = eiNθ sN (x). The map s 7 → sˆ is a unitary equivalence between H 0 (M, L⊗N ) and HN2 (X). We let 5N : L2 (X) → HN2 (X) denote the orthogonal projection. The Szegö kernel 5N (x, y) is defined by Z 5N (x, y)F (y)dVX (y), F ∈ L2 (X). (16) 5N F (x) = X

776

P. Bleher, B. Shiffman, S. Zelditch

It can be given as 5N (x, y) =

dN X j =1

b SjN (x)b SjN (y),

(17)

where S1N , . . . , SdNN form an orthonormal basis of H 0 (M, LN ). 2. Scaling In order that we may study the local nature of the random variable Zs , we fix a point z0 ∈ M and choose a holomorphic coordinate chart 9 : , 0 → U, z0 ( ⊂ Cm , U ⊂ M) such that m X i dzj ∧ d z¯ j . (18) 9 ∗ ωz 0 = 2 j =1

0

For example, if L is the hyperplane section bundle O(1) over CPm with the Fubini-Study metric hFS , and z0 = (1 : 0 : · · · : 0), then the coordinate chart 9 : Cm → U = {w ∈ CPm : w0 6= 0}, 9(z) = (1 : z1 : · · · : zm ) (i.e., zj = wj /w0 ) satisfies (18). To simplify notation, we identify U with . For a current T ∈ D0p,q (), we write √ z = τ√N T ∈ D0p,q ( N ), (τλ (z) = λz). T √ ∗ N P P (In particular, if T = Tj k (z)dzj ∧ d z¯ k , then T ( √z ) = N1 Tj k ( √z )dzj ∧ d z¯ k .) N

We define the rescaled zero current of s ∈ H 0 (M, LN ) by bsN (z) := Zs √z . Z N

N

(19)

The scaled n-point correlation currents are then defined by: 1 zn z N 1 N 2 N n N b b b ∈ D0n,n (M n ). E Zs (z ) ⊗ Zs (z ) ⊗ · · · ⊗ Zs (z ) = K n √ , . . . , √ N N (20) Following the approach of [SZ], we fix an orthonormal basis {SjN } of H 0 (M, LN )

⊗N over U . Any section in H 0 (M, LN ) may then be written as and write SjN = fjN eL PdN N . To simplify the notation we let f N = (f N , . . . , f N ) : U → CdN s = j =1 cj fjN eL 1 dN and we put dN X cj fj = c · f N . j =1

Poincaré–Lelong Approach to Correlations Between Zeros

777

Hence √ √ −1 ¯ −1 ¯ z N N b ∂ ∂ log |c · f |, Zs = ∂z ∂z log c · f N √ , Zs = π π N

(21)

and therefore bs (zn ) = bs (z1 ) ⊗ · · · ⊗ Z Z

n i ∂z1 ∂¯z1 · · · π 1 n N z N z ¯ n n ∂z ∂z log |c · f ( √ )| · · · log |c · f ( √ )| . N N

(22)

We then can write the rescaled correlation currents in the form zn z1 bs (zn ) bs (z1 ) ⊗ · · · ⊗ Z =E Z √ ,..., √ N N n i ∂z1 ∂¯z1 · · · ∂zn ∂¯zn = π 2 R N √z1 · · · log · f N √zn e−|c| dc. c d CdN log c · f N π N N

KN n

(23)

2.1. Scaling limit of the Szegö kernel. The asymptotics of the Szegö kernel along the diagonal were given by [Ti] and [Ze]: πm 5N (x, x) = 1 + O(N −1 ). Nm

(24)

For our proof of the Main Theorem, we need the following lemma from [BSZ], which gives the “near-diagonal” asymptotics of the Szegö kernel. local coordinates {zj } in a neighborhood of z0 so Lemma 2.1. Let z0 ∈ M and P choose 0 j that z = 0 and 2h (z0 ) = dz ∧ d z¯ j . Then z θ w ϕ πm ¯ 21 |z−w|2 5N ( √ , ; √ , ) = ei(θ−ϕ)+i=(z·w)− + O(N −1/2 ). Nm N N N N ∗ (z) ∈ X, and similarly for (w, ϕ). In (24) Here, (z, θ) denotes the point eiθ keL (z)kh eL and Lemma 2.1, the expression O(N α ) means a term with k th order derivatives ≤ Ck N α , for all k ≥ 0. Lemma 2.1 says that the Szegö kernel has a universal scaling limit. In fact, its scaling limit is the first Szegö kernel of the reduced Heisenberg group; see [BSZ].

3. Universality All the ideas of the proof of the Main Theorem occur in the simplest case n = 2. So first we prove universality in that case and then extend the proof to general n.

778

P. Bleher, B. Shiffman, S. Zelditch

Thus, our first object is to prove universality of the large N limit of the rescaled pair correlation current (from (23) with n = 2) w z N bsN (z) ⊗ Z bsN (w) =E Z K2 √ , √ N N (25) Z 2 z w e−|c| −1 log c · f N ( √ ) log c · f N ( √ ) d dc. = 2 ∂z ∂¯z ∂w ∂¯w dN π N N π N C

As in [SZ], we write f N = |f N |uN and expand the integrand in (25): z w z w log |c · f N ( √ )| log |c · f N ( √ )| = log |f N ( √ )| log |f N ( √ )| N N N N z N N w + log |f ( √ )| log |c · u ( √ )| N N z N w N + log |f ( √ )| log |c · u ( √ )| N N z N N w + log |c · u ( √ )| log |c · u ( √ )|. (26) N N Let us denote the terms resulting from this expansion by E1 , E2 , E3 , E4 , respectively. In particular, −1 z w (27) E1 = 2 ∂z ∂¯z ∂w ∂¯w log f N ( √ ) log f N ( √ ) . π N N N By (15), b SjN (z, θ) = eiNθ keL (z)kN h fj (z), where (z, θ ) are the coordinates in X given in Sect. 2.1. By (17), N N N 5N (z, w) = keL (z)kN h keL (w)kh hf (z), f (w)i,

(28)

N where we write 5N (z, w) = 5N (z, 0; w, 0). Since 5N (z, z)1/2 = keL (z)kN h |f (z)|, 1 z z z each factor in (27) has the form 2 log 5N ( √ , √ ) − N log keL ( √ )kh . By (24), N

N

N

z z ∂z ∂¯z ∂w ∂¯w log 5N ( √ , √ ) → 0 as N → ∞. N N On the other hand

z z −iN∂z ∂¯z log keL ( √ )kh = ω( √ ). N N Hence the first term converges to the normalized Euclidean (double) Kähler form: E1 =

i ¯ 2 i ¯ 2 1 ∂ ∂|z| ∧ ∂ ∂|w| + O( ). 2π 2π N

(29)

The middle two terms vanish since the integrals in E2 and E3 are independent of w and z respectively (see [SZ, §3.2]). The “interesting term” is therefore E4 =

−1 ¯ ∂z ∂z ∂w ∂¯w π2

Z

z w e−|c| log |c · uN ( √ )| log |c · uN ( √ )| d dc . N N π N 2

C dN

(30)

Poincaré–Lelong Approach to Correlations Between Zeros

779

To evaluate E4 , we consider the integral Z −|c|2 1 2 1 2 e (x , x ) := log |c · x | log |c · x | dc (x 1 , x 2 ∈ CdN ) GN 2 π dN C dN

(31)

with x 1 = uN ( √z ), x 2 = uN ( √w ). To simplify it, we construct a Hermitian orthonorN

N

mal basis {e1 , . . . , edN } for CdN such that x 1 = e1 and x 2 = ξ1 e1 + ξ2 e2 , ξ1 = hx 2 , x 1 i, ξ2 =

p 1 − |ξ1 |2 .

(32)

This is possible because we can always multiply e2 by a phase eiθ so that ξ2 is positive real. We then make a unitary change of variables to express the integral in the {ej } coordinates. Since the Gaussian is U (dN )-invariant, (31) simplifies to Z 1 2 2 N 1 2 e−(|c1 | +|c2 | ) log |ξ1 | log |c1 ξ1 + c2 ξ2 |dc1 dc2 G2 (x , x ) = G2 (ξ1 , ξ2 ) = 2 2 π C (33) (where we used the fact that the Gaussian integral in each cj , j ≥ 3, equals 1 by construction). By performing a rotation of the c1 variable, we may replace ξ1 with |ξ1 | and replace G2 (ξ1 , ξ2 ) with G(cos θ ) := G2 (cos θ, sin θ ),

(34)

where cos θ = |ξ1 | = |hx 1 , x 2 i|, 0 ≤ θ ≤ π/2. Hence (30) becomes

−1 z w E4 = 2 ∂z ∂¯z ∂w ∂¯w G(cos θN ), cos θN = uN ( √ ), uN ( √ ) . π N N

(35)

By the universal scaling formula for the Szegö kernel (Lemma 2.1) and (28), we have cos θN =

1 1 |5N (z, w)| 2 = e− 2 |z−w| + O(N − 2 ). 1/2 1/2 5N (z, z) 5N (w, w)

(36)

Thus we get the universal formula: 1 i ¯ 2 i ¯ 2 −1 ¯ 2 (37) ∂ ∂|z| ∧ ∂ ∂|w| + 2 ∂z ∂z ∂w ∂¯w G(e− 2 |z−w| ). 2π 2π π This completes the proof for the pair correlation case n = 2. (Notice that the formula has the same form in all dimensions.) The proof for general n is similar. We again write f N = |f N |uN and expand the integrand in (23):

K∞ 2 (z, w) =

1

2

n

N

N

N

log |c · f N ( √z )| log |c · f N ( √z )| · · · log |c · f N ( √z )| 1

2

n

N

N

N

= log |f N ( √z )| log |f N ( √z )| · · · log |f N ( √z )| 1

2

n−1

n

N

N

N

N

+ log |f N ( √z )| log |f N ( √z )| · · · log |f N ( z√ )| log |c · uN ( √z )| +··· 1

2

n

N

N

N

+ log |c · uN ( √z )| log |c · uN ( √z )| · · · log |c · uN ( √z )|.

780

P. Bleher, B. Shiffman, S. Zelditch

We denote the terms resulting from this expansion by E1 , . . . , E2n , respectively. As before, the first term converges to the normalized Euclidean “n-fold” Kähler form: i ¯ 12 i ¯ n2 1 ∂ ∂|z | ∧ · · · ∧ ∂ ∂|z | + O( ). 2π 2π N The E2n term is obtained from the function Z −|c|2 1 2 n 1 2 n e (x , x , . . . , x ) := log |c · x | log |c · x | · · · log |c · x | dc, GN n π dN C dN E1 =

(38)

x 1 , x 2 , . . . , x n ∈ CdN . Precisely, we substitute zj x j = uN ( √ ) N

(39)

n in (38) and apply the operator πi ∂z1 ∂¯z1 · · · ∂zn ∂¯zn . As above, we define a special Hermitian orthonormal basis {e1 , . . . , en } for the n-dimensional complex subspace spanned by {x1 , . . . , xn }. We put: x 1 = e1 x 2 = ξ21 e1 + ξ22 e2 .. .

ξ22 =

p 1 − |ξ21 |2

x n = ξn1 e1 + · · · + ξnn en

ξnn =

q P 1 − j ≤n−1 |ξnj |2 .

Such a basis exists because we can always multiply ej by a phase eiθ so that the last component ξjj is positive real. We complete {ej } to a basis of CdN , and we now let cj denote coordinates relative to this basis. As above, we rewrite the Gaussian integral in these coordinates. After integrating out the variables {cn+1 , . . . , cdN }, (38) simplifies to the n-dimensional complex Gaussian integral 1 n GN n (x , . . . , x ) = Gn (ξ21 , ξ22 , . . . , ξnn )

=

1 πn

R

e−|c| log |c1 | log |c1 ξ21 + c2 ξ22 | · · · log |c1 ξn1 + . . . cn ξnn |dc. 2

Cn

(40)

Note that the variables ξj k depend on N ; we write ξj k = ξjNk when we need to indicate this dependence. To prove universality, we observe that the ξj k are universal algebraic functions of the inner products hx a , x b i. Indeed, ξj 1 ξ¯k1 + · · · + ξj k ξ¯kk = hx j , x k i, 1 ≤ k ≤ j ≤ n,

(41)

where we set ξ11 = 1. These algebraic functions are obtained by induction (lexicographically) using (41). (The triangular matrix (ξj k ) is just the inverse of the matrix describing the Gram-Schmidt process.) By (39), it follows that the ξjNk are universal algebraic functions of the variables D

E

j k uN ( √z ), uN ( √z ) N N

=

j

k

N

N

5N ( √z , √z ) j

j

N

N

k

k

N

N

5N ( √z , √z )1/2 5N ( √z , √z )1/2

=e

i=(zj ·¯zk )− 21 |zj −zk |2

+ O( √1 ). N

Poincaré–Lelong Approach to Correlations Between Zeros

781

We note here that

1 j j k k 2 |x 1 ∧ · · · ∧ x n |2 = det(hx j , x k i) → det ei=(z ·¯z )− 2 |z −z | P j 2 = e− |z | det ezj ·¯zk .

(42)

When the zj are distinct (i.e., (z1 , . . . , zn ) ∈ Gnm ), the limit determinant in (42) is nonzero (see [BSZ]) and thus ξjNk = ξj∞k + O( √1 ), where the ξj∞k are universal realN analytic functions of z ∈ Gnm . We conclude that the E2n term converges to a universal current: n i ∞ ∞ ∂z1 ∂¯z1 · · · ∂zn ∂¯zn Gn (ξ21 , . . . , ξkk ) + O( √1 ). E2 n = N π Consider now a general term Ea . Suppose without loss of generality that Ea comes from z1 zk zk+1 zn log |c · uN ( √ )| · · · log |c · uN ( √ )| log |f N ( √ )| · · · log |f N ( √ )|. N N N N As above we obtain k i i ¯ k+1 2 ∞ ∞ ∂z1 ∂¯z1 · · · ∂zk ∂¯zk Gk (ξ21 , . . . , ξkk )∧ ∂ ∂|z | ∧ Ea = π 2π i ¯ n2 ∂ ∂|z | + O( √1 ). ··· ∧ N 2π Hence this term also approaches a universal current. (As in the pair correlation case, t terms with only one uN vanish.) u 4. Explicit Formulae e∞ (z, w). We now calculate explicitly the limit pair correlation measures K 2 4.1. Preliminaries. The first step is to compute 1G(e− 2 r ), where 1 is the Euclidean Laplacian on Cm and r = |ζ | (ζ ∈ Cm ). To begin this computation, we write aj = rj eiϕj and then rewrite (33)–(34) as Z Z Z 2 ∞ ∞ 2π 2 2 r1 r2 e−(r1 +r2 ) log r1 log |r1 cos θ + r2 eiϕ sin θ |dϕdr1 dr2 . G(cos θ) = π 0 0 0 (43) 1 2

We now evaluate the inner integral by Jensen’s formula, which gives  Z 2π  2π log(r1 cos θ ) for r2 sin θ ≤ r1 cos θ . log |r1 cos θ + r2 sin θeiϕ |dϕ =  2π log(r sin θ ) for r sin θ ≥ r cos θ 0 2 2 1 (44)

782

P. Bleher, B. Shiffman, S. Zelditch

Hence

∞Z ∞

Z

G(cos θ) = 4 0

r1 r2 e−(r1 +r2 ) log r1 log max(r1 cos θ, r2 sin θ )dr1 dr2 . 2

2

(45)

0

Now change variables again with r1 = ρ cos ϕ, r2 = ρ sin ϕ to get Z ∞ Z π/2 2 ρ 3 e−ρ log(ρ cos ϕ) log max(ρ cos ϕ cos θ, ρ sin ϕ sin θ ) G(cos θ) = 4 0

0

· cos ϕ sin ϕdϕdρ. (46) Since log max(ρ cos ϕ cos θ, ρ sin ϕ sin θ ) = log(ρ cos ϕ cos θ ) + log+ (tan ϕ tan θ ), we can write G = G1 + G2 , where Z ∞ Z π/2 2 ρ 3 e−ρ log(ρ cos ϕ) log(ρ cos ϕ cos θ ) cos ϕ sin ϕdϕdρ, (47) G1 (cos θ) = 4 0

Z G2 (cos θ) = 4

0 ∞ Z π/2

2

π/2−θ

0

ρ 3 e−ρ log(ρ cos ϕ) log(tan ϕ tan θ ) cos ϕ sin ϕdϕdρ. (48)

From (47), G1 (cos θ) = C1 + C2 log cos θ and thus 1 2 1 G1 (e− 2 r ) = C1 − C2 r 2 , 2

so that 1G1 (e− 2 r ) = 1 2

d2 2m − 1 d + 2 dr r dr

1 (C1 − C2 r 2 ) = −2mC2 . 2

(49)

We now evaluate 1G2 (e− 2 r ). Since the integrand in (48) vanishes when ϕ = π/2 − θ, we have Z ∞ Z π/2 d d 2 ρ 3 e−ρ log(ρ cos ϕ) cos ϕ sin ϕdϕdρ. G2 (cos θ) = 4 log tan θ dr dr 0 π/2−θ 1 2

Substituting tan2 θ = er − 1, we have 2

r d . log tan θ = 2 dr 1 − e−r Thus

where

1 2 4r d G2 (e− 2 r ) = (I1 + I2 ), 2 dr 1 − e−r

Z

I1 = 0

Z I2 =

0

∞ Z π/2 π/2−θ ∞ Z π/2 π/2−θ

ρ 3 e−ρ (log ρ) cos ϕ sin ϕdϕdρ = C sin2 θ = C(1 − e−r ), 2

ρ 3 e−ρ (log cos ϕ) cos ϕ sin ϕdϕdρ. 2

2

Poincaré–Lelong Approach to Correlations Between Zeros

783

We compute Z Z 1 π/2 1 sin θ (log cos ϕ) cos ϕ sin ϕdϕ = t log tdt I2 = 2 π/2−θ 2 0 h i 1 1 2 2 = (sin2 θ log sin2 θ − sin2 θ ) = (1 − e−r ) log(1 − e−r ) − 1 . 8 8 Thus 1 2 r d 2 G2 (e− 2 r ) = log(1 − e−r ) + C 0 r. dr 2

(50)

Hence by (49) and (50), 1G(e

− 21 r 2

) = −2mC2 +

2m − 1 r d 2 + log(1 − e−r ) + C 0 r dr r 2

= m log(1 − e−r ) + 2

r2 er − 1 2

+ C 00 .

(51)

4.2. Pair correlation in dimension 1. In dimension one, the pair correlation form is the same as the pair correlation measure. We first give our universal formula in the onedimensional case. Our formula agrees with that of Hannay [Ha] for SU(2) polynomials. Theorem 4.1. Suppose dim M = 1. Then h i w z ∞ 2 1 , ) → K ( (z, w) = π δ (z − w) + H ( |z − w| ) KN √ √ 0 2 2 2 N N i ¯ 2 i ¯ 2 ∂ ∂|z| ∧ ∂ ∂|w| , · 2π 2π where H (t) =

2 2 (sinh2 t + t 2 ) cosh t − 2t sinh t = t − t 3 + t 5 + O(t 7 ). 3 9 45 sinh t

Proof. Making the change of variables ζ = z − w, we have by (37), 1 ¯ 2 ∧ i ∂ ∂|w| ¯ 2 − 1 ∂z ∂¯z ∂w ∂¯w G(e− 2 |z−w|2 ) bN (w) → i ∂ ∂|z| bN (z) ⊗ Z E Z 2π 2π π2 ∂2 i ¯ 2 i ¯ 2 ∂2 − 21 |z−w|2 G(e ∂ ∂|z| ∧ ∂ ∂|w| ) = 1+4 ¯ ¯ 2π 2π ∂z∂z ∂w ∂w " # 2 2 ∂ i ¯ 2 i ¯ 2 − 21 |ζ |2 = 1+4 G(e ) ∂ ∂|z| ∧ ∂ ∂|w| ¯ 2π 2π ∂ζ ∂ζ 1 2 1 i ¯ 2 i ¯ 2 ∂ ∂|z| ∧ ∂ ∂|w| . = 1 + 12 G(e− 2 r ) 4 2π 2π

784

P. Bleher, B. Shiffman, S. Zelditch

By (51) with m = 1, we have 2 d 1 d r2 2 − 21 r 2 −r 2 )= + log(1 − e ) + 2 1 G(e dr 2 r dr er − 1 8(er − 1)2 − 16r 2 er (er − 1) + 4r 4 er (er + 1) 2

= 4πδ0 +

2

2

2

2

(er − 1)3 2

.

Finally,

2 2 2 2 2 2 (er +1)(er −1)2 −4r 2 er (er −1)+r 4 er (er +1) 1 2 − 21 r 2 ) = πδ0 + 1+ 1 G(e 2 4 (er −1)3 = πδ0 +

(sinh2 21 r 2 + 41 r 4 ) cosh 21 r 2 −r 2 sinh 21 r 2 sinh3 21 r 2

.

t u

4.3. Pair correlation in higher dimensions. The limit pair correlation measure is given by e2N ( √z , √w ) e2∞ (z, w) = lim N 2(m−1) K K N→∞ N N i ¯ 2 m−1 i ¯ 2 m−1 1 1 ∞ ∂ ∂|z| ∂ ∂|w| ∧ . = K 2 (z, w) ∧ (m − 1)! 2 (m − 1)! 2 ¯ 2 .) (The scaling N 2(m−1) comes from the fact that N ω( √z ) = N (τ√N )∗ ω → 2i ∂ ∂|z| N e∞ for the case of a manifold of general dimension m > 1. It is We now compute K 2 convenient to express this measure in terms of the expected density of zeros i ¯ 2 m m z 1 ∞ m−1 eN e ∂ ∂|z| . (52) K1 ( √ ) = dVCm = K1 (z) = lim N N→∞ π π(m − 1)! 2 N We have the following explicit universal formula for the limit pair correlation measure. In particular, it gives the scaling limit pair correlation for the zeros of SU(m + 1)polynomials. Theorem 4.2. Suppose dim M = m > 1. Then h i e1∞ (z) ∧ K e1∞ (w), e2∞ (z, w) = γm ( 1 |z − w|2 ) K K 2 where 1 γm (t) = =

2 2 (m

+ m) sinh2 t + t 2 cosh t − (m + 1)t sinh t m2 sinh3 t

+

m−1 2m

(m − 1) −1 m − 1 (m + 2)(m + 1) t + + t 2m 2m 6m2 (m + 4)(m + 3) 3 (m + 6) (m + 5) 5 t + t + O(t 7 ). − 90m2 945m2

Poincaré–Lelong Approach to Correlations Between Zeros

785

Proof. By (37) and (51), again writing ζ = z − w (except this time ζ ∈ Cm ),   m 2 2 X 1 ∂ ∂ 2 e1∞ (z) ∧ K e1∞ (w) e2∞ (z, w) = 1+ 4 G(e−2 |z−w| ) K K m2 ∂zj ∂ z¯ j ∂wk ∂ w¯ k j,k=1 1 2 −21 |ζ |2 e1∞ (z) ∧ K e1∞ (w) ) K (53) = 1+ 2 1ζ G(e 4m 2 2m−1 d r2 d 1 −r 2 e1∞ (w). e1∞ (z) ∧ K + )+ K m log(1−e = 1+ 2 2 4m dr 2 r dr er −1 Computing the Laplacian in (53) leads to the stated formula. u t Note that if we substitute m = 1 in the expression for γm (t), we obtain Hannay’s function H (t). However for the case m > 1, the limit measure is absolutely continuous on Cm × Cm , whereas in the one-dimensional case, there is a self-correlation delta measure. Acknowledgement. The first draft of this paper was completed while the third author was visiting the Erwin Schrödinger Institute in July 1998. He wishes to thank that institution for its hospitality and financial support.

References [BD]

Bleher, P. and Di, X.: Correlations between zeros of a random polynomial. J. Stat. Phys. 88, 269–305 (1997) [BSZ] Bleher, P., Shiffman, B. and Zelditch, S.: Universality and scaling of correlations between zeros on complex manifolds. Invent.Math. to appear [BBL] Bogomolny, E., Bohigas, O. and Leboeuf, P.: Quantum chaotic dynamics and random polynomials. J. Stat. Phys. 85, 639–679 (1996) [D] Deift, P.: Orthogonal polynomials and random matrices: A Riemann–Hilbert approach. Courant Lecture Notes in Mathematics, 3, New York: New York University, Courant Institute of Mathematical Sciences, 1999 [GH] Griffiths, P. and Harris,J.: Principles of Algebraic Geometry. New York: Wiley-Interscience, 1978 [Ha] Hannay, J.H.: Chaotic analytic zero points: Exact statistics for those of a random spin state. J. Phys. A: Math. Gen. 29, 101–105 (1996) [SZ] Shiffman, B. and Zelditch, S.: Distribution of zeros of random and quantum chaotic sections of positive line bundles. Commun. Math. Phys. 200, 661–683 (1999) [Ti] Tian, G.: On a set of polarized Kähler metrics on algebraic manifolds. J. Diff. Geometry 32, 99–130 (1990) [Ze] Zelditch, S.: Szegö kernels and a theorem of Tian. Int. Math. Res. Notices 6, 317–331 (1998) Communicated by P. Sarnak

Commun. Math. Phys. 208, 787 – 798 (2000)

Communications in

Mathematical Physics

© Springer-Verlag 2000

Monopoles and Solitons in Fuzzy Physics S. Baez1 , A. P. Balachandran1 , S.Vaidya2 , B. Ydri1 1 Physics Department, Syracuse University, Syracuse, NY 13244-1130, USA 2 Tata Institute of Fundamental Research, Colaba, Mumbai, 400 005, India

Received: 13 April 1999 / Accepted: 7 August 1999

Abstract: Monopoles and solitons have important topological aspects like quantized fluxes, winding numbers and curved target spaces. Naive discretizations which substitute a lattice of points for the underlying manifolds are incapable of retaining these features in a precise way. We study these problems of discrete physics and matrix models and discuss mathematically coherent discretizations of monopoles and solitons using fuzzy physics and noncommutative geometry. A fuzzy σ -model action for the two-sphere fulfilling a fuzzy Belavin–Polyakov bound is also put forth. A fuzzy space ([9–14]) is obtained by quantizing a manifold, treating it as a phase space. An example is the fuzzy two-sphere SF2 . It is described by operators xi subject to the √ √ P relations i xi2 = 1 and [xi , xj ] = (i/ l(l + 1))ij k xk . Thus Li = l(l + 1)xi are (2l + 1)-dimensional angular momentum operators while the canonical classical twosphere S 2 is recovered for l→∞. Planck’s work shows that quantization creates a short distance cut-off, therefore quantum field theories (QFT’s) on fuzzy spaces are ultraviolet finite. If the classical manifold is compact, it gets described by a finite-dimensional matrix model, the total number of states being finite too. Noncommutative geometry ([1,9–14]) has an orderly prescription for formulating QFT’s on fuzzy spaces so that these spaces indeed show us an original approach to discrete physics. In this note, we focus attention on SF2 and discuss certain of its remarkable aspects, entirely absent in naive discretizations. Quantum physics on SF2 is a mere matrix model, all the same they can coherently describe twisted topologies like those of monopoles and solitons. Traditional attempts in this direction based on naive discrete physics have at best been awkward, having ignored the necessary mathematical structures (projective modules and cyclic cohomology). Not all our results are new, our construction of monopoles being a reformulation of the earlier important work of Grosse et al. [4].

788

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

1. Classical Monopoles and σ -Models There is an algebraic formulation of monopoles and solitons suitable for adaptation to fuzzy spaces. We first outline it using the case of S 2 ([13]). Let A be the commutative algebra of smooth functions on S 2 . Vector bundles on S 2 can be described by projectors P. P is a matrix with coefficients in A(Pij ∈A), and fulfills P 2 = P and P † = P. If the points of S 2 are described by unit vectors n∈R3 , the ˆ where τi are Pauli matrices and projector for unit monopole charge is P (1) = (1+τ .n)/2, N N 2 nˆ i are coordinate functions, nˆ i (n) = ni . If A = A⊗C2 consists of 2N -component vectors ξ = (ξ1 , ξ2 , ..., ξ2N ), ξi ∈A, then the sections of vector bundles for monopole charge 1 are P (1) A2 . For monopole charge ±N (N > 0), the corresponding projectors Q (i) ˆ where τ (i) are commuting sets of Pauli matrices. are P (±N) = i=N i=1 (1 ± τ .n)/2, N They give sections of vector bundles P (±N ) A2 with τ (i) .nˆ P (±N ) ξ = ±P (±N ) ξ , τ (i) th 2 acting on the i C factor. For the trivial bundle, we can use P 0 = (1 + τ3 )/2 (or (1 − τ3 )/2), or more simply just the identity. The self-same projectors also describe nonlinear σ -models. To see this, consider the projector P (0) and its orbit hhP (0) h−1 : h∈SU (2)i = S 2 . If now we substitute for h a field g on S 2 with values in SU (2), each gP (0) g −1 describes a map S 2 →S 2 . It is a σ -model field on S 2 with target space S 2 and zero winding number. (Winding number is zero as g can be deformed to a constant map). For winding number 1, it is appropriate to ˆ g(n)−1 is consider the orbit of P (1) under g. For fixed n, as g(n) is varied, g(n) 1+σ2.n(n) still S 2 so that as n is varied, we get a map S 2 →S 2 . (More correctly we get the section of an S 2 bundle over S 2 ). For winding number ±N , we can consider the orbit of P (±N ) ˜ ˜ under conjugation by g ⊗N ’s, where g ⊗N (n) = g(n)⊗g(n)⊗ · · · ⊗g(n) (N factors). Here the i th g(n) acts only on τ (i) . 2. Winding Numbers for the Classical Sphere What about formulas for invariants like Chern character and winding number? The ideal way is to follow Connes ([9,10,12,14,11,13]) and introduce the Dirac and chirality operators D = ij k σi nˆ j Jk , 0 = σ.n, ˆ

(1)

where σi are Pauli matrices, J = −i(r×5) + σ2 is the total angular momentum and nˆ = r/|r|. The important points to keep in mind here are the following: i) 0 commutes with elements of A and anti-commutes with D. ii) 0 2 = 1 and 0 † = 0. The Chern numbers (or the quantized fluxes) for monopoles then are Z 1 d(cos θ)dφ Tr 0P (±N ) [D, P (±N ) ] [D, P (±N ) ](n). ±N = 4π

(2)

˜ and are also the soliton winding They do not change if P (±N) are conjugated by g ⊗N numbers.

Monopoles and Solitons in Fuzzy Physics

789

3. Fuzzy Monopoles The algebra A generated by xi is the full matrix algebra of (2l + 1)×(2l + 1) matri(±N ) ces. Fuzzy monopoles are described by projectors p(±N ) (pij ∈A), which as l→∞ approach P (±N) . We can find them as follows: For N = 1, we can try (1 + τ .x)/2, but that is not an idempotent as the x’s do not commute. We can fix that though: since 1 (τ .L + 1/2) squares to 1 as first remarked by (τ .L)2 = l(l + 1) − τ .L, γτ = l+1/2 (1) Watamuras ([6,7]). Hence p = (1 + γτ )/2. There is a simple interpretation of p (1) . We can combine L and τ /2 into the SU (2) 2 generator K (1) = L + τ /2 with spectrum k(k + 1) (with k = l ± 1/2) for K (1) ≡ (1) (1) K .K . The projector to the space with the maximum k, namely 2

K (1) − (l − 1/2)(l + 1/2) , (l + 1/2)(l + 3/2) − (l − 1/2)(l + 1/2) is just p (1) . This last remark shows the way to fuzzify P (N ) . We substitute K

(N )

=L+

i=N X

(τ (i) )/2

i=1 2

for K (1) and consider the subspace where K (N ) ≡K (N ) .K (N ) has the maximum eigenvalue kmax (kmax + 1), kmax = l + N/2. On this space (L + τ (i) /2)2 has the maximum value (l + 1/2)(l + 3/2) and τ (i) .L is hence l. Since τ (i) .x approaches τ (i) .nˆ on this subspace as l → ∞, p (N ) is just its projector: Q (N ) 2 − k(k + 1)] k6 =kmax [K (N ) (+N) ≡p =Q . (3) p k6 =kmax [kmax (kmax + 1) − k(k + 1)] p (−N) comes similarly from the least value kmin = l − N/2 of k. [We assume that 2l ≥ N.] We remark that the limits as l→∞ of p (±N ) are exactly P (±N ) , and are not say P (±N ) times another projector. That is because if τ (i) .L are all l, then τ (i) .τ (j ) = 1 for all i6=j and hence kmax = l + N2 . A proof goes as follows. Vectors with L2 = l(l + 1)1 can be represented as symmetric tensor products of 2l spinors, with components Ta1 ...a2l . The vectors with τ (i) .L = l1 as well have components Ta1 ...a2l ,b1 ...bN with symmetry under exchange of any ai with aj or bk . So they are symmetric under all exchanges of bi and (i) (j ) bj and have ( τ 2 + τ 2 )2 = 2, τ (i) .τ (j ) = 1. N Having obtained p (±N) , we can also write down the analogues of P (±N ) A2 : they N N are the “projective modules” p (±N ) A2 , A2 = h(a1 , a2 , .., a2N ) : ai ∈Ai, and are the noncommutative substitutes for sections of vector bundles. N If (a1 , a2 , ..., a2N ) is regarded as a column, then column dimension of p(±N ) A2 is L + 1≡2(l± N2 ) + 1 and its row dimension is M + 1≡2l + 1. Their difference is N ±N. This means that p(±N) A2 can be identified with Hˆ L,M of ref.[4] where of course N L − M = ±N. In particular angular momentum acts on p (±N ) A2 via K (N ) on the (±N ) ) and −L on the right, while there are similar actions of left (they commute with p angular momentum on Hˆ L,M [see ref. [4]].

790

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

4. Fuzzy σ -Models In the fuzzy versions of σ -models on S 2 with target S 2 , g becomes a 2×2 unitary matrix u with uij ∈A. Therefore u∈U (2(2l + 1)). (We can impose det u = 1, that ˜ ˜ makes no difference). An appropriate generalization u⊗N of g ⊗N can be constructed as follows. If C and D are 2×2 matrices with entries Cij , Dij ∈A, we can define Ca and aD for a ∈ A by (Ca)ij = Cij a and (aD)ij = aDij . Let C⊗A D denote the tensor product of C and D over A, where by definition Ca⊗A D = C⊗A aD. This definition can be extended to more factors. For example, C⊗A D⊗A E has the properties Ca⊗A D⊗A E = C⊗A aD⊗A E, C⊗A Da⊗A E = C⊗A D⊗A aE. Then: ˜

u⊗N = u⊗A u⊗A . . . ⊗A u ( N factors).

(4)

We can understand this construction in familiar terms by writing u = 12⊗2 a0 +τj aj = τµ aµ (aµ ∈A), where τ0 = 12⊗2 . [Greek subscripts run from 0 to 3, Roman ones from 1 to 3]. Unitarity requires that τµ τν aµ∗ aν = 1, aµ∗ ≡ aµ† .

(5)

In this notation, u⊗A u = τµ ⊗τν aµ aν , ⊗(≡⊗C ) denoting the Kronecker product. It is also τµ (1) aµ τν (2) aν in an evident notation. Proceeding in this way, we find, a τ (2) a . . . τµ(NN ) aµN . u⊗A u⊗A . . . ⊗A u = τµ(1) 1 µ1 µ2 µ2

(6)

It is unitary in view of (5). ˜ is a matrix with coefficients in A and not The significant point here is that u⊗N ˜ can also be A⊗A⊗ . . . ⊗A as is the case for u⊗u⊗u . . . ⊗u. We remark that g ⊗N written as g⊗A g⊗A g . . . ⊗A g (N factors). It is then a function only of n and has the meaning stated earlier. ˜ are fuzzy matrix versions of σ -model The orbits of p (±N) under conjugation by u⊗N ˜ fields with winding numbers ±N. [Here we take p(0) to be (1 + τ3 )/2 say and its u⊗N to be u itself. Henceforth our attention will be focused on N 6 = 0.] 5. “Winding Numbers” for the Fuzzy Sphere The fuzzy Dirac operator D and chirality operator γ are important for writing formulae for the invariants of projectors. There are proposals for D and γ in [2–7], we briefly describe those in [6,7]. There is a left and right action (“left” and “right” “ regular representations” AL and AR ) of A on A: bL a = ba and bR a = ab, (b, a ∈ A, bL,R ∈AL,R ) R L with corresponding angular momentum operators LL i and Li and fuzzy coordinates xi R and xi . D and γ are D = ij k σi xjL LR k, γ =−

σ.LR − 1/2 . l + 1/2

(7)

Monopoles and Solitons in Fuzzy Physics

791

Identifying AL as the representation of the fuzzy version of A, we have as before, γ bL γ D + Dγ γ2 γ†

= bL γ , = 0, = 1, = γ.

(8)

The carrier space of AL,R , D and γ is A2 . When p(±N ) are also included, it gets N +1 as τ (i) commute with σ . Note that p(±N ) commute with γ , as the expanded to A2 x’s they contain are now being identified with x L ’s. We now construct a certain generalization of (2) for the fuzzy sphere. It looks like (2), or rather the following expression: 1 (±N ) (±N ) (±N ) 0 P [D, P ] [D, P ] , ±N = −T rω |D|2 |D| = Positive square root of D† D,

(9)

where F = D/|D| [9–14]. It is equivalent to (2). It involves a Dixmier trace T rω and furthermore the inverse of |D|. N +1 has zero modes and |D| has no inverse. But the massless Dirac operator on A2 N +1 as rectangular matrices An easy proof is as follows. We can write the elements of A2 with entries ξλj ∈A (λ = 1, 2, ..., 2N ; j = 1, 2), where λ carries the action of τ (i) ’s and N +1 with γ = ±1 j carries the action of σ . The dimensions of the subspaces U± of A2 N are [2(l ± 1/2) + 1][(2l + 1)2 ]. The first factor is the row dimension of U± and is deduced from the fact that (−LR + σ /2)2 has the definite values (l + 1/2)(l + 3/2) and (l − 1/2)(l + 1/2) on U± . The second factor is the column dimension of U± . D anticommutes with γ . So if D (+) is the restriction of D to the domain U+ , D (+) = D|U+ : U+ → U− , its index is dim U+ −dim U− = 2[(2l +1)2N ]. This is the minimum number of zero modes of D in U+ . Calculations [6,7] show this to be the exact number of zero modes, D having no zero mode in U− . In any case, D (+) and so D have no inverse. So we work instead with the massive Dirac operator Dm = D + mγ (m 6= 0) 2 = D 2 + m2 and form the operator with the strictly positive square Dm fm =

Dm , |Dm |

(10)

where † Dm , fm† = fm , fm2 = 1. |Dm | = Positive square root of Dm

(11)

(N ) f p (N ) 1+γ , where we pick p (N ) and not p (−N ) for specificity. It Consider 1−γ m 2 p 2 anticommutes with γ . Let Vˆ± = p(N ) U± . It then follows that the index of the operator

1 − γ (N ) 1+γ p fm p(N ) fˆm(+) = 2 2

(12)

(restricted to Vˆ+ , such restrictions are hereafter to be understood) is the dimension of Vˆ+ − that of Vˆ− = 2[2l + 1 + N]. The index of its adjoint

792

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

1 + γ (N ) 1−γ p fm p(N ) fˆm(+)† ≡ fˆm(−) = 2 2

(13)

is −2[2l + 1 + N]. (+) We may try to associate the index of fˆm say with the winding number N . But that will not be correct: this index is not zero for N = 0. The source of this unpleasant feature is also a set of unwanted zero modes. Their presence can be established by (±) (±) looking at fˆm more closely. fˆm and γ commute with “total angular momentum” P (i) (±) J = LL − LR + i τ 2 + σ2 while γ anticommutes with fˆm . So if an irreducible representation (IRR) of J with J 2 = j (j + 1)1 occurs an odd number of times in (+) (−) Vˆ+ + Vˆ− , fˆm + fˆm must vanish on at least one of the (2j + 1)− dimensional eigenspaces. The remaining (2j + 1)− dimensional eigenspaces can pair up so as to correspond to eigenvalues ±λ6 =0 and get interchanged by γ . There are two such j , both in Vˆ+ . They label IRR’s with multiplicity 1 and are its maximum and minimum N−1 j (N ) = 2l + N+1 2 and 2 . We can see that their eigenspaces have γ = +1 as follows: P (i) N +1 is l +N/2 so that the angular the angular momentum value of LL + i τ 2 in p (N ) A2 momentum value of −LR + σ2 must be l + 1/2 to attain the j − values j (N ) and N 2−1 . A (+) further point is that since (2j (N ) + 1) + [2( N 2−1 ) + 1] is the index of fˆm found earlier, we can conclude that there are no other obligatory zero modes. Indeed every other j labels IRR’s of multiplicity 2, one with γ = +1 and the other with γ = −1. The zero modes for j (N ) are unphysical as discussed by Watamuras [6,7]: there are no similar modes in the continuum. If we can project them out, the index will shrink to (N ) 2 N 2−1 + 1 = N, just what we want. So let π (j ) be the projection operator for j (N ) , constructed in the same fashion as p (N ) . It commutes with p(N ) since p(N ) commutes (N ) (N ) with J . In fact, p (N ) π (j ) = π (j ) since if j is maximum, then so is k. We thus find that 5(N ) = p(N ) [1 − π (j

(N ) )

] = p(N ) − π (j

(N ) )

(14)

is a projector. It commutes with γ too. Let V± = 5(N ) U± , 1∓γ (N ) 1±γ 5 fm 5(N ) , fm(±) = 2 2 (+) †

(−)

(15)

(+)

where fm = fm . Then fm (restricted to V+ ) has the index N we want. (−,+) (+,−) fm are the same (not counting degeneracy) and for The eigenvalues of fm nonzero eigenvalues, the dimensions of the corresponding eigenspaces are also identical. (We omit the elementary proofs.) Therefore, 1 + γ (N ) 5 [1 − fm(−) fm(+) ] − 2 1 − γ (N ) 5 [1 − fm(+) fm(−) ] = Tr 2 1 − γ (N ) 1 + γ (N ) =N 5 −Tr 5 Tr 2 2 = Index of fm(+) . Tr

(16)

Monopoles and Solitons in Fuzzy Physics

793

We want to be able to write (16) as a cyclic cocycle coming from a Fredholm module [9–14]. The latter for us is based on a representation 6 of AL ⊗AR on a Hilbert space, and operators F and with the following properties: (i) F † = F, F 2 = 1. (ii) † = , 2 = 1, 6(α) = 6(α), F = −F ,

(17) (18)

where α∈AL ⊗AR . [This gives an even Fredholm module, there need be no in an odd one.] We choose for 6 the representation α0 6 : α → 6(α) = (19) 0α on A2

N +1

⊕ A2

N +1

and set

F =

0 fm fm 0

, =

1 0 . 0 −1

(20)

Introduce the projector P

(N )

=

1+γ 2

5(N ) 0

! 0 1−γ (N ) 2 5

.

Then (P

(N )

FP

(N ) 2

) =

(+)† (+) fm

fm

0

(21)

! 0 (−)† (−) fm f m

.

(22)

Therefore, Index of fm(+) = Tr [P (N ) − (P (N ) F P (N ) )2 ]. But since [9] P (N ) − (P (N ) F P (N ) )2 = −P (N ) [F, P (N ) ]2 P (N ) , Index of fm(+) = N = −Tr P (N ) [F, P (N ) ] [F, P (N ) ].

(23) (24)

This is the formulation of (16) weaimed at and is the analogue of (2). It is worth γ 0 γ 0 (N ) = P (N ) . remarking that we can replace by here since P 0γ 0γ N +1

as well, the unwanted zero modes correspond to the top value j (−N ) = In p (−N) A2 N−1 2l − 2 of “total angular momentum”. Once they are suppressed, the remaining obligatory zero modes are readily shown to have the j − value N 2−1 , multiplicity N and (−N )

) be the projector for the top angular momentum. Then it can be γ = −1. Let π (j projected out by replacing p(−N ) by

5(−N ) = p(−N ) [1 − π (j (−N )

(−N )

(−N ) )

],

(25)

6 =π j . Substituting 5(−N ) for 5(N ) in (21), we define P (−N ) but now p (−N) π j and then by using (24) can associate −N too with an index.

794

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

There is the topic of fuzzy σ -fields yet to be discussed in this section. We first note ˜ ˜ and g ⊗N that in u defined earlier, aµ is to be identified with aµL . Let us extend u⊗N N

N +1

N

N +1

N

from A2 to A2 = A2 ⊗C2 and A2 = A2 ⊗C2 so that they act as idenN +1 N +1 N +1 2 ≡ A2 ⊗C2 and tity on the last C ’s. We also extend them further toA2 ⊕A2 N +1 N +1 N +1 ≡ A2 ⊗C2 so that they act as identity on these last C2 ’s. Define A2 ⊕A2 N +1 N +1 ˜ ˜ Q(u⊗N )−1 for an operator Q(= Q(1)) on A2 ⊕A2 . The right also Q(u) = u⊗N hand side of (24) is invariant under the substitution P (N ) →P (N ) (u) without changing F . So P (N ) (u) is a candidate for a fuzzy winding number N σ -field in the present context, whereas previously it was p(N ) (u). But we must justify this candidacy by ˜ ˜ →g ⊗N , π (j looking at the continuum limit. In that limit, u⊗N

(N ) 5(N ) →5∞ ˜ ⊗N

P (N )

(j (N ) ) π∞ .

(N ) )

(j (N ) )

→π∞

say and

P (N )

= − The stability group of under conjugation by (N ) is as before U (1) at each n. Now π (j ) projects out states where any one of g (i) (LL + τ 2 )2 , (LL + σ2 )2 , (LL − LR )2 has the maximum value. (Then any other pair of angular momenta also adds up to maximum value as we saw in Sect. 3.) So (N ) (N ) (N ) (N ) (N ) τ (i) .x L π (j ) = σ .x L π (j ) = π (j ) , x L .x R π (j ) = −π (j ) . The last condition (N ) is just a rule telling us that as l→∞, x R becomes −nˆ on vectors projected by π (j ) . It (j (N ) )

can be identified will not show up in the continuum projector. This establishes that π∞ Q 1+τ (i) .nˆ 1+σ .nˆ (N+1) and 1±0 5(N ) ) while of course γ →0. So P with P (N+1) = ( N ∞ i=1 2 2 2 (N ) have U (1) stability groups, 5∞ (g) is a σ -field on S 2 and P (N ) (u) is a good choice for the fuzzy σ -field. For winding number −N , we propose P (−N ) (u) as the fuzzy σ -field. We can check its Q 1−τ (i) .nˆ validity also by going to the continuum limit. As l→∞, p(−N ) →P (−N ) = N i=1 2 (−N )

) . The and has the U (1) stability group at each n. Next consider the product p(−N ) π (j (N ) (−N) 2 allows us to assume that (K ) = kmin (kmin + 1), kmin = l − N/2. presence of p (−N ) ) the projector coupling K (N ) with −LR + σ2 to give Also we can substitute for π (j .nˆ maximum angular momentum. This projector for l→∞ gives the projector 1+σ 2 , the (−N )

) as ˆ So p(−N ) π (j N giving no contribution. (We also get the condition x R → − n). 1+σ .nˆ (−N ) and that too has the U (1) stability group at l→∞ can be identified with P 2 each n. This shows that P (−N) (u) is a good fuzzy σ -field for winding number −N.

6. Dynamics and Continuum Limit for Fuzzy σ -Models The simplest action for the O(3) nonlinear σ -model on S 2 is S P3

a=1

=

β 2

Z

d cos θ dφ (Li 8a )(n)(Li 8a )(n), 4π

8a (n)2 = 1, β > 0,

(26)

where −iLi are the angular momentum operators on S 2 . It fulfills the important bound [16] S≥βN,

(27)

Monopoles and Solitons in Fuzzy Physics

795

where N(≥0) or−N is as usual the winding number of the map 8 : S 2 →S 2 : Z d cos θ dφ 1 ij k ni abc 8a (Lj 8b )(Lk 8c ). Winding number of 8 = 2 S2 4π

(28)

This bound is obtained by integrating the inequality (Li 8a ±ij k nj abc 8b Lk 8c )2 ≥0

(29)

and is saturated if and only if Li 8a ±ij k nj abc 8b Lk 8c = 0

(30)

for one choice of sign. The solutions of (30) can be thought of as two dimensional instantons [16]. We now propose a fuzzy σ -action using these properties of S as our guide. Consider the inequality ([F, P (u)]

1± 1± P (u))† ([F, P (u)] P (u))≥0, 2 2

(31)

where P (u) can be P (N ) (u) or P (−N ) (u) and Q≥0 here means that Q is a nonnegative operator. This is the analogue of (29). Taking trace, we get the analogue of (27), sF ≡T rP (u)[F, P (u)][F, P (u)]≥N.

(32)

The bound is saturated if and only if [F, P (u)]

1± P (u) = 0 2

(33)

for one choice of sign, just like in (30). All this suggests the novel fuzzy σ -action SF = βF sF .

(34)

Qualitative remarks about the approach to continuum of SF will now be made. The first is that βF and m must be scaled as l→∞. As regards the scaling of βF , we conjecture that (33) has no solution for finite l and that (32) is a strict inequality. Choose 3(l) = so that

sF 3(l)

1 ×(Minimum of sF ) N

(35)

is N at minimum. Then we suggest that we should set βF =

β . 3(l)

(36)

It is our conjecture too that 3(l) diverges as l→∞ in such a way that (up to factors) SF →S∞ = βTr ω P∞ (g)[F, P∞ (g)][F, P∞ (g)], 0 D/|D| F= , D/|D| 0 g = lim u, l→∞

P∞ (g) = lim P (u), l→∞

(37)

796

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

where we have let m become zero as D has no zero mode. An alternative form of S∞ is Z d cos θdφ 0 D 0 D , P∞ (g) , P∞ (g) , (38) TrP∞ (g) S∞ = β D 0 D 0 4π where the trace T r is only over the internal indices. We now argue that P (u) itself must be corrected by cutting off all high angular momenta (and not just the top one) while passing to continuum. Thus it was mentioned before that state vectors with top “total” angular momentum j (±N ) are unphysical. Their characteristic feature is their divergence as l → ∞. That means that once normalized these vectors become weakly zero in the continuum limit. In fact any sequence of vectors with a linearly divergent j as l→∞ is unphysical. Such j contribute eigenvalues to the Dirac operator which are nonexistent in the continuum, as one can verify using the results of [6,7] for N = 0: the spectrum of D then is ±(j + 21 )[1+(1−(j + 21 )2 )/(4l(l +1))]1/2 while that of D is ±(j + 21 ), j being total angular momentum. The corresponding eigenvectors too if normalized are weakly zero in the l → ∞ limit. It seems necessary therefore to eliminate them in a suitable sense during the passage to the limit. One way to do so may be to use a double limit which we now describe. Let π (J ) be the projection operator for all states with j ≥J . Let us define ! 1+γ (±N ) p (1 − π (J ) ) 0 (±N )(J ) 2 , = P 1−γ (±N ) (1 − π (J ) ) 0 2 p ˜

˜

P (±N )(J ) (u) = u⊗N P (±N )(J ) [u⊗N ]−1 .

(39)

We then consider the fuzzy σ -model with P (±N )(J ) (u) replacing P (±N ) (u)≡P (±N )(j

(N ) )

(u)

and thereby cutting off angular momenta ≥J . That would not affect index theory arguments so long as J > N−1 2 as the important zero modes will then be left intact. We are thus led to the cut-off action (J )

SF =

β (J ) s , (J 3 ) (l) F

(J )

sF = T rP (±N )(J ) (u)[F, P (±N )(J ) (u)][F, P (±N )(J ) (u)], 3(J ) (l) =

(J )

Minimum of sF , N

(40)

and the following suggestion: A good way to define the continuum partition function is (J ) to let l and J →∞ in that order in the partition function of SF . Thus we propose the continuum partition function Z (J ) (41) dµ exp(−SF ), Z = lim lim J →∞ l→∞

dµ denoting the functional measure. The inner limit recovers the continuum where the contributions of vectors with divergent J should not matter, for this reason this method may eliminate the influence of unwanted modes from Z. Perhaps an equivalent limiting procedure would be let l, J →∞ with J / l→0.

Monopoles and Solitons in Fuzzy Physics

797

Taking the limit l→∞ with fixed J is compatible with the continuum description of the σ -field. In that limit, p(±N ) becomes P (±N ) . Next consider the vectors projected by p (±N) [1 − π (J ) ]. The effect of the last factor on the projected vectors is as follows: For γ = 1 say, we must combine the angular momentum value l± N2 of K (N ) with the value l + 1/2 of −LR + σ2 to produce an allowed value j < J of any such vector. So [K (N ) + (−LR + σ2 )]2 = j (j + 1), (K (N ) )2 = (l± N2 )(l± N2 + 1) and (−LR + σ2 )2 = (l + 21 )(l + 23 ). Letting l→∞, we find that x L .x R →1 due to the factor [1 − π (J ) ], where (i)

we have used the fact that τ l and σl →0 as l→∞. But this is just a rule instructing us to set x R = nˆ for large l for these vectors, and will not show up in the continuum projector. The γ = −1 case is no different in the continuum limit. Thus for l→∞, ˆ ˆ P (±N ) [g ⊗N ]−1 , the p (±N) (u)[1 − π (J ) (u)] can be interpreted as P (±N ) (g) = g ⊗N continuum σ -fields. Let 1+0 (±N ) (g) 0 (±N )(J ) 2 P . (42) (g) = lim P (±N )(J ) (u) = P∞ 1−0 (±N ) (g) 0 l→∞ 2 P (J )

Then the naive l→∞, m→0 limit of SF is expected to be (up to factors) (±N )(J ) (±N )(J ) (±N )(J ) (g)[F, P∞ (g)][F, P∞ (g)] S∞ = βT rω P∞

which can be simplified to Z d cos θdφ T rP (±)N (g)[D, P (±)N (g)][D, P (±)N (g)]. S∞ = β 4π

(43)

(44)

It seems to correspond to (26). 7. Remarks – The Dirac operators in (1) and (7) differ from those of [6,7] by unitary transformations generated by 0 and γ . – Nonabelian monopoles such as the elementary U (2) monopoles and their fuzzy versions can very likely be accommodated in our approach using different projectors. – In the same manner, there seems to be no big barrier to studying the case of Grassmannians Gn,k (C) = U (n + k)/[U (n)×U (k)] as target spaces in σ -models as they are orbits of rank n projectors under U (n + k). We can also imagine treating other target spaces by considering orbits under subgroups of U (n + k), the previous choice ˜ (n)}⊂U (2N) being an example. of {g ⊗N – We have managed to generalize the approach here to fuzzy manifolds like fuzzy CP2 ([17]) or more generally to fuzzy versions of orbits of simple Lie groups in the adjoint representation. We will report on this work elsewhere. – Further discussion of fuzzy quantum physics of monopoles and solitons is needed to better reveal the implications of fuzzy quantum physics in its topological aspects. – The fuzzy Dirac operator D 0 used in [2–5] gives a much better approximation to the spectrum of the continuum Dirac operator. Its eigenvalue for the top angular momentum state vector is largest in modulus, recedes to infinity with l and is not a zero mode as in the case of D. The contribution of this vector therefore tends to be suppressed in functional integrals. In [18], a chirality operator γ 0 for D 0 (with the

798

S. Baez, A. P. Balachandran, S. Vaidya, B. Ydri

correct continuum limit) has been constructed after projecting out this vector. The contents of the present paper can be easily recast using D 0 and γ 0 . Ref.[19] discusses θ-states and chiral anomalies in gauge theories of fuzzy physics using these new operators. Acknowledgements. During this work we were fortunate to receive help and advice from colleagues and friends like T. R. Govindarajan, Giorgio Immirzi, C. Klimcik, Gianni Landi, Fedele Lizzi, Xavier Martin, Denjoe O’Connor, Paulo Teotonio-Sobrinho and J. C. Varilly. Fedele was especially helpful with detailed comments and pointed out an error in an earlier version and also the connection of our work to that in ref.[4]. We thank them. A.P.B. also warmly thanks Fedele Lizzi and Beppe Marmo for their exceptionally friendly hospitality and for arranging support by INFN at Dipartimento di Scienze Fisiche, Università di Napoli, while this work was being completed. This work was supported in part by the DOE under contract number DE-FG02-85ER40231.

References 1. Madore, J.: An Introduction to Noncommutative Differential Geometry and its Applications. Cambridge: Cambridge University Press, 1995 2. Grosse, H. and Presnajder, P.: Lett. Math. Phys. 33, 171–182 (1995) 3. Grosse, H., Klimcik, C. and Presnajder, P.: N = 2 Superalgebra and Noncommutative Geometry. In: Les Houches Summer School on Theoretical Physics, 1995, hep-th/9603071 4. Grosse, H., Klimcik, C. and Presnajder, C.: Commun. Math. Phys. 178, 507–526 (1996), hep-th/9510083 5. Grosse, H., Klimcik, C. and Presnajder, P.: Commun. Math. Phys. 180, 429–438 (1996), hep-th/9602115. See citations in [2–5] for further references 6. Carow-Watamura, U. and Watamura, S.: Commun. Math. Phys. 183, 365–382 (1997), hep-th/9605003 7. Carow-Watamura, U. and Watamura, S.: hep-th/9801195 8. See however Fröhlich, J., Grandjean, O. and Recknagel, A.: math-phys/9807006 and references therein for fuzzy versions of group manifolds, also of odd dimension 9. Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 10. Landi, G. An Introduction to Noncommutative Spaces and Their Geometries. Berlin: Springer-Verlag, 1997, hep-th/9701078 11. Coquereaux, R.: J. Geom. Phys. 6, 425–490 (1989), CPT preprint CPT-88/P-2147 12. Varilly, J.C. and Gracia-Bondia, J. M.: J. Geom. Phys. 12, 223–301 (1993) 13. Mignaco, J. A., Sigaud, C., da Silva, A. R. and Vanhecke, F.J.: Rev. Math. Phys. 9, 689–718 (1997), hep-th/9611058 14. Varilly, J.C.: An Introduction to Noncommutative Geometry. physics/9709045 15. Montvay, I. and Münster, G.: Quantum Fields on a Lattice. Cambridge: Cambridge University Press, 1994 16. Belavin, A.A. and Polyakov, A.M.: JETP Lett. 22, 245–247 (1975); Balachandran, A.P., Marmo, G., Skagerstam, B.S. and Stern, A.: Classical Topology and Quantum States. Singapore: World Scientific, 1991, p. 112 17. Fuzzy CP2 has already been investigated by H.Grosse and A.Strohmaier, hep-th/9902138 18. Balachandran, A.P., Ydri, B. and Govindarajan, T.R.: In preparation 19. Balachandran, A.P. and Vaidya, S.: hep-th/9910129 Communicated by A. Connes

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 186

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Communications in Mathematical Physics - Volume 260

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close