Communications in Mathematical Physics - Volume 238

Commun. Math. Phys. 238, 1–33 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0853-1 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

39 downloads 835 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 238, 1–33 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0853-1

Communications in

Mathematical Physics

Hitchin–Kobayashi Correspondence, Quivers, and Vortices ´ Luis Alvarez–C´ onsul1, , Oscar Garc´ıa–Prada2, 1 2

Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA Departamento de Matem´aticas, Universidad Aut´onoma de Madrid, 28049 Madrid, Spain

Received: 10 December 2001 / Accepted: 10 November 2002 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: A twisted quiver bundle is a set of holomorphic vector bundles over a complex manifold, labelled by the vertices of a quiver, linked by a set of morphisms twisted by a fixed collection of holomorphic vector bundles, labelled by the arrows. When the manifold is K¨ahler, quiver bundles admit natural gauge-theoretic equations, which unify many known equations for bundles with extra structure. In this paper we prove a Hitchin– Kobayashi correspondence for twisted quiver bundles over a compact K¨ahler manifold, relating the existence of solutions to the gauge equations to a stability criterion, and consider its application to a number of situations related to Higgs bundles and dimensional reductions of the Hermitian–Einstein equations. Introduction A quiver Q consists of a set Q0 of vertices v, v , . . ., and a set Q1 of arrows a : v → v connecting the vertices. Given a quiver and a compact K¨ahler manifold X, a quiver bundle is defined by assigning a holomorphic vector bundle Ev to a finite number of vertices and a homomorphism φa : Ev → Ev to a finite number of arrows. A quiver sheaf is defined by replacing the term “holomorphic vector bundle” by “coherent sheaf” in this definition. If we fix a collection of holomorphic vector bundles Ma parametrized by the set of arrows, and the morphisms are φa : Ev ⊗ Ma → Ev , twisted by the corresponding bundles, we have a twisted quiver bundle or a twisted quiver sheaf. In this paper we define natural gauge-theoretic equations, that we call quiver vortex equations, for a collection of hermitian metrics on the bundles associated to the vertices of a twisted quiver bundle (for this, we need to fix hermitian metrics on the twisting vector bundles). To solve these equations, we introduce a stability criterion for twisted quiver sheaves, and Current address: Mathematical Sciences, University of Bath, Bath, BA2 7AY, UK. E-mail: [email protected] Current address: Instituto de Matem´aticas y F´ısica Fundamental, CSIC, Serrano 113 bis, 28006 Madrid, Spain. E-mail: [email protected]

2

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

prove a Hitchin–Kobayashi correspondence, relating the existence of (unique) hermitian metrics satisfying the quiver vortex equations to the stability of the quiver bundle. The equations and the stability criterion depend on some real numbers, the stability parameters (cf. Remarks 2.1 for the exact number of parameters). It is relevant to point out that our results cannot be derived from the general Hitchin–Kobayashi correspondence scheme developed by Banfield [Ba] and further generalized by Mundet [M]. This is due not only to the presence of twisting vector bundles, but also to the deformation of the Hermitian–Einstein terms in the equations. This deformation is naturally explained by the symplectic interpretation of the equations, and accounts for extra parameters in the stability condition for the twisted quiver bundle. This correspondence provides a unifying framework to study a number of problems that have been considered previously. The simplest situation occurs when the quiver has a single vertex and no arrows, in which case a quiver bundle is just a holomorphic bundle E, and the gauge equation is the Hermitian–Einstein equation. A theorem of Donaldson, Uhlenbeck and Yau [D1, D2, UY], establishes that a (unique) solution to the Hermitian–Einstein equation exists if and only if E is polystable. The bundle E is called stable (in the sense of Mumford–Takemoto) if µ(F) < µ(E) for each proper coherent subsheaf F ⊂ E, where the slope µ(F) is the degree divided by the rank; a finite direct sum of stable bundles with the same slope is called polystable. A correspondence of this type is usually known as a Hitchin–Kobayashi correspondence. A Hitchin–Kobayashi correspondence, where some extra structure is added to the bundle E, appears in the theory of Higgs bundles, consisting of pairs (E, ) formed by a holomorphic vector bundle E and a morphism : E → E ⊗ , where is the sheaf of holomorphic differentials (sometimes the condition ∧ = 0 is added as part of the definition). Higgs bundles were first studied by Hitchin [H] (when X is a compact Riemann surface), and Simpson [S] (when X is higher dimensional), who introduced a natural gauge equation for them, and proved a Hitchin–Kobayashi correspondence. Higgs bundles are twisted quiver bundles, for a quiver formed by one vertex and one arrow whose head and tail coincide, and the twisting bundle is the holomorphic tangent bundle (i.e. the dual to ). Another class of quiver bundles are holomorphic triples (E1 , E2 , ), consisting of two holomorphic bundles E1 and E2 , and a morphism : E2 → E1 . The quiver has two vertices, say 1 and 2, and one arrow a : 2 → 1 (the twisting sheaf is OX ). The corresponding equations are called the coupled vortex equations [G2, BG]. When E2 = OX , holomorphic triples are holomorphic pairs (E, ), where E is a bundle and ∈ H 0 (X, E) (cf. [B]). There are other examples of quiver vortex equations that come out naturally from the study of the moduli of solutions to the Higgs bundle equation. Combining a theorem of Donaldson and Corlette [D3, C] with the Hitchin–Kobayashi correspondence for Higgs bundles [H, S], one has that the set of isomorphism classes of semisimple complex representations of the fundamental group of X in GL(r, C) is in bijection with the moduli space of polystable Higgs bundles of rank r with vanishing Chern classes. When X is a compact Riemann surface, this generalizes a theorem of Narasimhan and Seshadri [NS], which provides an interpretation of the unitary representations of the fundamental group as degree zero polystable vector bundles, up to isomorphism. Now, if X is a compact Riemann surface of genus g ≥ 2, the Morse methods introduced by Hitchin [H] reduce the study of the topology of the moduli space M of Higgs bundles to the study of the topology of the moduli space of complex variations of the Hodge structure – the critical points of the Morse function in this case. These are twisted quiver bundles, called twisted holomorphic chains, for a quiver whose vertex set is the set Z of integer numbers, and whose arrows are ai : i → i +1, for each i ∈ Z; the twisting bundle associated to each arrow is the holomorphic tangent bundle. The twisted holomorphic

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

3

chains that appear in these critical submanifolds are polystable for particular values of the stability parameters. Using Morse theory, Hitchin [H] computed the Poincar´e polynomial of M for the rank 2 case. Gothen [Go] obtained similar results for rank 3: the critical submanifolds are moduli spaces of stable twisted holomorphic chains formed by a line bundle and a rank 2 bundle (i.e. twisted holomorphic triples), and by three line bundles. To use these methods for higher rank, one needs to study moduli spaces of other twisted holomorphic chains. A possible strategy is to proceed as in [Th], studying the moduli space of twisted holomorphic chains in the whole parameter space. Another interesting type of quiver bundles arise in the study of semisimple representations of the fundamental group of X in U(p, q), the unitary group for a hermitian inner product of indefinite signature. Here, the quiver has two vertices, say 1 and 2, and two arrows, a : 1 → 2 and b : 2 → 1, and the twisting bundle associated to each arrow is the holomorphic tangent bundle. These are studied in [BGG1, BGG2]. Another context in which quiver bundles appear naturally is in the study of dimensional reductions of the Hermitian–Einstein equation over the product of a K¨ahler manifold X and a flag manifold. In this case, the parabolic subgroup defining the flag manifold entirely determines the structure of the quiver [AG1, AG2]. The dimensional reduction for this kind of manifolds has provided insight in the general theory of quiver bundles, and was actually the first method used to prove a Hitchin–Kobayashi correspondence for holomorphic triples [G2, BG], holomorphic chains [AG1], and quiver bundles for more general quivers with relations [AG2]. In these examples, the quiver bundles are not twisted, however, there are other examples for which a generalization of the method of dimensional reduction has produced twisted holomorphic triples [BGK1, BGK2]. An important feature of the stability of quiver sheaves is that it generally depends on several real parameters. When X is an algebraic variety, the ranks and degrees appearing in the numerical condition defining the stability criterion are integral, and the parameter space is partitioned into chambers. Strictly semistable quiver sheaves can occur when the parameters are on a wall separating the chambers, and the stability condition only depends on the chamber in which the parameters are. In the case of holomorphic triples [BG], there is a chamber (actually an interval in R) where the stability of the triple is related to the stability of the bundles. This can be used to obtain existence theorems for stable triples when the parameters are in this chamber, while the methods of [Th] can be used to prove existence results for other chambers (see [BGG2] for recent work in the case of triples). The geography of the resulting convex polytope for other quivers is an interesting issue to which we wish to return in a future paper. To approach this problem, one should study the homological algebra of quiver bundles. This has been developed by Gothen and King in a paper [GK] that appeared after we submitted this paper. When the manifold X is a point, a quiver bundle is just a quiver module (over C; cf. e.g. [ARS]). For arbitrary X, a quiver bundle can be regarded as a family of quiver modules (the fibres of the quiver bundle), parametrized by X. One can thus transfer to our setting many constructions of the theory of quiver modules. In the last part of the paper we introduce a more algebraic point of view by considering the path algebra bundle of the twisted quiver and looking at twisted quiver bundles as locally free modules over this bundle of algebras. This point of view is inspired by a similar construction for quiver modules [ARS], and suggests a generalization to other algebras that appear naturally in other problems. This is something to which we plan to come back in the future. The Hitchin–Kobayashi correspondence for quiver bundles combines in one theory two different versions, in some sense, of the theorem of Kempf and Ness [KN] identifying the symplectic quotient of a projective variety by a compact Lie group action, with the geometric invariant theory quotient. The first one is the classical Hitchin–Kobayashi

4

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

correspondence for vector bundles, and the second one occurs when the manifold X is a point, in which case the equations and the stability condition reduce to the moment map equations and the stability condition for quiver modules introduced by King [K]. As we prove in Theorem 4.1, there is in fact a very tight relation between the quiver vortex equations and the moment map equations for quiver modules: when the twisting sheaves are OX and the bundles have vanishing Chern classes, the existence of solutions to the quiver vortex equations is equivalent to the existence of flat metrics on the bundles which fibrewise satisfy the moment map equations for quiver modules. 1. Twisted Quiver Bundles In this section we define the basic objects that we shall study: twisted quiver bundles and twisted quiver sheaves. They are representations of quivers in the categories of holomorphic vector bundles and coherent sheaves, respectively, twisted by some fixed holomorphic vector bundles, as explained in §1.2. Thus, many results about quiver modules, i.e. quiver representations in the category of vector spaces, can be tranferred to our setting. A good reference for quivers and their linear representations is [ARS]. 1.1. Quivers. A quiver, or directed graph, is a pair of sets Q = (Q0 , Q1 ) together with two maps h, t : Q1 → Q0 . The elements of Q0 (resp. Q1 ) are called the vertices (resp. arrows) of the quiver. For each arrow a ∈ Q1 , the vertex ta (resp. ha) is called the tail (resp. head) of the arrow a. The arrow a is sometimes represented by a : v → v when v = ta and v = ha. 1.2. Twisted quiver sheaves and bundles. Throughout this paper, X is a connected compact K¨ahler manifold, Q is a quiver, and M is a collection of finite rank locally free sheaves Ma on X, for each arrow a ∈ Q1 . By a sheaf on X, we shall will mean an analytic sheaf of OX -modules. Our basic objects are given by the following: Definition 1.1. An M-twisted Q-sheaf on X is a pair R = (E, φ), where E is a collection of coherent sheaves Ev on X, for each v ∈ Q0 , and φ is a collection of morphisms φa : Eta ⊗ Ma → Eha , for each a ∈ Q1 , such that Ev = 0 for all but finitely many v ∈ Q0 , and φa = 0 for all but finitely many a ∈ Q1 . Remark 1.1. Given a quiver Q = (Q0 , Q1 ), as defined in §1.1, the sets Q0 and Q1 can be infinite, but for each M-twisted Q-sheaf R = (E, φ), the subset Q0 ⊂ Q0 of vertices v such that Ev = 0, and the subset Q1 ⊂ Q1 of arrows a such that φa = 0, are both finite. Thus, to any M-twisted Q-sheaf R = (E, φ), we can associate the subquiver Q = (Q0 , Q1 ) of Q, and R can be seen as an M -twisted Q -sheaf, where Q0 , Q1 are finite sets, and M ⊂ M is the collection of sheaves Ma with a ∈ Q1 . As usual, we identify a holomorphic vector bundle E, with the locally free sheaf of sections of E. Accordingly, a holomorphic M-twisted Q-bundle is an M-twisted Q-sheaf R = (E, φ) such that the sheaf Ev is a holomorphic vector bundle, for each v ∈ Q0 . For the sake of brevity, in the following the terms “Q-sheaf” or “Q-bundle” are to be understood as “M-twisted Q-sheaf” or “M-twisted Q-bundle”, respectively, often suppressing the adjective “M-twisted”.

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

5

A morphism f : R → R between two Q-sheaves R = (E, φ), R = (E , φ ), is given by a collection of morphisms fv : Ev → Ev , for each v ∈ Q0 , such that φa ◦ (fv ⊗ idMa ) = fv ◦ φa , for each arrow a : v → v in Q. If f : R → R and g : R → R are two morphisms between Q-sheaves R = (E, φ), R = (E , φ ), R = (E , φ ), then the composition g ◦ f is defined as the collection of composed morphisms gv ◦ fv : Ev → Ev , for each v ∈ Q0 . We have thus defined the category of M-twisted Q-sheaves on X, which is abelian. Important concepts in relation to stability and semistability (defined in §2.3) are the notions of Q-subsheaves and quotient Q-sheaves, as well as indecomposable and simple Q-sheaves. They are defined as for any abelian category. In particular, an M-twisted Q-subsheaf of R = (E, φ) is another ⊗M ) ⊂ M-twisted Q-sheaf R = (E , φ ) such that Ev ⊂ Ev , for each v ∈ Q0 , φa (Eta a ⊗M , Eha , for each a ∈ Q1 , and φa : Ma ⊗ Eta → Eha is the restriction of φa to Eta a for each a ∈ Q0 . 2. Gauge Equations and Stability 2.1. Gauge equations. Throughout this paper, given a smooth bundle E on X, k (E) (resp. i,j (E)) is the space of smooth E-valued complex k-forms (resp. (i, j )-forms) on X, ω is a fixed K¨ahler form on X, and : i,j (E) → i−1,j −1 (E) is contraction with ω (we use the same notation as e.g. in [D1]). The gauge equations will also depend on a fixed collection q of hermitian metrics qa on Ma , for each a ∈ Q1 , which we fix once and for all. Let R = (E, φ) be a holomorphic M-twisted Q-bundle on X. A hermitian metric on R is a collection H of hermitian metrics Hv on Ev , for each v ∈ Q0 with Ev = 0. To define the gauge equations on R, we note that φa : Eta ⊗ Ma → Eha has a smooth adjoint morphism φa∗Ha : Eha → Eta ⊗ Ma with respect to the hermitian metrics Hta ⊗ qa on Eta ⊗ Ma , and Hha on Eha , for each a ∈ Q0 , so it makes sense to consider the composition φa ◦ φa∗Ha : Eha → Eta ⊗ Ma → Eha . Moreover, φa and φa∗Ha can be seen as morphisms φa : Eta → Eha ⊗ Ma∗ and φa∗Ha : Eha ⊗ Ma∗ → Eta , so φa∗Ha ◦ φa : Eta → Eta makes sense too. Definition 2.1. Let σ and τ be collections of real numbers σv , τv , with σv positive, for each v ∈ Q0 . A hermitian metric H satisfies the M-twisted quiver (σ, τ )-vortex equations if √ σv −1 FHv + φa ◦ φa∗Ha − φa∗Ha ◦ φa = τv idEv , (1) a∈h−1 (v)

a∈t −1 (v)

for each v ∈ Q0 such that Ev = 0, where FHv is the curvature of the Chern connection AHv associated to the metric Hv on the holomorphic vector bundle Ev , for each v ∈ Q0 with Ev = 0. 2.2. Moment map interpretation. The twisted quiver vortex equations appear as a symplectic reduction condition, as we explain now. Let E be a collection of smooth vector bundles Ev , for each v ∈ Q0 , with Ev = 0 for all but finitely many v ∈ Q0 . By removing the vertices v ∈ Q0 with Ev = 0 and all but finitely many arrows a ∈ Q1 , we obtain a finite subquiver, which we still call Q = (Q0 , Q1 ), such that Ev = 0 for each v ∈ Q0 (see Remark 1.1). Let Hv be a hermitian metric on Ev , for each v ∈ Q0 . Let Av and Gv be the corresponding spaces of unitary connections and their unitary gauge groups,

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

6

and let Av1,1 ⊂ Av be the space of unitary connections Av with (∂¯Av )2 = 0, for each v ∈ Q0 . The group Gv G = v∈Q0

acts on the space A of unitary connections, and on the representation space 0 , defined by Av , 0 = 0 (R(Q, E)), with R(Q, E) = Hom(Eta ⊗ Ma , Eha ), A = v∈Q0

a∈Q1

(2) where Hom(Eta ⊗ Ma , Eha ) is the vector bundle of homomorphisms Eta ⊗ Ma → Eha . An element g ∈ G is a collection of group elements gv ∈ Gv , for each v ∈ Q0 , and an element A ∈ A (resp. φ ∈ 0 ) is a collection of unitary connections Av ∈ Av (resp. smooth morphisms φa : Eta ⊗ Ma → Eha ), for each v ∈ Q0 (resp. a ∈ Q1 ). The G -actions on A and 0 are G × A → A , (g, A) → A = g · A, with dAv = gv ◦ dAv ◦ gv−1 , for each v ∈ Q0 ; G × 0 → 0 , (g, φ) → φ = g · φ, with φa = −1 ⊗ idMa ), for each a ∈ Q1 , respectively. The induced G -action on the gha ◦ φa ◦ (gta product A × 0 leaves invariant the subset N of pairs (A, φ) such that Av ∈ Av1,1 , for each v ∈ Q0 , and φa : Eta ⊗ Ma → Eha is holomorphic with respect to ∂¯Ata and ∂¯Aha , for each a ∈ Q0 . Let ωv be the Gv -invariant symplectic form on Av , for each v ∈ Q0 , as given in [AB] for a compact Riemann surface, or e.g. in [DK, Prop. 6.5.8] for any compact K¨ahler manifold, that is, tr(ξv ∧ ηv ), for ξv , ηv ∈ 1 (ad(Ev )), ωv (ξv , ηv ) = X

where ad(Ev ) is the vector bundle of Hv -antiselfadjoint endomorphisms of Ev . The corresponding moment map µv : Av → (Lie Gv )∗ is given by µv (Av ) = FAv (we use implicitly the inclusion of Lie Gv in its dual space by means of the metric Hv on Ev ). The symplectic form ωR on 0 associated to the L2 -metric induced by the hermitian metrics on the spaces 0 (Hom(Eta ⊗ Ma , Eha )) is G -invariant, and has associated moment map µR : 0 → (Lie G )∗ given by µR = v∈Q0 µR,v , with µR,v : 0 → Lie Gv ⊂ Lie G ⊂ (Lie G )∗ given by √ −1 µR,v (φ) = φa ◦ φa∗Ha − φa∗Ha ◦ φa , for φ ∈ 0 , (3) a∈h−1 (v)

a∈t −1 (v)

(this follows as in [K, §6], which considers the action of a unitary group on a representation space of quiver modules). Given a collection σ of real numbers σv > 0, for each v ∈ Q0 , v∈Q0 σv ωv + ωR is obviously a G -invariant symplectic form on A × 0 . A moment map for this symplectic form is µσ = v∈Q0 σv µv + µR , where we are τ of real numbers τv , omitting pull-backs to A × 0 in the √ notation. Any √ collection for each v ∈ Q0 defines an element −1 τ · id = −1 v∈Q0 τv idEv in the center of √ Lie G . The points of the symplectic reduction µ−1 σ (− −1 ·τ )/G are precisely the orbits of pairs (A, φ) such that the hermitian metric H satisfies the M-twisted (σ, τ )-vortex quiver equations on the corresponding holomorphic quiver bundle R = (E, φ). Thus, √ Definition 2.1 picks up the points of µ−1 ahler submanifold (outside σ (− −1 τ ) in the K¨ its singularities) N . For convenience in the Hitchin–Kobayashi correspondence, it is formulated in terms of hermitian metrics.

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

7

2.3. Stability. To define stability, we need some preliminaries and notation. Let n be the complex dimension of X. Given a torsion-free coherent sheaf E on X, the double dual sheaf det(E)∗∗ is a holomorphic line bundle, and we define the first Chern class c1 (E) of E as the first Chern class of det(E)∗∗ . The degree of E is the real number deg(E) =

2π 1 c1 (E) ωn−1 , [X] , Vol(X) (n − 1)!

where Vol(X) is the volume of X, [ωn−1 ] is the cohomology class of ωn−1 , and [X] is the fundamental class of X. Note that the degree depends on the cohomology class of ω. Given a holomorphic vector bundle E on X, by Chern-Weil theory, its degree equals √ 1 deg(E) = tr( −1 FH ), Vol(X) X where FH is the curvature of the Chern connection associated to a hermitian metric H on E. Let Q be a quiver, and σ , τ be collections of real numbers σv , τv , with σv > 0, for each v ∈ Q0 ; σ and τ are called the stability parameters. Let R = (E, φ) be a Q-sheaf on X. Definition 2.2. The (σ, τ )-degree and (σ, τ )-slope of R are degσ,τ (R) =

v∈Q0

(σv deg(Ev ) − τv rk(Ev )) ,

µσ,τ (R) =

degσ,τ (R) , v∈Q0 σv rk(Ev )

respectively. The Q-sheaf R is called (σ , τ )-(semi)stable if for all proper Q-subsheaves R of R, µσ,τ (R ) < (≤)µσ,τ (R). A (σ ,τ )-polystable Q-sheaf is a finite direct sum of (σ, τ )-stable Q-sheaves, all of them with the same (σ, τ )-slope. As for coherent sheaves, one can prove that any (σ, τ )-stable Q-sheaf is simple, i.e. its only endomorphisms are the multiples of the identity. Remarks 2.1. (i) If a holomorphic Q-bundle R admits a hermitian metric satisfying the (σ, τ )-vortex equations, then taking traces in (1), summing for v ∈ Q0 , and integrating over X, we see that the parameters σ, τ are constrained by degσ,τ (R) = 0. (ii) If we transform the parameters σ, τ , multiplying by a global constant c > 0, obtaining σ = cσ , τ = cτ , then µσ ,τ (R) = µσ,τ (R). Furthermore, if we transform the parameters τ by τv = τv + dσv for some d ∈ R, and let σ = σ , then µσ ,τ (R) = µσ,τ (R) − d. Since the stability condition does not change under these two kinds of transformations, the “effective” number of stability parameters of a quiver sheaf R = (E, φ) is 2N (R)−2, where N (R) is the (finite) number of vertices v ∈ Q0 with Ev = 0. From the point of view of the vortex equations (1), the first type of transformations, σ = cσ , τ = cτ , corresponds to a redefinition of the sections φ = c1/2 φ (note that the stability condition is invariant under this transformation), while the second type corresponds to the constraint degσ,τ (R) = 0 in (i). (iii) As usual with stability criteria, in Definition 2.2, to check (σ, τ )-stability of a Qsheaf R, it suffices to consider Q-subsheaves R ⊂ R such that Ev ⊂ Ev is saturated, i.e. such that the quotient Ev /Ev is torsion-free, for each v ∈ Q0 .

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

8

3. Hitchin–Kobayashi Correspondence In this section we prove a Hitchin–Kobayashi correspondence between the twisted quiver vortex equations and the stability condition for holomorphic twisted quiver bundles: Theorem 3.1. Let σ and τ be collections of real numbers σv and τv , respectively, with σv > 0, for each v ∈ Q0 . Let R = (E, φ) be a holomorphic M-twisted Q-bundle such that degσ,τ (R) = 0. Then R is (σ, τ )-polystable if and only if it admits a hermitian metric H satisfying the quiver (σ, τ )-vortex equations (1). This hermitian metric H is unique up to an automorphism of the Q-bundle, i.e. up to a multiplication by a constant λj > 0 for each (σ, τ )-stable summand Rj of R = R1 ⊕ · · · ⊕ Rl . Remark 3.1. This theorem generalizes previous theorems, mainly the Donaldson–Uhlenbeck–Yau theorem [D1, D2, UY], the Hitchin–Kobayashi correspondence for Higgs bundles [H, S], holomorphic triples and chains [AG1, BG], twisted holomorphic triples [BGK2], etc. It should be mentioned that Theorem 3.1 does not follow from the general theorems proved in [Ba, M] for the following two reasons. First, the symplectic form v∈Q0 σv ωv + ωR on A × 0 (cf. §2.2) has been deformed by the parameters σ whenever σv = σv for some v, v ∈ Q0 ; as a matter of fact, the vortex equations (1) depend on new parameters even for holomorphic triples or chains [AG1, BG], hence generalizing their Hitchin–Kobayashi correspondences (in the case of a holomorphic pair (E, φ), consisting of a holomorphic vector bundle E and a holomorphic section φ ∈ H 0 (X, E), as considered in [B], which can be understood as a holomorphic triple φ : OX → E, the new parameter can actually be absorbed in φ, so no new parameters are really present). Second, the twisting bundles Ma , for a ∈ Q1 , are not considered in [Ba, M]. Our method of proof combines the moment map techniques developed in [B, D2, S, UY] for bundles with a proof of a similar correspondence for quiver modules in [K, §6]. 3.1. Preliminaries and general notation. Throughout Sect. 3, R = (E, φ) is a fixed holomorphic (M-twisted) Q-bundle with degσ,τ (R) = 0. To prove Theorem 3.1, we can assume that Q = (Q0 , Q1 ) is a finite quiver, with Ev = 0, for v ∈ Q0 , and φa = 0, for a ∈ Q1 (if this is not the case, we remove the vertices v with Ev = 0, and the arrows a with φa = 0, see Remark 1.1). The technical details of the proof largely simplify by introducing the following notation. Unless otherwise stated, v, v , . . . (resp. a, a , . . .) stand for elements of Q0 (resp. Q1 ), while sums, direct sums and products in v, v , . . . (resp. a, a , . . .) are over elements of Q0 (resp. Q1 ). Thus, the condition degσ,τ (R) = 0 is equivalent to v σv deg(Ev ) = v τv rk(Ev ). Let E = ⊕v Ev ;

(4)

a vector u in the fibre Ex over x ∈ X, is a collection of vectors uv in the fibre Ev,x over X, ¯ for each v ∈ Q0 . Let ∂¯Ev : 0 (Ev ) → 0,1 (Ev ) be the ∂-operator of the holomorphic vector bundle Ev , and let ∂¯E = ⊕v ∂¯Ev

(5)

¯ be the induced ∂-operator on E. A hermitian metric Hv on Ev defines a unique Chern connection AHv compatible with the holomorphic structure ∂¯Ev ; the corresponding covariant

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

9

derivative is dHv = ∂Hv + ∂¯Ev , where ∂Hv : 0 (Ev ) → 1,0 (Ev ) is its (1, 0)-part. Thus, given u ∈ i,j (E), ∂¯E (u) ∈ i,j +1 (E) = ⊕v i,j +1 (Ev ) is the collection of Ev -valued (i, j + 1)-forms (∂¯E (u))v = ∂¯Ev (uv ), for each v ∈ Q0 . 3.1.1. Metrics and associated bundles. Let Metv be the space of hermitian metrics on Ev .A hermitian metric (·, ·)Hv on Ev is determined by a smooth morphism Hv : Ev → Ev∗ , by (uv , uv )Hv = Hv (uv )(uv ), with uv , uv in the same fibre of Ev . The right action of the complex gauge group Gvc on Metv is given, by means of this correspondence, by Metv × Gvc → Metv , (Hv , gv ) → Hv ◦ gv . Let Sv (Hv ) be the space of Hv -selfadjoint smooth endomorphisms of Ev , for each Hv ∈ Metv . We choose a fixed hermitian metric Kv ∈ Met such that √ the hermitian metric det(Kv ) induced by Kv on the determinant bundle det(Ev ) satisfies −1 Fdet(Kv ) = deg(Ev ), for each v ∈ Q0 (such a hermitian metric Kv exists by Hodge theory). Any other metric on Ev is given by Hv = Kv esv for some sv ∈ Sv , or equivalently, by (uv , uv )Hv = (esv uv , uv )Kv , where Sv = Sv (Kv ). Let Met be the space of hermitian metrics on E such that the direct sum E = ⊕v Ev is orthogonal. Ametric H ∈ Met is given by a collection of metrics Hv ∈ Metv , by (u, u )H = v (uv , uv )Hv . Let S(H ) = ⊕v Sv (Hv ), for each H ∈ Met, and S = S(K) = ⊕v Sv . A vector s ∈ S(H ) is given by a collection of vectors sv ∈ Sv (Hv ), for each v ∈ Q0 , while a metric H ∈ Met is given by H = Kes for some s ∈ S, i.e. Hv = Kv esv . The (fibrewise) norm on 1/2 Ev (resp. E) corresponding to Hv (resp. H ), is given by |uv |Hv = (uv , uv )Hv (resp. 1/2

|u|H = (u, u)H ). The corresponding L2 -metric and L2 -norm on the space of sections of Ev (resp. E), are defined by 1/2 for uv , uv ∈ 0 (Ev ), (uv , uv )L2 ,Hv = (uv , uv )Hv , uv L2 ,Hv = (uv , uv )L2 ,H , v

X

1/2 (resp. (u, u )L2 ,H = v (uv , uv )L2 ,Hv , uL2 ,H = (u, u)L2 ,H ). The Lp -norm on the space of sections of E, given by 1

uLp ,H = X

p |u|H

p

for u ∈ 0 (E),

will also be useful. These metrics and norms induce canonical metrics on the associated bundles, which will be denoted with the same symbols. For instance, Hv ∈ Metv (resp. H ∈ Met) induces an Lp -norm · Lp ,Hv on Sv (Hv ) (resp. · Lp ,H on S(H )). To simplify the notation, we set (uv , uv ) = (uv , uv )Kv , |uv | = |uv |Kv , (u, u ) = (u, u )K , |u| = |u|K ; and (uv , uv )L2 = (uv , uv )L2 ,Kv , uv L2 = uv L2 ,Kv , (u, u )L2 = (u, u )L2 ,K , uLp = uLp ,K . The morphisms φa : Eta ⊗ Ma → Eha induce a section φ = ⊕a φa of the representation bundle, defined as the smooth vector bundle over X Hom(Eta ⊗ Ma , Eha ). R= a

A metric H ∈ Met induces another metric Ha on each term Hom(Eta ⊗ Ma , Eha ) of R, by (φa , φa )Ha = tr(φa ◦ φa∗Ha ) for φa , φa in the same fibre of Hom(Eta ⊗ Ma , Eha ), where φa∗Ha : Eha → Eta ⊗ Ma is defined as in §2.1. Thus, H defines a hermitian metric on R, which we shall also denote H , by (φ, φ )H = a (φa , φa )Ha , where φ, φ are in

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

10

1/2

a fibre of R. The corresponding fibrewise norm | · |H is given by |φ|H = (φ, φ)H . By integrating the hermitian metric over X, (·, ·)Ha and (·, ·)H induce L2 -inner products (·, ·)Ha ,L2 and (·, ·)H,L2 on 0 (Eta ⊗ Ma , Eha ) and 0 = 0 (R) respectively, given by (φa , φa )Ha ,L2 = X (φa , φa )Ha , for φa , φa ∈ 0 (Eta ⊗ Ma , Eha ), and (φ, φ )H,L2 = 0 2 a (φa , φa )L2 ,Ha , for φ, φ ∈ , with associated L -norms · Ha ,L2 , · H,L2 given 1/2 1/2 by φa L2 ,H = (φa , φa )L2 ,H and φL2 ,H = (φ, φ)L2 ,H . We set (φ, φ ) = (φ, φ )K , |φ| = |φ|K , for each φ, φ in the same fibre of R; and (φ, φ )L2 = (φ, φ )L2 ,K , φL2 = φL2 ,K , for each φ, φ smooth sections of R. 3.1.2. The vortex equations. Composition of two endomorphisms s, s ∈ S is defined by (s ◦ s )v = sv ◦ sv for v ∈ Q0 . The identity endomorphism id of E is given by idv = idEv . Given a vector bundle F on X, we define the endomorphisms σ, τ : F ⊗ S c → F ⊗ S c , where S c = ⊕ν End(Eν ), by fibrewise multiplication, i.e. (σ · (f ⊗ s))v = f ⊗ σv sv and (τ · (f ⊗ s))v = f ⊗ τv sv , for f ∈ F and s ∈ S c in the fibres over the same point x ∈ X. For instance, if s ∈ S, then (σ · ∂¯E (s))v = σv ∂¯Ev (sv ). Given H ∈ Met and sections φ, φ of R, we define the endomorphisms φ ◦φ ∗H , φ ∗H ◦φ , [φ, φ ∗H ] ∈ 0 (S c ), using §2.1, by φa ◦ φa∗Ha , (φ ∗H ◦ φ )v = φa∗Ha ◦ φa , (φ ◦ φ ∗H )v = v∈h−1 (a)

v∈t −1 (a)

[φ, φ ∗H ] = φ ◦ φ ∗H − φ ∗H ◦ φ . Note that [φ, φ ∗H ] ∈ S(H ). The quiver vortex equations (1) can now be written in a compact form √ (6) σ · −1 FH + [φ, φ ∗H ] = τ · id, for H ∈ Met. Given s ∈ S and φ ∈ 0 = 0 (R), s ◦ φ, φ ◦ s, [s, φ], [φ, s] ∈ 0 are defined by (s ◦ φ)a = sha ◦ φa , (φ ◦ s)a = φa ◦ (sta ⊗ idMa ), [s, φ] = s ◦ φ − φ ◦ s, [φ, s] = φ ◦ s − s ◦ φ. 3.1.3. The trace and trace free parts of the vortex equations. The trace map is defined by tr : End(E) → C, s → tr(s) = v tr(sv ). Let S 0 (H ) be the space of “σ -trace free” H -selfadjoint endomorphisms s ∈ S(H ), i.e.0 such0that tr(σ · s) = 0, or0 more explicitly, v σv tr(sv ) = 0, for each H ∈ Met; let S = S (K) ⊂ S. Let Met be the space of metrics H = Kes with s ∈ S 0 . The metrics H ∈ Met 0 satisfy the trace part of Eq. (6), i.e. √ (7) tr(σ · −1 FH ) = tr(τ · id). To prove this, let H = Kes ∈ Met with s ∈ S. Then det(Hv ) = det(Kv )etr sv so ¯ tr sv = tr FKv + ∂∂ ¯ tr sv (since the operators induced tr FHv = Fdet(Hv ) = Fdet(Kv ) + ∂∂ by ∂¯det(Ev ) and ∂det(Kv ) on the trivial bundle of endomorphisms of det(Ev ) are ∂¯ and ∂, √ √ √ ¯ tr(σ · s), resp.). Adding for all v, tr(σ · −1 FH ) = tr(σ · −1 FK ) + −1 ∂∂

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

11

√ √ where tr( −1 F Kv ) = deg(Ev ) by construction (cf. §3.1.1), so tr(σ · −1 FK ) = v σv deg(Ev ) = v τv rk(Ev ) = tr(τ · id). Thus, √ √ ¯ tr(σ · s), (8) tr(σ · −1 FH − τ · id) = −1 ∂∂ which is zero if s ∈ S 0 . This proves (7). Therefore, a metric H = Kes ∈ Met 0 satisfies the quiver (σ, τ )-vortex equations (6) if and only if it satisfies the “σ -trace free” part, i.e.

√ 0 σ · −1 FH + [φ, φ ∗H ] − τ · id = 0, pH 0 : S(H ) → S(H ) is the H -orthogonal projection onto S 0 (H ). where pH

3.1.4. Sobolev spaces. Following [UY, S, B], given a smooth vector bundle E, and any p p integers k, p ≥ 0, Lk i,j (E) is the Sobolev space of sections of class Lk , i.e. E-valued p (i, j )-forms whose derivatives of order ≤ k have finite L -norm. Throughout the proof of Theorem 3.1, we fix an even integer p > dimR (X) = 2n. Note that there is a compact p embedding of L2 i,j (E) into the space of continuous E-valued (i, j )-forms on X, for p > 2n. This embedding will be used in §3.1.6. Particularly important are the collection p p p L2 S = ⊕v L2 Sv of Sobolev spaces L2 Sv of Kv -selfadjoint endomorphisms of Ev of p p ∼ p class L2 ; the collection Met2 = v Met2,v of Sobolev metrics, with p

p

Met2,v = {Kv esv |sv ∈ L2 Sv }, p

p

for each v ∈ Q0 ,

p

the subspace L2 S 0 ⊂ L2 S of sections s ∈ L2 S such that tr(σ ·s) = 0 almost everywhere in X; and p,0

Met2 p

p

p

= {Kes |sv ∈ L2 S 0 } ⊂ Met2 . p

Given H = Kes ∈ Met2 , with s ∈ L2 S, we define the H -adjoint of φ, generalizing the case where sv is smooth, i.e. φ ∗H = e−s ◦ φ ∗K ◦ es . Similar generalizations apply to the p p p other constructions in §§3.1.2, 3.1.3, to define L2 Sv (Hv ) and L2 S(H ) = ⊕v L2 Sv (H ), p 0 p p as well as the subspace L2 S (H ) ⊂ L2 S(H ), for each H ∈ Met2 . If Hv = Kv esv ∈ p p p Met2,v with sv ∈ L2 Sv , we define the connection AHv , with L1 coefficients, and its p 1,1 p curvature FHv ∈ L (End(Ev )), with L coefficients, generalizing the case where sv is smooth: dHv := dKv +e−sv ∂Kv (esv ),

FHv = FKv + ∂¯Ev (e−sv ∂Kv (esv )),

(9)

(where dHv is the covariant derivative associated to the connection AHv ). 3.1.5. The degree of a saturated subsheaf. A saturated coherent subsheaf F of a holomorphic vector bundle F on X (i.e., a coherent subsheaf with F/F torsion-free), is reflexive, hence a vector subbundle outside of codimension 2. Given a hermitian metric H on F, the H -orthogonal projection π from F onto F , defined outside codimension 2, is an L21 -section of the bundle of endomorphisms of F, so β = ∂¯F (π ) is of class L2 , ¯ where ∂¯F is the ∂-operator of F. The degree of F is

√ 1 tr(π −1 FH ) − β2L2 ,H , deg(F ) = Vol(X) X (cf. [UY, S, B]).

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

12

3.1.6. Some constructions involving hermitian matrices. The following definitions slightly generalize [S, §4]. Let ϕ : R → R and Φ : R × R → R be smooth functions. Given s ∈ S, we define ϕ(s) ∈ S and linear maps Φ(s) : S → S and Φ(s) : 0 (R) → 0 (R) (we denote the last two maps with the same symbol since there will not be possible confusion between them). Actually, we define maps of fibre bundles Φ : S → S(End E) and Φ : S → S(End R), for certain spaces S(End E) and S(End R), which we first define. Let S(End E) = ⊕v S(End Ev ), where S(End Ev ) is the space of smooth sections of the bundle End(End Ev ) which are selfadjoint w.r.t. the metric induced by Kv . Let End R be the endomorphism bundle of the vector bundle R; S(R) is the space of smooth sections of End R which are selfadjoint w.r.t. the metric induced by Kv and qa . We define ϕ(sv ) ∈ Sv for sv ∈ Sv and a linear map Φ : Sv → S(End Ev ) as follows. Let sv ∈ Sv . If x ∈ X, let (uv,i ) be an orthonormal basis of Ev,x (w.r.t. Kv ), with dual basis (uv,i ), such that sv = i λv,i uv,i ⊗ uv,i . Furthermore, let (ma,k ) be the dual of an orthonormal basis of Ma,x (w.r.t. qa ). The value of ϕ(sv ) ∈ Sv at the point x ∈ X is defined as in [S, §4], by ϕ(sv )(x) := ϕ(λv,i )uv,i ⊗ uv,i . (10) i

We define ϕ(s) ∈v,jS, for s ∈ S, by ϕ(s)v := ϕ(sv ). Given fv ∈ Sv with fv (x) = i,j fv,ij uv,i ⊗ u , the value of Φ(sv )fv ∈ Sv at the point x ∈ X is Φ(sv )fv (x) :=

Φ(λv,i , λv,j )fv,ij uv,i ⊗ uv,j ,

(11)

i,j

and we define Φ : S → S(End E) and Φ : S → S(End R) as follows. Let s ∈ S. First, if f ∈ S, (Φ(s)f )v := Φ(sv )fv . Second, given a section φ of R such that the value of φa : Eta ⊗ Ma → Eha at x ∈ X is φa (x) = i,j,k φa,ij k (x)uha,j ⊗ uta,i ⊗ ma,k for each a ∈ Q1 , the value of Φ(s)φ ∈ 0 (R) at x ∈ X is (Φ(s)φ(x))a := Φ(λha,j , λta,i )φa,ij k (x)uha,j ⊗ uta,i ⊗ ma,k , for each a ∈ Q1 . i,j,k

(12) Note that if Φ is given by Φ(x, y) = ϕ1 (x)ϕ2 (y) for certain functions ϕ1 , ϕ2 : R → R, then (Φ(s)φ)a = ϕ1 (sha ) ◦ φa ◦ (ϕ2 (sha ) ⊗ idMa ), that is, Φ(s)φ = ϕ1 (s) ◦ φ ◦ ϕ2 (s).

(13)

Finally, given a smooth function ϕ : R → R, we define d ϕ : R × R → R as in [S, §4]: d ϕ(x, y) =

ϕ(y) − ϕ(x) , if x = y, and d ϕ(x, y) = ϕ (x) if x = y. y−x

Thus, ∂¯E (ϕ(s)) = d ϕ(s)(∂¯E (s)) for s ∈ S.

(14)

The following lemma will be especially important in the proof of Lemma 3.8. Given a p number b, L2k,b S ⊂ Lk S is the closed subset of sections s ∈ L2k S such that |s| ≤ b a.e. in X; L20,b S(End R) is similarly defined.

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

13

Lemma 3.1. (i) ϕ : S → S extends to a continous map ϕ : L20,b S → L20,b S for some b . q (ii) ϕ : S → S extends to a map ϕ : L21,b S → L1,b S for some b , for q ≤ 2, which is continuous for q < 2. Formula (14) holds in this context. (iii) Φ : S → S(End E) extends to a map Φ : L20,b S → Hom(L2 0 (End E), Lq 0 (End E)) for q ≤ 2, which is continuous in the norm operator topology for q < 2. (iv) Φ : S → S(End R) extends to a continuous map ϕ : L20,b S → L20,b S(End R) for some b . p p p (v) The previous maps extend to smooth maps ϕ : L2 S → L2 S, Φ : L2 S → p p p L2 S(End E) and Φ : L2 S → L2 S(End R) between Banach spaces of Sobolev sections. Formulas (10)–(14) hold everywhere in X. Proof. This follows as in [B, S]. For (v), p > 2n, so there is a compact embedding p L2 ⊂ C 0 . 3.2. Existence of special metric implies polystability. Let H be a hermitian metric on R satisfying the quiver (σ, τ )-vortex equations. To prove that R is (σ, τ )-polystable, we can assume that it is indecomposable – then we have to prove that it is actually (σ, τ )-stable. Let R = (E , φ ) ⊂ R be a proper Q-subsheaf. We can assume that Ev ⊂ Ev is saturated for each v ∈ Q0 (cf. Remark 2.1(iii)). Let πv be the Hv -orthogonal projection from Ev onto Ev , defined outside codimension 2, πv = id −πv , and βv = ∂¯E (πv ). The collections of sections πv , πv , βv define elements π , π , β ∈ L2 0,1 (End E), respectively. Taking the L2 -product with π in (6), √ (σ · −1 FH , π )L2 ,H + ([φ, φ ∗H ], π )L2 ,H = (τ · id, π )L2 ,H . We now evaluate the three terms of this equation. The first term in the left hand side is √ √ (σ · −1 FH , π )L2 ,H = σv ( −1 FHv , πv )L2 ,Hv v

= Vol(X)

σv deg(Ev ) +

v

σv βv 2L2 ,H

v

v

(cf. §3.1.5). Let φ = π ◦ φ ◦ π , φ = π ◦ φ ◦ π , φ ⊥ = π ◦ φ ◦ π . Then φ = φ ◦ π + φ ⊥ ◦ π + φ ◦ π outside of codimension 2, for R ⊂ R. Thus, [π , φ] = φ ⊥ ◦ π , and the second term is ([φ, φ ∗H ], π )L2 ,H = (φ, [π , φ])L2 ,H = (φ, φ ⊥ )L2 ,H = φ ⊥ 2L2 ,H . Finally, the right-hand side is

(τ · id, π )L2 ,H =

X v

τv tr(πv ) = Vol(X)

τv rk(Ev ),

v

(since tr(πv ) = rk(Ev ) outside of codimension 2). Therefore Vol(X) degσ,τ (R ) = − σv βv 2L2 ,H − φa⊥ 2L2 ,H . v

v∈Q0

a

a∈Q1

The indecomposability of R implies that either βv = 0 for some v ∈ Q0 or φa⊥ = 0 for some a ∈ Q1 ; thus, degσ,τ (R ) < 0, so µσ,τ (R ) < 0 = µσ,τ (R), hence R is (σ, τ )-stable.

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

14

3.3. The modified Donaldson lagrangian. To define the modified Donaldson Lagrangian, we first recall the definition of the Donaldson lagrangian (cf. [S, §5]). Let : R × R → R be given by (x, y) =

ey−x − (y − x) − 1 . (y − x)2

(15)

p

The Donaldson lagrangian MD,v = MD (Kv , ·) : Met2,v → R is given by √ MD,v (Hv ) = ( −1 FKv , sv )L2 + ((sv )(∂¯Ev sv ), ∂¯Ev sv )L2 , p

p

for Hv = Kv esv ∈ Met2,v , sv ∈ L2 Sv . The Donaldson lagrangian MD,v = MD (Kv , ·) is additive in the sense that MD,v (Kv , Hv ) + MD,v (Hv , Jv ) = MD,v (Kv , Jv ),

p

for Hv , Jv ∈ Met2 .

(16) p

Another important property is that the Lie derivative of MD,v at Hv ∈ Met2 , in the p direction of sv ∈ L2 Sv (Hv ), is given by the moment map (cf. §2.2), i.e. √ d MD,v (Hv eεsv )ε=0 = ( −1 FHv , sv )L2 ,Hv , dε

p

p

with Hv ∈ Met2 , sv ∈ L2 Sv (Hv ). (17)

Higher order Lie derivatives can be easily evaluated. Thus, from (9), d FH eεsv = ∂¯Ev ∂Hv eεsv sv , dε v

p

p

for each Hv ∈ Met2 and sv ∈ L2 S(Hv ),

(18)

so the second order Lie derivative is √ d2 MD,v (Hv eεsv )ε=0 = ( −1 ∂¯Ev ∂Hv sv , sv )L2 ,Hv = ∂¯Ev sv L2 ,Hv (19) 2 dε √ (the second equality is obtained by integrating tr(sv√ −1 ∂¯Ev ∂Hv sv ) = √ −1 ∂¯ tr(sv ∂Hv sv )+|∂¯Ev sv |2Hv over X, where |∂¯Ev sv |2Hv = − −1 tr(∂¯Ev sv ∧∂Hv sv ) by the K¨ahler identities, and X ∂¯ tr(sv ∂Hv sv ) = X ∂¯ tr(sv ∂Hv (sv ))∧ωn−1 /(n−1)! = 0 by Stokes theorem – cf. e.g. [S, Lemma 3.1(b) and the proof Proposition 5.1]). p

Definition 3.1. The modified Donaldson lagrangian Mσ,τ = Mσ,τ (K, ·) : Met2 → R is σv MD,v (Hv ) + φ2L2 ,H − φ2L2 ,K − (s, τ · id)L2 , Mσ,τ (H ) = v p

p

for H = Kes ∈ Met2 , s ∈ L2 S. Using the constructions of §3.1.6, the modified Donaldson lagrangian can be expressed in terms of the functions , ψ : R × R → R, with given by (15) and ψ defined by ψ(x, y) = ex−y .

(20)

In the following, we use the notation (·, ·)L2 = (·, ·)L2 ,K , · L2 = · L2 ,K , as defined in §3.1.1.

Hitchin–Kobayashi Correspondence, Quivers, and Vortices p

15

p

Lemma 3.2. If H = Kes ∈ Met2 , with s ∈ L2 S, then Mσ,τ (H ) = (σ ·

√

−1 FK , s)L2 + (σ · (s)(∂¯E s), ∂¯E s)L2

+(ψ(s)φ, φ)L2 − φ2L2 − (τ · id, s)L2 . Proof. The first two terms follow from the definitions of MD,v and Mσ,τ . To obtain the third term, we note that φa∗Ha = (e−sta ⊗ idMa ) ◦ φa∗Ka ◦ esha and (ψ(s)φ)a = esha ◦ φa ◦ (e−sta ⊗ idMa ) (cf. (13)), so |φa |2Ha = tr(φa ◦ φa∗Ha ) = tr(esha ◦ φa ◦ (e−sta ⊗

idMa ) ◦ φa∗Ka ) = tr((ψ(s)φ)a ◦ φa∗Ka ) = ((ψ(s)φ)a , φa )Ka . The last two terms follow directly from the definition of Mσ,τ . p

3.4. Minima of Mσ,τ , the main estimate, and the vortex equations. Let mσ,τ : Met2 → Lp 0 (End E) be defined by √ p p mσ,τ (H ) = σ · −1 FH + [φ, φ ∗H ] − τ · id, for H = Kes ∈ Met2 , s ∈ L2 S. (21) p

Thus, mσ,τ (H ) ∈ Lp S(H ) for each H ∈ Met2 , and actually mσ,τ (H ) ∈ Lp S 0 (H ) if p,0 p H ∈ Met2 , by (7). Let B > mσ,τ (K)Lp be a positive real number. We are interested p,0 in the minima of Mσ,τ in the closed subset of Met2 defined by p,0

p,0

p

Met 2,B := {H ∈ Met 2 | mσ,τ (H )Lp ,H ≤ B} (the restriction to this subset will be necessary to apply Lemma 3.4 below). Proposition 3.1. If R is simple, i.e. its only endomorphisms are multiples of the identity, p,0 p,0 and H ∈ Met2,B minimises Mσ,τ on Met 2,B , then mσ,τ (H ) = 0. The minima are thus the solutions of the vortex equations. To prove this, we need a lemma about the first and second order Lie derivaties of p p Mσ,τ . Given H ∈ Met2 , LH : L2 S(H ) → Lp S(H ) is defined by LH (s) =

d mσ,τ (H eεs )ε=0 , dε

p

for each s ∈ L2 S(H ).

(22)

Since φ ∗Hε = e−εs φ ∗H eεs , with Hε = H eεs , we have d ∗Hε = [s, φ]∗H , φ ε=0 dε so

d ∗Hε ] d ε [φ, φ ε=0

(23)

= [φ, [s, φ]∗H ]. Together with (18), this implies that LH (s) = σ ·

√

−1 ∂¯E ∂H s + [φ, [s, φ]∗H ].

(24) p

Lemma 3.3. (i) Mσ,τ (K, H ) + Mσ,τ (H, J ) = Mσ,τ (K, J ), for H, J ∈ Met2 ; d p p Mσ,τ (H eεs )ε=0 = (mσ,τ (H ), s)L2 ,H , for each H ∈ Met2 and s ∈ L2 S(H ); (ii) dε

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

16

(iii)

d2 εs M (H e ) = (L (s), s) σv ∂¯Ev sv 2L2 ,H + [s, φ]2L2 ,H , for 2 ,H = σ,τ H L ε=0 v d ε2 v p

p

each H ∈ Met2 and s ∈ L2 S(H ).

Proof. Part (i) follows immediately from (16) and (Kes )es = Kes+s. To prove(ii) and (iii), let Hε = H eεs , for ε ∈ R. From (23) we get ddε |φ|2Hε ε=0 = tr φ ddε φ ∗Hε ε=0 = tr(φ[s, φ]∗H ) = ([φ, φ ∗H ], s)H , which together with (17), proves (ii) (the last term in (21) is trivially obtained). The first equality in (iii) follows from (ii), the Hε -selfadjointness of s (since s ∗Hε = e−εs s ∗H eεs = e−εs seεs = s), and (22): d2 d (mσ,τ (Hε ), s)L2 ,Hε ε=0 Mσ,τ (Hε )ε=0 = 2 dε dε

d mσ,τ (Hε )ε=0 s = tr(LH (s)s), = tr dε X X which equals (LH (s), s)L2 ,H . To prove the second equality in (iii), we first notice that if φ is a smooth section of R, then (s, φ ◦ φ ∗H )H = (s ◦ φ, φ )H and (s, φ ∗H ◦ φ )H = (φ ◦ s, φ )H , so (s, [φ , φ ∗H ])H = ([s, φ], φ )H . The second equality in (iii) is now obtained using (24), (18) and taking φ = [s, φ] in the previous formula. p,0

Proof of Proposition 3.1. We start proving that if R is simple and H ∈ Met2 , then the p p restriction of LH to L2 S 0 (H ), which we also denote by LH : L2 S 0 (H ) → Lp S 0 (H ), is surjective. To do this, we only have to show that LH is a Fredholm operator of index p zero and that it has no kernel. First, for each vertex v, kv : L2 Sv (Hv ) → Lp Sv (Hv ), √ √ ¯ defined by kv = −1 ∂¯Ev ∂Hv − −1 √ ∂Ev ∂Kv , is obviously a pcompact operator (cf. §3.1.4), and by the K¨ahler identities, −1 ∂¯Ev ∂Kv acting on L2 S is the (1, 0)-lapla∗ + ∂ ∗ ∂ , which is elliptic and selfadjoint, hence Fredholm, and cian Kv = ∂Kv ∂K Kv Kv v √ has index zero. Now, LH equals v σv −1 ∂¯E ∂Hv , up to a compact operator, so it is also a Fredholm operator of index zero. To prove that it has no kernel, we notice that if p s ∈ L2 S 0 (H ) satisfies LH (s) = 0, then (s, LH (s))L2 ,H = 0, so Lemma 3.3(iii) implies ∂¯Ev sv = 0 and [s, φ] = 0; i.e. s is actually an endomorphism of R, so sv = c idEv , for certain constant c. Since tr(σ · s) = 0, the constant is c = 0, so sv = 0. p,0 Let H minimise Mσ,τ in Met2,B . To prove that mσ,τ (H ) = 0, we assume the conp 0 trary. Since LH : L2 S (H ) → Lp S 0 (H ) is surjective, and mσ,τ (H ) ∈ S 0 (H ) is not p zero, there exists a non-zero s ∈ L2 S 0 (H ) with LH (s) = −mσ,τ (H ). We shall consider p,0 the values of Mσ,τ along the path Hε = H eεs ∈ Met2 for small |ε|. First, d |mσ,τ (Hε )|2Hε ε=0 dε d tr(mσ,τ (Hε )2 )ε=0 = 2(mσ,τ (H ), LH (s))H = −2|mσ,τ (H )|2H , = dε (cf. (22)), and since p is even, d p p p−2 d mσ,τ (Hε )Lp ,Hε ε=0 = |mσ,τ (Hε )|2Hε ε=0 |mσ,τ (H )|H dε 2 X dε p = −pmσ,τ (H )Lp ,H < 0,

Hitchin–Kobayashi Correspondence, Quivers, and Vortices p,0

17

d d ε Mσ,τ (Hε ) ε=0 = p to s ∈ L2 S(H ) gives

so the path Hε is in Met2,B for small |ε|. Thus, Mσ,τ in

p,0 Met2,B .

Now, Lemma 3.3(ii) applied

0, as H minimises

d Mσ,τ (Hε )ε=0 = (mσ,τ (H ), s)L2 ,H = −(LH (s), s)L2 ,H . dε p

As in the first paragraph of this proof, if R is simple and s ∈ L2 S 0 (H ) satisfies (s, LH (s))L2 ,H = 0, then Lemma 3.3(iii) implies that s is zero. This contradicts the assumption mσ,τ (H ) = 0. p,0

Definition 3.2. We say that Mσ,τ satisfies the main estimate in Met 2,B if there are constants C1 , C2 > 0, which only depend on B, such that sup |s| ≤ C1 Mσ,τ (H ) + C2 , for p,0 p all H = Kes ∈ Met 2,B , s ∈ L2 S. p,0

Proposition 3.2. If R is simple and Mσ,τ satisfies the main estimate in Met 2,B , then there is a hermitian metric on R satisfying the (σ, τ )-vortex equations. This hermitian metric is unique up to multiplication by a positive constant. Proof. This result is proved in exactly the same way as in [B, §3.14], so here we only sketch the proof. One first shows that if Mσ,τ (Kes ) is bounded above, then the Sobolev norms sLp are bounded. One then takes a minimising sequence {Kesj } for Mσ,τ , with p

2

sj ∈ L2 S 0 ; then sj Lp are uniformly bounded, so after passing to a subsequence, 2

p

{sj } converges weakly in L2 to some s. One then sees that Mσ,τ is continuous in the p,0 weak topology on Met2,B , so Mσ,τ (Kesj ) converges to Mσ,τ (Kes ). Thus, H = Kes minimises Mσ,τ . By Proposition 3.1, mσ,τ (H ) = 0, i.e. H satisfies the vortex equations. By elliptic regularity, H is smooth. The uniqueness of the solution H follows from the convexity of Mσ,τ (cf. Lemma 3.3(iii)) and the simplicity of R. The proof of Theorem 3.1 is therefore reduced to show that if R is (σ, τ )-stable, then p,0 Mσ,τ satisfies the main estimate in Met 2,B (this is the content of §3.6). 3.5. Equivalence of C 0 and L1 estimates. The following proposition will be used in §3.6. Proposition 3.3. There are two constants C1 , C2 > 0, depending on B and σ , such that p,0 p for all H = Kes ∈ Met 2,B , s ∈ L2 S 0 , sup |s| ≤ C1 sL1 + C2 . p,0

Corollary 3.1. Mσ,τ satisfies the main estimate in Met 2,B if and only if there are constants C1 , C2 > 0, which only depend on B, such that sL1 ≤ C1 Mσ,τ (H ) + C2 , for p,0 p all H = Kes ∈ Met 2,B , s ∈ L2 S 0 . Corollary 3.1 is immediate from Proposition 3.3. To prove Proposition 3.3, we need three lemmas. The first one is due to Donaldson [D3] (see also the proof of [S, Prop. 2.1]). Lemma 3.4. There exists a smooth function a : [0, ∞) → [0, ∞), with a(0) = 0 and ∈ R, there is a constant a(x) = x for x > 1, such that the following is true: For any B C(B) such that if f is a positive bounded function on X and f ≤ b, where b is a func then sup |f | ≤ C(B)a(f tion in Lp (X) (p > n) with bLp ≤ B, L1 ). Furthermore, if f ≤ 0, then f = 0.

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

18

Lemma 3.5. If s ∈ L2 S and H = Kes ∈ Met 2 , then ([φ, φ ∗H ], s) ≥ ([φ, φ ∗K ], s). p

p

Proof. The function f (ε) = ([φ, φ ∗Hε ], s) for ε ∈ R, where Hε = Keεs , is increasing, as df (ε)/d ε = |[s, φ]|2Hε ≥ 0 (cf. (23)). Now,f (0) = ([φ, φ ∗K ], s), f (1) = ([φ, φ ∗H ], s), so we are done. p

p

Lemma 3.6. If H = Kes ∈ Met 2 , with s ∈ L2 S, then (mσ,τ (H ) − mσ,τ (K), s) ≥

1 1/2 |σ · s| |σ 1/2 · s|, 2 1/2

p

where σ 1/2 · s ∈ L2 S is of course defined by (σ 1/2 · s)v = σv sv , for v ∈ Q0 . Proof. This lemma, and its proof, are similar to (but not completely immediate from) [B, Prop. 3.7.1]. First, Lemma 3.5 and (9) imply √ (mσ,τ (H ) − mσ,τ (K), s) ≥ −1 (σ · FH − σ · FK , s) √ = −1 (σ · ∂¯E (e−s ∂K es ), s), (25) where ¯ · e−s ∂K es , s) + (σ · e−s ∂K es , ∂K s) (σ · ∂¯E (e−s ∂K es ), s) = ∂(σ

(26)

(for AK is the Chern connection corresponding to the metric K). To make some local calculations, we choose a local Kv -orthogonal basis {uv,i } of eigenvectors of sv , for each vertex v, with corresponding eigenvalues {λv,i }, and let {uv,i } be the corresponding dual basis; thus, sv = λv,i uv,i ⊗ uv,i . i

As in [B, (3.36)], a local calculation gives (e−sv ∂Kv esv , sv ) = 21 ∂|sv |2 ; multiplying by σv and adding for v ∈ Q0 , we get (σ · e−s ∂K es , s) = 21 ∂|s |2 , where s = σ 1/2 · s. Thus, ¯ · e−s ∂K es , s) = 1 ∂∂|s ¯ | + ∂|s ¯ | ∧ ∂|s |. ¯ |2 = |s |∂∂|s ∂(σ (27) 2 √ ¯ for the action of the laplacian From (25), (26), (27) and the equality = 2 −1 ∂∂ on 0-forms in a K¨ahler manifold, we get (mσ,τ (H ) − mσ,τ (K), s) √ √ 1 ¯ ∧ ∂|s |) + −1 (σ · e−s ∂K es , ∂K s). ≥ |s ||s | + −1 (∂|s| 2 In the proof of [B, Prop. 3.7.1], there are several local calculations which, although there p they are only used for the section s ∈ L2 S defining the metric H = Kes , are actually p valid for any K-selfadjoint section, in particular for s ∈ L2 S. Thus, [B, (3.42)] applied to sv is √ √ ¯ v,i ), −1 (e−sv ∂Kv esv , σ · ∂Kv sv ) ≥ −1 (∂λv,i ∧ ∂λ i

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

19

and multiplying by σv and adding for v ∈ Q0 , we get √ √ ¯ v,i ), −1 (σ · e−s ∂K es , σ · ∂K s) ≥ −1 (∂λv,i ∧ ∂λ

(28)

v,i

where λv,i := σv λv,i are the eigenvalues of sv = σv sv ; similarly, [B, (3.43)] applied to s is √ √ √ ¯ v,i ) ≥ −1 (∂|s | ∧ ∂|s ¯ |) = − −1 (∂|s ¯ | ∧ ∂|s |). (29) −1 (∂λv,i ∧ ∂λ 1/2

1/2

v,i

From (27), (28), (29), we obtain (mσ,τ (H ) − mσ,τ (K), s) ≥ 21 |s ||s |.

Proof of Proposition 3.3. Let σmin = min{σv |v ∈ Q0 }, σmax = max{σv |v ∈ Q0 }. p,0 p −1/2 Given H = Kes ∈ Met2,B , with s ∈ L2 S 0 , let f = |σ 1/2 ·s| and b = σmin (|mσ,τ (H )|+ |mσ,τ (K)|). We now verify that f and b verify the hypotheses of Lemma 3.4, for a certain which only depends on B. First, bLp ≤ σ −1/2 (mσ,τ (H )Lp + mσ,τ (K)Lp ) ≤ B min := σ −1/2 2B 1/p . Second, we prove that B min f ≤ b.

(30) −1/2

At the points where f does not vanish, |f |−1 ≤ σmin |s|−1 , so Lemma 3.6 gives −1/2

−1/2

f ≤ σmin |s|−1 (mσ,τ (H ) − mσ,τ (K), s) ≤ σmin |mσ,τ (H ) − mσ,τ (K)| ≤ b, while to consider the points where f vanishes, we just take into account that f = 0 almost everywhere (a.e.) in f −1 (0) ⊂ X, and that b ≥ 0 by its definition, so (30) actually holds a.e. in X. The hypotheses of Lemma 3.4 are thus satisfied, so there exists a constant C(B) > 0 such that sup f ≤ C(B)a(f L1 ), with a : [0, ∞) → [0, ∞) as in Lemma 3.4. This estimate can also be written as sup f ≤ C1 f L1 + C2 , where −1/2 1/2 C1 , C2 > 0 only depend on B. Now, |s| ≤ σmin f and f ≤ σmax |s|, so −1/2

−1/2

1/2 sup |s| ≤ σmin (C1 f L1 + C2 ) ≤ σmin (C1 σmax sL1 + C2 ).

The estimate is obtained by redefining the constants C1 , C2 .

3.6. Stability implies the main estimate. The following proposition, together with Proposition 3.2, are the key ingredients to complete the proof of Theorem 3.1 (cf. Definition 3.2 for the main estimate). p,0

Proposition 3.4. If R is (σ, τ )-stable, then Mσ,τ satisfies the main estimate in Met 2,B . To prove this, we need some preliminaries (Lemmas 3.7-3.10). Let {Cj }∞ j =1 be a sequence of constants with lim Cj = ∞. j →∞

p,0

Lemma 3.7. If Mσ,τ does not satisfy the main estimate in Met 2,B , then there is a se-

0 sj quence {sj }∞ j =1 in L2 S with Ke ∈ Met2,B (which we can assume to be smooth), such that p

p,0

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

20

(i) lim sj L1 = ∞, j →∞

(ii) sj L1 ≥ Cj M(Kesj ). p

p,0

p,0

Proof. Let b > mσ,τ (K)Lp with b < B, so Met 2,b ⊂ Met 2,B . Thus, if Mσ,τ does p,0

not satisfy the main estimate in Met 2,B , then it does not satisfy the main estimate in

Met 2,b either. We shall prove that for any positive constant C , if there are positive p constants C and N such that sL1 ≤ C Mσ,τ (Kes ) + C whenever s ∈ L2 S 0 with p,0 p,0 Kes ∈ Met 2,b and sL1 ≥ N, then Mσ,τ satisfies the main estimate in Met 2,b . ∞ The lemma follows from this claim by choosing a sequence of constants {Nj }j =1 with p,0

Nj → ∞, and taking Cj and sj ∈ L2 S 0 with Kesj ∈ Met2,b ⊂ Met 2,B , sj L1 ≥ Nj , and sL1 > Cj Mσ,τ (Kesj ) + Cj . Let C , C , N be such that p

p,0

p,0

sL1 ≤ C Mσ,τ (Kes ) + C for sL1 ≥ N. p

p,0

Let SN = {s ∈ L2 S 0 |Kes ∈ Met 2,b and sL1 ≤ N }. By Proposition 3.3, if s ∈ SN , then sup |sv | ≤ sup |s| ≤ C1 sL1 + C2 ≤ C1 N + C2 (here C1 and C2 are not the first elements of the sequence {Cj }∞ j =1 but constants as in Proposition 3.3), so by Lemma 3.2, Mσ,τ is bounded below on SN , i.e. Mσ,τ (Kes ) ≥ −λ for each s ∈ SN , for some constant λ > 0. Thus, sL1 ≤ C (Mσ,τ (Kes ) + λ) + N for each s ∈ SN . Replacing C p by max{C , C λ+N }, we see that sL1 ≤ C Mσ,τ (Kes )+C , for each s ∈ L2 S 0 with p,0 p,0 Kes ∈ Met 2,b . By Corollary 3.1, Mσ,τ satisfies the main estimate in Met 2,b . Finally, p 0 since the set of smooth sections is dense in L2 S , we can always assume that sj is smooth p (we made the choice b < B so that if Kesj is in the boundary mσ,τ (H )Lp ,H = b of

Met2,b , we can still replace sj by a smooth sj with Kesj ∈ Met2,B ). p,0

p,0

Lemma 3.8. Assume that Mσ,τ does not satisfy the main estimate in Met 2,B . Let {sj }∞ j =1 be a sequence as in Lemma 3.7, lj = sj L1 , C(B) = C1 + C2 , where C1 , C2 are as in Proposition 3.3, and uj = sj / lj . Thus, uj L1 = 1 and sup |uj | ≤ C(B). After going p to a subsequence, uj → u∞ weakly in L21 S 0 , for some nontrivial u∞ ∈ L2 S 0 such that if F : R × R → R is a smooth non-negative function such that F (x, y) ≤ 1/(x − y) whenever x > y, and Fε : R × R → R is a smooth non-negative function with Fε (x, y) = 0 whenever x − y ≤ ε, for some fixed ε > 0, then √ (σ · −1 FK , u∞ )L2 + (σ · F (u∞ )∂¯E u∞ , ∂¯E u∞ )L2 +(Fε (s)φ, φ)L2 − (τ · id, u∞ )L2 ≤ 0. p,0

Proof. To prove this inequality, we can assume that F and Fε have compact support (for sup |uj | are bounded, by Lemma 3.3, and the definitions of F (s)∂¯E u∞ and Fε (s)φ only depend on the values of F and Fε at the pairs (λi , λj ) of eigenvalues, as seen in §3.1.6). Now, if F and Fε have compact support then, for large enough l, F (x, y) ≤ l(lx, ly),

Fε (x, y) ≤ l −1 ψ(lx, ly),

where and ψ are defined as in (15) and (20) (cf. the proof of [B, Prop. 3.9.1]). Since lj → ∞, from these inequalities we obtain that for large enough j , (F (uj,v )∂¯E uj,v , ∂¯E uj,v )L2 ≤ l((lj,v uj,v )∂¯E uj,v , ∂¯E uj,v )L2 ,

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

21

(Fε (uj )φ, φ)L2 ≤ l −1 (ψ(lj uj )φ, φ)L2 , so Lemma 3.7(iii) applied to si = lj uj , together with Lemma 3.2, give an upper bound √ φ2L2 1 + ≥ lj−1 Mσ,τ (Kelj uj ) + lj−1 φ2L2 ≥ (σ · −1 FK , uj )L2 Cj lj + (σ · F (uj )∂¯E uj , ∂¯E uj )L2 + (Fε (uj )φ, φ)L2 − (τ · id, uj )L2 . As in the proof of [B, Prop. 3.9.1], one can use this upper bound to show that the 2 sequence {uj }∞ j =1 is bounded in L1 . Thus, after going to a subsequence, uj → u∞ in L21 , for some u∞ ∈ L21 S with u∞ L1 = 1, so u∞ is non-trivial. We now prove the estimate for u∞ . First, since sup |uj | ≤ b := C(B), uj → u∞ in L20,b ; applying Lemma 3.1(iii) , one can show (as in the proof of [S, Lemma 5.4]) that √ √ (σ · −1 FK , uj )L2 +(σ ·F (uj )∂¯E uj , ∂¯E uj )L2 approaches (σ · −1 FK , u∞ )L2 + (σ · F (u∞ )∂¯E u∞ , ∂¯E u∞ )L2 as j → ∞. Second, since L21 ⊂ L2 is a compact embedding and actuallly uj ∈ L21,b S ⊂ L20,b S, applying Lemma 3.1(iv) (as in the proof of [B, Prop. 3.9.1]), Fε : L20,b S → L20,b S(End R), u → Fε (u), is continuous on L20,b S, so limj →∞ Fε (uj ) = Fε (u∞ ). Since sup |uj | are bounded, this implies that (Fε (uj )φ, φ)L2 converges to (Fε (u∞ )φ, φ)L2 as j → ∞. Finally, it is clear that (τ · id, uj )L2 → (τ · id, u∞ )L2 as j → ∞. This completes the proof. p,0

p

Lemma 3.9. If Mσ,τ does not satisfy the main estimate in Met 2,B , and u∞ ∈ L2 S 0 is as in Lemma 3.8, then the following happens: (i) The eigenvalues of u∞ are constant almost everywhere. (ii) Let the eigenvalues of u∞ be λ1 , . . . , λr . If F : R×R −→ R satisfies F (λi , λj ) = 0 whenever λi > λj , 1 ≤ i, j ≤ r, then F (u∞ )(∂¯E u∞ ) = 0. (iii) If Fε is as in Proposition 3.8, then Fε (u∞ )φ = 0. Proof. Parts (i) and (ii) of are proved as in [UY, Appendix], [S, §§6.3.4 and 6.3.5], or [B, §§3.9.2 and 3.9.3], using Lemma 3.1(ii) for part (i) and the estimate in Lemma 3.8 for part (ii). Part (iii) is similar to [B, Lemma 3.9.4], and again uses the estimate in Lemma 3.8. p

We now construct a filtration of quiver subsheaves of R using L2 -subsystems, as in [B, §3.10]. p,0

Lemma 3.10. Assume that Mσ,τ does not satisfy the main estimate in Met 2,B . Let u∞ ∈ p L2 S 0 be as in Lemma 3.8. Let the eigenvalues of u∞ , listed in ascending order, be λ0 < λ1 < · · · < λr . Since u∞ is “σ -trace free” (cf. §3.1.3), there are at least two different eigenvalues, i.e. r ≥ 1. Let p0 , . . . , pr : R → R be smooth functions such that, for j < r, pj (x) = 1 if x ≤ λj , pj (x) = 0 if x ≥ λj +1 , and pr (x) = 1 if x ≤ λr . Let πv : E → Ev be the canonical projections (cf. (4)) and ∂¯E be as in (5). The operators = π ◦ π , for 0 ≤ j ≤ r, satisfy: πr = pj (u∞ ) and πj,v v j (i) πj ∈ L21 S, πj2 = πj = πj∗K and (1 − πj )∂¯E πj = 0, ) ◦ φa ◦ (πj,ta ⊗ idMa ) = 0 for each v ∈ Q0 , (ii) (id −πj,ha (iii) Not all the eigenvalues of u∞ are positive.

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

22

Proof. The proof of (i) is as in [S] (right below Lemma 5.6; see also [B, Prop. 3.10.2(i)(iii)]). Part (ii) is similar to, but more involved than, [B, Prop. 3.10.2(iv)], so we now give a detailed proof of this part. For each j , let ε > 0 be such that ε ≤ (λj +1 − λj )/2, and ϕ1 , ϕ2 : R → R be smooth non-negative functions such that ϕ1 (x) = 0 if x ≤ λj +1 −ε/2 and ϕ1 (x) = 1 if x ≥ λj +1 , in the case of ϕ1 ; and ϕ2 (y) = 1 if y ≤ λj and ϕ2 (y) = 0 if y ≥ λj + ε/2, in the case of ϕ2 . Let Fε : R × R → R be given by Fε (x, y) = ϕ1 (x)ϕ2 (y). If Fε (x, y) = 0, then x > λj +1 − ε/2 and y < λj + ε/2, so x − y > λj +1 − λj − ε ≥ ε; thus, Fε satisfies the hypothesis of Lemma 3.9 (iii), so Fε (u∞ )φ = 0. But Fε (u∞ )φ = ϕ1 (u∞ ) ◦ φ ◦ ϕ2 (u∞ ) (cf. (13)), where ϕ1 (u∞ ) = id −πj and ϕ2 (u∞ ) = πj , which completes the proof of part (ii). Finally, part (iii) follows from tr(σ · u∞ ) = 0 and the non-triviality of u∞ . p,0

Proof of Proposition 3.4. Assume that Mσ,τ does not satisfy the main estimate in Met2,B . are We have to prove that R is not (σ, τ )-stable. By Lemma 3.10 (i), the operators πj,v weak holomorphic vector subbundles of Ev , for v ∈ Q0 [UY, §4]. Applying the Uhlen ⊂ E , beck–Yau regularity theorem [UY, §7], they represent reflexive subsheaves Ej,v v ⊂ E are compatible with the morphisms and by Lemma 3.10 (ii), the inclusions Ej,v v φa , hence define Q-subsheaves Rj = (Ej , φj ) of R = (E, φ). We thus get a filtration of Q-subsheaves, 0 → R0 → R1 → · · · → Rr = R. As in [B, (3.7.2)], u∞ =

λ0 π0

+

r

λj (πj

j =1

− πj −1 )

= λr idE −

r−1

(λj +1 − λj )πj ,

j =0

so the v-component u∞,v = u∞ ◦ πv of u∞ is u∞,v = λr idEv −

r−1

(λj +1 − λj )πj,v ,

(31)

j =0 = π (note that it may happen that πj,v j +1,v for some v and j ). From (14) and πj,v = = d p (u pj (u∞,v ), ∂¯Ev πj,v j ∞,v )(∂¯Ev u∞,v ), so r−1 j =0

2 (λj +1 − λj )|∂¯Ev πj,v | =

r−1

(λj +1 − λj )((d pj )2 (u∞,v )∂¯Ev (u∞,v ), ∂¯Ev (u∞,v ))

j =0

= (F (u∞,v )(∂¯Ev u∞,v ), ∂¯Ev u∞,v ), (32) 2 where F : R × R −→ R, defined by F = l−1 j =0 (λj +1 − λj )(d pj ) , satisfies the conditions of Lemma 3.8 (cf. e.g. the proof of [S, Lemma 5.7]). We make use of the previous calculations to estimate the number   r−1 (λj +1 − λj ) degσ,τ (Rj ) . χ = Vol(X) λr degσ,τ (R) − j =0

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

23

⊂ E is given by (3.1.5), On the one hand, the degree of the subsheaf Ej,v v √ 2L2 , Vol(X) deg(Ej,v ) = ( −1 FKv , πj,v )L2 − ∂¯Ev πj,v

and this formula, together with Eqs. (31) and (32), imply   r−1 √  χ= σv  −1 FKv , λr idEv − (λj +1 − λj )πj,v j =0

v∈Q0

+

σv

j =0

v∈Q0

−

r−1

L2

(λj +1 − λj )∂¯Ev πj,v 2L2



τv Vol(X) λr rk(Ev ) −

r−1

 (λj +1 − λj ) rk(Ej,v )

j =0

v∈Q0

√ = (σ · −1 FK , u∞ )L2 + (σ · F (u∞ )(∂¯E u∞ ), ∂¯E u∞ )L2 − (τ · id, u∞ )L2 . It follows from Lemma 3.8 (with Fε = 0, cf. Lemma 3.9 (iii)), that χ ≤ 0. On the other hand, if R is (σ, τ )-stable, then µσ,τ (R) > µσ,τ (Rj ), for 0 ≤ j < r, and since p u∞ ∈ L2 S 0 is “σ -trace free”, tr(σ · u∞ ) = σv tr(u∞ ◦ πv ) v

= λr

σv rk(Ev ) −

r−1

(λj +1 − λj )

j =0

v∈Q0

σv rk(Ej,v ) = 0,

v∈Q0

so we get Vol(X) v∈Q0 σv rk(Ev )

χ= ×

r−1



(λj +1 − λj )

j =0

= Vol(X)

σv rk(Ej,v ) degσ,τ (R) −

v∈Q0 r−1 j =0

(λj +1 − λj )

 σv rk(Ev ) degσ,τ (Rj )

v∈Q0 σv rk(Ej,v )(µσ,τ (R) − µσ,τ (Rj )) > 0.

v∈Q0 p,0

Therefore, if Mσ,τ does not satisfy the main estimate in Met 2,B , then R cannot be (σ, τ )-stable. 3.7. Stability implies existence and uniquenes of special metric. Let R = (E, φ) be a (σ, τ )-polystable holomorphic Q-bundle on X. To prove that it admits a hermitian metric satisfying the quiver (σ, τ )-vortex equations, we can assume that R is (σ, τ )-stable, which in particular implies that it is simple. The existence and uniqueness of a hermitian metric satisfying the quiver (σ, τ )-vortex equations is now immediate from Propositions 3.2 and 3.4. Sections 3.2 and 3.7 prove Theorem 3.1.

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

24

4. Yang–Mills–Higgs Functional and Bogomolov Inequality Let σ, τ be collections of real numbers σv , τv , with σv > 0, for v ∈ Q0 . Given a smooth complex vector bundle E, let c1 (E) and ch2 (E) be its first Chern class and second Chern character, respectively. By Chern–Weil theory, if A is a connection on E then c1 (E) (resp. √ −1 ch2 (E)) is represented by the closed form 2π tr(FA ) (resp. − 8π1 2 tr(FA2 )). Define the topologial invariants of E, √ ωn−1 ωn 1 C1 (E) = c1 (E) ∧ tr( −1 FA ) = (33) (n − 1)! 2π X n! X and

Ch2 (E) =

ch2 (E) ∧ X

ωn−2 1 =− 2 (n − 2)! 8π

X

tr(FA2 ) ∧

ωn−2 (n − 2)!

(34)

(thus, C1 (E) is the degree of E, up to a normalisation factor). Given a holomorphic vector bundle E on X, we denote by C1 (E) and Ch2 (E) the corresponding topological invariants of its underlying smooth vector bundle. Theorem 4.1. If R = (E, φ) √ is a (σ, τ )-stable holomorphic Q-bundle on X, and the qq selfadjoint endomorphism −1 Fqa of Ma is positive semidefinite, for each a ∈ Q0 , then τv C1 (Ev ) ≥ 2π σv Ch2 (Ev ). (35) v∈v

v∈Q0

If C1 (Ev ) = 0, Ch2 (Ev ) = 0 for all v ∈ Q0 , then the connections AHv are flat for each v ∈ Q0 , and φa ◦ φa∗H − φa∗H ◦ φa = τv idEv (36) a∈h−1 (v)

a∈t −1 (v)

for each v ∈ Q0 , where H is a solution of the M-twisted quiver (σ, τ )-vortex equations on R. Thus, quiver bundles can be useful to construct flat connections. Note that when X is an algebraic variety, (36) means that R is a family of τ -stable Q-modules parametrized by X (cf. [K, §§5, 6]). This theorem is an immediate consequence of the Hitchin–Kobayashi correspondence for holomorphic Q-bundles and Proposition 4.1 below. We shall use the notation introduced in §2.2. Definition 4.1. The Yang–Mills–Higgs functional Y MHσ,τ : A × 0 → R is defined by Y MHσ,τ (A, φ) = σv FAv 2L2 + dAa φa 2L2 v∈Q0

a∈Q1

2 −1 ∗H ∗H +2 σv φa ◦ φ a − φa ◦ φa − τv idEv , a∈h−1 (v) 2 v∈Q0 a∈t −1 (v) L

where Aa is the connection induced by Ata , Aqa and Aha on the vector bundle Hom(Eta ⊗ Ma , Eha ).

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

25

In the following, · will mean the L√2 -norm in the appropiate space of sections. Note that in Theorem 4.1 it is assumed that −1 Fqa is semidefinite positive for each a ∈ Q0 , so it defines a semidefinite positive sesquilinear form on 0 (Hom(Eta ⊗ Ma , Eha )) by

√ (φa , φa )qa = tr φa ◦ (idEta ⊗ −1 Fqa ) ◦ φa∗Ha , X

for each φa , φa ∈ 0 (Hom(Eta ⊗ Ma , Eha )). Adding together, we thus get a semidefinite positive sesquilinear form on 0 , defined by (φ, φ )R,M = (φa , φa )L2 ,qa , for each φ, φ ∈ 0 . a∈Q1

Thus, φ2R,M := (φ, φ)R,M ≥ 0 for each φ ∈ 0 . Proposition 4.1. If (A, φ) ∈ A × 0 , with Av ∈ A1,1 v for all v ∈ Q0 , then YMHσ,τ (A, φ) = 4

a∈Q1

∂¯Aa φa 2 +4π

v∈Q0

τv C1 (Ev )−8π 2

σv Ch2 (Ev )−φ2R,M

v∈Q0

2 √ −1 ∗H ∗H + σv σv −1 FAv + φa ◦ φ a − φa ◦ φa − τv idEv . v∈Q0 a∈h−1 (v) a∈t −1 (v) Proof. Before giving the proof, we need several preliminaries. First, note that for any Av ∈ A1,1 v , FAv 2 = FAv 2 − 8π 2 Ch2 (Ev )

(37)

(cf. e.g. [B, Theorem 4.2]). Secondly, we notice that the curvature of Aa , for A ∈ Q1 , is given by FAa (φa ) = FAha ◦ φa − φa ◦ (FAta ⊗ idMa + idEta ⊗Fqa )

(38)

where φa is a section of Hom(Eta , Eha ). Finally, since the (0, 1)-parts of the unitary connections Ata , Aha define holomorphic structures, Aa also defines a holomorphic structure on the smooth vector bundle Hom(Eta , Eha ), so it satisfies the K¨ahler identities √ √ −1[, ∂Aa ] = −∂¯A∗ a , −1[, ∂¯Aa ] = ∂A∗ a . √ In particular, the commutator of −1 with the curvature FAa = ∂Aa ∂¯Aa + ∂¯Aa ∂Aa is √ −1[, FAa ] = Aa − Aa , where A = ∂A∗ ∂A + ∂A ∂A∗ and A = ∂¯A∗ ∂¯A + ∂¯A ∂¯A∗ . When acting on sections φa of Hom(Eta , Eha ), this simplifies to √ −1 FAa φa = Aa φa − Aa φa , so that

√ ( −1 FAa φa , φa )L2 = ∂Aa φa 2 − ∂¯Aa φa 2 .

(39)

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

26

To prove the proposition, we define Uv (φ) = φa ◦ φa∗H − φa∗H ◦ φa a∈h−1 (v)

a∈t −1 (v)

for φ ∈ 0 and v ∈ Q0 . Then √ σv−1 σv −1 FAv + Uv (φ) − τv idEv 2 v∈Q0

=

σv FAv 2 +

v∈Q0

σv−1 Uv (φ) − τv idEv 2

v∈Q0

√ √ +2 ( −1 FAv , Uv (φ))L2 − 2 σv−1 ( −1 FAv , τv idEv )L2 , v∈Q0

v∈Q0

where (38), (39) give √ ( −1 FAv , Uv (φ))L2 v∈Q0

=

√ √ ( −1 FAha ◦ φa − φa ◦ ( −1 FAta ⊗ idMa ), φa )L2 a∈Q1

=

√ ( −1 FAa φa , φa )L2 − φR,M

a∈Q1

=

a∈Q1

∂Aa φa 2 −

∂¯Aa φa 2 − φR,M .

a∈Q1

The proposition now follows from the previous equation, (37), and the definition of C1 (Ev ). Proof of Theorem 4.1. Let R = (E, φ) be (σ, τ )-stable, H the hermitian metric on R satisfying the (σ, τ )-vortex equations (cf. Theorem 3.1), and A ∈ A the corresponding Chern connection. By Definition 4.1, YMH σ,τ (A, φ) ≥ 0, while from Proposition 4.1, this is 2π v∈Q0 τv C1 (Ev ) − 8π 2 v∈Q0 σv Ch2 (Ev ) − φ2R,M , as ∂¯Aa φa = 0 for each a ∈ Q1 . Since we are assuming φ2R,M ≥ 0, we obtain (35). Furthermore, if C1 (Ev ) = Ch2 (Ev ) = 0 for each v ∈ Q0 , then YMH σ,τ (A, φ) = −φ2R,M ≤ 0, but this functional is non-negative by Definition 4.1, so YMH σ,τ (A, φ) = 0. Thus, FAv = 0 and we also obtain (36) for each v ∈ Q0 , again by Definition 4.1. 5. Twisted Quiver Sheaves and Path Algebras The category of M-twisted Q-sheaves is equivalent to the category of coherent sheaves of right A-modules, where A is a certain locally free OX -sheaf associated to Q and M – the so-called M-twisted path algebra of Q. This provides an alternative point of view of twisted quiver sheaves which, in certain cases, gives a more algebraic understanding of certain properties of Q-sheaves. In particular, it may be a better point of view to study the moduli space problem, which we will not address in this paper. To fix terminology, a locally free (resp. free, coherent) OX -algebra is a sheaf S of rings which at the same

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

27

time is a locally free (resp. free, coherent) OX -module. Given such an OX -algebra S, a locally free (resp. free, coherent) S-algebra is a sheaf A of (not necessarily commutative) rings over S which at the same time is a locally free (resp. free, coherent) OX -module. A coherent right A-module is a sheaf of right A-modules which at the same time is a coherent OX -module. 5.1. Coherent sheaves of right A-modules. Throughout §5.1, we assume that Q is a finite quiver, that is, Q0 and Q1 are both finite. Let M be as in §1.2. 5.1.1. Twisted path algebra. Let S = ⊕v∈Q0 OX · ev be the free OX -module generated by Q0 , where ev are formal symbols, for v ∈ Q0 . We consider a structure of a commutative OX -algebra on S, defined by ev · ev = ev if v = v , and ev · ev = 0 otherwise, for each v, v ∈ Q0 . Let M= Ma a∈Q1

be a locally free sheaf of S-bimodules, whose left (resp. right) S-module structure is given by ev · m = m if m ∈ Ma and v = ha (resp. m · ev = m if m ∈ Ma and v = ta), and ev · m = 0 otherwise (resp. m · ev = 0 otherwise), for each v ∈ Q0 , a ∈ Q1 , m ∈ Ma . The M-twisted path algebra of Q is the tensor S-algebra of the S-bimodule M, that is, A= M⊗S . ≥0

Note that A is a locally free OX -algebra. Furthermore, since Q is finite, A has a unit 1A = ⊕v∈Q0 ev .

(40)

5.1.2. Coherent A-modules. We will show now that the category of M-twisted Qsheaves is equivalent to the category of coherent sheaves of the right A-modules, or coherent right A-modules. This result is a direct generalisation of the corresponding equivalence of categories for quiver modules (cf. e.g. [ARS]). We define an equivalence functor from the first to the second category. Let R = (E, φ) be an M-twisted Q-sheaf. Let E = ⊕v∈Q0 Ev as a coherent OX -module. The structure of the right Amodule on E is given by a morphism of OX -modules µA : E ⊗OX A → E satisfying the usual axioms defining right modules over an algebra. Let πv : E ⊗OX S = ⊕v,v ∈Q0 Ev ⊗OX OX · ev → Ev ⊗OX OX · ev ∼ = Ev , be the canonical projection, and ιv : Ev → E the inclusion map, for each v ∈ Q 0 . Let µv = ιv ◦ πv : E ⊗OX S → E. The morphism µS = v∈Q0 µv : E ⊗OX S → E defines a structure of right S-module on E. The tensor product of E and M over S is E ⊗S M ∼ = ⊗a∈Q1 Eta ⊗OX Ma ; let πa : E ⊗S M → Eta ⊗OX Ma be the canonical projection, for each a ∈ Q1 . The morphism µM = a∈Q1 ιha ◦ φa ◦ πa : E ⊗S M → E is a morphism of S-modules. Since A is the tensor S-algebra of M, µM induces a morphism of OX -modules µA : E ⊗OX A → E defining a structure of the right A-module on E. This defines the action of the equivalence functor on the objects of the category of M-twisted Q-sheaves. It is straightforward to construct an action of the functor on morphisms of M-twisted Q-sheaves, so this defines a functor from the category of M-twisted Q-sheaves to the

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

28

category of coherent right A-modules. We now define a functor from the category of coherent right A-modules to the category of M-twisted Q-sheaves, and see that this new functor is an inverse equivalence of the previous functor. Let E be a coherent right A-module, with right A-module structure morphism µA : E ⊗OX A → E. The decomposition (40) is a sum of orthogonal idempotents in A (i.e. ev2 = ev , ev · ev = 0 for v, v ∈ Q0 with v = v ), so E = ⊕v∈Q0 Ev with Ev = µA (E ⊗OX OX ·ev ) ⊂ E, for each v ∈ Q0 , and the tensor product of E and M over S is E ⊗S M = ⊗a∈Q1 Eta ⊗OX Ma . The restriction of µA to E ⊗OX M induces a morphism of S-modules µM : E ⊗S M → E. The image of Eta ⊗OX Ma under µM is therefore in Eha , hence defines a morphism of OX -modules φa : Eta ⊗OX Ma → Eha , for each a ∈ Q1 . This defines a functor from the category of coherent right A-modules to the category of M-twisted Q-sheaves. It is straightforward to define the action of this functor on morphisms and to prove that this functor, together with the previous one, are inverse equivalences of categories. This completes the proof of the following: Proposition 5.1. The category of coherent right A-modules is equivalent to the category of M-twisted Q-sheaves on X. 6. Examples 6.1. Higgs bundles. Let X be a Riemann surface. A Higgs bundle on X is a pair (E, ), where E is a holomorphic vector bundle over X and ∈ H 0 (End(E) ⊗ K) is a holomorphic endomorphism of E twisted by the canonical bundle K of X. The quiver here consists of one vertex and one arrow whose head and tail coincide and the twisting bundle is dual of the canonical line bundle of X, i.e. the holomorphic tangent bundle T X of X. This quiver, and the twisting bundle attached to its arrow, is represented in Fig. 1. The Higgs bundle (E, ) is stable if the usual slope stability condition µ(E ) < µ(E) is satisfied for all proper -invariant subbundles E of E. The existence theorem of Hitchin and Simpson [H, S] says that (E, ) is polystable if and only if there exists a hermitian metric H on E satisfying √ (41) FH + [, ∗ ] = − −1µ idE ω, where ω is the K¨ahler form on X, idE is the identity on E, and µ is a constant. Note that taking the trace in the first equation and integrating over X we get µ = µ(E). There are many reasons why Higgs bundles are of interest, one of the most important of which is the fact that there is a bijective correspondence between isomorphism classes of poly-stable Higgs bundles of degree zero on X and isomorphism classes of semisimple complex representations of the fundamental group of X. This fact is derived from a combination of the theorem of Hitchin and Simpson mentioned above and an existence theorem for equivariant harmonic metrics proved by Donaldson [D3] and Corlette [C]. This correspondence can also be used to study representations of π1 (X) in non-compact real Lie groups. In particular, by considering the group U(p, q) one obtains another interesting example of a twisted quiver bundle. To identify this quiver we observe that there is a homeomorphism between the moduli space of semisimple representation of π1 (X) in U(p, q) and the moduli space of polystable zero degree Higgs bundles (E, ) of the form E = V ⊕ W,

= γ0 β0 ,

(42)

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

29

where V and W are holomorphic vector bundles on X of rank p and q, respectively, β ∈ H 0 (Hom(W, V ) ⊗ K) and

γ ∈ H 0 (Hom(V , W ) ⊗ K).

The corresponding quiver, with the twisting bundle attached to each arrow, is represented in Fig. 2. Now, for this twisted quiver bundle one can consider the general quiver equations. Although they only coincide with Hitchin’s equations (41) for a particular choice of the parameters, it turns out that the other values are very important to study the topology of the moduli of representations of π1 (X) into U(p, q) [BGG1]. T X -

? T X

E

γ

V

W

T X

Fig. 1

β

Fig. 2

A very important tool to study topological properties of Higgs bundle moduli spaces and hence moduli spaces of representations of the fundamental group is to consider the C∗ -action on the moduli space given by multiplying the Higgs field by a non-zero scalar. A point (E, ) is a fixed point of the C∗ -action if and only if it is a variation of Hodge structure, that is, E = F1 ⊕ · · · ⊕ Fm

(43)

for holomorphic vector bundles Fi such that the restriction i := |Fi ∈ H 0 (Hom(Fi , Fi+1 ) ⊗ K). A variation of Hodge structure is therefore a twisted quiver bundle, whose twisting bundles are Ma = T X, and the infinite quiver represented in Fig. 3. T X

-

T X

-

T X

-

T X

-

T X

-

T X

-

Fig. 3. Variations of Hodge structure

One can generalize the notion of Higgs bundle to consider twistings by a line bundle other than the canonical bundle. These have also very interesting geometry [GR]. 6.2. Quiver bundles and dimensional reduction. Quiver bundles and their vortex equations appear naturally in the context of dimensional reduction. To explain this, consider the manifold X × G/P , where X is a compact K¨ahler manifold, G is a connected simply connected semisimple complex Lie group and P ⊂ G is a parabolic subgroup, i.e. G/P is a flag manifold. The group G (and hence, its maximal compact subgroup K ⊂ G) act trivially on X and in the standard way on G/P . The K¨ahler structure on X together with a K-invariant K¨ahler structure on G/P define a product K¨ahler structure on X × G/P .

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

30

We now consider a G-equivariant vector bundle over X × G/P and study K-invariant solutions to the Hermitian–Einstein equations. It turns out that these invariant solutions correspond to special solutions to the quiver vortex equations on a certain quiver bundle over X, where the quiver is determined by the parabolic subgroup P . In [AG1] we studied the case in which G/P = P1 , the complex projective line, which is obtained as the quotient of G = SL(2, C) by the subgroup of lower triangular matrices, generalizing previous work by [G1, G2, BG]. The general case has been studied in [AG2]. We will just mention here some of the main results and refer the reader to the above mentioned papers. A key fact is the existence of a quiver Q with relations K naturally associated to the subgroup P . A relation of the quiver is a formal complex linear combination r = j cj pj of paths pj of the quiver (i.e. cj ∈ C), and a path in Q is a sequence p = a0 · · · am of arrows aj ∈ Qj which compose, i.e. with taj −1 = haj for 1 ≤ j ≤ m: p:

am

am−1

a0

• −→ • −→ · · · −→ •

(44)

The set of vertices of the quiver associated to P coincides with the set of irreducible representations of P . The arrows and relations are obtained by studying certain isotopical decompositions related to the nilradical of the Lie algebra of P . For example, for P1 , P1 × P1 and P2 , the quiver is the disjoint union of two copies of the quivers in Fig. 4, 5 and 6, respectively. -

-

-

-

-

-

Fig. 4. G/P = P1

-

-

-

-

-

-

-

6 6 6 6 6 6 6 - - - - - -

-

6 6 6 6 6 6 6 - - - - - -

-

6 6 6 6 6 6 6 - - - - - a (2) 6

-

a (1)

-

6 6 -

6 6 6 - -

6 6 6 6 - - -

6 6 6 6 6 - - - -

(1)

6 6 6 6 6 6 - - - - -

-

a 6 6 6 6 6 6 - - - - a (2)

Fig. 5. G/P = P1 × P1

Fig. 6. G/P = P2

In the case of the quiver associated to P1 , the set of relations is empty, while for the quivers associated to P1 × P1 and P2 , the relations rλ are given by (2)

(1)

(1)

(2)

rλ = aλ−L1 aλ − aλ−L2 aλ , (j )

where λ ∈ Z2 is a vertex, L1 and L2 are the canonical basis of C2 , and aλ : λ → λ−Lj are the arrows going out from λ, for j = 1, 2. Given a set K of relations of the quiver Q,

Hitchin–Kobayashi Correspondence, Quivers, and Vortices

31

a holomorphic (Q, K)-bundle (with no twisting bundles Ma ) is defined as a holomorphic Q-bundle R = (E, φ) which satisfies the relations r = j cj pj in K, i.e. such that j cj φ(pj ) = 0, where φ(p) : Etam → Eha0 is defined for any path (44) as the composition φ(p) := φa0 ◦ · · · ◦ φam . Let (Q, K) be the quiver with relations associated to P . One has an equivalence of categories coherent G−equivariant ←→ (Q, K)−sheaves on X . sheaves on X × G/P The holomorphic G-equivariant vector bundles on X × G/P and the holomorphic (Q, K)-bundles on X are in correspondence by this equivalence. Thus, the category of G-equivariant holomorphic vector bundles on X × (P1 )2 and X × P2 is equivalent to the category of commutative diagrams of holomorphic quiver bundles on X for the corresponding quiver Q in Figs. 5 and 6. If we now fix a total order in the set of vertices, any coherent G-equivariant sheaf F on X × G/P admits a G-equivariant sheaf filtration F : 0 → F0 → F1 → · · · → Fm = F, Fs /Fs−1 ∼ = p∗ Eλs ⊗ q ∗ Oλs , 0 ≤ s ≤ m,

(45)

where {λ0 , λ1 , . . . , λm } is a finite subset of vertices, listed in ascending order, E0 , . . . , Em are non-zero coherent sheaves on X with trivial G-action, and Oλs is the homogeneous bundle over G/P corresponding to the representation λs . The maps p and q are the canonical projections from X × G/P to X and G/P , respectively. If F is a holomorphic G-equivariant vector bundle, then E0 , . . . , Em are holomorphic vector bundles. The appropriate equation to consider on a filtered bundle [AG1] is a deformation of the Hermite–Einstein equation which involves as many parameters τ0 , τ1 , . . . , τm ∈ R as steps are in the filtration, and has the form   τ0 I0 τ 1 I1   √ , (46) −1 Fh =  ..   . τm Im where the RHS is a diagonal matrix, written in blocks corresponding to the splitting which a hermitian metric h defines in the filtration F . If τ0 = · · · = τm , then (46) reduces to the Hermite–Einstein equation. As in the ordinary Hermite–Einstein equation, the existence of invariant solutions to the τ -Hermite–Einstein equation (46) on an equivariant holomorphic filtration is related to a stability condition for the equivariant holomorphic filtration which naturally involves the parameters. Let F be a G-equivariant holomorphic vector bundle on X × G/P . Let F be the Gequivariant holomorphic filtration associated to F and R = (E, φ) be its corresponding holomorphic (Q, K)-bundle on X, where (Q, K) is the quiver with relations associated to P . Then F has a K-invariant solution to the τ -deformed Hermite–Einstein equations if and only if the vector bundles Eλ in R admit hermitian metrics Hλ on Eλ , for each vertex λ with Eλ = 0, satisfying √ −1 nλ FHλ + φa ◦ φa∗ − φa∗ ◦ φa = τλ idEλ , (47) a∈h−1 (λ)

a∈t −1 (λ)

´ L. Alvarez–C´ onsul, O. Garc´ıa–Prada

32

where nλ is the multiplicity of the irreducible representation corresponding to the vertex λ and τλ are related to τλ by the choice of the K-invariant metric on G/P . It is not difficult to show that the stability of the filtration coincides with the stability of the quiver bundle where the parameters σλ in the general stability condition for a quiver bundle equal the integers nλ . This, together with the dimensional reduction obtainment of the equations, provides an alternative proof of the Hitchin–Kobayashi correspondence for these special quiver bundles. Although the quiver bundles obtained by dimensional reduction on X × G/P are not twisted, it seems that twisting may appear if one considers dimensional reduction on more general G-manifolds – this is something to which we plan to come back in the future. Acknowledgements. This research has been partially supported by the Spanish MEC under the grants PB98–0112 and BFM2000-0024. The research of L.A. was partially supported by the Comunidad Aut´onoma de Madrid (Spain) under a FPI Grant, and by a UE Marie Curie Fellowship (MCFI-200100308). The authors are members of VBAC (Vector Bundles on Algebraic Curves), which is partially supported by EAGER (EC FP5 Contract no. HPRN-CT-2000-00099) and by EDGE (EC FP5 Contract no. HPRN-CT-2000-00101). We also want to thank the Erwin Schr¨odinger International Institute for Mathematical Physics for the hospitality and the support during the final preparation of the paper.

References [AB] [AG1] [AG2] [ARS] [B] [Ba] [BG] [BGG1] [BGG2] [BGK1] [BGK2] [C] [D1] [D2] [D3] [DK] [G1] [G2]

Atiyah, M.F., Bott, R.: The Yang–Mills equations over Riemann surfaces. Philos. Trans. Roy. Soc. Lond. Ser. A 308, 523–615 (1982) ´ Alvarez–C´ onsul, L., Garc´ıa–Prada, O.: Dimensional reduction, SL(2, C)-equivariant bundles and stable holomorphic chains. Internat. J. Math. 12, 159–201 (2001) ´ Alvarez–C´ onsul, L., Garc´ıa–Prada, O.: Dimensional reduction and quiver bundles. J. reine angew. Math. 556, 1–46 (2003) Auslander, M., Reiten, I., Smalø, S.O.: Representation Theory of Artin Algebras. Cambridge Studies in Advanced Mathematics 36, Cambridge: Cambridge Univ. Press, 1995 Bradlow, S.B.: Special metrics and stability for holomorphic bundles with global sections. J. Diff. Geom. 33, 169–214 (1991) Banfield, D.: Stable pairs and principal bundles. Quart. J. Math. Oxford 51, 417–436 (2000) Bradlow, S.B., Garc´ıa–Prada, O.: Stable triples, equivariant bundles and dimensional reduction. Math. Ann. 304, 225–252 (1996) Bradlow, S.B., Garc´ıa–Prada, O., Gothen, P.B.: Representations of the fundamental group of a surface in PU(p, q) and holomorphic triples. C. R. Acad. Sci. Paris S´er. I Math. 333, 347–352 (2001) Bradlow, S.B., Garc´ıa–Prada, O., Gothen, P.B.: Surface group representations, Higgs bundles, and holomorphic triples. e-print arXiv:math.AG/0206012 Bradlow, S.B., Glazebrook, J.F., Kamber, F.W.: Reduction of the Hermitian–Einstein equation on K¨ahler fiber bundles. Tohoku Math. J. 51, 81–123 (1999) Bradlow, S.B., Glazebrook, J.F., Kamber, F.W.: The Hitchin–Kobayashi correspondence for twisted triples. Internat. J. Math. 11, 493–508 (1999) Corlette, K.: Flat G-bundles with canonical metrics. J. Diff. Geom. 28, 361–382 (1988) Donaldson, S.K.: Anti self-dual Yang–Mills connections over complex algebraic surfaces and stable vector bundles. Proc. Lond. Math. Soc. 3, 1–26 (1985) Donaldson, S.K.: Infinite determinants, stable bundles and curvature. Duke Math. J. 54, 231– 247 (1987) Donaldson, S.K.: Twisted harmonic maps and the self-duality equations. Proc. London Math. Soc (3) 55, 127–131 (1987) Donaldson, S.K., Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford Science Publications, Oxford: Clarendon Press, 1990 Garc´ıa–Prada, O.: Invariant connections and vortices. Commun. Math. Phys. 156, 527–546 (1993) Garc´ıa–Prada, O.: Dimensional reduction of stable bundles, vortices and stable pairs. Internat. J. Math. 5, 1–52 (1994)

Hitchin–Kobayashi Correspondence, Quivers, and Vortices [GK] [Go] [GR] [H] [K] [KN] [M] [NS] [S] [Th] [UY]

33

Gothen, P.B., King, A.D.: Homological algebra of quiver bundles. e-print arXiv:math.AG/0202033 Gothen, P.B.: The Betti numbers of the moduli space of stable rank 3 Higgs bundles. Internat. J. Math. 5, 861–875 (1994) Garc´ıa–Prada, O., Ramanan, S.: Twisted Higgs bundles and the fundamental group of compact K¨ahler manifolds. Math. Res. Letts. 7, 1–18 (2000) Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. (3) 55, 59–126 (1987) King, A.D.: Moduli of representations of finite dimensional algebras. Quart. J. Math. Oxford 45, 515–530 (1994) Kempf, G., Ness, L.: On the lengths of vectors in representation spaces. Springer LNM 732, Berlin-Heidelbert-New York: Springer, 1982, pp. 233–243 Mundet i Riera, I.: A Hitchin–Kobayashi correspondence for Kaehler fibrations. J. reine angew. Math. 528, 41–80 (2000) Narasimhan, M.S., Seshadri, C.S.: Stable and unitary vector bundles on a compact Riemann surface. Ann. Math. 82, 540–564 (1965) Simpson, C.: Constructing variations of Hodge structure using Yang–Mills theory and applications to uniformization. J. Amer. Math. Soc. 1, 867–918 (1988) Thaddeus, M.: Stable pairs, linear systems and the Verlinde formula. Invent. Math. 117, 317– 353 (1994) Uhlenbeck, K.K., Yau, S.T.: On the existence of Hermitian–Yang–Mills connections on stable bundles over compact K¨ahler manifolds. Comm. Pure and Appl. Math. 39–S, 257–293 (1986); 42, 703–707 (1989)

Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 238, 35–51 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0867-8

Communications in

Mathematical Physics

Enhanced Gauge Symmetry and Braid Group Actions Balázs Szendr˝oi1,2 1 2

Department of Mathematics, Utrecht University, PO. Box 80010, 3508 TA Utrecht, The Netherlands. E-mail: [email protected] Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences, PO. Box 127, 1364 Budapest, Hungary

Received: 28 October 2002 / Accepted: 9 December 2002 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: Enhanced gauge symmetry appears in Type II string theory (as well as Fand M-theory) compactified on Calabi–Yau manifolds containing exceptional divisors meeting in Dynkin configurations. It is shown that in many such cases, at enhanced symmetry points in moduli a braid group acts on the derived category of sheaves of the variety. This braid group covers the Weyl group of the enhanced symmetry algebra, which itself acts on the deformation space of the variety in a compatible way. Extensions of this result are given for nontrivial B-fields on K3 surfaces, explaining physical restrictions on the B-field, as well as for elliptic fibrations. The present point of view also gives new evidence for the enhanced gauge symmetry content in the case of a local A2n -configuration in a threefold having global Z/2 monodromy. Introduction The phenomenon that Type II string theory compactified on a Calabi–Yau manifold can exhibit enhanced gauge symmetry was first observed in the physics literature in the context of K3 surfaces [23, 1]. The existence of non-perturbatively enhanced symmetry algebras is forced by the duality between heterotic string theory on T 4 and the Type IIA string on K3, since the former obviously has enhanced symmetry at special points in moduli. It was found that a K3 surface can have enhanced gauge symmetry if it has rational double point (ADE) singularities, and the type of the (simply-laced) non-abelian Lie algebra that appears precisely matches that of the singularity. The argument for nonabelian gauge symmetry was later extended to Calabi–Yau threefolds in [2] and [16], for threefolds with a curve of ADE singularities. In the presence of monodromy, non-simply laced Lie algebras can also appear. These symmetries and the arising representations have also been analyzed in the context of M- and F-theory (see [13, 3] and references therein). The purpose of the present paper is to give a mathematical interpretation of a “holomorphic shadow” of this symmetry; namely, of the parameters needed to specify a

36

B. Szendr˝oi

string vacuum, I will only concentrate on the complex structure and B-field parameters, ignoring the Kähler structure. In particular, by moving in the Kähler moduli space, I can resolve the singularities mentioned in the previous paragraph, and work with smooth K3 surfaces and Calabi–Yau threefolds, containing ADE configurations of rational curves and configurations of ruled surfaces respectively. The phenomenon that I will illustrate by several theorems is that enhanced gauge symmetry can occur at points in complex moduli when the derived category of the corresponding Calabi–Yau manifold has a large set of autoequivalences. Moreover, these derived equivalences always satisfy the relations of a generalized braid group, which covers the Weyl group of the enhanced gauge symmetry Lie algebra. When one deforms the complex parameters, these autoequivalences deform away to equivalences of derived categories between different manifolds; this is always governed by a Weyl group action on the deformation space. In particular, one can phrase the results of this paper as saying that the category of topological D-branes on a Calabi–Yau compactification (cf. [8]) has an extra braid group worth of symmetries at enhanced gauge symmetry points, not present at generic points in moduli. Braid group actions for groups of Type A (and DE) on derived categories were first constructed in [19]. In Sect. 3 of the present paper I will show how to extend these actions in two dimensions (K3 surfaces) to cover deformations, and how this fits into the framework of enhanced gauge symmetry. The autoequivalences will be generalized to cover deformations with nonzero B-field; in particular I will derive the restrictions on the B-field found in [1] by a duality argument. Calabi–Yau threefolds, as mentioned before, can exhibit gauge symmetries of all A . . . G2 types. Corresponding braid group actions are constructed in [22]. I explain in Sect. 4 the main points of the construction, referring back to the (easier) surface case. I also give some examples, including an amusing projective example exhibiting non-trivial monodromy, and make some comments related to the interpretation of the actions as enhanced gauge symmetry. The threefolds appearing in this paper represent the simplest case of enhanced gauge symmetry, that of “uniform singularities” or geometrically ruled surfaces (no hypermultiplets in physics-speak). In case there are extra rational curves in fibers, the mathematics is more complicated (compare for example [13]); dissident curves can be flopped, there are many more autoequivalences and derived equivalences around, and it appears to be difficult to formulate a clean statement. However, for one highly singular situation studied for example in [3, Sect. 4], the ideas of the present paper are strong enough to provide supporting evidence (though alas not a proof) for the gauge symmetry content. The argument is spelled out in Remark 10. The paper begins with two introductory sections: Section 1 recalls reflection groups and (generalized) braid groups, whereas Sect. 2 deals with (families of) equivalences of derived categories. The latter section contains a statement which may be of independent interest, connecting deformations of a Fourier–Mukai functor of a Calabi–Yau variety with its action on cohomology. Section 5 points out an extension of the results to elliptic fibrations and braid groups of affine type which may be interesting from the point of view of F-theory, whereas Sect. 6 poses a challenge for symplectic geometry via mirror symmetry. 1. Reflection Groups and Generalized Braid Groups A Dynkin diagram in this paper means an irreducible finite type diagram corresponding to a finite root system ⊂ hR in a real Euclidean inner product space hR . It is well

Enhanced Gauge Symmetry and Braid Group Actions

37

known that such diagrams can be of type An , Bn , Cn , Dn , E6 , E7 , E8 , F4 or G2 . The root system defines a finite reflection group W = ri acting on hR , generated by a set of reflections r1 , . . . , rn indexed by nodes of , equivalently by a set of simple roots. As an abstract group, ri2 = 1, (ri rj )mij = 1 W ∼ = ri : i ∈ Nodes() with one relation for every node i and one for every pair of different nodes (i, j ) with label mij . The group W also acts on the complex vector space h = hR ⊗ C. Define the (generalized) braid group (also called Artin group) B by generators and relations as B = Ri : i ∈ Nodes() Ri Rj . . . = Rj Ri . . . (1) mij

mij

with one relation for every pair of different nodes (i, j ) of , the braid relation. There is a group homomorphism B → W sending Ri to ri . As an example, in the familiar case of type An the group W is the symmetric group on (n + 1) letters, whereas B is the classical braid group on (n + 1) strings. 2. Families of Derived Equivalences If X is a smooth projective variety, let D b (X) denote the bounded derived category of coherent sheaves on X. A kernel (derived correspondence) between smooth projective varieties X1 , X2 is an object U ∈ D b (X1 × X2 ). Such an object defines a functor U : D b (X2 ) → D b (X1 ) by L

U (−) = Rp1∗ (U ⊗ p2∗ (−)), with pi : X1 × X2 → Xi the projections. If U is an equivalence of triangulated categories, then it is called a Fourier–Mukai functor and U is said to be invertible. Suppose that π : X → S is a smooth family of projective varieties over a complex base S. A relative kernel is a pair (U, ϕ), where –

ϕ : S → S is an automorphism, giving rise to the fibre product diagram X ×ϕ X −→ X   π X

–

ϕ◦π

−→ S

and U ∈ D b (X ×ϕ X ) is an object in the derived category of the product.

38

B. Szendr˝oi

There is a map X ×ϕ X → S with fibre Xs × Xϕ(s) over s ∈ S. The (derived) restriction of U to this fibre gives a kernel Us = Lys∗ (U ) ∈ D b (Xs × Xϕ(s) ), where ys : Xs × Xϕ(s) → X ×ϕ X is the inclusion. Hence a relative kernel defines a family of functors s = Us : D b (Xϕ(s) ) → D b (Xs ). In the present paper, a relative kernel (U, ϕ) will be called invertible, if for all s ∈ S the functor s is a Fourier–Mukai functor. Every invertible relative kernel gives a family of Fourier–Mukai transforms over the base S. The next statement is in some sense auxiliary, but it encompasses the point of view of the present article. Let X be a projective K3 surface or Calabi–Yau threefold. Let π : X → S be a family of projective deformations of X over a polydisc S, with π −1 (0) ∼ = X for 0 ∈ S. Assume that the Kodaira–Spencer map ψ : T0 S → H 1 (X, X ) of the family is injective. Let U0 ∈ D b (X × X) be an invertible kernel on X giving rise to a Fourier–Mukai functor = U0 on X. Using the Mukai map from the derived category to cohomology (see for example [7, Sect. 3.1]), there is an induced isomorphism ψ : H ∗ (X, C) → H ∗ (X, C) preserving Hodge structures (in the sense of Mukai for the K3 case). In particular, H n,0 is preserved where n is the dimension of X; so if ∈ H 0 (X, nX ) is a holomorphic top-form then its image ψ( ) is also a holomorphic top-form (a constant multiple of ). Theorem 1. Assume that there is an invertible relative kernel (U, ϕ) on X → S with ϕ(0) = 0 extending U0 . Then there is a commutative diagram dϕ|0

−→ T0(S) T0(S) ψ ψ 1 1 H (X, H (X,  X )  X ) ∧ψ( )  ∧ H 1 (X, n−1 X )  H ∗ (X, C)

H 1 (X, n−1 X )  ψ

−→

H ∗ (X, C)

where the last vertical maps are the inclusions coming from Hodge theory. This statement may look complicated, but it says something very simple. Suppose you have a Fourier–Mukai functor on X. The action of on cohomology gives rise, via Hodge theory, to a map on the base of the local deformation space of X. Then the only way to extend over a deformation family of X is to a relative kernel whose action ϕ on the base is compatible with the map defined by . In particular, unless acts trivially on the local deformation space, it will never extend to a family of autoequivalences (ϕ = idS ) in a family of deformations of X.

Enhanced Gauge Symmetry and Braid Group Actions

39

Proof of Theorem 1. Once the statement is properly formulated, the proof is not very difficult. Note that the family Us of Fourier–Mukai functors gives rise to an isomorphism of local systems ⊕n R n (ϕ ◦ π)∗ (CX ) ∼ = ⊕n R n π∗ (CX ) on S (basically just a continuous family of cohomology isomorphisms), which preserves Hodge filtrations. Now use the fact that the period map of the family is injective (since the Kodaira–Spencer map of π is, and X is Calabi–Yau), and unwind the definition of the derivative of the period map at 0 ∈ S.

3. K3 Surfaces with ADE Configurations Let Y¯ be a projective K3 surface with a du Val (rational double point) singularity at a point p ∈ Y¯ and no other singularities. Let g : Y → Y¯ be its smooth K3 resolution with exceptional locus E = E1 ∪ . . . ∪ Er . It is well known that each component Ei is a smooth rational curve of self-intersection −2, hence it defines a reflection ri : ω → ω + (Ei · ω) Ei

(2)

on H 2 (Y, C). The intersection graph of the curves {Ei } is a Dynkin diagram of type ADE, and as the notation suggests, the maps rj generate an action of the reflection group W on H 2 (Y, C). Proposition 2. There exists a family e : Y → Z of projective deformations of e−1 (0) ∼ = Y over a complex polydisc 0 ∈ Z, with an action of the finite group W on the base Z, such that the following properties hold: (i) there is a proper subset Zi ⊂ Z such that s ∈ Zi if and only if the fibre Ys contains a smooth rational curve which is a deformation of Ei ∈ Y ; (ii) for every s ∈ Z, there is a contraction morphism Ys → Y¯i,s , which contracts the deformation of Ei in Ys if s ∈ Zi and is an isomorphism otherwise; (iii) the fixed locus of ri on Z equals Zi ; and (iv) for w ∈ W and s ∈ Z, the fibres Ys , Yw(s) are isomorphic. Proof. This can be proved using the language of lattice-polarized K3 surfaces [10]. Let M be the orthogonal complement of E1 , . . . , En in the Picard group of Y , or any sublattice thereof containing the cohomology class of an ample divisor on Y ; since Y¯ was assumed projective, such M exist. Consider the local moduli space Y → Z of M-polarized K3 surfaces [10] with central fibre Y = e−1 (0) for 0 ∈ Z, a smooth family of projective K3 surfaces. Since Z is small, the second cohomology H 2 (Ys , Z) can be identified across the family. The base Z is isomorphic, using the Kodaira–Spencer map, to a small disc around the origin in N ⊗ C, where N is the orthogonal complement of M in Pic (Y ). Since M does not include the class Ei , Ei ∈ H 2 (Ys , Z) is algebraic (and represented by a rational curve) if and only if s ∈ Zi for a subvariety Zi ⊂ Z. It is easy to see that the W -action on H 2 (Y, C) preserves N ⊗ C, and hence W can be made to act on Z. The isomorphisms Ys ∼ = Yw(s) come from the Torelli theorem, since these surfaces have isomorphic Hodge structure. Finally the fact that Zi is exactly the fixed locus of ri is just chasing definitions.

Next I want to define relative kernels on Y → Z, indexed by nodes of the diagram . By (ii) above, for a node i of and s ∈ Z there is a contraction Ys → Y¯i,s

40

B. Szendr˝oi

which contracts Ei if s ∈ Zi and is an isomorphism otherwise. There is a diagram Ys

i,s Y

Y¯i,s

Yri (s)

i,s is the fibre product of the two contractions. This fibre product can be thought where Y of as a subscheme of the product Ys × Yri (s) ; it is the “correspondence variety” on the

i,s is simply product (pairs of points mapping to the same image). If s ∈ Z \ Zi , then Y ∼ the diagonal in Ys × Yri (s) with respect to the isomorphism Ys = Yri (s) . On the other

i,s has two components: one is hand, if s ∈ Zi , then Ei ⊂ Ys is a rational curve, and Y 1 1 ∼ the diagonal, and the other one is Ei × Ei = P × P . The components intersect along the diagonal Ei . In any case, set Ui,s = OY i,s ∈ D b (Ys ×Yri (s) ) to be the (pushforward of the) structure sheaf of this correspondence subscheme. It is possible to show (see [22, Theorem 4.1] for the case of threefolds) that the kernels Ui,s are restrictions to the fibres of a relative kernel (Ui , ri ) on Y → Z. Theorem 3. For every node i of , the relative kernel (Ui , ri ) is invertible: for s ∈ Z, the kernel Ui,s defines a Fourier–Mukai functor ∼

i,s = Ui,s : D b (Yri (s) ) −→ D b (Ys ) such that for a pair of nodes (i, j ) of , there is a isomorphism of functors i,s ◦ j,ri (s) ◦ . . . ∼ = j,s ◦ i,rj (s) ◦ . . . : D b (Yrij (s) ) −→ D b (Ys ), mij

(3)

mij

where rij = ri ◦ rj ◦ . . . ∼ = rj ◦ ri ◦ . . . ∈ W . mij

mij

Hence the derived category D b (Y ) carries an action of the braid group B , and this action deforms to an action of B by a family of derived equivalences over the deformation space Y → Z of Y . Proof. The point s = 0 ∈ Z is fixed by all ri , and in this case the theorem is a re-statement of a special case of [19, Theorem 1.2]. In more detail, as proved in [22, Lemma 4.6], for s = 0 ∈ Z the functors i,0 are just the twist functors of [19] with respect to the spherical sheaves OEi (−1) on Y = Y0 . The relations (3) were proved in [19]. Hence mapping the braid group generator Ri to the autoequivalence i,0 defines an action of B on D b (Y ). For arbitrary s ∈ Z, the fact that i,s is invertible is easy: if s ∈ Zi then it is still a twist functor; otherwise it is the structure sheaf of the diagonal in Ys × Yri (s) under the isomorphism Ys ∼ = Yri (s) , and hence clearly invertible. The relation (3) can be proved using the method of [22], which does the more complicated case of threefolds. The point is that the kernels for the composites on both sides of the relation (3) can be proved to

Enhanced Gauge Symmetry and Braid Group Actions

41

be structure sheaves; for a general point s ∈ Z they are both isomorphic to the structure sheaf of the diagonal in Ys × Yrij (s) under the isomorphism Ys ∼ = Yrij (s) , and from this a specialization argument concludes that the two kernels are isomorphic everywhere. In particular, this gives an independent proof in this case of the braid relations on the central fibre Y .

It is known from [1] that (for appropriate values of the Kähler form) Type II string theory on the surface Y exhibits enhanced gauge symmetry. The braid group action in Theorem 3 is a holomorphic shadow of this enhanced gauge symmetry: the derived category of Y has a braid group worth of autoequivalences covering the Weyl group of the nonperturbative gauge symmetry algebra, which deform to equivalences between different varieties under a deformation of its complex structure. In other words, at the enhanced gauge symmetry points the derived automorphism group of Y (the group of symmetries of the category of topological D-branes) is larger than that of its deformations. I next extend Theorem 3 and its interpretation as enhanced gauge symmetry to gerbe deformations, also known as nonzero B-fields. I take the most simple-minded definition, according to which the B-field is a class B ∈ H 2 (Y, R/Z). A B-field can be used to twist the derived category of coherent sheaves of the K3 surface Y as follows. Consider the natural map ∗ δ : H 2 (Y, R/Z) → H 2 (Y, OX )

(4)

∗ ) gives a coming from the exponential sequence. The class β = δ(B) ∈ H 2 (Y, OX gerbe on X, and there is a notion of a sheaf over this gerbe (also called β-twisted sheaf on Y ). One wants to define the “derived category of β-twisted sheaves on Y ” with some finiteness condition. If the class B is torsion in H 2 (Y, R/Z), then the usual notion of coherence generalizes, and one obtains [7] a triangulated category D b (Y, B) with properties very similar to those of D b (Y ). In the general case there does not seem to be an accepted definition, though see [14, Remark 2.6] for discussion. The following statement is therefore formulated for the case of torsion B-fields; I certainly expect it to hold in general.

Theorem 4. Let B ∈ H 2 (Y, Q/Z) be a torsion B-field. Then for every vertex i of , there is a family of twisted Fourier–Mukai functors ∼

i,s,B : D b (Yri (s) , ri (B)) −→ D b (Ys , B)

(5)

deforming the functor i,s,0 = i,s . Here W acts on H 2 (Y, R/Z) via its action on H 2 (Y, R). Proof. Let p1 , p2 denote the projections of Ys × Yri (s) onto its factors. A twisted functor (5) needs, by [7, Sect. 3.1], a kernel Vi,s ∈ D b (Ys × Yri (s) , p2∗ (ri (B)) − p1∗ (B)) (note that I am using additive notation for classes in cohomology with values in Q/Z).

i,s in Ys ×Yri (s) with respect to the i th contraction. Recall the correspondence variety Y The sheaf Ui,s was defined as the structure sheaf of this correspondence; more precisely,

i,s → Ys × Yri (s) is the inclusion, then Ui,s = k∗ OY . if k : Y i,s Let

= p2∗ (ri (B)) − p1∗ (B). B

42

B. Szendr˝oi

Note that by [7, Theorem 2.2.6], there is a twisted pushforward functor

i,s , k ∗ (B))

→ D b (Ys × Yri (s) , B).

k∗ : D b (Y

i,s over the

i,s is naturally a sheaf on Y I claim that the structure sheaf of the scheme Y

This implies that the kernel Ui,s = k∗ OY can be gerbe defined by the class k ∗ (B). i,s

and hence it can thought of as a sheaf on Ys × Yri (s) over the gerbe corresponding to B be used to define the twisted functor (5). To prove the claim, I distinguish two cases. First assume s ∈ Zi . Then Ei deforms

i,s has two components: one is Ei × Ei and the other one to Ys and as I said above, Y is Ys , the diagonal. It is enough to show that the structure sheaf of either component is

restricted to that component. But one compoa sheaf over the gerbe coming from k ∗ (B) nent Ei × Ei is simply the quadric surface, which has a trivial Brauer group and hence

Y = (B · Ei )Ei . Now the there is nothing to prove. On the other component, k ∗ (B)| s

Y point is that since s ∈ Zi , Ei is an algebraic class on Ys , hence the class k ∗ (B)| s defines the trivial gerbe (see Remark 5 for the argument). Hence again, the structure sheaf is a sheaf over this gerbe! Next assume that s ∈ Z \ Zi . Then there is an isomorphism Ys ∼ = Yri (s) . It can be shown that this isomorphism induces the map ri on second cohomology. On the other

i,s is in this case irreducible and isomorphic to the diagonal; moreover, B

pulls hand, Y back to the trivial gerbe over this diagonal. Hence the structure sheaf is again a sheaf

over the gerbe defined by k ∗ (B). The fact that the kernel Ui,s defines an equivalence of categories can be proved using [7, Theorem 3.2.1], which generalizes the criterion of Bridgeland [5, Theorems 5.1 and 5.4]; I omit the details.

Remark 5. The statement of Theorem 4 involves a subtlety concerning the W -action on gerbes. On the central fibre Y , all cohomology classes Ei are algebraic. On the other hand, the map H 2 (Y, R) → H 2 (Y, OY∗ ) factors through H 2 (Y, OY ) and by Hodge theory, the image of Ei ∈ H 2 (Y, R) in H 2 (Y, OY ) is zero. This implies that B and ri (B) give the same gerbe on Y . However, for generic Ys the classes Ei are transcendental, and B, ri (B) are different gerbes. In fact, Theorem 4 should be complemented by a statement that there is no family of equivalences D b (Yri (s) , B) → D b (Ys , B). The family of sheaves {Ui,s } is certainly not appropriate, since as the proof above shows,

gives a nontrivial gerbe for s ∈ Z \ Zi exactly because the class [Ei ] is transcenB dental on Ys . Indeed I expect that the only possible way to deform the equivalence i,s in the B-field direction is that compatible with its cohomology action; in other words, there is an analogue of Theorem 1 for gerbe deformations. I have no idea how to prove this statement. I wish to offer the following interpretation of Theorem 4: Type II string theory on (Y, B) has enhanced gauge symmetry (for appropriate values of the Kähler parameter) if and only if the derived category D b (Y, B) admits a set of twisted autoequivalences, which deform to twisted Fourier–Mukai functors between different points in moduli when the complex structure and B-field parameters are deformed. Theorem 4, together

Enhanced Gauge Symmetry and Braid Group Actions

43

with Remark 5, says that this is the case if and only if ri (B) = B for all i, in other words if and only if Ei · B = 0 for all exceptional curves. Note that this condition on the B-field is identical to that of [1, p. 4], found by an analysis involving heterotic/Type II duality. 4. Calabi–Yau Threefolds Containing Ruled Surfaces Let X¯ be a projective threefold with a curve of singularities ¯ B = Sing(X) → X, such that along the curve X¯ has du Val singularities of uniform ADE type. The iterated blowup of the singular locus f : X → X¯ is a resolution of singularities. Locally over ¯ the fibre of f is a set of rational curves as before, intersecting a point p ∈ B ∈ X, according to the appropriate ADE type Dynkin diagram. However, globally there may be monodromy (see Fig. 1): as p moves over the curve B, the configuration of curves may be permuted according to a diagram symmetry of the Dynkin diagram. It is well known that quotients of ADE Dynkin diagrams by (subgroups of) their automorphism groups are non-simply laced Dynkin diagrams in a well-defined sense. Concretely, the action of Z/2 on the diagrams A2n+1 , Dn and E6 gives, respectively, the diagrams Cn+1 , Bn−1 and F4 , whereas the action of Z/3 and the symmetric group on three letters leads to the diagram G2 . The group Z/2 also acts on the diagram A2n ; this is a special case which I exclude from consideration, though see Remark 10 below. Globally therefore, the exceptional locus of f : X → X¯ consists of a set of smooth geometrically ruled surfaces {πj : Dj → Bj } intersecting in a Dynkin configuration ,

A2

A3

D4

D4

A2

C2

B3

G2

Fig. 1. Dynkin diagrams and configurations of surfaces

44

B. Szendr˝oi

which may or may not be simply laced. If is simply laced then each Bj is isomorphic to B, whereas in the general case each Bj is an unramified cover of B of the appropriate degree. As in the case of surfaces, I want to describe some deformations of the threefold X. In the local case (when one restricts attention to a neighbourhood of the exceptional surfaces), this problem is studied in detail in [22, Sect. 2]. Globally there may be some obstructions to realizing all local deformations as actual projective deformations of X. In simple cases (see below) it can be checked that the deformations I describe actually exist. The next proposition therefore should be considered a kind of “ideal scenario” statement. Proposition 6. Let X be the Calabi–Yau threefold constructed above, with a set of exceptional surfaces {πj : Dj → Bj } indexed by nodes of a Dynkin configuration , which may or may not be simply laced. Assume that X has good deformation theory. Then the universal family of (projective Calabi–Yau) deformations e : X → S of X = e−1 (0) over a polydisc 0 ∈ S carries an action of the reflection group W on its base S; moreover, the following properties hold. (i) For every s ∈ S, there is a contraction fs : Xs → X¯ s deforming the contraction f . (ii) There is an analytic subset Sj ⊂ S of codimension equal to the genus of Bj , such that s ∈ Sj if and only if the fibre Xs contains a smooth ruled surface in the exceptional locus of fs which is a deformation of Di ∈ X. (iii) The fixed locus of rj on S equals Sj . (iv) For w ∈ W and s ∈ S, the fibres Xs , Xw(s) are birational. Assume moreover that the genus g of B is at least one, and s ∈ S is a general point in the base. Then (v) The exceptional locus of Xs → X¯ s consists of rational (−1, −1)-curves, coming in sets of l(2g − 2) naturally indexed by positive roots of (l is the squared length of a root). (vi) For w ∈ W and s ∈ S, the birational map Xs Xw(s) flops some of these curves. Note that in the central fibre, the exceptional locus of fs consists of a set of surfaces indexed by simple roots (nodes) of . In the general fibre (assuming genus at least two), the exceptional set of fs is a set of curves indexed by positive roots of . Figure 2 illustrates the case = A2 , g = 2. Note also that the deformation theory of X is very different if the genus of B is zero. In that case, the W -action is trivial (Sj = S for all j and hence every generator fixes S) and the exceptional locus is always two-dimensional. For higher genus the W -action is non-trivial and the general exceptional locus is one-dimensional. The case g = 1 is also somewhat special: in that case, for general s ∈ S, the contraction fs : Xs → X¯ s is an isomorphism, which is reminiscent of the surface case. This distinction is discussed further below. The next statement is the exact analogue of Theorem 3. Theorem 7. For every node j of , there is a family of Fourier–Mukai functors ∼

j,s : D b (Xrj (s) ) −→ D b (Xs )

Enhanced Gauge Symmetry and Braid Group Actions

α

45

α+β Xs X

[α + β]

β

[α] [α]

[β]

[β]

s 0

S2

S1 S Fig. 2. The root system of A2 and exceptional loci for g = 2

such that for a pair of nodes (i, j ) of , there is a isomorphism of functors i,s ◦ j,ri (s) ◦ . . . ∼ = j,s ◦ i,rj (s) ◦ . . . : D b (Yrij (s) ) −→ D b (Ys ) mij

(6)

mij

where rij = ri ◦ rj ◦ . . . ∼ = rj ◦ ri ◦ . . . ∈ W . mij

mij

Hence the derived category D b (X) carries an action of the braid group B , and this action deforms to an action of B by a family of equivalences over the deformation space X → S of X.

46

B. Szendr˝oi

Proof. The proof, given in detail in [22, Sect. 4], is similar to that of Theorem 3. The individual functors Uj,s are defined using a diagram Xs

X¯ s .

Xrj (s)

For s ∈ Si , the functor turns out to be a special case of a functor written down by Horja in [11, (4.31)], and proved invertible in [12]. The proof of the braid relations relies, as before, on a specialization argument.

According to [2, 16, 3] and references cited in these works, threefolds X of the above type (for suitable values of the Kähler form) exhibit enhanced gauge symmetry. Theorem 7 is a holomorphic shadow of this symmetry: the derived category of X has a braid group worth of autoequivalences covering the Weyl group of the gauge algebra, which for genus at least one deforms away to a set of equivalences between different deformations. In particular, the derived automorphism group of X is larger than generic at these enhanced symmetry points. It is interesting to consider the case when the curve B has genus zero. In this case, the projective threefold X has no deformations where the surfaces deform away. The braid group still acts on the derived category of X, but it also acts as a set of derived autoequivalences on all deformations. Hence nothing gets “enhanced”. This phenomenon was also observed in the physics literature: as explained in [16, p.2], enhanced gauge symmetry needs that B is not rational; if B ∼ = P1 then the symmetry is only present in the limit when the area of B goes to infinity [2]. The lack of deformations is also an issue in the proof of the braid relations in [22]; the proof proceeds via decomposing X locally into a union of two pieces X1 ∪ X2 , so that both contain ruled surfaces over the affine line A1 and have enough deformations. Decomposing P1 into a union of two lines is here the mathematics equivalent to taking the area of the P1 to infinity. Examples 8. Varieties X¯ with a curve of singularities of uniform type An can be found among hypersurfaces or complete intersections in weighted projectice spaces; compare for example [16]. The resolution X is then embedded in a (partial) resolution of the ambient space, typically with n distinct divisors over the relevant singular locus; hence the configuration in X is still of type An . It can often be shown by concrete methods that the deformation theory of these threefolds is good in the sense needed for Proposition 6 to hold. Such varieties can be systematically searched for and in low codimension classified using the graded ring method pioneered by Reid; see the A1 case in [20] and the general case in [6]. I proceed to give an example of a projective Calabi–Yau threefold X which contains a C2 configuration of surfaces, inspired by [3, Sect. 3]; to the best of my knowledge, this is the first explicit example of this kind. Begin with an auxiliary variety x24 = y1 y2 ¯ V = ⊂ P5 [1, 1, 2, 2, 2, 4]. x18 + x28 + y14 + y24 + y34 + z2 = 0 V¯ is a degenerate degree (4, 8) complete intersection Calabi–Yau threefold in the indicated space. It can be checked by explicit computation that V¯ has three curves of singularities, which are all elliptic. Along two of the curves at {x1 = x2 = y1 = 0} and {x1 = x2 = y2 = 0}, V¯ has generically A1 singularities; this is a result of the

Enhanced Gauge Symmetry and Braid Group Actions

47

identifications on the weighted projective space. For a generic (4, 8) complete intersection (which is simply an octic in P4 [1, 1, 2, 2, 2], since the degree four variable can be eliminated), there is one irreducible curve of A1 singularities, but in the special V¯ this part of the singular locus becomes reducible because of the first equation. The last curve is {x2 = y1 = y2 = 0}, arising also because of the first equation; the singularity along the last curve is generically A3 . The three curves all meet at the two points (0 : 0 : 0 : 0 : 1 : ±i) of the weighted projective space. A patient calculation shows that these points are also quotient singularities, under the group Z/2 × Z/4 acting on C3 by (−1, −1, 1) × (1, i, −i). Construct a particular crepant partial resolution V → V¯ in two steps. First perform the blowup of both intersection points according to the right hand arrow of the toric diagram Fig. 3. This introduces two exceptional divisors over the two points, and leaves behind three disjoint curves of singularities of uniform type A1 , A1 and A3 respectively, with no dissident points. Then blow up the two disjoint A1 curves to get a Calabi–Yau threefold V with a single elliptic curve of uniform A3 singularities. Consider the action ι : (x1 : x2 : y1 : y2 : y3 : z) → (x1 : (−x2 ) : y2 : y1 : (−y3 ) : (−z)) on the weighted projective space. This action fixes V¯ ; since it interchanges the two A1 singular curves, it extends to the partial resolution V . Further, ι acts by a free action on the elliptic curve of A3 singularities of V ; in the transverse coordinates x2 , y1 , y2 to this curve satisfying the relation x24 = y1 y2 , the action interchanges y1 and y2 , and maps x2 to −x2 . A final check shows that ι acts freely on V¯ and hence on V . Thus letting X¯ = V /ι, the projective Calabi–Yau threefold X¯ has an elliptic curve of A3 singularities and is smooth otherwise; moreover, the local coordinates along this curve undergo Z/2 monodromy. Hence its Calabi–Yau resolution X → X¯ contains a C2 configuration of exceptional surfaces ruled over elliptic curves. Remark 9. The braid group action on the derived category gives rise to actions on even and odd cohomology, using the Mukai map. The action on odd cohomology H 3 (X, C) leads, as discussed in Proposition 1, to a Weyl group action on the tangent space to the deformation space, in a compatible fashion with the way the derived equivalences deform. There is also an induced Weyl group action on the Picard group. Some of these actions were described before; e.g. [16] has a symmetric group action in the case of Type A, both on the Picard group and the deformation space. The action of the braid group on the derived category explains all these actions in a uniform way.

Fig. 3. The toric partial resolution of C3 /(Z/2 × Z/4)

48

B. Szendr˝oi

A4

(A2 , )

Fig. 4. The A4 configuration with Z/2 monodromy

Remark 10. The case of monodromy Z/2 acting on the Dynkin diagram A2n has been excluded from consideration all along. This case has caused considerable headache also in the physics literature [3, Sect. 4]. In this case, the exceptional divisors Di of f : X → X¯ are still indexed by vertices of a kind of quotient quiver, the An -quiver with a marked vertex at one end corresponding to the adjacent Z/2-orbit of vertices of A2n . However, the marked node corresponds to a singular exceptional surface. It is an irreducible nonnormal surface πn : Dn → B whose double locus is a section and whose fibre over any point b ∈ B is a line pair. I do not know whether there exists an autoequivalence n of D b (X) corresponding to this surface, but I suspect that the answer is yes; this is a contracting EZ-configuration in the sense of Horja [12], with singular E. The Main Assertion of [3, Sect. 4], supported by various arguments including the analysis of the matter spectrum, claims that the enhanced gauge symmetry is sp(n), or in the language of the present paper, of type Cn . The point of view exposed in this paper gives additional support to this claim. Namely, the derived category of D b (X) is acted on by the autoequivalences 1 , . . . , n−1 coming from the smooth ruled surfaces, as well as the hypothetical autoequivalence n ; the question is what are the relations. One can make an educated guess based on the following argument. ¯ take a small quasiprojective surface Y¯ ⊂ X¯ interIn the singular threefold B ⊂ X, secting B once transversally. Let Y → Y¯ be its resolution in X, with exceptional curves E1 , . . . , E2n ⊂ Y . Set Ei = OEi (−1) ∈ D b (Y ) for i = 1, . . . , 2n. The functors i can be restricted to Fourier–Mukai functors on Y (compare [22, Proof of Theorem 4.5]). The functor i for 1 ≤ i ≤ n−1 restricts in fact to the composite of two twist functors TEi and TE2n+1−i . On the other hand, by [19], the twist functors {TEi : 1 ≤ i ≤ 2n} generate the braid group BA2n acting on the derived category of Y . Moreover, the monodromy Z/2 acts on this braid group, mapping TEi → TE2n+1−i for i = 1, . . . , n. The guess I want to make is that the functors 1 , . . . , n satisfy the relations of the fixed subgroup (BA2n )Z/2 . By a result in algebra [17], this fixed subgroup is generated by the composites TEi ◦ TE2n+1−i (note these commute) for i = 1, . . . , n − 1 and a final element TEn ◦ TEn+1 ◦ TEn (note these braid), and the group generated by these elements is the braid group corresponding to the Weyl group (WA2n )Z/2 . This latter group can be checked by a direct argument to be isomorphic to the Weyl group of the diagram Cn .

Enhanced Gauge Symmetry and Braid Group Actions

49

Hence the conjectural answer is that X has a set of derived equivalences 1 , . . . , n satisfying the braid relations of the Dynkin diagram Cn . In other words, X has enhanced gauge symmetry of type Cn (or sp(n)). Remark 11. To conclude this section, I remark that as opposed to the case of dimension two, the braid group actions of [19] can never be interpreted as enhanced gauge symmetry in dimension three. The reason is the following: it can easily be checked that if E is a spherical object in the sense of [19], then the corresponding twist functor acts on cohomology by α → α + ch(E), αch(E), where , is a linear combination of intersection forms on cohomology. However, ch(E) only has even components, hence the action of the twist functor on odd cohomology and so on H 1 (X, X ) is trivial. In particular, by Theorem 1, a twist functor always deforms to all deformations as an autoequivalence in dimension three, and hence it can never be part of an “enhanced” action.

5. Elliptic Fibrations and Braid Groups of Affine Type Let σ : X → S be an elliptic fibration of a projective threefold X. Assume that there is a smooth component C ⊂ S of the discriminant locus of σ , over which the fibres of σ are of uniform Kodaira type In with n > 2, In , I I ∗ , I I I ∗ or I V ∗ . These are the

n+4 , E

6 , E

7 and E

8 .

n−1 (n > 2), D fibre types corresponding to the affine diagrams A In X, the rational curves in the fibres over p ∈ C undergo monodromy, and trace out

n−1 case the monodromy is not ruled surfaces πj : Dj → Cj . Assume that in the type A

transitive, and in the type D4 case it does not act transitively on the outer vertices. Then the global intersections of the exceptional surfaces are described by an affine Dynkin

D

E

diagram for trivial monodromy and a quotient

, which is the original A diagram

C

G

F

type diagram otherwise. The diagram

gives rise to a braid non-simply laced B

4 A

3 C

Fig. 5. Some ruled surface configurations in elliptic fibrations

50

B. Szendr˝oi

and one (braid) relation for every pair group B

, with one generator for every node of

. of nodes as dictated by the labels of the diagram b Theorem 12. The affine type braid group B

acts on the derived category D (X).

Proof. The ruled surfaces Dj → Cj give rise to Fourier–Mukai functors j on X as before. The proof of a single braid relation only concerns two surfaces and the functors defined by them. Under the assumptions made, every pair of surfaces forms an A1 × A1 , B2 or G2 configuration. Moreover, the computation of the composed functors can be restricted to a small neighbourhood of these two surfaces. Hence the proof of [22] applies.

Enhanced gauge symmetry for threefolds with elliptic fibrations has been discussed in the context of F-theory compactifications; see [18, 3, 9] and references therein. 6. Braiding Mirror Symplectomorphisms? The paper [19], a direct predecessor of the present work, is directly motivated by mirror symmetry. Namely, the original motivation of that paper was to find the mirrors of certain symplectomorphisms of symplectic manifolds (M 2n , ω), Dehn twists in Lagrangian spheres S n ⊂ M. The twist functors in spherical objects are natural candidates for the mirrors of Dehn twists. As discussed in [11, 21 and 4], the derived equivalences studied in this paper, arising from ruled surfaces collapsing to curves in X, are mirror to certain diffeomorphisms of the mirror manifold, arising as monodromy transformations around certain boundary components of the complex moduli space of the mirror M. These diffeomorphisms are symplectomorphisms of (M 2n , ω) for special values of the symplectic form ω. It would be of interest to find a direct symplectic geometric construction of these diffeomorphisms. It is tempting to speculate that they are given by some kind of twisting with respect to a fibered submanifold of M, just as the Fourier–Mukai functors of X are constructed from the ruled surfaces. [15] begins the topological study of the mirrors of some explicit Calabi–Yau manifolds X containing a single ruled surface; the situation appears to be quite intricate. It would also be interesting to see whether in appropriate cases the braid relations (1) can be proved for these symplectomorphisms. References 1. Aspinwall, P.S.: Enhanced gauge symmetries and K3 surfaces. Phys. Lett. B 357, 329–334 (1995) 2. Aspinwall, P.S.: Enhanced gauge symmetries and Calabi–Yau threefolds. Phys. Lett. B 371, 231–237 (1996) 3. Aspinwall, P.S., Katz, S., Morrison, D.R.: Lie groups, Calabi–Yau threefolds, and F-theory. Adv. Theor. Math. Phys. 4, 95–126 (2000) 4. Aspinwall, P.S., Horja, R.P., Karp, R.L.: Massless D-branes on Calabi–Yau threefolds and monodromy, hep-th/0209161 5. Bridgeland, T.: Equivalences of derived categories and Fourier–Mukai functors. Bull. London Math. Soc. 31, 25–34 (1999) 6. Buckley, A.: Ph.D. thesis, University of Warwick (in preparation) 7. Caldararu, A.: Derived categories of twisted sheaves on Calabi–Yau manifolds. Ph.D. Thesis, Cornell University (2000) 8. Douglas, M.: D-branes, categories and N = 1 supersymmetry. J. Math. Phys. 42, 2818–2843 (2001) 9. Grassi, A., Morrison, D.R.: Group representations and the Euler characteristic of elliptically fibered Calabi–Yau threefolds. J. Alg. Geom. 12, 321–356 (2003) 10. Dolgachev, I.: Mirror symmetry for lattice polarized K3 surfaces. J. Math. Sci. 81, 2599–2630 (1996)

Enhanced Gauge Symmetry and Braid Group Actions

51

11. Horja, P.R.: Hypergeometric functions and mirror symmetry in toric varieties, math.AG/9912109 12. Horja, P.R.: Derived category automorphisms from mirror symmetry, math.AG/0103231 13. Intriligator, K., Morrison, D.R., Seiberg, N.: Five-dimensional supersymmetric gauge theories and degenerations of Calabi–Yau spaces. Nucl. Phys. B 497, 56–100 (1997) 14. Kapustin, A., Orlov, D.: Vertex algebras, mirror symmetry and D-branes: The case of complex tori, hep-th/0010293 15. Kachru, S., Katz, S., Lawrence, A., McGreevy, J.: Mirror symmetry for open strings. Phys. Rev. D (3) 62, (2000) 16. Katz, S., Morrison, D.R., Plesser, R.: Enhanced gauge symmetry in type II string theory. Nucl. Phys. B 477, 105–140 (1996) 17. Michel, J.: A note on words in braid monoids. J. Algebra 215, 366–377 (1999) 18. Morrison, D.R., Vafa, C.: Compactifications of F -theory on Calabi–Yau threefolds I, II. Nuc. Phys. B 473, 74–92 (1996); Nucl. Phys. B 476, 437–469 (1996) 19. Seidel, P., Thomas, R.P.: Braid group actions on derived categories of coherent sheaves. Duke Math. J. 108, 37–108 (2001) 20. Szendr˝oi, B.: Calabi–Yau threefolds with a curve of singularities and counterexamples to the Torelli problem II. Math. Proc. Cam. Phil. Soc. 129, 193–204 (2000) 21. Szendr˝oi, B.: Diffeomorphisms and families of Fourier–Mukai transforms in mirror symmetry. In: Applications of algebraic geometry to coding theory, physics and computation (Eilat, 2001), NATO Sci. Ser. II Math. Phys. Chem. 36, Dordrecht: Kluwer, 2001, pp. 317–337 22. Szendr˝oi, B.: Artin group actions on derived categories of threefolds, math.AG/0210121 23. Witten, E.: String theory dynamics in various dimensions. Nucl. Phys. B 443, 85–126 (1995) Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 238, 53–93 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0828-2

Communications in

Mathematical Physics

Rigorous Analysis of Discontinuous Phase Transitions via Mean-Field Bounds Marek Biskup, Lincoln Chayes Department of Mathematics, UCLA, Los Angeles, CA 90095-1555, USA Received: 22 July 2002 / Accepted: 12 January 2003 Published online: 5 May 2003 – © M. Biskup, L. Chayes 2003

Abstract: We consider a variety of nearest-neighbor spin models defined on the d-dimensional hypercubic lattice Zd . Our essential assumption is that these models satisfy the condition of reflection positivity. We prove that whenever the associated mean-field theory predicts a discontinuous transition, the actual model also undergoes a discontinuous transition (which occurs near the mean-field transition temperature), provided the dimension is sufficiently large or the first-order transition in the meanfield model is sufficiently strong. As an application of our general theory, we show that for d sufficiently large, the 3-state Potts ferromagnet on Zd undergoes a first-order phase transition as the temperature varies. Similar results are established for all q-state Potts models with q ≥ 3, the r-component cubic models with r ≥ 4 and the O(N )-nematic liquid-crystal models with N ≥ 3. Contents 1. Introduction . . . . . . . . . . . . . . . . . . 1.1 Motivation and outline . . . . . . . . . 1.2 Models of interest . . . . . . . . . . . . 1.3 Mean-field formalism . . . . . . . . . . 1.4 Main results . . . . . . . . . . . . . . . 1.5 Direct argument for mean-field equation 2. Results for Specific Models . . . . . . . . . . 2.1 Potts model . . . . . . . . . . . . . . . 2.2 Cubic model . . . . . . . . . . . . . . . 2.3 Nematic liquid-crystal model . . . . . . 3. Proofs of Mean-Field Bounds . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

54 54 55 56 57 60 61 61 64 65 68

c Copyright rests with the authors. Reproduction of the entire article for non-commercial purposes is permitted without charge.

54

3.1 Convexity estimates . . . . . . . . . . . . . . . . . 3.2 Infrared bound . . . . . . . . . . . . . . . . . . . 3.3 Proof of Main Theorem . . . . . . . . . . . . . . . 4. Proofs of Results for Specific Models . . . . . . . . . . 4.1 General considerations . . . . . . . . . . . . . . . 4.1.1 Uniform closeness to global minima. . . . 4.1.2 Monotonicity of mean-field magnetization. 4.1.3 One-component mean-field problems. . . . 4.2 Potts model . . . . . . . . . . . . . . . . . . . . . 4.3 Cubic model . . . . . . . . . . . . . . . . . . . . . 4.4 Nematic model . . . . . . . . . . . . . . . . . . . 5. Mean-Field Theory and Complete-Graph Models . . . .

M. Biskup, L. Chayes

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

68 71 74 76 76 76 77 78 79 82 84 90

1. Introduction 1.1. Motivation and outline. Mean-field theory has traditionally played a seminal role for qualitative understanding of phase transitions. In fact, most practical studies of complex physical systems begin (and sometimes end) with the analysis of the corresponding mean field theory. The central idea of mean-field theory – dating back to [15, 53] – is rather compelling: The ostensibly complicated interactions acting on a particular element of the system are replaced by the action of an effective (or mean) external field. This field causes a response at the point of question and its value has to be self-consistently adjusted so that the response matches the effective field. The practical outcome of this procedure is a set of equations, known as the mean-field equations. In contrast to the original, fully interacting system, the mean-field equations are susceptible to direct analytical or numerical methods. There is a general consensus that mean-field predictions are qualitatively or even quantitatively accurate. However, for short-range systems, a mathematical foundation of this belief has not been presented in a general context. A number of rigorous results have related various lattice systems to their mean-field counterparts, either in the form of bounds on transition temperatures and critical exponents, see [19,20,52] and references therein, or in terms of limits of the free energy [48] and the magnetization [12, 41] as the dimension tends to infinity. In all of these results, the nature of the phase transition is not addressed or the proofs require special symmetries which, as it turns out, ensure that the transition is continuous. But, without special symmetries (or fine tuning) phase transitions are typically discontinuous, so generic short-range systems have heretofore proved elusive. (By contrast, substantial progress along these lines has been made for systems where the range of the interaction plays the role of a large parameter. See, e.g., [10, 11, 14, 47].) In this paper we demonstrate that for a certain class of nearest-neighbor spin systems, namely those that are reflection positive, mean-field theory indeed provides a rigorous guideline for the order of the transition. In particular, we show that the actual systems undergo a first-order transition whenever the associated mean-field model predicts this behavior, provided the spatial dimension is sufficiently high and/or the phase transition is sufficiently strong. Furthermore, we give estimates on the difference between the values of parameters of the actual model and its mean-field counterpart at their corresponding transitions and show that these differences tend to zero as the spatial dimension tends to infinity. In short, mean field theory is quantitatively accurate whenever the dimension is sufficiently large.

Phase Transitions and Mean-Field Theory

55

The main driving force of our proofs is the availability of the so called infrared bound [18,22–24], which we use for estimating the correlations between nearest-neighbor spins. It is worth mentioning that the infrared bound is the principal focus of interest in a class of rigorous results on mean-field critical behavior of various combinatorial models [13,30–32,37,39] and percolation [29,33–36,38,40] based on the technique of the lace expansion. However, in contrast to these results (and to the hard work that they require), our approach is more reminiscent of the earlier works on high-dimensional systems [1–3], where the infrared bound is provided as an input. In particular, for our systems this input is a consequence of reflection positivity. (As such, some of our results can also be extended to systems with long-range forces; the relevant modifications will appear in a separate publication [9].) The principal substance of this paper is organized as follows: We devote the remainder of Sect. 1 to a precise formulation of the general class of spin systems that we consider, we then develop some general mean-field formalism and, finally, state our main theorems. Sect. 2 contains a discussion of three eminent models – Potts, cubic and nematic – with specific statements of theorems which underscore the first-order (and mean-field) nature of the phase transitions for the large-d version of these models. In Sect. 3 we develop and utilize the principal tools needed in this work and provide proofs of all statements made in Sect. 1. In Sect. 4, we perform detailed analyses and collect various known results on the mean-field theories for the specific models mentioned above. When these systems are “sufficiently prepared,” we apply the Main Theorem to prove all of the results stated in Sect. 2. Finally, in Sect. 5, we show that for any model in the class considered, the mean-field theory can be realized by defining the problem on the complete graph.

1.2. Models of interest. Throughout this paper, we will consider the following class of spin systems on the d-dimensional hypercubic lattice Zd : The spins, denoted by Sx , take values in some fixed set , which is a subset of a finite dimensional vector space E . We will use (· , ·) to denote the (positive-definite) inner product in E and assume that is compact in the topology induced by this inner product. The spins are weighted according to an a priori Borel probability measure µ whose support is . An assignment of a spin value Sx to each site x ∈ Zd defines a spin configuration; we assume that the a priori joint distribution of all spins on Zd is i.i.d. Abusing the notation slightly, we will use µ to denote the joint a priori measure on spin configurations and use −0 to denote the expectation with respect to µ. The interaction between the spins is described by the (formal) Hamiltonian βH = −

J (Sx , Sy ) − (b, Sx ). 2d x

(1.1)

x,y

Here x, y denotes a nearest-neighbor pair of Zd , the quantity b, playing the role of an external field, is a vector from E and β, the inverse temperature, has been incorporated into the (normalized) coupling constant J ≥ 0 and the field parameter b. The interaction Hamiltonian gives rise to the concept of a Gibbs measure which is defined as follows: Given a finite set ⊂ Zd , a configuration S = (Sx )x∈ in and a boundary condition S = (Sx )x∈Zd \ in Zd \, we let βH (S|S ) be given by (1.1) with the first sum on the right-hand side of (1.1) restricted to x, y such that {x, y} ∩ = ∅, the second sum restricted to x ∈ , and Sx for x ∈ replaced by Sx . Then we define

56

M. Biskup, L. Chayes (S )

the measure ν on configurations S in by the expression (S ) ν (dS)

e−βH (S|S ) = µ(dS), Z (S )

(1.2)

where Z (S ) is the appropriate normalization constant which is called the partition function. The measure in (1.2) is the finite-volume Gibbs measure corresponding to the interaction (1.1). In statistical mechanics, the measure (1.2) describes the thermodynamic equilibrium of the spin system in . To address the question of phase transitions, we have to study the possible limits of these measures as expands to fill in Zd . In accord with the standard definitions, see [26], we say that the spin model undergoes a first-order phase transition at parameter values (J, b) if there are at least two distinct infinitevolume limits of the measure in (1.2) arising from different boundary conditions. We will call these limiting objects either infinite-volume Gibbs measures or, in accordance with mathematical-physics nomenclature, Gibbs states. We refer the reader to [26, 52] for more details on the general properties of Gibbs states and phase transitions. We remark that, while the entire class of models has been written so as to appear identical, the physics will be quite different depending on the particulars of and µ, and the inner product. Indeed, the language of magnetic systems has been adapted only for linguistic and notational convenience. The above framework can easily accommodate any number of other physically motivated interacting models such as lattice gases, ferroelectrics, etc. 1.3. Mean-field formalism. Here we will develop the general formalism needed for stating the principal mean-field bounds. The first object of interest is the logarithmic moment generating function of the distribution µ, G(h) = log µ(dS) e(S,h) . (1.3)

Since was assumed compact, G(h) is finite for all h ∈ E . Moreover, h → G(h) is continuous and convex throughout E . Every mean-field theory relies on a finite number of thermodynamic functions of internal responses. For the systems with interaction (1.1), the object of principal interest is the magnetization. In general, magnetization is a quantity taking values in the closed, convex hull of , here denoted by Conv(). If m ∈ Conv(), then the mean-field entropy function is defined via a Legendre transform of G(h), (1.4) S(m) = inf G(h) − (m, h) . h∈E

(Strictly speaking, (1.4) makes sense even for m ∈ Conv() for which we simply get S(m) = −∞.) In general, m → S(m) is concave and we have S(m) ≤ 0 for all m ∈ Conv(). From the perspective of the large-deviation theory (see [16, 19]), the mean-field entropy function is (the negative of) the rate function for the probability that the average of many spins is near m. To characterize the effect of the interaction, we have to introduce energy into the game. For the quadratic Hamiltonian in (1.1), the (mean-field) energy function is given simply by

Phase Transitions and Mean-Field Theory

57

1 EJ,b (m) = − J |m|2 − (m, b), (1.5) 2 where |m|2 = (m, m). On the basis of physical considerations, a state of thermodynamic equilibrium corresponds to a balance between the energy and the entropy. The appropriate thermodynamic function characterizing this balance is the free energy. We therefore define the mean-field free-energy function by setting ΦJ,b (m) = EJ,b (m) − S(m), i.e., 1 ΦJ,b (m) = − J |m|2 − (m, b) − S(m). (1.6) 2 The mean-field (Gibbs) free energy FMF (J, b) is defined by minimizing ΦJ,b (m) over all m ∈ Conv(). Assuming a unique minimizer, this and (1.4–1.5) give us a definition of the mean-field magnetization, entropy and energy. A more interesting situation occurs when there is more than one minimizer of ΦJ,b . The latter cases are identified as the points of phase coexistence while the former situation is identified as the uniqueness region. For the sake of completeness, it is interesting to observe that every minimizer of ΦJ,b (m) (in fact, every stationary point) in the relative interior of Conv() is a solution of the equation m = ∇G(J m + b),

(1.7)

where ∇ denotes the (canonical) gradient in E . This is the mean-field equation for the magnetization, which describes the self-consistency constraint that we alluded to in Sect. 1.1. The relation between (1.7) and the stationarity of ΦJ,b is seen as follows: ∇ΦJ,b (m) = 0 implies that J m + b + ∇S(m) = 0. But h = −∇S(m) is equivalent to m = ∇G(h), and stationarity therefore implies (1.7). We conclude with a claim that an immediate connection of the above formalism to some statistical mechanics problem is possible. Indeed, if the Hamiltonian (1.1) is redefined for the complete graph on N vertices, then the quantity ΦJ,b (m) emerges as the rate function in a large-deviation principle for magnetization and hence FMF (J, b) is the free energy in this model. A precise statement and a proof will appear in the last section (Theorem 5.1 in Sect. 5); special cases of this result have been known since time immemorable, see e.g. [19]. 1.4. Main results. Now we are in a position to state our general results. The basic idea is simply to watch what happens when the value of the magnetization in an actual system (governed by (1.1)) is inserted into the associated mean-field free-energy function. We begin with a general bound which relies only on convexity: Theorem 1.1. Consider the spin system on Zd with the Hamiltonian (1.1) and let νJ,b be an infinite-volume Gibbs measure corresponding to the parameters J ≥ 0 and b ∈ E in (1.1). Suppose that νJ,b is invariant under the group of translations and rotations of Zd . Let −J,b denote the expectation with respect to νJ,b and let m be the magnetization of the state νJ,b defined by m = S0 J,b , where 0 denotes the origin in ΦJ,b (m ) ≤

Zd .

inf

(1.8)

Then

m∈Conv()

ΦJ,b (m) +

J (S0 , Sx ) J,b − |m |2 , 2

where x denotes a nearest neighbor of the origin.

(1.9)

58

M. Biskup, L. Chayes

Thus, whenever the fluctuations of nearest-neighbor spins have small correlations, the physical magnetization almost minimizes the mean-field free energy. The bound (1.9) immediately leads to the following observation, which, to the best of our knowledge, does not appear in the literature: Corollary 1.2. Let νJ,b and −J,b be as in Theorem 1.1 and let m be as in (1.8). Then (1.10) (Sx , Sy ) J,b ≥ |m |2 for any pair of nearest-neighbors x, y ∈ Zd . In particular, for any model with interaction (1.1), the nearest-neighbor spins are positively correlated in any Gibbs state which is invariant under the translations and rotations of Zd . Our next goal is to characterize a class of Gibbs states for which the correlation term on the right-hand side of (1.9) is demonstrably small. However, our proofs will make some minimal demands on the Gibbs states themselves and it is therefore conceivable that we may not be able to access all the extremal magnetizations. To define those values of magnetization for which our proofs hold, let F (J, b) denote the infinite-volume free energy per site of the system on Zd , defined by taking the thermodynamic limit of 1 − || log Z , see e.g. [50]. (Note that the existence of this limit follows automatically by the compactness of .) The function F (J, b) is concave and, therefore, has all directional derivatives. Let K (J, b) be the set of all pairs [e , m ] such that F (J + J, b + b) − F (J, b) ≤ e J + (m , b)

(1.11)

holds for all numbers J and all vectors b ∈ E . By a well-known result (see the discussion of the properties of subdifferential on page 215 of [51]), K (J, b) is a convex set; we let M (J, b) denote the set of all values m such that [e , m ] is an extreme point of the set K (J, b) for some value e . Our Main Theorem is then as follows: Main Theorem. Let d ≥ 3 and consider the spin system on Zd with the Hamiltonian (1.1). Let n denote the dimension of E . For J ≥ 0 and b ∈ E , let m ∈ M (J, b). Then κ inf ΦJ,b (m) + J n Id , (1.12) ΦJ,b (m ) ≤ m∈Conv() 2 where κ = maxS∈ (S, S) and Id = with D(k) =1−

1 d

[−π,π ]d

2 ddk [1 − D(k)] d (2π) D(k)

(1.13)

d

j =1 cos(ky ).

The bound (1.12) provides us with a powerful method for proving first-order phase transitions on the basis of a comparison with the associated mean-field theory. The key to our whole program is that the “error term”, J n κ2 Id , vanishes in the d → ∞ limit; in fact, Id =

1 1 + o(1) as 2d

d → ∞,

(1.14)

Phase Transitions and Mean-Field Theory

59

0.01

0.01

0.005

0.005

(a)

(b)

0.33

0.33

0.01

0.005

(d)

0.005

(c) D(J )

0.33

0.33

Fig. 1. The mean-field free energy as a function of a scalar magnetization m(J ) for the typical model undergoing a first-order phase transition. In an interval of values of J , there are two local minima which switch their order at J = JMF . If the “barrier” height (J ) always exceeds the error term from (1.12), there is a forbidden interval of scalar magnetizations and m(J ) has to jump as J varies. The actual plot corresponds to the 3-state Potts model for J taking the values (a) 2.73, (b) 2.76, (c) 2.77 and (d) 2.8. See Sect. 2.1 for more details

see [12]. For d sufficiently large, the bound (1.12) thus forces the magnetization of the actual system to be near a value of m that nearly minimizes ΦJ,b (m). Now, recall a typical situation of the mean-field theory with a first-order phase transition: There is a JMF such that, for J near JMF , the mean-field free-energy function has two nearly degenerate minima separated by a barrier of height (J ), see Fig. 1. If the barrier (J ) always exceeds the error term in (1.12), i.e., if (J ) > J n κ2 Id , some intermediate values of magnetization are forbidden and, as J increases through JMF , the physical magnetization undergoes a jump at some Jt near JMF . See also Fig. 2. The Main Theorem is a direct consequence of Theorem 1.1 and the following lemma: Key Estimate. Let J ≥ 0 and b ∈ E and let m ∈ M (J, b). Let n, κ and Id be as in the Main Theorem. Then there is an infinite-volume Gibbs state νJ,b for interaction (1.1) such that m = S0 J,b ,

(1.15)

(Sx , Sy ) J,b − |m |2 ≤ nκId ,

(1.16)

and

for any nearest-neighbor pair x, y ∈ Zd . Here −J,b denotes the expectation with respect to νJ,b .

60

M. Biskup, L. Chayes

The Key Estimate follows readily under certain conditions; for instance, when the parameter values J and b are such that there is a unique Gibbs state. Under these circumstances, the bound (1.16) is a special case of the infrared bound which can be derived using reflection positivity (see [18,22–24]) and paying close attention to the “zero mode.” Unfortunately, at the points of non-uniqueness, the bound in (1.16) is also needed. The restriction to extreme magnetizations is thus dictated by the need to approximate the magnetizations (and the states which exhibit them) by states where the standard “RP, IRB” technology can be employed. The Key Estimate and Theorem 1.1 constitute a proof of the Main Theorem. Thus, a first-order phase transition (for d 1) can be established in any system of the form (1.1) by detailed analysis of the full mean-field theory. Although this sounds easy in principle, in practice there are cases where this can be quite a challenge. But, ultimately, the Main Theorem reduces the proof of a phase transition to a problem in advanced calculus where (if desperate) one can employ computers to assist in the analysis.

1.5. Direct argument for mean-field equation. We have stated our main results in the context of the mean-field free energy. However, many practical calculations focus immediately on the mean-field equation for magnetization (1.7). As it turns out, a direct study of the mean-field equation provides us with an alternative (albeit existential) approach to the results of this paper. The core of this approach is the variance bound for the magnetization stated as follows: Lemma 1.3. Let d ≥ 3 and consider the spin system on Zd with the Hamiltonian (1.1). Let n and Id be as in the Main Theorem. For J ≥ 0 and b ∈ E , let m ∈ M (J, b). Then there is an infinite-volume Gibbs state νJ,b for the interaction (1.1) such that m = S0 J,b and 2

1 Sx − m ≤ nJ −1 Id , (1.17) J,b 2d x : |x|=1

where −J,b denotes the expectation with respect to νJ,b . Here is how the bound (1.17) can be used to prove that mean-field equations are accurate in sufficiently large dimensions: Conditioning on the spin values at the neighbors of the origin and recalling the definition of G(h), the expectation S0 J,b can be written as J S0 J,b = ∇G Sx + b . (1.18) 2d J,b x : |x|=1

Since the right-hand side of (1.17) tends to zero as d → ∞, the (spatial) average of 1

the spins neighboring the origin – namely 2d x : |x|=1 Sx – is, with high probability, very close to m . Using this in (1.18), we thus find that m approximately satisfies the mean-field equation (1.7). Thus, to demonstrate phase coexistence (for d 1) it is sufficient to show that, along some curve in the parameter space, the solutions to the mean-field equations cannot be assembled into a continuous function. In many cases, this can be done dramatically by perturbative arguments. While this alternative approach has practical appeal for certain systems, the principal drawback is that it provides no clue as to the location of the transition temperature.

Phase Transitions and Mean-Field Theory

61

m

J Fig. 2. The solutions of the mean-field equation for the scalar order parameter m as a function of J for the 10-state Potts model. The solid lines indicate the local minima, the dashed lines show the other solutions to the mean-field equation. The portions of these curves in the regions where m is sufficiently close to zero or one can be (rigorously) controlled using perturbative calculations. These alone prove that the mean-field theory “does not admit continuous solutions” and, therefore, establish first order transitions for d 1. The shaded regions show the set of allowed magnetizations for the system on Zd when Id ≤ 0.002. In addition to manifestly proving a discontinuous transition, these provide tight numerical bounds on the transition temperature and reasonable bounds on the size of the jump

Indeed, as mentioned in the paragraph following the Main Theorem, secondary minima and other irrelevant solutions to the mean-field equations typically develop well below J = JMF . Without the guidance of the free energy, there is no way of knowing which solutions are physically relevant. 2. Results for Specific Models In this section we adapt the previous general statements to three models: the q-state Potts model, the r-component cubic model and the O(N )-nematic liquid crystal model. For appropriate ranges of the parameters q, r and N and dimension sufficiently large, we show that these models undergo a first-order phase transition as J varies. The relevant results appear as Theorems 2.1, 2.3 and 2.6.

2.1. Potts model. The Potts model, introduced in [49], is usually described as having a discrete spin space with q states, σx ∈ {1, 2, . . . , q}, with the (formal) Hamiltonian δσx ,σy . (2.1) βH = −J x,y J Here δσx σy is the usual Kronecker delta and J = 2d . To bring the interaction into the form of (1.1), we use the so called tetrahedral representation, see [54]. In particular, we let = {ˆv1 , . . . , vˆ q }, where vˆ α denote the vertices of a (q − 1)-dimensional hypertetrahedron, i.e., vˆ α ∈ Rq−1 with 1, if α = β, (2.2) vˆ α · vˆ β = 1 otherwise. − q−1 ,

62

M. Biskup, L. Chayes

The inner product is proportional to the usual dot product in Rq−1 . Explicitly, if Sx ∈ corresponds to σx ∈ {1, . . . , q}, then we have (Sx , Sy ) =

q −1 1 Sx · Sy = δσx ,σy − . q q

(2.3)

(The reason for this rescaling the dot product is to maintain coherence with existing treatments of the mean-field version of this model.) The a priori measure µ gives a uniform weight to all q states in . Let us summarize some of the existing rigorous results about the q-state Potts model. The q = 2 model is the Ising model, which in mean-field theory as well as real life has a continuous transition. It is believed that the Potts model has a discontinuous transition for all d ≥ 3 and q ≥ 3 (see, e.g., [54]). In any d ≥ 2, it was first proved in [45] that for q sufficiently large, the energy density has a region of forbidden values over which it must jump discontinuously as J increases. On the basis of FKG monotonicity properties, see [4], this easily implies that the magnetization is also discontinuous. Such results have been refined and improved; for instance in [44, 46], Pirogov-Sinai type expansions have been used to show that there is a single point of discontinuity outside of which all quantities are analytic. However, for d ≥ 3, the values of q for which these techniques work are “astronomical,” and, moreover, deteriorate exponentially with increasing dimension. Let m (J ) and e (J ) denote the actual magnetization and energy density, respectively. These quantities can be defined using one-sided derivatives of the physical free energy: m (J ) =

∂ F (J, bvˆ 1 ) + b=0 ∂b

and e (J ) =

∂ F (J , 0) +, J =J ∂J

(2.4)

or, equivalently, by optimizing the expectations (ˆv1 , S0 ), resp., 21 (S0 , Sx ), where “0” is the origin and x is its nearest neighbor, over all Gibbs states that are invariant under the symmetries of Zd . Recalling the Fortuin-Kasteleyn representation [4, 21, 27, 28], let P∞ (J ) be the probability that, in the associated random cluster model with parameters p = 1 − e−J /(2d) and q, the origin lies in an infinite cluster. Then m (J ) and P∞ (J ) are related by the equation m (J ) =

q −1 P∞ (J ). q

(2.5)

As a consequence, the magnetization m (J ) is a non-decreasing and right-continuous function of J . The energy density e (J ) is non-decreasing in J simply by concavity of the free energy. The availability of the graphical representation allows us to make general statements about the phase-structure of these systems. In particular, in any d ≥ 2 and for all q under consideration, there is a Jc = Jc (q, d) ∈ (0, ∞) such that m (J ) > 0 for J > Jc while m (J ) = 0 for J < Jc , see [4, 28]. Whenever m (Jc ) > 0 (which, by the aforementioned results [44–46], is known for q 1), there are at least q + 1 distinct extremal, translation-invariant Gibbs states at J = Jc . The mean-field free energy for the model without external field is best written in terms of components of m: If (x1 , . . . , xq ) is a probability vector, we express m as m = x1 vˆ 1 + · · · + xq vˆ q .

(2.6)

Phase Transitions and Mean-Field Theory

63

The interpretation of this relation is immediate: xk corresponds to the proportion of spins in the k th spin-state. In terms of the variables in (2.6), the mean-field free-energy function is (to within a constant) given by ΦJ (m) =

q

− J2 xk2 + xk log xk .

(2.7)

k=1

In (2.7) we have for once and all set the external field b to zero and suppressed it from the notation. It is well-known (see [41,54] and also Lemma 4.4 of the present paper) that, for each q ≥ 3, there is a JMF ∈ (2, q) such that ΦJ has a unique global minimizer m = 0 for J < JMF , while for J > JMF , there are q global minimizers which are obtained by permutations of single (x1 , . . . , xq ) with x1 > x2 = · · · = xq . To keep the correspondence with m (J ), we define the scalar mean-field magnetization mMF (J ) as the maximal Euclidean norm of all global minimizers of the mean-field free energy ΦJ (m). (In this parametrization, the asymmetric global maxima will be given by x1 = q1 + mMF (J ) and x2 = · · · = xq = to the equation

1 q

1 − q−1 mMF (J ).) Then mMF (J ) is the maximal positive solution

J

q

m

q e q−1 − 1 m= . J q m q −1 e q−1 + q − 1

(2.8)

In particular, J → mMF (J ) is non-decreasing. We note that the explicit values of the coupling constant JMF and the magnetization mc = mMF (JMF ) at the mean-field transition are known: JMF = 2

q −1 log(q − 1) and q −2

mc =

q −2 , q

(2.9)

see e.g. [54]. Thus, the mean-field transition is first-order for all q > 2. Our main result about the Potts model is then as follows: Theorem 2.1 (Potts model). Consider the q-state Potts model on Zd and let m (J ) be its scalar magnetization. For each q ≥ 3, there exists a Jt = Jt (q, d) and two numbers 1 = 1 (d, J ) > 0 and 2 = 2 (d) > 0 satisfying 1 (d, J ) → 0, uniformly on finite intervals of J , and 2 (d) → 0 as d → ∞, such that the following holds: m (J ) ≤ 1 for J < Jt

(2.10)

|m (J ) − mMF (J )| ≤ 1 for J > Jt .

(2.11)

|Jt − JMF | ≤ 2 .

(2.12)

and

Moreover,

In particular, both the magnetization m (J ) and the energy density e (J ) undergo a jump at J = Jt whenever d is sufficiently large.

64

M. Biskup, L. Chayes

The jump in the energy density at Jt immediately implies the existence of at least q + 1 distinct extremal Gibbs measures at J = Jt . However, the nature of our proofs does not permit us to conclude that m (J ) = 0 for J < Jt nor can we rule out that m (J ) undergoes further jumps for J > Jt . (Nonetheless, the jumps for J > Jt would have to be smaller than 21 (d, J ).) Unfortunately, we can say nothing about the continuous-q variant of the Potts model – the random cluster model – for non-integer q. In this work, the proofs lean too heavily on the spin representation. Furthermore, for non-integer q, the use of our principal tool, reflection positivity, is forbidden; see [8]. We also concede that, despite physical intuition to the contrary, our best bounds on 2 (d) and 1 (d, J ) deteriorate with increasing q. This is an artifact of the occurrence of the single-spin space dimension on the right-hand side of (1.12). (This sort of thing seems to plague all existing estimates based on reflection positivity.) In particular, we cannot yet produce a sufficiently large dimension d for which the phase transition in all (q ≥ 3)-state Potts models would be provably first order.

2.2. Cubic model. Our second example of interest is the r-component cubic model. Here the spins Sx are the unit vectors in the coordinate directions of Rr , i.e., if eˆ k are the standard unit vectors in Rr , then = {±ˆek : k = 1, . . . , r}.

(2.13)

The Hamiltonian is given by (1.1), with the inner product given by the usual dot product in Rr and the a priori measure given by the uniform measure on . As in the last subsection, we set b = 0 and suppress any b-dependence from the notation. We note that the r = 1 case is the Ising model while the case r = 2 is equivalent to two uncoupled Ising models. The cubic model was introduced (and studied) in [42,43] as a model of the magnetism in rare-earth compounds with a cubic crystal symmetry. There it was noted that the associated mean-field theory has a discontinuous transition for r ≥ 4, while the transition is continuous for r = 1, 2 and 3. The mean field theory is best expressed in terms of the collection of parameters y¯ = (y1 , . . . , yr ) and µ¯ = (µ1 , . . . , µr ), where yk stands for the fraction of spins that take the values ±ˆek and µk yk is the magnetization in the direction eˆ k . In this language, the magnetization vector can be written as m = y1 µ1 eˆ 1 + · · · + yr µr eˆ r .

(2.14)

To describe the mean-field free-energy function, we define (r)

KJ (y, ¯ µ) ¯ =

r

yk log yk + yk Θ2Jyk(µk ) ,

(2.15)

k=1

where ΘJ (µ) denotes the standard Ising mean-field free energy with bias µ; i.e., the quantity in (2.7) with q = 2, x1 = 21 (1 + µ) and x2 = 21 (1 − µ). Then ΦJ (m) is found (r) ¯ µ) ¯ over all allowed pairs (y, ¯ µ) ¯ such that (2.14) holds. by minimizing KJ (y, As in the case of the Potts model, the global minimizer of ΦJ (m) will be a permutation of a highly-symmetric state. However, this time the result is not so well known, so we state it as a separate proposition:

Phase Transitions and Mean-Field Theory

65

Proposition 2.2. Consider the r-component cubic model. For each J ≥ 0, the only local minima of ΦJ are m = 0 or m = ±mMF eˆ k , k = 1, . . . , r, where mMF = mMF (J ) is the maximal positive solution to the equation m=

sinh J m . r − 1 + cosh J m

(2.16)

Furthermore, there is a JMF ∈ (0, ∞) such that the only global minimizers of ΦJ (m) are m = 0 for J < JMF and m = ±mMF (J )ˆek , k = 1, . . . , r, (with mMF (J ) > 0) for J > JMF . For a system on Zd , the scalar magnetization is most conveniently defined as the norm of S0 J , optimized over all translation-invariant Gibbs states for the coupling constant J . The energy density e (J ) is defined using the same formula as for the Potts model, see (2.4). Our main result about the cubic model is then as follows: Theorem 2.3 (Cubic model). Consider the r-state cubic model on Zd and let m (J ) be its scalar magnetization. Then for every r ≥ 4, there exists a Jt = Jt (q, d) and two numbers 1 = 1 (d, J ) > 0 and 2 = 2 (d) > 0 satisfying 1 (d, J ) → 0, uniformly on finite intervals of J , and 2 (d) → 0 as d → ∞, such that the following holds: m (J ) ≤ 1 for J < Jt

(2.17)

|m (J ) − mMF (J )| ≤ 1 for J > Jt .

(2.18)

|Jt − JMF | ≤ 2 .

(2.19)

and

Moreover,

In particular, both the magnetization m (J ) and the energy density e (J ) undergo a jump at J = Jt whenever d is sufficiently large. As in the case of the Potts model, our technique does not allow us to conclude that Jt is the only value of J where the magnetization undergoes a jump. In this case, we do not even know that the magnetization is a monotone function of J ; the conclusions (2.17–2.18) can be made because we know that the energy density is close to 21 m (J )2 and is (as always) a non-decreasing function of J . Finally, we also cannot prove that, in the state with large magnetization in the direction eˆ 1 , there will be no additional symmetry breaking in the other directions. Further analysis, based perhaps on graphical representations, is needed.

2.3. Nematic liquid-crystal model. The nematic models are designed to study the behavior of liquid crystals, see the monograph [25] for more background on the subject. In the simplest cases, a liquid crystal may be regarded as a suspension of rod-like molecules which, for all intents and purposes, are symmetric around their midpoint. For the models of direct physical relevance, each rod (or a small collection of rods) is described by an

66

M. Biskup, L. Chayes

three-dimensional spin and one considers only interactions that are (globally) O(3)invariant and invariant under the (local) reversal of any spin. The simplest latticized version of such a system is described by the Hamiltonian βH (s) = −

J (sx · sy )2 , 2d

(2.20)

x,y

with sx a unit vector in R3 and x ∈ Zd with d = 2 or d = 3. We will study the above Hamiltonian, but we will consider general dimensions d (provided d ≥ 3) and spins that are unit vectors in any RN (provided N ≥ 3). The Hamiltonian (2.20) can be rewritten into the form (1.1) as follows [25]: Let E be the space of all traceless N × N matrices with real coefficients and let be the set of those matrices Q = (Qα,β ) ∈ E for which there is a unit vector in v = (vα ) ∈ RN such that Qαβ = vα vβ −

1 δαβ , N

α, β = 1, . . . , N.

(2.21)

Writing Qx for the matrix arising from the spin sx via (2.21), the interaction term becomes (sx · sy )2 = Tr(Qx Qy ) +

1 . N

(2.22)

Now E is a finite-dimensional vector space and (Q, Q ) = Tr(QQ ) is an inner product on E , so (2.20) indeed takes the desired form (1.1), up to a constant that has no relevance for physics. The a priori measure on is a pull-back of the uniform distribution on the unit sphere in RN . More precisely, if v is uniformly distributed on the unit sphere in RN , then Q ∈ is a random variable arising from v via (2.21). As a consequence, the a priori distribution is invariant under the action of the Lee group O(N, R) given by Qx → g−1 Qx g,

g ∈ O(N, R).

(2.23)

The parameter signaling the phase transition, the so called order parameter, is “tensor” valued. In particular, it corresponds to the expectation of Q0 . The order parameter can always be diagonalized. The diagonal form is not unique; however, we can find an orthogonal transformation that puts the eigenvalues in a decreasing order. Thus the order parameter is effectively an N -vector λ = (λ1 , . . .

, λN ) such that λ1 ≥ λ2 ≥ · · · ≥ λN . We note that, since each Qx is traceless, we have k λk = 0. The previous discussion suggests the following definition of the scalar order parameter: For J ≥ 0, we let λ (J ) be the value of the largest non-negative eigenvalue of the matrix Q0 J , optimized over all translation-invariant Gibbs states for the coupling constant J . As far as rigorous results about the quantity λ (J ) are concerned, we know from [6] that (in d ≥ 3) λ (J ) > 0 once J is sufficiently large. On the other hand, standard high-temperature techniques (see e.g. [5, 7, 17]) show that if J is sufficiently small then there is a unique Gibbs state. In particular, since this state is then invariant under the action (2.23) of the full O(N, R) group, this necessitates that λ (J ) ≡ 0 for J small enough. The goal of this section is to show that λ (J ) actually undergoes a jump as J varies. The mean-field theory of the nematic model is formidable. Indeed, for any particular N it does not seem possible to obtain a workable expression for ΦJ (λ), even if

Phase Transitions and Mean-Field Theory

67

we allow that the components of λ have only two distinct values (which is usually assumed without apology in the physics literature). Notwithstanding, this simple form of the vector minimizer and at least some of the anticipated properties can be established: Proposition 2.4. Consider the O(N )-nematic model for N ≥ 3. Then every local minimum of ΦJ (λ) is an orthogonal transformation of the matrix λ = diag λ, −

λ λ ,...,− N −1 N −1

(2.24)

where λ is a non-negative solution to the equation λ=

0

1

N −3 J N λ 2 dx (1 − x 2 ) 2 e N −1 x x 2 − 1 N −3 J N λ 2 dx (1 − x 2 ) 2 e N −1 x

1 N

(2.25)

.

0

In particular, there is an increasing and right-continuous function J → λMF (J ) such that the unique minimizer of ΦJ (λ) is λ = 0 for J < JMF , while for any J > JMF , the function ΦJ (λ) is minimized by the orthogonal transformations of λMF (J ) λMF (J ) λ = diag λMF (J ), − ,...,− . N −1 N −1

(2.26)

At the continuity points of λMF : (JMF , ∞) → [0, 1], these are the only global minimizers of ΦJ . Based on the pictorial solution of the problem by physicists, see e.g. [25], we would expect that J → λMF (J ) is continuous on its domain and, in fact, corresponds to the maximal positive solution to (2.25). (This boils down to showing a certain convexityconcavity property of the function on the right-hand side of (2.25).) While we could not establish this fact for all N ≥ 3, we were successful at least for N sufficiently large. The results of the large-N analysis are summarized as follows: (N)

Proposition 2.5. Consider the O(N )-nematic model for N ≥ 3 and let λMF (J ) be the maximal positive solution to (2.25). Then there exists an N0 ≥ 3 and, for each N ≥ N0 , a number JMF = JMF (N ) ∈ (0, ∞) such that for each N ≥ N0 , the unique minimizer of ΦJ (λ) is λ = 0 for J < JMF , while for any J > JMF , the function ΦJ (λ) is minimized only by the orthogonal transformations of (2.26), with λMF (J ) > 0. (N) The function J → λMF (J ) is continuous and strictly increasing on its domain and has the following large-N asymptotic: For all J ≥ 2, (N)

lim λMF (J N ) =

N→∞

(∞)

1 1 + 1 − 4J −2 . 2

(2.27)

(∞)

Moreover, there exists a JMF (with JMF ≈ 2.455) such that JMF (N ) (∞) = JMF . N→∞ N lim

(2.28)

68

M. Biskup, L. Chayes

Now we are ready to state our main theorem concerning O(N )-nematics. As can be gleaned from a careful reading, our conclusions are not quite as strong as in the previous cases (due to the intractability of the associated mean-field theory). Nevertheless, a bona fide first-order transition is established for these systems. Theorem 2.6 (Nematic model). Consider the O(N )-nematic model with the Hamiltonian (2.20) and J ≥ 0. For each N ≥ 3, there exists a non-negative function J → λMF (J ), a constant Jt = Jt (N, d) and two numbers 1 = 1 (d, J ) > 0 and 2 = 2 (d) > 0 satisfying 1 (d, J ) → 0, uniformly on finite intervals of J , and 2 (d) → 0 as d → ∞, such that the following holds: λMF (J ) λMF (J ) For all J ≥ 0, the matrix λ = diag(λMF (J ), − N−1 , . . . , − N−1 ) is a local minimum of ΦJ . Moreover, we have the bounds λ (J ) ≤ 1 for J < Jt

(2.29)

|λ (J ) − λMF (J )| ≤ 1 for J > Jt .

(2.30)

|Jt − JMF | ≤ 2 .

(2.31)

and

Furthermore,

In particular, λ (J ) ≥ κ > 0 for all J > Jt and all N ≥ 3 and both the order parameter and the energy density e (J ) undergo a jump at J = Jt , provided the dimension is sufficiently large. The upshot of the previous theorem is that the high-temperature region with λ = 0 and the low-temperature region with λ = 0 (whose existence was proved in [6]) are separated by a first-order transition. However, as with the other models, our techniques are not sufficient to prove that λ is exactly zero for all J < Jt , nor, for J > Jt , that all states are devoid of some other additional breakdown of symmetry. Notwithstanding, general theorems about Gibbs measures guarantee that, a jump of J → λ (J ) at J = Jt implies the coexistence of a “high-temperature” state with various symmetry-broken “low-temperature” states. 3. Proofs of Mean-Field Bounds 3.1. Convexity estimates. In order to prove Theorem 1.1, we need to recall a few standard notions from convexity theory and prove a simple lemma. Let A ⊂ Rn be a convex set. Then we define the affine hull of A by the formula aff A = λx + (1 − λ)y : x, y ∈ A , λ ∈ R . (3.1) (Alternatively, aff A is a smallest affine subset of Rn containing A .) This concept allows us to define the relative interior, ri A , of A as the set of all x ∈ A for which there exists an > 0 such that y ∈ aff A

&

|y − x| ≤

⇒

y ∈A.

(3.2)

It is noted that this definition of relative interior differs from the standard topological definition. For us it is important that the standard (topological) closure of ri A is simply the standard closure of A . We refer to [51] for more details.

Phase Transitions and Mean-Field Theory

69

Lemma 3.1. For each m ∈ ri {m ∈ E : S(m ) > −∞}, there exists a vector h ∈ E such that ∇G(h) = m. Results of this sort are quite well known; e.g., with some effort this can be gleaned from Lemma 2.2.12 in [16] combined with the fact that the so called exposed points of S(m) can be realized as ∇G(h) for some h. For completeness, we provide a full derivation which exploits the particulars of the setup at hand. Proof. Let C abbreviate {m ∈ E : S(m ) > −∞} and let m ∈ ri C . Let us define the set V = {m − m : m ∈ aff C }. It is easy to see that V is in fact the affine hull of the shifted set C − m and, since 0 ∈ V, it is a closed linear subspace of E . First we claim that the infimum in (1.4) can be restricted to h ∈ V. Indeed, if h, a ∈ E , then the convexity of h → G(h) gives G(h + a) − (h + a, m) ≥ G(h) − (h, m) + a, ∇G(h) − m (3.3) for any m. This implies that ∇G(h) has a finite entropy, i.e., ∇G(h) ∈ C for any h ∈ E . Now let m be as above and a ∈ V⊥ . Then an inspection of the definition of V shows that the last term in (3.3) identically vanishes. Consequently, for the infimum (1.4), we will always be better off with h ∈ V. Let hk ∈ V be a minimizing sequence for S(m); i.e., G(hk ) − (hk , m) → S(m) as k → ∞. We claim that hk contains a subsequence tending to a finite limit. Indeed, if on the contrary hk = |hk | → ∞ we let τ k be defined by hk = hk τ k and suppose that τ k → τ (at least along a subsequence), where |τ | = 1. Now since m ∈ ri C and τ ∈ V, we have m + τ ∈ aff C for all and, by (3.2), m + τ ∈ C for some > 0 sufficiently small. But we also have G(hk ) − (hk , m + τ ) = G(hk ) − (hk , m) − hk (τ k , τ ),

(3.4)

which tends to the negative infinity because (τ k , τ ) → 1 and hk → ∞. But then S(m + τ ) = −∞, which contradicts that m + τ ∈ C . Thus hk contains a converging subsequence, hkj → h. Using that h is an actual minimizer of G(h) − (h, m), it follows that ∇G(h) = m. Now we are ready to prove our principal convexity bound: Proof of Theorem 1.1. Recall that FMF (J, b) denotes the infimum of ΦJ,b (m) over all m ∈ Conv(). As a first step, we will prove that there is a constant C < ∞ such that for any finite ⊂ Zd and any boundary condition S∂ , the partition function obeys the bound Z (S∂ ) ≥ e−||FMF (J,b)−C|∂| ,

(3.5)

where || denotes the number of sites in and |∂| denotes the number of bonds of Zd with one end in and the other in Zd \ . (This is an explicit form of the well known fact that the free energy is always lower than the associated mean-field free energy, see [19, 52].) To prove (3.5), let M denote the total magnetization in , M =

x∈

Sx ,

(3.6)

70

M. Biskup, L. Chayes ()

and let −0,h be the a priori state in tilted with a uniform magnetic field h, i.e., for any measurable function f of the configurations in , f 0,h = e−||G(h) f e(h,M ) 0 . ()

(3.7) ()

Fix an h ∈ E and let mh = ∇G(h). By inspection, ∇G(h) = Sx 0,h for all x ∈ . Then () Z (S∂ ) = e||G(h) e−(h,M )−βH (S |S∂ ) 0,h , (3.8) which using Jensen’s inequality gives Z (S∂ ) ≥ exp || G(h) − (h, mh ) − βH (S |S∂ ) 0,h .

(3.9)

To estimate the expectation of βH (S |S∂ ), we first discard (through a bound) the boundary terms and then evaluate the contribution of the interior bonds. Since the number of interior bonds in is more than d|| − |∂|, this gets us J − βH (S |S∂ ) 0,h ≥ − |mh |2 − C|∂|. 2

(3.10)

Now G(h)−(h, mh ) ≥ S(mh ), so we have Z (S∂ ) ≥ e−||ΦJ,b (mh )−C|∂| . But Lemma 3.1 guarantees that each m with S(m) > −∞ can be approximated by a sequence of mh with h ∈ E , so the bound (3.5) follows by optimizing over h ∈ E . Next, let νJ,b be an infinite volume Gibbs state and let −J,b denote expectation with respect to νJ,b . Then we claim that e||G(h) = e(h,M )+βH (S |S∂ ) Z (S∂ ) J,b . (3.11) (Here S , resp. S∂ denote the part of the same configuration S inside, resp., outside . Note that the relation looks trivial for h = 0.) Indeed, the conditional distribution in νJ,b (S ) given that the configuration outside equals S is ν , as defined in (1.2). But then (1.2) tells us that (h,M )+βH (S |S ) (S ) e Z (S ) ν (dS ) = e(h,M ) µ(dS ) = e||G(h) . (3.12) The expectation over the boundary condition S then becomes irrelevant and (3.11) is proved. Now suppose that νJ,b is the Zd -translation and rotation invariant Gibbs measure in question and recall that m = S0 J,b , where −J,b denotes the expectation with respect to νJ,b . To prove our desired estimate, we use (3.5) on the right-hand side of (3.11) and apply Jensen’s inequality to get e||G(h) ≥ exp (h, M ) + βH J,b e−||FMF (J,b)−C|∂| . (3.13) Using the invariance of the state νJ,b with respect to the translations and rotations of Zd , we have (h, M ) J,b = ||(h, m ) (3.14)

Phase Transitions and Mean-Field Theory

71

while βH J,b ≥ −||

J (S0 , Sx ) J,b − ||(b, m ) − C |∂|, 2

(3.15)

where C is a constant that bounds the worst-case boundary term and where x stands for any neighbor of the origin. By plugging these bounds back into (3.13) and passing to the thermodynamic limit, we conclude that −G(h) + (h − b, m ) −

J (S0 , Sx ) J,b ≤ FMF (J, b). 2

(3.16)

Now optimizing the left-hand side over h ∈ E allows us to replace −G(h) + (h, m ) by −S(m ). Then the bound (1.9) follows by adding and subtracting the term J2 |m |2 on the left-hand side.

3.2. Infrared bound. Our proof of the Key Estimate (and hence the Main Theorem) requires the use of the infrared bounds, which in turn are derived from reflection positivity. The connection between infrared bounds and reflection positivity dates back (at least) to [18, 22–24]. However, the present formulation (essentially already contained in [12, 24, 41]) emphasizes more explicitly the role of the “k = 0” Fourier mode of the two-point correlation function by subtracting the square of the background average. Reflection positivity is greatly facilitated by first considering finite systems with periodic boundary conditions. If it happens that there is a unique Gibbs state for parameter values J and b then the proof of the Key Estimate is straightforward – there is no difficulty with putting the system on a torus and taking the limit. In particular, the Key Estimate amounts (more or less) to Corollary 2.5 in [24]. But when there are several infinite-volume Gibbs states, we can anticipate trouble with the naive limits of the finitevolume torus states. Fortunately, Gibbsian uniqueness is not essential to our arguments. Below we list two properties of Gibbs states which allow a straightforward proof of the desired infrared bound. Then we show that in general we can obtain the infrared bound for states of interest by an approximation argument. Property 1. An infinite-volume Gibbs measure νJ,b (not necessarily extremal) for the interaction (1.1) is called a torus state if it can be obtained by a (possibly subsequential) weak limit as L → ∞ of the Gibbs states in volume [−L, L]d ∩ Zd , for the interaction (1.1) with periodic boundary conditions. Given J and b, we let M (J, b) denote the subset of Conv() containing all magnetizations achieved by infinite-volume translation-invariant Gibbs states for the interaction (1.1). Next, recall the notation M from (3.6) for the average magnetization in ⊂ Zd . Property 2. An infinite-volume Gibbs measure νJ,b (not necessarily extremal) for the interaction (1.1) is said to have block-average magnetization m if M = m, Zd || lim

νJ,b -almost surely.

(3.17)

Here the convergence Zd is along the net of all the finite boxes ⊂ Zd with partial order induced by set inclusion. (See [26] for more details.)

72

M. Biskup, L. Chayes

Our first goal is to show that every torus state with a deterministic block-average magnetization satisfies the infrared bound. Suppose d ≥ 3 and let D −1 denote the Fourier transform of the inverse lattice Laplacian with Dirichlet boundary condition. In lattice coordinates, D −1 has the representation dd k 1 eik(x−y) , x, y ∈ Zd , (3.18) D −1 (x, y) = d D(k) d (2π) [−π,π ] where D(k) = 1 − d1 that d ≥ 3.

d

j =1 cos(kj ). Note that the integral converges by our assumption

Lemma 3.2. Let d ≥ 3 and suppose that νJ,b is a Gibbs state for interaction (1.1) satisfying Properties 1 and 2. Let −J,b denote the expectation with respect to νJ,b and let m

denote the value of magnetization in νJ,b . Then for all (vx )x∈Zd such that vx ∈ R and x∈Zd |vx | < ∞, vx vy (Sx − m, Sy − m) J,b ≤ nJ −1 vx vy D −1 (x, y). (3.19) x,y∈Zd

x,y∈Zd

Here n denotes the dimension of E . (L)

Proof. Let L = [−L, L]d ∩ Zd and let νJ,b be the finite-volume Gibbs state in L for the interaction (1.1) with periodic boundary conditions. Let 2π 2π L = n1 , . . . , nd : − L ≤ ni ≤ L (3.20) 2L + 1 2L + 1 denote the reciprocal lattice. Let (wx )x∈L be a collection of vectors from E satisfying

(L) that wx = 0 for only a finite number of x ∈ Zd and x∈L wx = 0. Let −J,b denote (L)

the expectation with respect to νJ,b . Then we have the infrared bound [22–24], (L) (wx , Sx )(wy , Sy ) J,b ≤ J −1 (wx , wy ) DL−1 (x, y), (3.21) x,y∈L

x,y∈L

where DL−1 (x, y) =

1 |L |

k∈L {0}

1 ik(x−y) e . D(k)

(3.22)

Now, let eˆ 1 , . . . , eˆ n be an orthogonal basis in E and choose wx = wx eˆ , where (wx )x∈Zd is such that wx = 0 only for a finite number of x ∈ Zd and wx = 0. (3.23) x∈Zd (L)

Passing to the limit L → ∞ in such a way that νJ,b converges to the state νJ,b , and then summing over = 1, . . . , n gets us the bound wx wy (Sx , Sy ) J,b ≤ nJ −1 wx wy D −1 (x, y). (3.24) x,y∈Zd

x,y∈Zd

Phase Transitions and Mean-Field Theory

73

So far we have (3.24) only for (wx ) with a finite support. But, using that fact that both quantities D −1 (x, y) and (Sx , Sy )J,b are uniformly bounded,

(3.24) is easily extended to all absolutely-summable (wx )x∈Zd (i.e., those satisfying x∈Zd |wx | < ∞) which obey the constraint (3.23).

Let (vx ) be as specified in the statement of the lemma and let a = x∈Zd vx . Fix K, (K) let K be as above and define (wx ) by wx(K) = vx −

a 1{x∈K } . |K |

(3.25)

(K)

Clearly, these (wx ) obey the constraint (3.23). Our goal is to recover (3.19) from (3.24) (K) in the K → ∞ limit. Indeed, plugging this particular (wx ) into (3.24), the left hand side opens into four terms. The first of these is the sum of vx vy (Sx , Sy )J,b , which is part of what we want in (3.19). The second and the third terms are of the same form and both amount to 1 vx 1{x∈K } (Sx , Sy ) J,b = a vx Sx , Sy . (3.26) a |K | J,b x,y x y∈K

By our assumption of a sharp block-average magnetization in νJ,b , the average of the spins in K can be replaced, in the K → ∞ limit, by m. Similarly, we claim that 1 K→∞ |K |2 lim

(Sx , Sy ) J,b = |m|2 ,

(3.27)

x,y∈K

so, recalling the definition of a, the left-hand side is in a good shape. (K) As for the right-hand side of (3.24) with (wx ) = (wx ), here we invoke the fact that (for d ≥ 3) 1 −1 D (x, y) = 0, (3.28) lim K→∞ |K | x∈L

uniformly in y ∈ Zd . The claim therefore follows.

Next we show that for any parameters J and b, and any m ∈ M (J, b), we can always find a state with magnetization m that is a limit of states satisfying Properties 1 and 2. Lemma 3.3. For all J > 0, all b ∈ E and all m ∈ M (J, b), there are sequences (Jk ), (bk ) and (mk ) with Jk → J , bk → b, mk → m and M (Jk , bk ) = {mk }. In particular, there is a sequence (νJk ,bk ) of infinite-volume Gibbs measures satisfying Properties 1 and 2, which weakly converge (possibly along a subsequence) to a measure νJ,b with magnetization m . Proof. The proof uses a little more of the convexity theory, let us recapitulate the necessary background. Let f : Rn → (−∞, ∞) be a convex and continuous function. Let (·, ·) denote the inner product in Rn . For each x ∈ Rn , let S(x) be the set of all possible limits of the gradients ∇f (xk ) for sequences xk ∈ Rn such that xk → x as k → ∞. Then Theorem 25.6 of [51] says that the set of all subgradients ∂f (x) of f at x, (3.29) ∂f (x) = a ∈ Rn : f (y) − f (x) ≥ (y − x, a), y ∈ Rn ,

74

M. Biskup, L. Chayes

can be written as ∂f (x) = Conv(S(x)),

(3.30)

where Conv(S(x)) is the closed, convex hull of S(x). (Here we noted that since the domain of f is all of Rn , the so-called normal cone is empty at all x ∈ Rn .) But S(x) is closed and thus Conv(S(x)) is simply the convex hull of S(x). Now, by Corollary 18.3.1 of [51], we also know that if S ⊂ Rn is a bounded set of points and C is its convex hull (no closure), then every extreme point of C is a point from S. Thus, we conclude: every extreme point of ∂f (x) lies in S(x). Now we can apply the above general facts to our situation. Let F (J, b) be the infinitevolume free energy of the model in (1.1). Noting that F (J, b) is defined for all J ∈ R and all b ∈ E , the domain of F is R × E . By well known arguments, F is continuous and concave. Moreover, a comparison of (1.11) and (3.30) shows that K (J, b) is – up to a sign change – the subdifferential of F at (J, b). As a consequence of the previous paragraph, every extreme point [e , m ] ∈ K (J, b) is given by a limit limk→∞ [ek , mk ], where [ek , mk ] are such that K (Jk , bk ) = {[ek , mk ]} for some Jk → J and bk → b. But m ∈ M (J, b) implies that [e , m ] is an extreme point of K (J, b) for some e , so the first part of the claim follows. To prove the second part, note that any infinite-volume limit of the finite-volume Gibbs state with periodic boundary condition and parameters Jk and bk must necessarily have energy density ek and magnetization mk . By compactness of the set of all Gibbs states (which is ensured by compactness of ), there is at least one (subsequential) limit −J,b of the torus states as Jk → J and bk → b, which is then a translation-invariant Gibbs state with parameters J and b such that (3.31) e = (Sx , Sy ) J,b and m = Sx J,b , where x and y is any pair of nearest neighbors of Zd . However, the block-average values of both quantities must be constant almost-surely, because otherwise −J,b could have been decomposed into at least two ergodic states with distinct values of energy-density /magnetization pair, which would in turn contradict that [e , m ] is an extreme point of K (J, b). We note that the limiting measure is automatically Zd -translation and rotation invariant and, in addition, satisfies the block-average property. But, in the cases that are of specific interest to the present work (i.e., when M (J, b) contains several elements), there is little hope that such a state is a torus state. Nevertheless, we can prove: Corollary 3.4. Let J ≥ 0 and b ∈ E . Then for any m ∈ M (J, b), there exists a state νJ,b with (block-average) magnetization m for which the infrared bound (3.19) holds. Moreover, the state νJ,b is Zd -translation and rotation invariant. Proof. For J = 0 we have a unique Gibbs state and the claim trivially holds. Otherwise, all of this follows from the weak convergence of the νJk ,bk discussed above.

3.3. Proof of Main Theorem. Now we have all the ingredients ready to prove Lemma 1.3:

Phase Transitions and Mean-Field Theory

75

Proof of Lemma 1.3. Fix m ∈ M (J, b) and let νJ,b be the state described in Corollary 3.4. To prove our claim, it just remains to choose (vx ) as follows: 1 if |x| = 1, 2d , vx = (3.32) 0, otherwise, and recall the definition of Id from (1.13). Having established Lemma 1.3, we are ready to give the proof of the Key Estimate: Proof of Key Estimate. Let J ≥ 0 and b ∈ E . Let m ∈ M (J, b) and let −J,b be the state satisfying (1.15) and (1.17). Our goal is to prove the bound (1.16). To that end, let m0 = m0 (S) denote the spatially averaged magnetization of the neighbors of the origin. The rotation symmetry of the state −J,b then implies (3.33) (Sx , S0 ) J,b = (m0 , S0 ) J,b . Next, conditioning on the spin configuration in the neighborhood of the origin, we use the DLR condition for the state −J,b which results in (3.34) (m0 , S0 ) J,b = (m0 , ∇G(J m0 + b)) J,b . Finally, a simple calculation, which uses the fact that m = S0 J,b = m0 J,b = ∇G(J m0 + b)J,b , allows us to conclude that (m0 , ∇G(J m0 + b)) J,b − |m |2

= m0 − m , ∇G(J m0 + b) − ∇G(J m + b) . (3.35) J,b

To proceed with our estimates, we need to understand the structure of the double gradient of function G(h). Recall the notation −0,h for the single-spin state tilted by the external field h. Explicitly, for each measurable function f on , we have f (S)0,h = e−G(h) f (S)e(h,S) 0 . Then the components of the double gradient correspond to the components of the covariance matrix of the vector-valued random variable S. In formal vector notation, for any a ∈ E , (3.36) (a, ∇)2 G(h) = (a, S − S0,h )2 0,h . Pick h0 , h1 ∈ E . Then we can write

h1 − h0 , ∇G(h1 ) − ∇G(h0 ) =

1

dλ 0

h1 − h0 , S − S0,hλ

2 0,hλ

,

(3.37)

where hλ = (1 − λ)h0 + λh1 . But the inner product on the right-hand side can be bounded using the Cauchy-Schwarz inequality, and since (3.38) |S − S0,hλ |2 0,h ≤ max(S, S) = κ, λ

S∈

we easily derive that

h1 − h0 , ∇G(h1 ) − ∇G(h0 ) ≤ κ|h1 − h0 |2 .

(3.39)

This estimate shows that the right-hand side of (3.35) can by bounded by κJ |m0 − m |2 J,b . But for this we have the bound from Lemma 1.3: |m0 − m |2 J,b ≤ nJ −1 Id . Putting all the previous arguments together, (1.16) follows. Proof of Main Theorem. This now follows directly by plugging (1.16) into (1.9).

76

M. Biskup, L. Chayes

4. Proofs of Results for Specific Models By and large, this section is devoted to the specifics of the three models described in Sect. 2. Throughout the entire section, we will assume that b = 0 and henceforth omit b from the notation. We begin with some elementary observations which will be needed in all three cases of interest but which are also of some general applicability. 4.1. General considerations. 4.1.1. Uniform closeness to global minima. We start by showing that, for the systems under study, the magnetization is uniformly close to a mean-field magnetization. Let MMF (J ) denote the set of all local minima of ΦJ . Obviously, if we know that the actual magnetization comes close to minimizing the mean-field free energy, it must be close to a minimum or a “near-minimum” of this function. A useful measure of this closeness is the following: For J ∈ [0, ∞] and ϑ > 0, we let DJ (ϑ) = sup dist m, MMF (J ) m ∈ Conv(), ΦJ (m) < FMF (J ) + ϑ , (4.1) where FMF (J ) denotes the absolute minimum of ΦJ . However, to control the “closeness” we will have to make some assumptions about the behavior of the (local) minima of ΦJ . An important property ensuring the desired uniformity in all three models under study is as follows: Uniformity Property. If J ≥ 0 and if m ∈ Conv() is a global minimum of ΦJ , then there is an > 0 and a continuous function m : [J − , J + ] → Conv() such that limJ →J m (J ) = m and m (J ) is a local minimum of ΦJ for all J ∈ [J − , J + ]. In simple terms, the Uniformity Property states that every global minimum can be extended into a one-parameter family of local minima. Based on the Uniformity Property, we can state a lemma concerning the limit of DJ (ϑ) as ϑ ↓ 0: Lemma 4.1. Suppose that ΦJ satisfies the above Uniformity Property. Then for all J0 > 0, lim sup DJ (ϑ) = 0.

ϑ↓0 0≤J ≤J0

(4.2)

Proof. This is essentially an undergraduate exercise in compactness. Indeed, if the above fails, then for some > 0, we could produce a sequence ϑk ↓ 0 and Jk ∈ [0, J0 ] such that DJk (ϑk ) ≥ 6. This, in turn, implies the existence of mk ∈ Conv() such that dist mk , MMF (Jk ) ≥ 3 while ΦJk (mk ) < FMF (Jk ) + ϑk .

(4.3)

(4.4)

Let us use J and m to denote the (subsequential) limits of the above sequences. Using the continuity of ΦJ (m), to the right of the while we would have ΦJ (m) = FMF (J ) and m is thus a global minimum of ΦJ . By our hypothesis, for each k sufficiently large, there is a local minimum m (Jk ) of ΦJk with m (Jk ) converging to m as k → ∞. Since mk is also converging to m, the sequences mk and m (Jk ) will eventually be arbitrary close. But that contradicts the bound to the left of the while.

Phase Transitions and Mean-Field Theory

77

4.1.2. Monotonicity of mean-field magnetization. For spin systems with an internal symmetry (which, arguably, receive an inordinate share of attention), the magnetization usually serves as an order parameter. In the context of mean-field theory, what would typically be observed is an interval [0, JMF ], where m = 0 is the global minimizer of ΦJ , while for J > JMF , the function ΦJ is minimized by a non-zero m. This is the case for all three models under consideration. (It turns out that whenever S0 = 0, the unique global minimum of ΦJ for J sufficiently small is m = 0.) In order to prove the existence of a symmetry-breaking transition, we need to prove that the models under consideration have a unique point where the local minimum m = 0 ceases the status of a global minimum. This amounts to showing that, once the minimizer of ΦJ has been different from zero, it will never jump back to m = 0. In the mean-field theory with interaction (1.1), this can be proved using the monotonicity of the energy density; an analogous argument can be used to achieve the same goal for the corresponding systems on Zd . Lemma 4.2. Let J1 < J2 and let m1 be a global minimizer of ΦJ1 and m2 a global minimizer of ΦJ2 . Then |m1 | ≤ |m2 |. Moreover, if J → m(J ) is a differentiable trajectory of local minima, then 2 1 d ΦJ m(J ) = − m(J ) . dJ 2

(4.5)

Proof. The identity (4.5) is a simple consequence of the fact that, if m is a local minimum of ΦJ , then ∇ΦJ (m) = 0. To prove the first part of the claim, let J, J ≥ 0 and let m be a minimizer of ΦJ . Let FMF (J ) be the mean-field free energy. First we claim that FMF (J ) − FMF (J ) ≥ −

J − J |m|2 . 2

(4.6)

Indeed, since FMF (J ) = ΦJ (m), we have from the definition of ΦJ that FMF (J ) = −

J − J |m|2 + ΦJ (m). 2

(4.7)

Then the above follows using that ΦJ (m) ≥ FMF (J ). Let J1 < J2 and m1 and m2 be as stated. Then (4.6) for the choice J = J2 , J = J1 and m = m2 gives 1 FMF (J2 ) − FMF (J1 ) ≥ − |m2 |2 , J2 − J 1 2

(4.8)

while (4.6) for the choice J = J1 , J = J2 and m = m1 gives 1 FMF (J1 ) − FMF (J2 ) ≤ − |m1 |2 . J1 − J 2 2 Combining these two bounds, we have |m1 | ≤ |m2 | as stated.

(4.9)

78

M. Biskup, L. Chayes

4.1.3. One-component mean-field problems. Often enough, the presence of symmetry brings along a convenient property that the multicomponent mean-field equation (1.7) can be reduced to a one-component problem. Since this holds for all cases under consideration and we certainly intend to use this fact, let us spend a few minutes formalizing the situation. Suppose that there is a non-zero vector ω ∈ E such that ∇G(hω) is colinear with ω (and not-identically zero) for all h. As it turns out, then also ∇S(mω) is colinear with ω, provided mω ∈ Conv(). Under these conditions, let us restrict both h and m to scalar multiples of ω and introduce the functions g(h) = |ω|−2 G(hω) and

s(m) = |ω|−2 S(mω).

(4.10)

The normalization by |ω|−2 ensures that s(m) is given by the Legendre transform of g(h) via the formula (1.4). Moreover, the mean-field free-energy function ΦJ (mω) equals the |ω|2 -multiple of the function 1 φJ (m) = − J m2 − s(m). 2

(4.11)

The mean-field equation (1.7) in turn reads m = g (J m).

(4.12)

In this one-dimensional setting, we can easily decide about whether a solution to (4.12) is a local minimum of φJ or not just by looking at the stability of the solutions under iterations of (4.12): Lemma 4.3. Let m be a solution to (4.12) and suppose φJ is twice continuously differentiable in a neighborhood of m. If J g (J m) < 1,

(4.13)

then m is a local minimum of φJ . Informally, only “dynamically stable” solutions to the (on-axis) mean-field equation can be local minima of φJ . We remark that the term “dynamically stable” stems from the attempt to find solutions to (4.12) by running the iterative scheme mk+1 = g (J mk ). Proof. Let h and m be such that g (h) = m, which is equivalent to h = s (m). An easy calculation then shows that g (h) = −(s (m))−1 . Suppose now that m is a solution to (4.12) such that (4.13) holds. Then h = J m and from (4.13) we have −1 s (m) = − g (J m) < −J. (4.14) But that implies φJ (m) = −J − s (m) > −J + J = 0,

(4.15)

and, using the second derivative test, we conclude that m is a local minimum of φJ .

With Lemmas 4.1, 4.2 and 4.3 established, our account of the general properties is concluded and we can start discussing particular models. What follows in the next three subsections are the three respective models laid out in order of increasing difficulty. Our repeated – and not particularly elegant – strategy will be to pound at the various

Phase Transitions and Mean-Field Theory

79

models using internal symmetry as the mallet. The upshot is inevitably that at most one component becomes dominant while all other components act, among themselves, like a system at high temperature. Thus all subdominant components are equivalent and the full problem has been reduced to an effective scalar model. In short, there are some parallels between the various treatments. However, somewhat to our disappointment, we have not been able to find a unified derivation covering “all models of this sort.” 4.2. Potts model. In order to prove Theorem 2.1, we need to establish (rigorously) a few detailed properties of the mean-field free-energy function (2.7). In view of (2.6) we will interchangeably use the notations m and (x1 , . . . , xq ) to denote the same value of the magnetization. Lemma 4.4. Consider the q-state Potts model with q ≥ 3. Let ΦJ be the mean-field free-energy function as defined in (2.7). If m ∈ Conv() is a local minimum of ΦJ then the corresponding (x1 , . . . , xq ) is a permutation of the probability vector (x1 , . . . , xq ) such that x1 ≥ x2 = · · · = xq .

(4.16)

Moreover, when x1 > x2 , we also have J x1 > 1 > J x2 .

(4.17)

A complete proof of the claims in Lemma 4.4 was, to our best knowledge, first provided in [41]. (Strictly speaking, in [41] it was only shown that the global minima of ΦJ take the above form; however, the proof in [41] can be adapted to also accommodate local minima.) We will present a nearly identical proof but with a different interpretation of the various steps. The advantage of our reinterpretation is that it is easily applied to the other models of interest in this paper. (q)

Proof of Lemma 4.4. If m corresponds to the vector (x1 , . . . , xq ), we let ΦJ (x1 , . . . , xq ) be the quantity ΦJ (m). Suppose that (x1 , . . . , xq ) is a local minimum. It is easy to verify that (x1 , . . . , xq ) cannot lie on the boundary of Conv(), so xk > 0 for all k = 1, . . . , q. Pick any two coordinates – for simplicity we assume that our choice is x1 and x2 – and let y = 1 − (x3 + · · · + xq ), z1 = x1 /y and z2 = x2 /y. (Note that y = x1 + x2 and, in particular, y > 0.) Then we have (q)

ΦJ (x1 , . . . , xq ) 1 (q) = − Jy 2 (z12 + z22 ) + y(z1 log z1 + z2 log z2 ) + RJ (x3 , . . . , xq ), 2

(4.18)

(q)

where RJ (x3 , . . . , xq ) is independent of z1 and z2 . Examining the form of the free energy, we find that the first two terms are proportional to the mean-field free-energy function of the Ising (q = 2) system with reduced coupling J y: (q)

(2)

(q)

ΦJ (x1 , . . . , xq ) = y ΦJy (z1 , z2 ) + RJ (x3 , . . . , xq ).

(4.19)

Since the only z-dependence is in the first term, the pair (z1 , z2 ) must be a local (2) minimum of ΦJy regardless of what x3 , . . . , xq look like. But this reduces the problem to the Ising model, about which much is known and yet more can easily be derived. The (2) properties of ΦJ (z1 , z2 ) we will need are:

80

M. Biskup, L. Chayes (2)

(i) Jc = 2 is the critical coupling. For J ≤ Jc , the free-energy function ΦJ (z1 , z2 ) (2) is lowest when z1 = z2 , while for J > Jc , the free-energy function ΦJ (z1 , z2 ) is lowest when ρ = |z1 − z2 | is the maximal (non-negative) solution to ρ = tanh( 21 Jρ). (ii) Whenever J > Jc , the maximal solution to ρ = tanh( 21 Jρ) satisfies J (1−ρ 2 ) < 2, which implies that either J z1 > 1 and J z2 < 1 or vice versa. (2) (iii) For all J and z1 ≥ z2 , the mean-field free-energy function ΦJ (z1 , z2 ) monotonically decreases as ρ = z1 − z2 moves towards the non-negative global minimum. All three claims are straightforward to derive, except perhaps (ii), which is established by noting that, whenever ρ > 0 satisfies the (Ising) mean-field equation, we have 1 J Jρ J (1 − ρ 2 ) = < 1. = 1 2 sinh(Jρ) 2 cosh( 2 Jρ)2

(4.20)

Hence, if J > Jc and z1 > z2 , then J z2 = 21 J (1 − ρ) < 21 J (1 − ρ 2 ) < 1 and thus J z1 > 1 because J (z1 + z2 ) = J > Jc = 2. Based on (i–iii), we can draw the following conclusions for any pair of distinct indices xj and xk : If J (xj + xk ) ≤ 2, then xj = xk , because the (k, j )th Ising pair is subcritical, while if J (xj + xk ) > 2 then, using our observation (ii), either J xk > 1 and J xj < 1 or vice versa. But then we cannot have J xk > 1 for more than one index k, because if J xk > 1 and J xj > 1, we would have J (xj + xk ) > 2 and the (k, j )th Ising pair would not be at a local minimum. All the other indices must then be equal because the associated two-component Ising systems are subcritical. Consequently, only one index from (x1 , . . . , xq ) can take a larger value; the other indices are equal. Proposition 4.5. Consider the q-state Potts model with q ≥ 3. Let ΦJ be the mean-field free-energy function as defined in (2.7). Then there exist J1 and J2 = q with J1 < J2 such that (1) m = 0 is a local minimum of ΦJ provided J < J2 . (2) m = x1 vˆ 1 + · · · + xq vˆ 1 with x1 > x2 = · · · = xq is a local minimum of ΦJ provided that J > J1 and x1 = q1 + m, where m is the maximal positive solution to Eq. (2.8). (3) For all J ≥ 0, there are no local minima except as specified in (1) and (2). Moreover, if JMF is as in (2.9), then the unique global minimum of ΦJ is as in (1) for J < JMF while for J > JMF the function ΦJ has q distinct global minimizers as described in (2) . Proof of Proposition 4.5. Again, most of the above stated was proved in [41] but without the leeway for local minima. (Of course, the formulas (2.8) and (2.9) date to an earlier epoch, see e.g. [54].) What is not either easily derivable or already proved in [41] amounts to showing that if m is a “dynamically stable” solution to (2.8), the corresponding m = x1 vˆ 1 + · · · + xq vˆ 1 as described in (2) is a local minimum for the full ΦJ (m). The rest of this proof is spent proving the latter claim. We first observe that for the set U(x) = m = (x, x2 , . . . , xq ) : J xk ≤ 1, k = 2, . . . , q (4.21) the unique (strict) global minimum of ΦJ occurs at 1−x 1−x m(x) = x, q−1 . , . . . , q−1

(4.22)

Phase Transitions and Mean-Field Theory

81

Indeed, otherwise we could further lower the value of ΦJ by bringing one of the (j, k)th Ising pairs closer to its equilibrium, using the properties (ii–iii) above. Now, suppose that m satisfying (2.8) is “dynamically stable” in the sense of Lemma 4.3. By (4.17) we have that the corresponding x1 = q1 + m satisfies J x1 > 1 while the common value of xk for k = 2, . . . , q is such that J xk < 1. Suppose that the corresponding m is not a local minimum of the full ΦJ . Then there exists a sequence (mk ) tending to m such that ΦJ (mk ) < ΦJ (m). But then there is also a sequence mk such that ΦJ (mk ) < ΦJ (m), where each mk now takes the form (4.22). This contradicts that the restriction of ΦJ to the “diagonal,” namely the function φJ (m), has a local minimum at m. Now we are ready to prove our main result about the q-state Potts model. Proof of Theorem 2.1. By well known facts from the FK representation of the Potts model, the quantities e (J ) and m (J ) arise from the pair [ew , mw ] corresponding to the state with constant boundary conditions (the wired state). Therefore, [ew , mw ] is an extreme point of the convex set K (J ) and mw ∈ M (J ) for all J . In particular, the bound (1.12) for mw can be used without apology. Let δd be the part of the error bound in (1.12) which does not depend on J . Explicitly, 1 we have δd = 2q (q −1)2 Id , because κ = (q −1)/q and dim E = q −1. Since Id → 0 as d → ∞, we have δd → 0 as d → ∞. Let us define 1 = 1 (d, J ) = sup DJ (J δd ), 0≤J ≤J

(4.23)

where DJ is as in (4.1). It is easy to check that the Uniformity Property holds. Lemma 4.1 then guarantees that every (extremal) physical magnetization m ∈ M (J ) has to lie within 1 from a local minimum ΦJ . Since the asymmetric minima exist only for J > J1 > 0 while m = 0 is a local minimum only for J < J2 = q, we have m (J ) ≤ 1 for J ≤ J1 , while |m (J ) − mMF (J )| ≤ 1 for J > J2 . But from the FKG properties of the random cluster representation we know that J → m (J ) is non-decreasing so there must be a point, Jt ∈ (J1 , J2 ], such that (2.10–2.11) hold. It remains to show that |Jt − JMF | tends to zero as d → ∞. For J ∈ (J1 , J2 ), let ϕS (J ), resp., ϕA (J ) denote the value of ΦJ at the symmetric, resp., asymmetric local minima. The magnetization corresponding to the asymmetric local minimum exceeds some κ > 0 throughout (J1 , J2 ). Integrating (4.5) with respect to J and using that ϕS (JMF ) = ϕA (JMF ) then gives us the bound ϕS (J ) − ϕA (J ) ≥ 1 κ 2 |J − JMF |. (4.24) 2 However, in the 1 -neighborhood US (1 ) of the symmetric minimum, we will have ΦJ (m) − ϕS (J ) ≤ 1 K, (4.25) where K is a uniform bound on the derivative of ΦJ (m) for m ∈ US (1 ) and J ∈ (J1 , J2 ). Since the asymmetric minima are well separated from the boundary of Conv() for J ∈ (J1 , J2 ), a similar bound holds for the 1 -neighborhood of the asymmetric minimum. Comparing (4.24–4.25) and (1.12), we find that if 1 2 (4.26) κ |J − JMF | − 21 K > J δd , 2 no value of magnetization in the 1 -neighborhood of the local minima with a larger value of ΦJ is allowed. In particular, |Jt − JMF | ≤ 2 , where 2 = 2 (d) tends to zero as d → ∞.

82

M. Biskup, L. Chayes

4.3. Cubic model. Our first goal is to prove Proposition 2.2. We will begin by showing (r) that the local minima of ΦJ and KJ are in one-to-one correspondence. Let us introduce the notation r X = (y, ¯ µ) ¯ : |µj | ≤ 1, yj ≥ 0, yj = 1

(4.27)

j =1

and let X(m) denote the subspace of X, where m = y1 µ1 + · · · + yr µr . Lemma 4.6. Let m ∈ Conv() be a local minimum of ΦJ . Then there exists a (y, ¯ µ) ¯ ∈ (r) X(m) which is a local minimum of KJ (as defined in (2.15)). (r)

Proof. Let m be a local minimum of ΦJ . Since X(m) is compact and KJ is continuous on X, the infimum ΦJ (m) =

(r)

inf

(y, ¯ µ)∈X(m) ¯

KJ (y, ¯ µ) ¯

(4.28) (r)

is attained at some (y, ¯ µ) ¯ ∈ X(m). We claim that this (y, ¯ µ) ¯ is a local minimum of KJ . Indeed, if the opposite is true, there is a sequence (y¯k , µ¯ k ) ∈ X converging to (y, ¯ µ) ¯ such that (r)

(r)

¯ µ) ¯ = ΦJ (m). KJ (y¯k , µ¯ k ) < KJ (y,

(4.29)

(r)

Now, (y, ¯ µ) ¯ was an absolute minimum of KJ on X(m), so (y¯k , µ¯ k ) ∈ X(m) and the magnetization mk corresponding to (y¯k , µ¯ k ) is different from m for all k. Noting that (r)

ΦJ (mk ) ≤ KJ (y¯k , µ¯ k )

(4.30)

and combining (4.29–4.30), we thus have ΦJ (mk ) < ΦJ (m) for all k. But mk tends to m in Conv(), which contradicts the fact that m is a local minimum of ΦJ . Lemma 4.6 allows us to analyze the local minima in a bigger, simpler space: (r)

Lemma 4.7. Let KJ (y, ¯ µ) ¯ be the quantity in (2.15). Then each local minimum of (r) ¯ µ) ¯ is an index-permutation of a state (y, ¯ µ) ¯ with y1 ≥ y2 = · · · = yr and KJ (y, µ2 = · · · = µr = 0. Moreover, if y1 > y2 , then µ1 = 0. (r)

Proof. Let (y, ¯ µ) ¯ be a local minimum of KJ such that y1 ≥ y2 ≥ · · · ≥ yr and fix a k between 1 and r. We abbreviate y = yk + yk+1 and introduce the variables z1 = yk /y, z2 = yk+1 /y, ν1 = µk and ν2 = µk+1 . Then (r)

(2)

KJ (y, ¯ µ) ¯ = y KJy (¯z, ν¯ ) + R,

(4.31)

(2)

where KJy (¯z, ν¯ ) is the mean-field free energy of an r = 2 cubic model with coupling constant Jy, and R is a quantity independent of (¯z, ν¯ ). As was mentioned previously, the r = 2 cubic model is equivalent to two decoupled Ising models. Thus, (2)

KJy (¯z, ν¯ ) = ΘJy (ρ1 ) + ΘJy (ρ2 ),

(4.32)

Phase Transitions and Mean-Field Theory

83

where ρ1 and ρ2 are related to z1 , z2 , ν1 and ν2 via the equations z1 = 21 (1 + ρ1 ρ2 ),

z1 ν1 = 21 (ρ1 + ρ2 ),

z2 = 21 (1 − ρ1 ρ2 ),

z2 ν2 = 21 (ρ1 − ρ2 ).

(4.33)

Now, the local minima of ΘJ (ρ) occur at ρ = ±ρ(J ), where ρ(J ) is the largest non-negative solution to the equation ρ = tanh( 21 Jρ). Moreover, by the properties (i–iii) from the proof of Lemma 4.4 we know that ρ(J ) = 0 for J ≤ 2 while 21 J (1 − ρ(J )2 ) < 1 once J > 2. From these observations we learn that if yk = yk+1 , then J y ≤ 2 and µk = µk+1 = 0. On the other hand, if yk > yk+1 , then J y > 2, yk = 21 y(1+ρ(J y)2 ) and yk+1 = 21 y(1 − ρ(J y)2 ) so, in particular, J yk > 1 > J yk+1 . However, that forces that k = 1, because otherwise we would also have J yk−1 > 1 and J (yk−1 +yk ) > 2, implying (r) that (y, ¯ µ) ¯ is not a local minimum of KJ in the (k −1, k)th sector. Hence, y2 = · · · = yr and µ2 = · · · = µr = 0, while if y1 > y2 , then we have µ1 = ±ρ(J )/z1 = 0. The proof of Lemma 4.7 gives us the following useful observation: Corollary 4.8. Let m = (m1 , m2 , . . . , mr ) be contained in Conv() and suppose that m1 , m2 = 0. Then one of the four vectors (m1 ± m2 , 0, m3 , . . . , mr ),

(0, m2 ± m1 , m3 , . . . , mr )

(4.34)

corresponds to a magnetization m ∈ Conv() with ΦJ (m ) < ΦJ (m). Proof. Since m is in the interior of Conv(), there exists (y, ¯ µ), ¯ where the infimum (4.28) is achieved. Let z1 , z2 , ν1 and ν2 be related to y1 , y2 , µ1 and µ2 as in (4.31–4.33). Now by (4.32) the free energy of the corresponding sector of (y, ¯ µ) ¯ equals the sum of the free energies of two decoupled Ising models with biases ρ1 and ρ2 . Without loss of generality, suppose that ρ1 > ρ2 ≥ 0. Recalling the property (iii) from the proof of Lemma 4.4, ρ → ΘJ (ρ) decreases when ρ ≥ 0 gets closer to the non-negative local minimum. Thus, if ρ1 is nearer to the local minimum of ΘJy than ρ2 , by increasing ρ2 we lower the free energy by a non-trivial amount. Similarly, if ρ2 is the one that is closer, we decrease ρ1 . By inspection of (4.33), the former operation produces a new quadruple z1 , z2 , ν1 and ν2 , with ν2 = 0 and z1 ν1 = ρ1 . But that corresponds to the magnetization vector (m1 , m2 , m3 , . . . , mr ), where m1 = ρ1 y = m1 + m2

and

m2 = 0,

which is what we stated above. The other situations are handled analogously.

(4.35)

Now we are finally ready to establish the claim about local/global minima of ΦJ : Proof of Proposition 2.2. By Lemma 4.6, every local minimum of ΦJ corresponds to a (r) local minimum of KJ . Thus, using Lemma 4.7 we know that all local minima m of ΦJ will have at most one non-zero component. Writing ω = (1, 0, . . . , 0), h = hω and m = mω, we can use the formalism from Sect. 4.1. In particular, the on-axis moment generating function g(h) is given by g(h) = − log(2r) + log(r − 1 + cosh h).

(4.36)

84

M. Biskup, L. Chayes

Differentiating this expression, (4.12) shows that every local minimum m has to satisfy Eq. (2.16). Now, for r > 2, a little work shows that h → g (h) is convex for (r − 1)2 − (r − 1) cosh h + 2 > 0

(4.37)

and concave otherwise. In particular, for r > 3, Eq. (2.16) has either one non-negative solution m = 0 or three non-negative solutions, m = 0, m = m− (J ) and m = m+ (J ), where 0 ≤ m− (J ) ≤ m+ (J ). However, m+ (J ) is “dynamically stable” and, using Lemma 4.3, m− (J ) never corresponds to a local minimum. To finish the proof we need to show that m = (m+ (J ), 0, . . . , 0) is a local minimum of the full ΦJ . If the contrary were true, we would have a sequence mk tending to m such that ΦJ (mk ) < ΦJ (m). Then an (r − 1)-fold use of Corollary 4.8 combined with the symmetry of ΦJ implies the existence of a sequence mk = (mk , 0, . . . , 0) tending to m and satisfying ΦJ (mk ) ≤ ΦJ (mk ) for all k. But that contradicts that m+ (J ) is a local minimum of the on-axis mean-field free energy function. So m was a local minimum of ΦJ after all. The existence of a unique mean-field transition point JMF is a consequence of Lemma 4.2 and the fact that m = 0 ceases to be a local minimum for J ≥ r. Proof of Theorem 2.3. The proof is basically identical to that of Theorem 2.1, so we will be rather sketchy. First we note that m (J ) is achieved at some extremal translation-invariant state whose magnetization m is an element of M (J ). Let δd = 21 rId and define 1 as in (4.23). Then m has to be within 1 from a local minimum of ΦJ . While this time we cannot proclaim that J → m (J ) is non-decreasing, all the benefits of monotonicity can be achieved by using the monotonicity of the energy density e (J ). Indeed, J → e (J ) is non-decreasing and, by Corollary 1.2 and the Key Estimate, we have J 1 (4.38) e (J ) − m (J )2 ≤ rId = J δd . 2 2 But then e (J ) must undergo a unique large jump at some Jt from values e (J ) ≤ 2J δd to values near 21 mMF (J )2 by less than 2J δd . So m (J ) has to jump at J = Jt as well, in order to obey (4.38). The width of the “transition region” is controlled exactly as in the case of the Potts model.

4.4. Nematic model. The nematic models present us with the difficulty that an explicit formula for ΦJ (m) seems impossible to derive. However, the situation improves in the dual Legendre variables. Indeed, examining (1.4–1.6), it is seen that the stationary points of ΦJ (m) are in one-to-one correspondence with the stationary points of the (Gibbs) free-energy function ΨJ (h) =

1 |h|2 − G(h), 2J

(4.39)

via the relation h = J m. (In the case at hand, h takes values in E which was defined as the space of all N × N traceless matrices.) Moreover, if m = ∇G(h), then we have ΨJ (h) − ΦJ (m) =

1 |h − J m|2 2J

(4.40)

Phase Transitions and Mean-Field Theory

85

so the values ΨJ (m) and ΦJ (h) at the corresponding stationary points are the same. Furthermore, some juggling with Legendre transforms shows that if m is a local minimum of ΦJ , then h = J m is a local minimum of ΨJ . Similarly for local maxima and saddle points of ΦJ . Lemma 4.9. Each stationary point of ΨJ (h) on E is a traceless N × N matrix h with eigenvalues that can be reordered to the form h1 ≥ h2 = · · · = hN . Proof. The claim is trivial for N = 2 so let N ≥ 3. Without loss of generality, we can restrict

ourselves to diagonal, traceless matrices h. Let h = diag(h1 , . . . , hN ) be such that α hα = 0 and let vα , with α = 1, . . . , N, be the components of a unit vector in RN . Let −0 be the expectation with respect to the a priori measure µ on and let −h be the state on tilted by h. Explicitly, we have f h = e

−G(h)

µ(dv)f (v) exp

N

hα vα2

(4.41)

α=1

for any measurable function f on the unit sphere in RN . As in the case of the Potts and cubic models, the proof will be reduced to the twocomponent problem. Let h be a stationary point of J and let α and β be two distinct indices between 1 and N . The relevant properties of −h are then as follows: (i) If J vα4 + vβ4 h > 3, then hα = hβ . (ii) If hα > hβ , then J vα4 h > 23 > J vβ4 h . The proof of these facts involves a non-trivial adventure with modified Bessel functions, π In (x), where n is any non-negative integer and In (x) = π1 0 dθ ex cos θ cos(nθ ). To keep the computations succinct, we introduce the polar coordinates, vα = r cos θ and vβ = r sin θ , where θ ∈ [0, 2π) and r ≥ 0. Let −α,β denote the expectation with respect to the r-marginal of the state −h where h = diag(h1 , . . . , hN ) is related to h via hα = hβ = 21 (hα + hβ ), while hγ = hγ for γ = α, β. Explicitly, if f¯(r, θ) corresponds to f (vα , vβ ) via the above change of coordinates, then

2π r 2 cos(2θ) f¯(r, θ) 0 dθ e αβ

f (vα , vβ ) h = , (4.42) 2π 2 cos(2θ) r dθ e 0 αβ

where = 21 (hα − hβ ). We begin by deriving several identities involving modified Bessel functions. First, a straightforward calculation shows that vα2 − vβ2 h = Aαβ () r 2 I1 (r 2 ) αβ , (4.43) where Aαβ ()−1 = I0 (r 2 )αβ . Similarly we get vα2 vβ2 h = Aαβ () 18 r 4 I0 (r 2 ) − I2 (r 2 ) αβ .

(4.44)

But I0 (x) − I2 (x) = (2/x)I1 (x), whereby we have the identity 2(hα − hβ )vα2 vβ2 h = vα2 − vβ2 h .

(4.45)

86

M. Biskup, L. Chayes

A similar calculation using trigonometric formulas shows that vα4 h = Aαβ () r 4 38 I0 (r 2 ) + 21 I1 (r 2 ) + 18 I2 (r 2 ) αβ , vβ4 h = Aαβ () r 4 38 I0 (r 2 ) − 21 I1 (r 2 ) + 18 I2 (r 2 ) αβ .

(4.46) (4.47)

In particular, since I0 (0) = 1 while I1 (0) = I2 (0) = 0, we have hα = hβ

⇒

vα4 h = vβ4 h = 3vα2 vβ2 h .

(4.48)

The identities (4.44–4.48) will now allow us to prove (i–ii). First we note that the fact that h was a stationary point of ΨJ implies that hγ − hγ = J vγ2 − vγ2 h for all γ , γ = 1, . . . , N. Using this in (4.45), we have the following dichotomy: either

hα = hβ

or

2J vα2 vβ2 h = 1.

(4.49)

To establish (i), suppose that J vα4 + vβ4 h > 3 but hα = hβ . Then (4.48) gives us 2J vα2 vβ2 h > 1, in contradiction with (4.49). Hence, (i) must hold. To prove (ii), assume that hα > hβ and note that then > 0. Applying that I1 (x) > 0 and I2 (x) > 0 for x > 0 in (4.46), we easily show using (4.46) that vα4 h > 3vα2 vβ2 h . Similarly, the bound I1 (x) > I2 (x) for x > 0, applied in (4.47), shows that vβ4 h < 3vα2 vβ2 h . From here (ii) follows by invoking (4.49). Now we are ready to prove the desired claim. Let h be a stationary point. First let us prove that there are no three components of h such that hα > hβ > hγ . Indeed, if that would be the case, (i–ii) leads to a contradiction, because hα > hβ would require that J vβ4 h < 3/2 while hβ > hγ would stipulate that J vβ4 h > 3/2! Thus, any stationary point h of ΨJ can only have two values for vα4 h . However, if (say) both v14 h and v24 h take on the larger value (implying that h1 = h2 ), then J v14 + v24 h > 3 and h cannot be a stationary point. From here the claim follows. The symmetry of the problem at hand allows us to restrict ourselves to the on-axis 1 1 , . . . , − N−1 ), h = hω formalism from Sect. 4.1. In particular, we let ω = diag(1, − N−1 and λ = λω and define the functions g(h), s(λ) and φJ (λ) as in (4.10–4.11). Lemma 4.9 in turn guarantees that all local minimizers of ΦJ appear within the domain of φJ . What remains to be proved is the converse. This can be done using some of the items established above. Lemma 4.10. Suppose that λ is a stationary point of the scalar free energy φJ which 1 1 satisfies J g (J λ) < 1. Then λ = λω, with ω = diag(1, − N−1 , . . . , − N−1 ), is a local minimizer of ΦJ . Proof. To simplify the exposition, we will exploit the O(N )-symmetry of the problem: If g ∈ O(N, R) is any N × N orthogonal matrix, then ΦJ (m) = ΦJ (g−1 m g),

(4.50)

with similar considerations applying to ΨJ (h). Thus, for all intents and purposes, we may assume that the arguments of these functions are already in the diagonal form and regard the diagonal as an N -component vector. (Indeed, we will transfer back and forth between the vector and matrix language without further ado.)

Phase Transitions and Mean-Field Theory

87

Again we are forced to work with the dual variables. To that end, let ψJ (h) be the quantity |ω|−2 ΨJ (hω). Clearly, the relation between ψJ and φJ is as for ΨJ and ΦJ . First, let us demonstrate that every stationary point of the scalar free energy ψJ represents a stationary point of the full ΨJ . Indeed, let K be the orthogonal complement of vector ω in RN . As a simple computation shows, any k ∈ K has a zero first component. If k = (0, k2 , . . . , kN ) ∈ K is small, then

G(hω + k) = G(hω) + (4.51) kβ vβ2 + O |k|2 , β

hω

where −h is as in (4.41). Now vβ2 hω is the same for all β = 2, . . . , N, and in the

view of the fact that β kβ = 0, the expectation vanishes. Hence, ∇ΨJ (hω) has all components corresponding to the subspace K equal to zero. Now if h is a stationary point of ψJ , we know that (ω, ∇ΨJ (hω)) = 0 and thus ∇ΨJ (hω) = 0 as claimed. To prove the desired claim, it now suffices to show that the Hessian of ΨJ is positive definite at h = h ω when h satisfies J g (h ) < 1. (Recall that the corresponding stationary points of ψJ and φJ are related by h = J λ.) This in turn amounts to showing that ∇∇G(hω) is dominated by the J −1 -multiple of the unit matrix. Although we must confine ourselves to E , it is convenient to consider the Hessian of G(h) in a larger space which contains the constant vector and restrict our directional probes to vectors from E . In general, the entries of the Hessian are given in terms of truncated correlation functions: Hess(G) αβ = vα2 vβ2 h − vα2 h vβ2 h . (4.52) For the problem at hand, there are only four distinct entries: 



A B ... ... B B C D ... D

.  Hess(G) =  .. . . .

. . .  D . . . . ..  .

 .. . .  . . C D B D ... D C

(4.53)

Clearly, ω itself is an eigenvector of Hess(G) with the eigenvalue A − B. On the other hand, if k ∈ K, then the first row and column of Hess(G) are irrelevant. Writing the remaining (N − 1) × (N − 1) block in the form (C − D)1 + C S, where S is the matrix with all entries equal to one, it follows easily that all of K is an eigenspace of Hess(G) with eigenvalue C − D. It remains to show that these eigenvalues are strictly smaller than J −1 . The first one, namely, A − B is less than J −1 by our assumption that J g (h ) < 1. As to the other eigenvalue, C − D, we note that C − D = vα4 h − vα2 vβ2 h ,

α > β > 1.

(4.54)

Now, Eq. (4.48) tells us that, under our conditions, vα2 vβ2 h equals 13 vα4 h . So we need that 23 vα4 h is less than J . But since h1 = h > hα , that is exactly the condition (ii) derived in the proof of Lemma 4.9. Now we are ready to establish our claims concerning the local minima of ΦJ :

88

M. Biskup, L. Chayes

Proof of Proposition 2.4. Let ω be as above and note that |ω|2 = N/(N − 1). Then the on-axis moment generating function from (4.10) becomes N 2 1 N −1 g(h) = (4.55) log πN (dv) eh N −1 (v1 − N ) , N where πN is the uniform probability measure on the unit sphere in RN and v1 is the first component of v. An argument involving the N -dimensional spherical coordinates then shows that πN (v1 ∈ dx) = C(N ) (1 − x 2 )

N −3 2

dx,

(4.56)

where C(N ) is the ratio of the surfaces of the unit spheres in RN−1 and RN . By substituting this into (4.55) and applying (4.12), we easily find that, in order for λ = λω to be a local minimum of ΦJ , the scalar λ has to satisfy Eq. (2.25). A simple analysis of (2.25) shows that for J 1, the only solution to (2.25) is λ = 0, while for J N 2 , the solution λ = 0 is no longer perturbatively stable. Since Lemma 4.2 guarantees that the norm of all global minimizers increases with J , there must be a unique JMF ∈ (0, ∞) and a non-decreasing function J → λMF (J ) such that λMF (J ) solves (2.25) and that every global minimizer of ΦJ at any J > JMF which is a continuity point of J → λMF (J ) corresponds to λ = λMF (J ). (At any possible point of discontinuity of J → λMF (J ), the λ corresponding to any global minimizer is sandwiched between limJ ↑J λMF (J ) and limJ ↑J λMF (J ).) The claim is proved. In order to prove the large-N part of our statements concerning the mean-field theory of the nematic model, we will need to establish the following scaling property: (N)

Lemma 4.11. Let ΦJ denote the free-energy function of the O(N )-nematic Hamilto1 1 nian. Introduce the matrix ω = diag(1, − N−1 , . . . , − N−1 ) and define the normalized mean-field free-energy function (N)

φJ (λ) =

1 (N) |ω|−2 ΦJ N (λω), N

λ < 1.

(4.57)

(N)

Then, as N → ∞, the function λ → φJ (λ) converges, along with all of its derivatives, to the function (∞)

φJ

1 J 1 . (λ) = − λ2 + log 2 2 1−λ

(4.58)

Proof. The proof is a straightforward application of Laplace’s method to the measure on the right-hand side of (2.25). Indeed, for any h ≥ 0, consider the measure ρh,N on [0, 1] defined by ρh,N (dx) = 1 0

(1 − x 2 )

N −3 2

dx (1 − x 2 ) 1

2

ehNx

N −3 2

2

ehNx

2

dx.

(4.59)

Noting that the function x → (1 − x 2 ) 2 ehx has a unique maximum at x = xh , where 1 xh2 = max 0, 1 − , (4.60) 2h

Phase Transitions and Mean-Field Theory

89

we easily conclude that lim ρh,N (·) = δxh (·),

N→∞

(4.61)

where δa (·) denotes the Dirac point mass at x = a. Here the limit is taken in the sense of weak convergence on the space of all bounded continuous functions on [0, 1]. The proof of this amounts to standard estimates for the Laplace method; we leave the details to the reader. Let gN (h) denote the function g(hN ), where g is as in (4.55). Since any derivative of gN (h) can be expressed as a truncated correlation function of measure ρh,N , we easily conclude that h → gN (h) converges, along with all of its derivatives, to the function 1 1 g∞ (h) = lim gN (h) = max 0, h − − log(2h) , N→∞ 2 2

(4.62)

for all h ≥ 0. Now, the function sN (λ) = N1 |ω|−2 S(λω) – where S(·) is the entropy of the O(N )-nematic model – is the Legendre transform of gN , so we also get 1 1 . s∞ (λ) = lim sN (λ) = − log N→∞ 2 1−λ

(4.63)

(Again, the convergence extends to all derivatives, provided λ < 1.) From here the claim (N) (∞) follows by noting that φJ (λ) = − J2 λ2 − sN (λ), which tends to φJ (λ) in the desired sense. Proof of Proposition 2.5. By Lemma 4.11, the scaled mean-field free-energy func(N) (∞) tion φJ is, along with any finite number of its derivatives, uniformly close to φJ on compact subsets of [0, 1), provided N is sufficiently large. Now the local minima (∞) of φJ will again satisfy a mean-field equation, this time involving the function g∞ from (4.62). Since 1 , if h > 21 , 1 − 2h (4.64) g (h) = 0, otherwise, there are at most two perturbatively stable solutions to the mean-field equation: One at λ = 0 and the other at 1 (4.65) λ = 1 + 1 − 4J −2 . 2 Moreover, these local minima interchange the role of the global minimum at some finite (∞) and non-zero JMF , which is a solution of a particular transcendental equation. For J (∞) (∞) near JMF , the second derivative of φJ is uniformly positive around both local minima. The convergence stated in Lemma 4.11 ensures that all of the previously listed facts (N) (N) will be (at least qualitatively) satisfied by φJ for N large as well. Thus, φJ has at (N) most one positive local minimum, which immediately implies that J → λMF (J ) is (N) continuous whenever it is defined. Moreover, since the local minima of φJ converge (∞) to those of φJ , we also easily recover the asymptotic statements (2.27–2.28). This finishes the proof.

90

M. Biskup, L. Chayes

Proof of Theorem 2.6. The proof is similar to that of the Potts and cubic models; the only extra impediment is that now we cannot take for granted that there is only one non-zero local minimum. As before, most of the difficulties will be resolved by invoking the monotonicity of the energy density e (J ), which is defined e.g. by optimizing 1 2 (Q0 , Qx )J over all Gibbs states invariant under the lattice translations and rotations. In the present case, κ and n in the Main Theorem are given by κ = (N − 1)/N and n = 21 N(N − 1). Thus, letting δd = 41 (N − 1)2 Id , the quantity J δd is the corresponding error term on the right-hand side of (1.12). Define 1 by the formula (4.23). Then Lemma 4.9 guarantees that the diagonal form λ of Q0 J for any Gibbs state is an index permutation of a vector of the type λ λ λ + a1 , − (4.66) + a2 , . . . , − + aN , N −1 N −1

where i ai = 0, i ai2 ≤ 12 and λ corresponds to a local minimum of ΦJ . If λ is the physical magnetization giving rise to λ (J ), we let λMF (J ) be a value of λ, corresponding to a local minimum of ΦJ , for which λ takes the form (4.66). Then Corollary 1.2 and the Key Estimate give 1 N (4.67) λMF (J )2 ≤ 2J δd . e (J ) − 2N −1 Now for J ≤ J0 1, we know the only local minimum is for λMF (J ) = 0, while for J ≥ J1 N 2 , the zero vector is no longer a local minimum and hence λMF (J ) exceeds some κ > 0. But J → e (J ) is non-decreasing so there must be a Jt ∈ [J0 , J1 ], where e (J ) jumps by at least κ − 2Jt δd , which is positive once d is sufficiently large. The fact that Jt must be close to JMF for large enough d is proved exactly as for the Potts and cubic models.

5. Mean-Field Theory and Complete-Graph Models Here we will show that the mean-field formalism developed in Sect. 1.2 has a very natural interpretation for the model on a complete graph. An important reason for the complete graph picture is to provide a tangible physical system to motivate some of the physical arguments. The forthcoming derivation is a rather standard exercise in large-deviation theory [16, 19], so we will keep it rather brief. We will begin by a precise definition of the problem. Let GN be a complete graph on N vertices and consider a spin system on GN with single-spin space and the Hamiltonian βHN (S) = −

J N

(Sx , Sy ) −

1≤x
N

(b, Sx ).

(5.1)

x=1

(Recall that is a compact subset of a finite-dimensional vector space E with inner product denoted as in the previous formula.) Let µ denote the a priori spin measure and let −0 denote the corresponding expectation. For each configuration S, introduce the empirical magnetization by the formula mN (S) =

N 1 Sx . N x=1

(5.2)

Phase Transitions and Mean-Field Theory

91

If m ∈ Conv() and > 0, let U (m) denote the -neighborhood of m in Conv() in the metric induced by the inner product on E . Then we have: Theorem 5.1. For each m ∈ Conv(),

1 lim lim log e−βHN (S) 1{mN (S)∈ U (m)} = −ΦJ,b (m), ↓0 N→∞ N 0

(5.3)

where ΦJ,b (m) is as defined in Sect. 1.2. Moreover, if νN denotes the Gibbs measure obtained by normalizing e−βHN (S) and if FMF (J, b) denotes the infimum of ΦJ,b (m) over m ∈ Conv(), then lim νN ΦJ,b (mN (S)) ≥ FMF (J, b) + = 0 (5.4) N→∞

for every > 0. Proof. By our assumption, E is a finite-dimensional vector space. Moreover, is compact and thus the logarithmic generating function G(h) defined in (1.3) exists for all h ∈ E . As a consequence of Cram´er’s Theorem for i.i.d. random variables on Rn , see Theorem 2.2.30 in [16], the measures µN (·) = µ mN (S) ∈ · (5.5) satisfy a large-deviation principle on Rd with rate function (1.4). In particular, lim lim

↓0 N→∞

1 log µN U (m) = S(m), N

m ∈ Conv().

(5.6)

Now βHN can be written as follows: J βHN = N EJ,b mN (S) − (Sx , Sx ). N

(5.7)

x=1

Since the second term is bounded by a non-random constant almost surely and since m → EJ,b (m) is uniformly continuous throughout Conv(), (5.3) follows by inspecting the definition of ΦJ,b (m). Acknowledgement. The research of L.C. was supported by the NSF under the grant DMS-9971016 and by the NSA under the grant NSA-MDA 904-00-1-0050.

References 1. Aizenman, M.: Geometric analysis of ϕ 4 fields and Ising models. I, II. Commun. Math. Phys. 86, 1–48 (1982) 2. Aizenman, M., Fern´andez, R.: On the critical behavior of the magnetization in high-dimensional Ising models. J. Stat. Phys. 44, 393–454 (1986) 3. Aizenman, M., Barsky, D.J., Fern´andez, R.: The phase transition in a general class of Ising-type models is sharp. J. Stat. Phys. 47, 343–374 (1987) 4. Aizenman, M., Chayes, J.T., Chayes, L., Newman, C.M.: Discontinuity of the magnetization in one-dimensional 1/|x − y|2 Ising and Potts models. J. Stat. Phys. 50(1–2), 1–40 (1988) 5. Alexander, K.S., Chayes, L.: Non-perturbative criteria for Gibbsian uniqueness. Commun. Math. Phys. 189(2), 447–464 (1997)

92

M. Biskup, L. Chayes

6. Angelescu, N., Zagrebnov, V.A.: A lattice model of liquid crystals with matrix order parameter. J. Phys. A 15(11), L639–L643 (1982) 7. van den Berg, J., Maes, C.: Disagreement percolation in the study of Markov fields. Ann. Probab. 22(2), 749–763 (1994) 8. Biskup, M.: Reflection positivity of the random-cluster measure invalidated for non-integer q. J. Stat. Phys. 92, 369–375 (1998) 9. Biskup, M., Chayes, L.: Mean-field driven first-order phase transitions in systems with long-range interactions. In preparation 10. Bovier, A., Zahradn´ık, M.: The low-temperature phase of Kac-Ising models. J. Stat. Phys. 87, 311– 332 (1997) 11. Bovier, A., Zahradn´ık, M.: Cluster expansions and Pirogov-Sinai theory for long-range Ising systems. Submitted 12. Bricmont, J., Kesten, H., Lebowitz, J.L., Schonmann, R.H.: A note on the Ising model in high dimensions. Commun. Math. Phys. 122, 597–607 (1989) 13. Brydges, D., Spencer, T.: Self-avoiding walk in 5 or more dimensions. Commun. Math. Phys. 97, 125–148 (1985) 14. Cassandro, M., Presutti, E.: Phase transitions in Ising systems with long but finite range interactions. Markov Process. Related Fields 2, 241–262 (1996) 15. Curie, P.: Propri´et´es magn´etiques des corps a diverses temp´eratures. Ann. de Chimie et Physique 5, 289 (1885); reprinted in Œuvres de Pierre Curie Paris: Gauthier-Villars, 1908, pp. 232–334 16. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. New York: Springer Verlag, Inc., 1998 17. Dobrushin, R.: The description of a random field by means of conditional probabilities and conditions of its regularity. Theor. Prob. Appl. 13, 197–224 (1968) 18. Dyson, F.J., Lieb, E.H., Simon, B.: Phase transitions in quantum spin systems with isotropic and nonisotropic interactions. J. Statist. Phys. 18, 335–383 (1978) 19. Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Grundlehren der Mathematischen Wissenschaften, Vol. 271 New York: Springer-Verlag, 1985 20. Fern´andez, R., Fr¨ohlich, J., Sokal, A.D.: Random walks, critical phenomena, and triviality in quantum field theory. Texts and Monographs in Physics, Berlin: Springer-Verlag, 1992 21. Fortuin, C.M., Kasteleyn, P.W.: On the random cluster model. I. Introduction and relation to other models. Physica (Amsterdam) 57, 536–564 (1972) 22. Fr¨ohlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. I. General theory and long-range lattice models. Commun. Math. Phys. 62(1), 1–34 (1978) 23. Fr¨ohlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. II. Lattice systems with short-range and Coulomb interactions. J. Statist. Phys. 22(3), 297–347 (1980) 24. Fr¨ohlich, J., Simon, B., Spencer, T.: Infrared bounds, phase transitions and continuous symmetry breaking. Commun. Math. Phys. 50, 79–95 (1976) 25. de Gennes, P.G., Prost, J.: The Physics of Liquid Crystals. New York: Oxford University Press, 1993 26. Georgii, H.-O.: Gibbs Measures and Phase Transitions. de Gruyter Studies in Mathematics, Vol. 9, Berlin: Walter de Gruyter & Co., 1988 27. Georgii, H.-O., H¨aggstr¨om, O., Maes, C.: The random geometry of equilibrium phases. In: Phase Transitions and Critical Phenomena. C. Domb, J.L. Lebowitz (eds), Vol. 18, New York: Academic Press, 1999 pp. 1–142 28. Grimmett, G.: The stochastic random-cluster process and the uniqueness of random-cluster measures. Ann. Probab. 23(4), 1461–1510 (1995) 29. Hara, T., van der Hofstad, R., Slade, G.: Critical two-point functions and the lace expansion for spread-out high-dimensional percolation and related models. Ann. Probab. (to appear) 30. Hara, T., Slade, G.: Self-avoiding walk in five or more dimensions. I. The critical behaviour. Commun. Math. Phys. 147, 101–136 (1992) 31. Hara, T., Slade, G.: The lace expansion for self-avoiding walk in five or more dimensions. Rev. Math. Phys. 4, 235–327 (1992) 32. Hara, T., Slade, G.: Mean-field behaviour and the lace expansion. In: Probability and Phase Transition. G. Grimmett (ed), (Cambridge, 1993), NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., Vol. 420, Dordrecht: Kluwer Acad. Publ., 1994, pp. 87–122 33. Hara, T., Slade, G.: Mean-field critical behaviour for percolation in high dimensions. Commun. Math. Phys. 128, 333–391 (1990) 34. Hara, T., Slade, G.: The incipient infinite cluster in high-dimensional percolation. Electron. Res. Announc. Amer. Math. Soc. 4, 48–55 (1998) 35. Hara, T., Slade, G.: The scaling limit of the incipient infinite cluster in high-dimensional percolation. I. Critical exponents. J. Statist. Phys. 99, 1075–1168 (2000) 36. Hara, T., Slade, G.: The scaling limit of the incipient infinite cluster in high-dimensional percolation. II. Integrated super-Brownian excursion. J. Math. Phys. 41, 1244–1293 (2000)

Phase Transitions and Mean-Field Theory

93

37. van der Hofstad, R., den Hollander, F., Slade, G.: A new inductive approach to the lace expansion for self-avoiding walks. Probab. Theory Rel. Fields 111, 253–286 (1998) 38. van der Hofstad, R., den Hollander, F., Slade, G.: Construction of the incipient infinite cluster for spread-out oriented percolation above 4+1 dimensions. Commun. Math. Phys. 231, 435–461 (2002) 39. van der Hofstad, R., Slade, G.: A generalised inductive approach to the lace expansion. Probab. Theory Rel. Fields 122, 389–430 (2002) 40. van der Hofstad, R., Slade, G.: Convergence of critical oriented percolation to super-Brownian motion above 4 + 1 dimensions. Ann. Inst. H. Poincar´e Probab. Statist. (to appear) 41. Kesten, H., Schonmann, R.: Behavior in large dimensions of the Potts and Heisenberg models. Rev. Math. Phys. 1, 147–182 (1990) 42. Kim, D., Levy, P.M., Uffer, L.F.: Cubic rare-earth compounds: Variants of the three-state Potts model. Phys. Rev. B 12, 989–1004 (1975) 43. Kim, D., Levy, P.M.: Critical behavior of the cubic model. Phys. Rev. B 12, 5105–5111 (1975) 44. Koteck´y, R., Laanait, L., Messager, A., Ruiz, J.: The q-state Potts model in the standard Pirogov-Sina˘ı theory: surface tensions and Wilson loops. J. Statist. Phys. 58(1–2), 199–248 (1990) 45. Koteck´y, R., Shlosman, S.B.: First-order phase transitions in large entropy lattice models. Commun. Math. Phys. 83(4), 493–515 (1982) 46. Laanait, L., Messager, A., Miracle-Sol´e, S., Ruiz, J., Shlosman, S.: Interfaces in the Potts model. I. Pirogov-Sinai theory of the Fortuin-Kasteleyn representation. Commun. Math. Phys. 140(1), 81–91 (1991) 47. Lebowitz, J.L., Mazel, A., Presutti, E.: Liquid-vapor phase transitions for systems with finite-range interactions. J. Statist. Phys. 94, 955–1025 (1999) 48. Pearce, P.A., Thompson, C.J.: The high density limit for lattice spin models. Commun. Math. Phys. 58, 131–138 (1978) 49. Potts, R.B.: Some generalized order-disorder transformations. Proc. Cambridge Philos. Soc. 48, 106–109 (1952) 50. Ruelle, D.: Statistical mechanics. Rigorous results. Reprint of the 1989 edition, River Edge, NJ: World Scientific Publishing Co., Inc., London: Imperial College Press, 1999 51. Rockafellar, R.T.: Convex analysis. Princeton: Princeton University Press, 1997 52. Simon, B.: The statistical mechanics of lattice gases. Vol. I., Princeton Series in Physics, Princeton, NJ: Princeton University Press, 1993 53. Weiss, P.: L’Hypoth`ese du champ mol´eculaire et la proprit ferromagn´etique. J. de Physique 6, 661 (1907) 54. Wu, F.Y.: The Potts model. Rev. Modern Phys. 54, 235–268 (1982) Communicated by J. Z. Imbrie

Commun. Math. Phys. 238, 95–118 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0819-3

Communications in

Mathematical Physics

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions Shun-Jen Cheng1,∗ , Ngau Lam2,∗∗ 1 2

Department of Mathematics, National Taiwan University, Taipei, Taiwan 106, R.O.C. E-mail: [email protected] Department of Mathematics, National Cheng-Kung University, Tainan, Taiwan 701, R.O.C. E-mail: [email protected]

Received: 6 March 2002 / Accepted: 15 January 2003 Published online: 14 March 2003 – © Springer-Verlag 2003

Abstract: Making use of a Howe duality involving the infinite-dimensional Lie super∞|∞ and the finite-dimensional group GLl of [CW3] we derive a character algebra gl ∞|∞ in terms formula for a certain class of irreducible quasi-finite representations of gl ˆ n|n to derive a ∞|∞ to gl of hook Schur functions. We use the reduction procedure of gl character formula for a certain class of level 1 highest weight irreducible representations ˆ n|n , the affine Lie superalgebra associated to the finite-dimensional Lie superalgeof gl ˆ n|n -modules bra gln|n . These modules turn out to form the complete set of integrable gl of level 1. We also show that the characters of all integrable level 1 highest weight ˆ m|n -modules may be written as a sum of products of hook Schur functions. irreducible gl 1. Introduction Symmetric functions have been playing an important role in relating combinatorics and representation theory of Lie groups/algebras. Interesting combinatorial identities involving symmetric functions, more often than not, have remarkable underlying representation-theoretic explanations. As an example consider the classical Cauchy identity i,j

1 sλ (x1 , x2 , · · · )sλ (y1 , y2 , · · · ), = (1 − xi yj )

(1.1)

λ

where x1 , x2 , · · · and y1 , y2 , · · · are indeterminates, and sλ (x1 , x2 , · · · ) stands for the Schur function associated to the partition λ. Here the summation of λ above is over all partitions. Now the underlying representation-theoretic interpretation of (1.1) is of course the so-called (GL, GL) Howe duality [H1, H2]. Namely, let Cm and Cn be the ∗ ∗∗

Partially supported by NSC-grant 91-2115-M-002-007 of the R.O.C. Partially supported by NSC-grant 90-2115-M-006-015 of the R.O.C.

96

S.-J. Cheng, N. Lam

m- and n-dimensional complex vector spaces, respectively. We have an action of the respective general linear groups GLm and GLn on Cm and Cn . This induces a joint action of GLm × GLn on Cm ⊗ Cn , which in turn induces an action on the symmetric tensor S(Cm ⊗ Cn ). As partitions of appropriate length may be regarded as highest weights of irreducible representations of a general linear group, (1.1) simply gives an identity of characters of the decomposition of S(Cm ⊗ Cn ) with respect to this joint action. It was observed in [BR] that a generalization of Schur functions, the so-called hook Schur functions (see (3.1) for definition), plays a similar role in the representation theory of a certain class of finite-dimensional irreducible modules over the general linear Lie superalgebra. To be more precise, consider the general linear Lie superalgebra glm|n acting on the complex superspace Cm|n of (super)dimension (m|n). We may consider its induced action on the k th tensor power T k (Cm|n ) = k (Cm|n ). It turns out [BR] that the tensor ∞ algebra T (Cm|n ) = k=0 T k (Cm|n ) is completely reducible as a glm|n -module and the characters of the irreducible representations appearing in this decomposition are given by hook Schur functions associated to partitions lying in a certain hook whose shape is determined by the integers m and n. Now, as in the classical case, one may consider the joint action of two general linear Lie superalgebras glm|n × glp|q on the symmetric tensor S(Cm|n ⊗ Cp|q ). This action again is completely reducible and its decomposition with respect to the joint action, in a similar fashion, gives rise to a combinatorial identity involving hook Schur functions [CW1]. So here we have an interplay between combinatorics and representation theory of finite-dimensional Lie superalgebras as well. For another interplay involving Schur Q-functions and the queer Lie superalgebra see [CW2]. For further articles related to Howe duality in the Lie superalgebra settings we refer to [S1, S2, N and OP]. The purpose of the present paper is to demonstrate that symmetric functions may play a similarly prominent role relating combinatorics and representation theory of infinite-dimensional Lie superalgebras as well. It was shown in [CW3] that on the infinite-dimensional Fock space generated by n pairs of free bosons and n pairs of free fermions we have a natural commuting action of the finite-dimensional group GLn and ∞|∞ of central charge n. It can be shown the infinite-dimensional Lie superalgebra gl ∞|∞ ) forms a dual pair in the sense of Howe. The irre[CW3] that the pair (GLn , gl ducible representations of GLn appearing in this decomposition ranges over all rational representations, so that they are parameterized by generalized partitions of length not ∞|∞ appearing in the same decomexceeding n. The irreducible representations of gl position are certain quasi-finite highest weight irreducible representations. In particular this relates rational representations of the finite-dimensional group GLn and a certain ∞|∞ . class of quasi-finite representations of the infinite-dimensional Lie superalgebra gl A natural question that arises is the computation of the character of these quasi-finite ∞|∞ . This is solved in the present paper highest weight irreducible representations of gl by combining the Howe duality of [CW3] and a combinatorial identity involving hook Schur functions (Proposition 3.1). It turns out that the characters of these representations can be written as an infinite sum of products of two hook Schur functions. Each coefficient of these products can be determined by decomposing a certain tensor product of two finite-dimensional irreducible representations of GLn . The same method applied ˆ ∞ (using Lemma 3.1 now together with the dual Cauchy to the classical Lie algebra gl ˆ ∞ involving identity instead of Proposition 3.1) gives rise to a character formula for gl Schur functions instead of hook Schur functions. This formula has been discovered earlier by Kac and Radul [KR2] and even earlier in the simplest case by Awata, Fukuma,

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

97

Matsuo and Odake [AFMO2]. However, our approach in these classical cases appears to be simpler. It is indeed remarkable that the same character identity obtained in [KR2] with Schur functions replaced by hook Schur functions associated to the same partitions gives rise to our character identity for gl∞|∞ . That is, the coefficients remain unchanged! Making use of the same combinatorial identity we then proceed to compute the corresponding q-character formula for this class of highest weight irreducible representations ∞|∞ . We remark that when computing just the q-characters we can obtain a simpler of gl formula, which involves a sum of just hook Schur functions, instead of a product of hook Schur functions. We note that a q-character formula in the case when the central charge is 1 has been obtained earlier by Kac and van de Leur [KL]. Our formula in this case looks rather different from theirs, thus giving rise to another combinatorial identity. It would be interesting to find a purely combinatorial proof of this identity. ˆ n|n [KL] in the level 1 case to ∞|∞ to the gl We use the reduction procedure from gl obtain a character formula for certain highest weight irreducible representations of the ˆ n|n at level 1. However, the Borel subalgebra coming from affine Lie superalgebra gl the reduction procedure is different from the standard Borel subalgebra, and hence the corresponding highest weights in general are different. However, we show that by a sequence of odd reflections [PS] our highest weights may be transformed into highest ˆ n|n -modules (in the sense of [KW]) so that we weights corresponding to integrable gl ˆ n|n -modules of level 1. obtain a character formula for all integrable highest weight gl In [KW] a character formula has been obtained for level 1 integrable highest weight ˆ m|n -modules. Our formula looks rather different. irreducible gl We also show that by applying our method together with [KW] the characters of all ˆ m|n , m ≥ 2, may be written in terms of hook Schur level 1 integrable representations of gl functions as well. This seems to indicate the relevance of these generalized symmetric functions in the representation theory of affine superalgebras. ∞|∞ at arbiAs we have obtained a character formula for certain representations of gl trary positive integral level, it is our hope that our formula may provide some direction ˆ n|n -, or maybe even gl ˆ m|n -modules, at in finding a character formula for integrable gl higher positive integral levels. The paper is organized as follows. In Sect. 2 we collect the definitions and notation to be used throughout. In Sect. 3 we first prove the combinatorial identity mentioned ∞|∞ -modules in terms of hook above and then use it to write the characters of certain gl Schur functions. In Sect. 4 we calculate a q-character formula for these modules, while ˆ n|n -modules. In Sect. 6 we in Sect. 5 we calculate the characters of the associated affine gl ∞|∞ -modules. It turns out that even compute the tensor product decomposition of two gl though such a decomposition involves an infinite number of irreducible components, each irreducible component appears with a finite multiplicity. This multiplicity can be expressed via the usual Littlewood-Richardson coefficients.

2. Preliminaries Let Cm|n = Cm|0 ⊕ C0|n denote the m|n-dimensional superspace. Let glm|n be the Lie superalgebra of general linear transformations on the superspace Cm|n . Choosing a basis {e1 , · · · , em } for the even subspace Cm|0 and a basis {f1 , · · · , fn } for the odd subspace C0|n , we may regard glm|n as (m + n) × (m + n) matrices of the form

98

S.-J. Cheng, N. Lam

Ea , b e

(2.1)

where the complex matrices E, a, b and e are respectively m × m, m × n, n × m and th th n × n. Let Xij denote the corresponding elementary matrix with m1 in the i row n and j column and zero elsewhere, where X = E, b, a, e. Then h = i=1 CEii + j =1 Cejj is a Cartan subalgebra of glm|n . It is clear that any ordering of the basis {e1 , · · · , em , f1 , · · · , fn } that preserves the order among the even and odd basis elements themselves gives rise to a Borel subalgebra of glm|n containing h. In particular the ordering e1 < · · · < em < f1 < · · · < fn gives rise to the standard Borel subalgebra. In the case when m = n the ordering f1 < e1 < f2 < e2 < · · · < fn < en gives rise to a Borel subalgebra that we will refer to as non-standard from now on. λ denote the finite-dimensional highFixing the standard Borel subalgebra, we let Vm|n est weight irreducible module with highest weight λ. Let i ∈ h∗ be defined by i (Ejj ) = δij and i (ejj ) = 0. Furthermore let δj be defined by δj (Eii ) = 0 and δj (eii ) = δij . Then i and δj are the fundamental weights of glm|n . ˆ m|n ≡ Let C[t, t −1 ] be the ring of Laurent polynomials in the indeterminate t. Let gl −1 glm|n ⊗ C[t, t ] + CC + Cd be the affine Lie superalgebra associated to the Lie superalgebra glm|n . Writing A(k) for A ⊗ t k , A ∈ glm|n , the Lie (super)bracket is given by [A(k), B(l)] = [A, B](k + l) + δk+l,0 kStr(AB)C, [d, A(k)] = kA(k), A, B ∈ glm|n , k, l ∈ Z. Here C is a central element, d is the scaling element and Str denotes the super trace operator of a matrix, which for a matrix of the form (2.1) takes the form Tr(E) − Tr(e). ˆ m|n is given by hˆ = h + CC + Cd. We may extend reA Cartan subalgebra of gl spectively i and δj to elements ˜i and δ˜j in hˆ ∗ in a trivial way. Furthermore we define ˜ 0 ∈ hˆ ∗ and δ˜ ∈ hˆ ∗ by ˜ 0 (h) = ˜ 0 (d) = 0, ˜ 0 (C) = 1 and δ(h) ˜ ˜ = δ(C) = 0, ˜ δ(d) = 1, respectively. Let B ⊆ glm|n be a Borel subalgebra containing h. Then B +CC +Cd +glm|n ⊗tC[t] ˆ m|n . We define highest weight irreducible modules of gl ˆ m|n in is a Borel subalgebra of gl ˆ m|n -module is completely the usual way. It is clear that any highest weight irreducible gl ∗ ˆ ˆ m|n , ). determined by an element ∈ h . We will denote this module by L(gl ∞|∞ Consider now the infinite-dimensional complex superspace C with even basis elements labelled by integers and odd basis elements labelled by half-integers. Arranging the basis elements in strictly increasing order any linear transformation may be written as an infinite-sized square matrix with coefficients in C. This associative algebra is naturally Z2 -graded, so that it is an associative superalgebra, which we denote by M˜ ∞|∞ . Let 1 M∞|∞ := {A = (aij ) ∈ M˜ ∞|∞ , i, j ∈ Z| aij = 0 for |j − i| >> 0}. 2 That is, M∞|∞ consists of those matrices in M˜ ∞|∞ with finitely many non-zero diagonals. We denote the corresponding Lie superalgebra by gl∞|∞ . Furthermore let

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

99

us denote by eij , i, j ∈ 21 Z the elementary matrices with 1 at the i th row and j th column and 0 elsewhere. Then the subalgebra generated by {eij |i, j ∈ 21 Z} is a dense subalgebra inside gl∞|∞ . The Lie superalgebra gl∞|∞ has a central extension (by an even central element C), ∞|∞ , corresponding to the following two-cocycle: denoted from now on by gl α(A, B) = Str([J, A]B), A, B ∈ gl∞|∞ , where J denotes the matrix r≤0 err , and for a matrix D = (dij ) ∈ gl∞|∞ , Str(D) stands for the supertrace of the matrix D and which here is given by r∈ 1 Z (−1)2r drr . 2 We note that the expression α(A, B) is well-defined for A, B ∈ gl∞|∞ . ˆ ∞|∞ has a natural 1 Z-gradation by setting degEij = j − i, The Lie superalgebra gl 2 1 for i, j ∈ 2 Z. Thus we have the triangular decomposition ˆ ∞|∞ = (gl ˆ ∞|∞ )− ⊕ (gl ˆ ∞|∞ )0 ⊕ (gl ˆ ∞|∞ )+ , gl where the subscripts +, 0 and − respectively denote the positive, zeroth and negative graded components. Thus we have a notion of a highest weight Verma module, which contains a unique irreducible quotient, which is determined by an element ∈ (gl∞|∞ )∗0 . ∞|∞ , ). Let ωs , s ∈ 1 Z, denote the fundamental We will denote this module by L(gl 2 ∞|∞ . That is, ωs (err ) = 0, r ∈ 1 Z, and ωs (C) = 0. Furthermore let weights of gl 2 ∞|∞ )∗ with 0 (err ) = 0 and 0 (C) = 1. 0 ∈ (gl 0 Note that by declaring the highest weight vectors to be of degree zero, the module ∞|∞ , ) is naturally 1 Z-graded, i.e. L(gl 2 ∞|∞ , ) = ⊕ 1 L(gl ∞|∞ , )r . L(gl r∈ Z+ 2

∞|∞ , )r < ∞, ∞|∞ , ) is said to be quasi-finite [KR1] if dimL(gl The module L(gl 1 for all r ∈ 2 Z+ . ˆ 3. A Character Formula for gl ∞|∞ -Modules First we recall the notion of the hook Schur function of Berele-Regev [BR]. Let x = {x1 , x2 , · · · } be a countable set of variables. To a partition λ of non-negative integers we may associate the Schur function sλ (x1 , x2 , · · · ). We will write sλ (x) for sλ (x1 , x2 , · · · ). For a partition µ ⊂ λ we let sλ/µ (x) denote the corresponding skew Schur function. Denoting by µ the conjugate partition of a partition µ the hook Schur function corresponding to a partition λ is defined by H Sλ (x; y) := sµ (x)sλ /µ (y), (3.1) µ⊂λ

where as usual y = {y1 , y2 , · · · }. Let λ be a partition and µ ⊆ λ. We fill the boxes in µ with entries from the linearly ordered set {x1 < x2 < · · · } so that the resulting tableau is semi-standard. Recall that this means that the rows are non-decreasing, while the columns are strictly increasing. Next we fill the skew partition λ/µ with entries from the linearly ordered set {y1 < y2 < · · · } so that it is conjugate semi-standard, which means that the rows are

100

S.-J. Cheng, N. Lam

strictly increasing, while its columns are non-decreasing. We will refer to such a tableau as an (∞|∞)-semi-standard tableau (cf. [BR]). To each such tableau T we may associate a polynomial (xy)T , which is obtained by taking the products of all the entries in T . Then we have [BR] (xy)T , (3.2) H Sλ (x; y) = T

where the summation is over all (∞|∞)-semi-standard tableaux of shape λ. We have the following combinatorial identity involving hook Schur functions that is crucial in the sequel. Proposition 3.1. Let x = {x1 , x2 , · · · }, y = {y1 , y2 , · · · } be two infinite countable sets of variables and z = {z1 , z2 , · · · , zm } be m variables. Then

(1 − xi zk )−1 (1 + yj zk ) =

H Sλ (x; y)sλ (z),

(3.3)

λ

i,j,k

where 1 ≤ i, j < ∞, 1 ≤ k ≤ m and λ is summed over all partitions λ with length not exceeding m. Proof. Consider the classical Cauchy identity

(1 − xi zk )−1 (1 − yj zk )−1 =

sλ (x, y)sλ (z),

(3.4)

λ

i,j

where λ is summed over all partitions of length not exceeding m. Recall that for any partition λ one has (cf. [M] (I.5.9)) sλ (x, y) =

sµ (x)sλ/µ (y).

(3.5)

µ⊂λ

Let ω denote the involution of the ring of symmetric functions, which sends the elementary symmetric functions to the complete symmetric functions, so that we have ω(sλ (x)) = sλ (x). Now applying ω to the set of variables y in (3.4) we obtain together with (3.5)    (1 − xi zk )−1 (1 + yj zk ) = sµ (x)sλ /µ (y) sλ (z) λ

i,j,k

=

µ⊂λ

H Sλ (x; y)sλ (z),

λ

as required.

We note that Proposition 3.1 in the case when the sets of variables are all finite sets follows from the Howe duality ([H1, H2]) involving a general linear Lie superalgebra and a general linear Lie algebra described in [CW1]. Since we will need this result in the case when both algebras involved are Lie algebras later on we will recall it here.

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

101

Proposition 3.2. [H2] The Lie algebras gld and glm with their natural actions on S(Cd ⊗ Cm ) form a dual pair. With respect to their joint action we have the following decomposition: Vdλ ⊗ Vmλ , S(Cd ⊗ Cm ) ∼ = λ

where the summation is over all partitions with length not exceeding min(l, m). ˆ ∞|∞ × gll duality of [CW3]. Consider l pairs of free Below we will recall the gl ±,i fermions ψ (z) and l pairs of free bosons γ ±,i (z) with i = 1, · · · , l. That is, we have ψ +,i (z) =

ψn+,i z−n−1 ,

ψ −,i (z) =

n∈Z

γ

+,i

(z) =

ψn−,i z−n ,

n∈Z

γr+,i z−r−1/2 ,

γ

−,i

(z) =

r∈ 21 +Z

γr−,i z−r−1/2

r∈ 21 +Z −,j

−,j

+,i with non-trivial commutation relations [ψm , ψn ] = δij δm+n,0 and [γr+,i , γs ] = δij δr+s,0 . Let F denote the corresponding Fock space generated by the vaccum vector |0 >. −,i That is, ψn+,i |0 >= ψm |0 >= γr±,i |0 >= 0, for n ≥ 0, m > 0 and r > 0. These operators are called annihilation operators. ˆ ∞|∞ of central charge l on F given by (i, j ∈ Z Explicitly we have an action of gl 1 and r, s ∈ 2 + Z)

eij =

l

+,p

−,p

: ψ−i ψj

:,

p=1

ers = −

l

+,p −,p

: γ−r γs

:,

p=1

eis =

l

+,p −,p

: ψ−i γs

:,

p=1

erj = −

l

+,p

−,p

: γ−r ψj

:.

p=1

An action of gll on F is given by the formula Eij =

n∈Z

−,j

+,i : ψ−n ψn

:−

r∈1/2+Z

−,j

+,i : γ−r γr

:.

Here and further :: denotes the normal ordering of operators. That is, if A and B are two operators, then : AB := AB, if B is an annihilation operator, while : AB := (−1)p(A)p(B) BA, otherwise. As usual, p(X) denotes the parity of the operator X.

102

S.-J. Cheng, N. Lam

Before stating the duality of [CW3] we need some more notation. For j ∈ Z+ we define the matrices X−j as follows:   −,l −,l−1 · · · ψ0−,1 ψ 0 ψ0 ψ −,l ψ −,l−1 · · · ψ −,1   0 0 0  X0 =  . .. . ,  .. . · · · ..  ψ0−,l ψ0−,l−1 · · · ψ0−,1  γ −,l−1 · · · γ −,1 γ −,l − 21 − 21 − 21    −,l −,l−1 −,1   ψ−1 ψ−1 · · · ψ −1  , =  .. .. ..   . . ··· .  −,l −,l−1 −,1 ψ ψ−1 · · · ψ−1   −1 γ −,l γ −,l−1 · · · γ −,1 − 21 − 21 − 21    −,l −,l−1 −,1  γ 3 γ 3 ··· γ 3   −2 −2 −2    , =  −,l −,l−1 −,1   ψ−2 ψ−2 · · · ψ −2    .. .. .   . . · · · ..  

X−1

X−2

−,l −,l−1 −,1 ψ−2 ψ−2 · · · ψ−2 .. . .. .



γ −,l γ −,l−1 · · · γ −,1 1 1 1 −2

X−k ≡ X −l

−2

  −,l −,l−1 γ 3 γ 3  −2 −2  . ..  =  .. .   −,l −,l−1 γ γ  −l+ 21 −l+ 21  γ −,l 1 γ −,l−1 1 −l− 2

−l− 2



−2

      ,  −,1   ··· γ −l+ 21   · · · γ −,1 1 · · · γ −,1 − 23 .. ··· .

k ≥ l.

−l− 2

The matrices Xj , for j ∈ N, are defined similarly. Namely, X j is obtained from X −j by replacing ψi−,k by ψi+,l−k+1 and γr−,k by γr+,l−k+1 . For 0 ≤ r ≤ l, we let Xri (i ≥ 0) denote the first r × r minor of the matrix X i and let i X−r (i < 0) denote the first r × r minor of the matrix X i . Consider a generalized partition λ = (λ1 , λ2 , · · · , λp ) of length not exceeding l with λ1 ≥ λ2 ≥ · · · ≥ λi > λi+1 = 0 = · · · = λj −1 > λj ≥ · · · ≥ λl . Now the irreducible rational representations of GLl are parameterized by generalized partitions, hence these may be interpreted as highest weights of irreducible representations of GLl . We denote the corresponding finite-dimensional highest weight irreducible GLl - (or gll -) module by Vlλ . Let λj be the length of the j th column of λ. We use the convention that the first column of λ is the first column of the partition λ1 ≥ λ2 ≥ · · · ≥ λi .

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

103

The column to the right is the second column of λ, while the column to the left of it is the zeroth column and the column to the left of the zeroth column is the −1st column. We also use the convention that a non-positive column has non-positive length. As an example consider λ = (5, 3, 2, 1, −1, −2) with l(λ) = 6. We have λ−1 = −1, λ0 = −2, λ1 = 4, etc. (see (3.6)). .............................................................................................................................. ... ... ... ... ... ... .... ... ... ... ... ... .. .. .. .. .. .. ................................................................................................................................. ... ... ... ... ... ... ... ... ... ... ... ... . . . ........................................................................... . .... .... .... ... ... ... .. .. .. ...................................................... ... ... ... ... ... ... . . .................................................... ... ... ... ... ... ... .. . ..................................................... .. ... ... ... ... ... ... .... ... ....................................................

(3.6)

For ∈ (gl∞|∞ )∗0 , we set s = (ess ), for s ∈ 21 Z. Given a generalized partition ∞|∞ )∗ by: λ with l(λ) ≤ l, we define (λ) ∈ (gl 0 (λ)i = λi − i, (λ)j =

−−λj

i ∈ N,

+ j ,

j ∈ −Z+ ,

(λ)r = λr+1/2 − (r − 1/2),

r∈

1 + Z+ , 2

(λ)s = −−λp+(s+1/2) + (s − 1/2),

1 s ∈ − − Z+ , 2

(λ)(C) = l. Here for an integer k the expression < k >≡ k, if k > 0, and < k >≡ 0, otherwise. We have the following theorem. ˆ ∞|∞ and gll form a dual pair on F in the Theorem 3.1. [CW3] The Lie superalgebra gl sense of Howe. Furthermore we have the following (multiplicity-free) decomposition of F with respect to their joint action ˆ ∞|∞ , (λ)) ⊗ Vlλ , F∼ L(gl = λ

where the summation is over all generalized partitions of length not exceeding l. Furthermore, the joint highest weight vector of the λ-component is given by · detXλ0 · detXλ1 · detXλ2 · · · detXλλ1 |0. detXλλl +1 · · · detXλ−1 λl +1

−1

We compute for i ∈ Z, r ∈

1 2

0

1

+ Z, +,p

+,p

[eii , ψ−n ] = δin ψ−n , −,p

−,p

[eii , ψ−n ] = −δ−in ψ−n , ±,p

±,p

[err , ψ−n ] = [eii , γ−r ] = 0, +,p

+,p

[err , γ−s ] = δrs γ−s , −,p

−,p

[err , γ−s ] = −δ−rs γ−s .

2

λ1

104

S.-J. Cheng, N. Lam

Furthermore for i = 1, · · · , l we have +,p

+,p

[Eii , ψ−n ] = δip ψ−n , −,p

−,p

[Eii , ψ−n ] = −δip ψ−n , +,p

+,p

[Eii , γ−r ] = δip γ−r , −,p

−,p

[Eii , γ−r ] = −δip γ−r . Let e be a formal indeterminate and set for j ∈ Z, r ∈ xi = ei ,

yj = eωj ,

1 2

+ Z, i = 1, · · · , l,

zr = eωr ,

∞|∞ introwhere 1 , · · · , l and ωs are the respective fundamental weights of gll and gl duced earlier. It is easy to see that the character of F, with respect to the abelian algebra l i=1 CEii , is given by s∈ 1 Z Cess ⊕ 2

chF =

l i=1

−1 −1 m∈Z+ (1 + xi y−m ) . −1 −1 r∈1/2+Z+ (1 − xi zr )(1 − xi z−r ) n∈N (1 + xi yn )

(3.7)

By Proposition 3.1 we can rewrite (3.7) as chF = H Sµ (z; y)H Sν (z−1 ; y−1 )sµ (x)sν (x−1 ),

(3.8)

µ,ν

where µ and ν are summed over all partitions of length not exceeding l. Here we −1 , · · · }, z = {z 1 , z 3 , · · · }, z−1 = use the notation y = {y1 , y2 , · · · }, y−1 = {y0−1 , y−1 2

2

{z−11 , z−13 , · · · }, x = {x1 , x2 , · · · , xl }, and x−1 = {x1−1 , x2−1 , · · · , xl−1 }. −2

−2

It is clear that sν (x−1 ) is just the character of the gll -module (Vlν )∗ , the module contragredient to Vlν . Therefore we have sµ (x)sν (x−1 ) =

λ cµν chVlλ ,

(3.9)

λ

where the summation now is over generalized partitions of length not exceeding l. Here λ are of course just the multiplicity of V λ in the tensor the non-negative integers cµν l µ product decomposition of Vl ⊗ (Vlν )∗ . This combined with (3.8) allows us to write the character of F as λ cµν H Sµ (z; y)H Sν (z−1 ; y−1 ) chVlλ . (3.10) chF = λ

µ,ν

On the other hand Theorem 3.1 implies that ˆ ∞|∞ , (λ))chVlλ . chF = chL(gl λ

Using (3.10) together with (3.11) we can prove the following character formula:

(3.11)

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

Theorem 3.2. We have ˆ ∞|∞ , (λ)) = chL(gl

105

λ cµν H Sµ (z; y)H Sν (z−1 ; y−1 ),

µ,ν λ where the summation is over all partitions µ and ν of length not exceeding l and cµν λ λ µ ν ∗ are determined by the tensor product decomposition Vl ⊗ (Vl ) = λ cµν Vl .

Proof. The statement of the theorem would follow from the linear independence of the Schur functions in the ring of symmetric functions, if the summation in (3.10) and (3.11) were over partitons λ of length not exceeding l. However, here we need to deal with summation over generalized partitions λ. This will follow from the following lemma.

Lemma 3.1. Let q be an indeterminate and suppose that λ φ(q)chVlλ = 0, where φλ (q) are power series in q and λ above is summed over all generalized partitions of length not exceeding l. Then φλ (q) = 0, for all λ. 1

Proof. We continue to use the notation from above. We let pl = (x1 x2 · · · xl ) l and 1 1 1 p1 = (x1 x2−1 ) l , p2 = (x2 x3−1 ) l , · · · , pl−1 = (xl−1 xl−1 ) l . Then chVlλ may be written as l

chVlλ = pl

i=1 λi

ρλ (p1 , · · · , pl−1 ),

(3.12)

where ρλ (p1 , · · · , pl−1 ) are Laurent polynomials in p1 , · · · , pl−1 . The Laurent polynomial ρλ (p1 , · · · , pl−1 ) here is of course just the corresponding character of the irreducible sln -module. We need to show that if we have λ φλ (q)chVlλ = 0, then φλ (q) = 0 for all λ, where q is some indeterminate. Using (3.12), by considering just the coefficient of plm , m ∈ Z, we deduce that φλ (q)ρλ (p1 , · · · , pl−1 ) = 0. (3.13)

λ,

λi =m

l

Now it is clear that i=1 λi and ρλ (p1 , · · · , pl−1 ) uniquely determines λ. Hence if we sum over generalized partitions λ with li=1 λi fixed, ρλ is summed over inequivalent irreducible finite-dimensional sln -characters. Thus by the Weyl character formula we may write w(λ +ρ) w∈W w e ρλ (p1 , · · · , pl−1 ) = , α/2 − e−α/2 ) α∈ + (e where λ is the corresponding sll -highest weight of λ, W is the Weyl group, w is the sign of w ∈ W , + is a set of positive roots and ρ = 21 α∈ + α. Multiplying (3.13) by α∈ + (eα/2 − e−α/2 ) and using the Weyl character formula we get φλ (q) ew(λ +ρ) = 0. (3.14) λ

w∈W

As λ + ρ is a regular dominant weight, the coefficient of eλ +ρ in (3.14) above is φλ (q). Thus φλ (q) = 0.

106

S.-J. Cheng, N. Lam

We conclude this section by applying the character formula to the case of l = 1, that is when the central charge is 1. In this case µ and ν are integers. Furthermore, (λ) = λω 1 + 0 , for λ ≥ 0, and (λ) = −ω0 + (λ + 1)ω− 1 + 0 , for λ < 0. Since 2

2

obviously V1 ⊗ (V1ν )∗ = V1 µ

µ−ν

, we see that ˆ ∞|∞ , λω 1 + 0 ) = chL(gl H Sµ (z; y)H Sν (z−1 ; y−1 ), 2

µ−ν=λ

ˆ ∞|∞ , −ω0 + (λ + 1)ω 1 + 0 ) = chL(gl − 2

λ ≥ 0,

H Sµ (z; y)H Sν (z−1 ; y−1 ),

(3.15) λ < 0,

µ−ν=λ

where the summation is over all partitions of non-negative integers µ and ν. ∞|∞ -Modules 4. A q-Character Formula for gl We can obtain a q-character formula for the irreducible highest weight module ∞|∞ , (λ)) from Theorem 3.2. The resulting character formula will involve a sum L(gl of products of hook Schur functions. However we can use Proposition 3.1 and Lemma 3.1 to obtain a simpler formula that will only involve a sum of hook Schur functions. This we will discuss below. ∞|∞ has a principal 1 Z-gradation, by declaring the highest weight vectors Since gl 2 to be of degree 0, its irreducible highest weight modules are naturally 21 Z-graded. Given a quasi-finite highest weight module V = ⊕s∈ 1 Z+ Vs , we can define the q-character of 2 V to be chq V = dimVs q s . s∈ 21 Z+

∞|∞ , (λ)) is quasi-finite, so that we may compute its It is known [CW3] that L(gl q-character. We first introduce the Virasoro field: n 1 L(z) = Ln z−n−2 = (: ∂ψ +,i (z)ψ −,i (z) : − : ψ +,i (z)∂ψ −,i (z) :) 2 n∈Z

+

1 2

i=1

n

(: γ +,i (z)∂γ −,i (z) : − : ∂γ +,i (z)γ −,i (z) :).

i=1

Now set d˜ = −L0 − 21 α0 , where α0 =

l

i=1 (

n∈Z

(4.1)

+,i −,i : ψ−n ψn :). We have

˜ ψn±,i ] = nψn±,i , [d, ˜ γr±,i ] = rγr±,i . [d, It is clear that d˜ commutes with li=1 CEii so that we may decompose the Fock space F l into its Cd˜ + i=1 CEii -weight spaces. Let chq F denote the resulting character. Letting ˜ = 1 we xi = ei be as before and e−δ = q with δ defined by δ (Eii ) = 0 and δ (d) have   l −1 n n (1 + q xi )(1 + q xi )   chq F = (1 + xi−1 ) .  −1 r r (1 − q xi )(1 − q xi ) 1 i=1 n∈N,r∈ 2 +Z+

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

107

By Proposition 3.1 we have l

(1 + q n xi )(1 + q n xi−1 )

i=1 n∈N,r∈ 1 +Z+ 2

1

(1 − q r xi )(1 − q r xi−1 )

=

H Sµ (qr ; qn )sµ (x, x−1 ),

(4.2)

µ

3

where qr = {q 2 , q 2 , · · · }, qn = {q, q 2 , · · · }. Here the summation of µ is over all partitions of length not exceeding 2l, and sµ (x, x−1 ) = sµ (x1 , · · · , xl , x1−1 , · · · , xl−1 ). The expression sµ (x, x−1 ) has a simple interpretation. Consider the embedding of the Lie algebra gll into gl2l given by A 0 A ∈ gll → ∈ gl2l . 0 −At µ

µ

The gl2l -module V2l has gl2l -character sµ (x1 , x2 , · · · , x2l ). We may restrict V2l via the embedding above to a gll -module. Then its gll -character is given by sµ (x, x−1 ). Of course the product li=1 (1 + xi−1 ) is nothing but the character of • (Cl∗ ), the exterior algebra of the gll -module contragredient to the standard module. We may deµ compose V2l ⊗ • (C l∗ ) as a gll -module and obtain µ µ V2l ⊗ • (C l∗ ) ∼ cλ Vlλ , = µ µ

where λ are now generalized partitions of lengths not exceeding l and cλ are the multiplicities of this tensor product decomposition. Therefore we have the corresponding character identity sµ (x, x−1 )

l

(1 + xi−1 ) =

µ

cλ chVlλ .

Using (4.2) and (4.3) we obtain chq F =

λ

(4.3)

λ

i=1

µ cλ H Sµ (qr ; qn )

chVlλ ,

(4.4)

µ

where µ here is summed over all generalized partitions of length not exceeding l. On the other hand by Theorem 3.1, using the explicit formula of the joint highest weight vectors, we see that ∞|∞ , (λ))chVlλ , q h(λ) chq L(gl (4.5) chq F =

λ

where h(λ) = r∈ 1 Z r(λ)r . Combining (4.4) with (4.5), using Lemma 3.1, we have 2 thus proved the following Theorem 4.1. ∞|∞ , (λ)) = q −h(λ) chq L(gl

λ cµ H Sλ (qr , qn ),

µ µ

where the sum is over all partitions of length not exceeding 2l and the coefficient cλ are determined by (4.3).

108

S.-J. Cheng, N. Lam

We consider the simplest case when l = 1 and let x = x1 . In this case it is clear that sµ (x, x −1 ) is just the character of the corresponding sl2 -module, and hence for a partitions µ = (µ1 , µ2 ) with µ1 ≥ µ2 ≥ 0 we have sµ (x, x −1 ) = x µ1 −µ2 + x µ1 −µ2 −2 + · · · + x µ2 −µ1 . Hence sµ (x, x −1 )(1 + x −1 ) = x µ1 −µ2 + x µ1 −µ2 −1 + · · · + x µ2 −µ1 −1 . Therefore we see that for λ ∈ Z, ∞|∞ , λω 1 + 0 ) = q − 2 chq L(gl

λ

2

H Sµ (qr ; qn ),

µ1 −µ2 ≥λ

∞|∞ , −ω0 + (λ + 1)ω 1 + 0 ) = q chq L(gl −

λ+1 2

2

λ ≥ 0, H Sµ (qr ; qn ),

λ < 0.

µ2 −µ1 −1≤λ

Note that the coefficient of s ∈ 21 Z+ in H Sµ (qr ; qn ) can be computed as follows. 1 3 Arrange qr = {q 2 < q 2 < · · · } and qn = {q < q 2 < · · · } in increasing order. Let T be an (∞|∞)-semi-standard tableau of shape µ. Let q m(T ) denote the product of all entries in T so that m(T ) ∈ 21 Z+ . Then for a fixed s ∈ 21 Z+ the coefficient of q s in H Sµ (qr ; qn ) qs ,

is the number of (∞|∞)-semi-standard tableaux of shape µ with m(T ) = s. Hence we have the formula 1 qs . H Sµ (qr ; qn ) = s∈ 21 Z+

m(T )=s 3

5

For example if µ is the partition (2, 0), then H S(2,0) (qr , qn ) = q +q 2 +2q 2 +2q 2 +· · · . Now by [KL],

∞|∞ , λω 1 + 0 ) = (1 + q λ+ 2 )−1 chq L(gl 1

2

n∈N,r∈ 21 +Z+

(1 + q r )2 , (1 − q n )2

λ ≥ 0,

∞|∞ , −ω0 + (λ + 1)ω 1 + 0 ) chq L(gl − = (1 + q

−λ− 21

)−1

2

n∈N,r∈ 21 +Z+

(1 + q r )2 , (1 − q n )2

λ < 0.

Therefore we obtain the following interesting combinatorial identities. Theorem 4.2. For λ ∈ Z and µ = (µ1 , µ2 ) a partition of length at most 2 we have

λ

µ1 −µ2 ≥λ

1

H Sµ (qr ; qn ) = q 2 (1 + q λ+ 2 )−1

n∈N,r∈ 21 +Z+

H Sµ (qr ; qn ) = q −

λ+1 2

1

(1 + q −λ− 2 )−1

µ2 −µ1 −1≤λ

∞|∞ , 0 ) = First we note that chq L(gl over all µ. Also we have

(1 + q r )2 , (1 − q n )2

n∈N,r∈ 21 +Z+ µ H Sµ (q

r ; qn ),

λ ≥ 0,

(1 + q r )2 , (1 − q n )2

λ < 0.

where the summation is

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

109

∞|∞ , λω 1 + 0 ) = chq L(gl ∞|∞ , −ω0 − λω 1 + 0 ), chq L(gl − 2

2

(4.6)

for all λ ∈ Z+ . We have a similar q-character identity for higher level representations as well. In fact one has the following proposition. Proposition 4.1. Let l ∈ N and λ be a generalized partition of length not exceeding l. We have ∞|∞ , (λ1 , λ2 , · · · , λl )) chq L(gl ∞|∞ , (−λl − 1, · · · , −λ2 − 1, −λ1 − 1)). = chq L(gl In particular when l = 1 the above identity reduces to (4.6). ∞|∞ and the Proof. We will use the Howe duality involving a B-type subalgebra of gl double covering Lie group Pin2l of the spin group Spin2l on F in [CW3] to prove this identity. We recall that the subalgebra B is the subalgebra of gl∞|∞ preserving the even nondegenerate bilinear form (·|·) of gl∞|∞ defined by (ei |ej ) = (−1)i δi,−j , r+ 21

(er |es ) = (−1)

i, j ∈ Z,

δr,−s ,

r, s ∈

1 + Z. 2

of B. Obviously Restricting the 2-cocycle of gl∞|∞ to B we obtain a central extension B Pin2l ) forms a B acts on F and in fact there exists an action of Pin2l on F such that (B, dual pair [CW3]. Hence we have a multiplicity-free decomposition F∼ =

µ ) ⊗ W , L(B, 2l µ

µ µ µ )) stands for an irreducible Pin2l -module (respectively where W2l (respectively L(B, B-module) of highest weight µ (respectively µ ) and the map µ → µ is a bijection. On ∞|∞ × GLl ) on F. Now if the other hand by Theorem 3.1 we have another dual pair (gl 1

we twist the action of GLl on F by (det) 2 , then these two dual pairs form a seesaw pair in the sense of Kudla [KU]. This implies that we have the following decompositions of and GLl -modules: respectively B∞|∞ , (λ )) ∼ L(gl =

µ µ ), bλ L(B,

µ

µ W2l

∼ =

µ

bλ Vlλ ,

λ 1

where here λ = λ − ( 21 , · · · , 21 ) due to the twist by det 2 and bλ ∈ Z+ . Now the Pin2l -modules that appear in the decomposition of F with respect to the Pin2l ), when regarded as a module over the Lie algebra so2l , decomposes dual pair (B, µ

110

S.-J. Cheng, N. Lam µ

µ

into two irreducible modules contragredient to each other. Hence bλ = bλ∗ . But then as B-modules we have µ µ ) L(gl∞|∞ , (λ )) ∼ bλ L(B, = µ

∼ =

µ µ ) bλ∗ L(B,

µ

∼ ∞|∞ , (λ∗ )). = L(gl we have an isomorphism Hence as modules over B ∞|∞ , (−λl − 1, · · · , −λ2 − 1, −λ1 − 1)). L(gl∞|∞ , (λ1 , λ2 , · · · , λl )) ∼ = L(gl is a subalgebra of gl ∞|∞ preserving the principal Z-gradation, the proposition Since B follows.

ˆ 5. A Character Formula for gl m|n -Modules at Level 1 ˆ n|n We first recall a method of constructing representations of affine superalgebras gl ˆ ∞|∞ [KL]. It is a generalization of the classical reduction from representations of gl ˆ ˆ from gl ∞ to gl n . Our presentation below is somewhat different from [KL] in flavor. Let ψ + (z) = n∈Z ψn+ z−n−1 and ψ − (z) = n∈Z ψn− z−n be a pair of free ferm 1 ions and let γ ± (z) = r∈ 1 +Z γr± z−r− 2 be a pair of free bosons. Let F denote the 2 corresponding Fock space generated by the vacuum vector |0 >. We have thus an action ˆ ∞|∞ , gl1 ) on F, where the central charge of gl ˆ ∞|∞ is 1. From the pair of the dual pair (gl ±,i of free fermions we may construct n pairs of free fermions ψ (z), for i = 1, · · · , n, as follows: +,i + ψk z−k−1 = ψ−i+n(k+1) z−k−1 , (5.1) ψ +,i (z) = k∈Z

ψ

−,i

(z) =

k∈Z

ψk−,i z−k

k∈Z

=

− ψi+n(k−1) z−k .

(5.2)

k∈Z

It is easy to check that the only non-zero commutation relations are −,j

+,i , ψn ] = δij δm+n,0 , [ψm

m, n ∈ Z.

Similarly we construct from our pair of free bosons n pairs of free bosons γ ±,i (z) for i = 1, · · · , n, via 1 −r− 21 γ +,i (z) = γr+,i z−r− 2 = γ+ 1 , (5.3) 1 z r∈ 21 +Z

γ −,i (z) =

r∈ 21 +Z

r∈ 21 +Z 1

γr−,i z−r− 2 =

r∈ 21 +Z

−i+ 2 +n(r+ 2 )

γ− 1

i− 2 +n(r− 21 )

1

z−r− 2 .

(5.4)

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

111

Again it is easily checked that the only non-zero commutation relations are −,j

[γr+,i , γs

] = δij δr+s,0 ,

r, s ∈

1 + Z. 2

We may now use these n pairs of fermions and bosons to construct a copy of the affine gln|n of central charge 1 in the standard way: Eij (z) =

Eij (m)z−m−1 =: ψ +,i (z)ψ −,j (z) :,

m∈Z

eij (z) =

eij (m)z−m−1 = − : γ +,i (z)γ −,j (z) :,

m∈Z

aij (z) =

aij (m)z−m−1 =: ψ +,i (z)γ −,j (z) :,

m∈Z

bij (z) =

bij (m)z−m−1 = − : γ +,i (z)ψ −,j (z) : .

m∈Z

Explicitly we have the following formulas. + Eij (m) = : ψ−i+n(k+1) ψj−+n(l−1) :, k+l=m

eij (m) = −

: γ+

γ− −i+ 21 +n(r+ 21 ) j − 21 +n(s− 21 )

r+s=m

aij (m) =

+ : ψ−i+n(k+1) γ−

j − 21 +n(r− 21 )

k+r+ 21 =m

bij (m) = −

: γ 1+

k+r− 21 =m

1 2 −i+n(r+ 2 )

:,

:,

ψj−+n(k−1) : .

The following lemma is straightforward. Lemma 5.1. We have ˆ ∞|∞ )+ , for i < j with m = 0 and for m ≥ 1, Eij (m) ∈ (gl ˆ ∞|∞ )+ , for i < j with m = 0 and for m ≥ 1, eij (m) ∈ (gl ˆ ∞|∞ )+ , for i < j with m = 0 and for m ≥ 1, aij (m) ∈ (gl ˆ ∞|∞ )+ , for i ≤ j with m = 0 and for m ≥ 1. bij (m) ∈ (gl ˆ ∞|∞ that appears in the decomposition of F is obviousNow every representation of gl ˆ n|n constructed via reduction modulo n. By [KL] every ly invariant under the action of gl ˆ irreducible gl ∞|∞ -module that appears in F in fact remains irreducible when restricted ˆ n|n . Hence it follows from the previous lemma that every irreducible representato gl ˆ ∞|∞ that appears in the decomposition of F is a highest weight irreducible tion of gl ˆ n|n -module with respect to the Borel subalgebra induced by the non-standard Borel gl subalgebra of gln|n .

112

S.-J. Cheng, N. Lam

Remark 5.1. In general one can construct m pairs of free fermions and n pairs of free bosons using the method just described. One then constructs a copy of the affine glm|n of central charge 1 in the usual way. However, the resulting representations of this affine ˆ ∞|∞ of are not highest weight glm|n on the highest weight irreducible representations of gl representations with respect to a Borel subalgebra induced from a Borel subalgebra of glm|n . ˆ n|n -module L(gl ˆ ∞|∞ , (λ)) from We want to deduce a character formula for the gl ˆ ∞|∞ , (3.16). For this we will need to find a slightly more general formula for chL(gl (λ)). Set d¯ = −L0 (see (4.1)) so that we have ¯ ψ ±,i ] = k ± 1 ψ ±,i , [d, ¯ γr±,i ] = rγr±,i . [d, k k 2 From this the following lemma is a straightforward computation. ¯ X(m)] = mX(m), for X = E, e, a, b. Lemma 5.2. For all m ∈ Z we have [d, ˆ ∞|∞ , (λ)) as a semisimple By construction we see that d¯ acts on each L(gl ¯ ˆ linear operator and also [d, (gl ∞|∞ )0 ] = 0. Hence we may compute the character ˆ ∞|∞ )0 ⊕ Cd. ¯ Letting δ ∈ ((gl ˆ ∞|∞ )0 ⊕ Cd) ¯ ∗ ˆ ∞|∞ , (λ)) with respect to (gl of L(gl ¯ = 1 and δ((gl ˆ ∞|∞ )0 ) = 0 we have (x = x1 , q = e−δ ): defined by δ(d) chF =

k− 2 (1 + xyj +n(k−1) q k− 2 )(1 + x −1 yj−1 ) −nk q 1

n

k∈N,s∈ 21 +Z+ j =1

1

(1 − xzj − 1 +(s− 1 )n q s )(1 − x −1 z−1 2

−j + 21 −(s− 21 )n

2

qs )

.

∞|∞ )0 ⊕ Cd the character It follows again from Proposition 3.1 that with respect to (gl ˆ of L(gl ∞|∞ , (λ)) equals

H Sµ (zq; yq)H Sν (z−1 q; y−1 q),

(5.5)

µ−ν=λ

where for k ∈ N, s ∈ y−1 q = {yj−1 −nk q

k− 21

1 2

1

+ Z+ and j = 1, · · · , n. Here yq = {yj +n(k−1) q k− 2 },

}, zq = {zj − 1 +(s− 1 )n q s } and z−1 q = {z−1 2

−j + 21 −(s− 21 )n

2

q s }.

The following lemma follows easily from our construction. Lemma 5.3. For i, j = 1, · · · , n; k ∈ Z and r ∈ ±,j

[Eii (0), ψk

±,j

] = ±δij ψk

±,j [eii (0), ψk ]

= 0,

,

1 2

+ Z we have ±,j

[Eii (0), γr

±,j [eii (0), γr ]

] = 0, ±,j

= ±δij γr

.

ˆ n|n by d. It is clear from Lemma 5.2 that Let us denote the scaling operator of gl ˆ n|n the linear operator d − d¯ acts as a scalar on each irreducible representation of gl that appears in the decomposition of F. The scalar can be computed from the explicit

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

113

ˆ ∞|∞ highest weight vectors given by Theorem 3.1. In the case of l = 1 formulas of the gl the highest weight vectors are given by (γ +1 )λ |0 >,

λ ≥ 0,

−2

(γ −1 )−λ−1 ψ0− |0 >, −2

(5.6) λ < 0.

(5.7)

Now d¯ acts on the former with eigenvalue − λ2 and on the latter with eigenvalue λ2 . Thus using Lemma 5.3 we can rewrite (5.5) in the following form. Theorem 5.1. For yq = {yj q s }, y−1 q = {yj−1 q s }, zq = {zj q s } and z−1 q = {zj−1 q s } with j = 1, · · · , n and s ∈ 21 + Z+ we have (µ, ν ∈ Z+ and λ ∈ Z) ˆ n|n , λδ˜1 + ˜ 0) = q− 2 chL(gl

λ

H Sµ (zq; yq)H Sν (z−1 q; y−1 q),

µ−ν=λ

ˆ n|n , (λ + 1)δ˜n − ˜n + ˜ 0) = q 2 chL(gl

λ

λ ≥ 0,

H Sµ (zq; yq)H Sν (z−1 q; y−1 q), λ < 0,

µ−ν=λ ˜

˜

where yj = e˜j , zj = eδj and q = e−δ . ˆ n|n -modules in Theorem 5.1 are integrable. Furthermore they Proposition 5.1. The gl ˆ n|n -modules of level 1. form a complete list of integrable gl Proof. We will employ the method of simple odd reflections [PS] following [KW]. ˆ n|n with respect to the standard Borel subalgeRecall that the set of simple roots of gl ˜ ˜ bra is given by {α0 = δn − ˜1 + δ, α1 = ˜1 − ˜2 , α2 = ˜2 − ˜3 , · · · , αn = ˜n − δ˜1 , αn+1 = δ˜1 − δ˜ 2 , · · · , α2n−1 = δ˜n−1 − δ˜n }. The corresponding Dynkin diagram is as follows (as usual denotes an isotropic odd root):

α1

α2

···

α0

αn−1

XXX XX

XXX

···

α2n−1

αn

Now the set of simple roots with respect to the non-standard Borel subalgebra are ˜ β1 = δ˜1 − ˜1 , β2 = ˜1 − δ˜2 , β3 = δ˜2 − ˜2 , · · · , β2n−2 = given by {β0 = ˜n − δ˜1 + δ, ˜ ˜ ˜n−1 − δn , β2n−1 = δn − ˜n }, with the corresponding Dynkin diagram

β0

XXX XXX X

··· ··· ···

β1

β2

β3

β2n−2

X β2n−1

114

S.-J. Cheng, N. Lam

We can use a chain of odd reflections [PS] to bring the second diagram to the first as follows. We reflect first along the odd simple root β2n−1 . After that we reflect along the rightmost odd simple root in the bottom row of the Dynkin diagram which is of the form δ˜i − ˜j . For example, after reflecting along β2n−1 we obtain the diagram δ˜n − δ˜1 + δ˜

β1

β2

β3

XXX

··· ··· ···

XX

XXX

γ

−β2n−1

where γ = ˜n−1 − ˜n . As −β2n−1 and γ are not of the form δ˜i − ˜j , the next step is to reflect along the odd simple root β2n−3 . Continuing this way we obtain the diagram corresponding to the standard Borel subalgebra. Now according to Lemma 1.4 of [KW] a highest weight vector v of highest weight with respect to the original Borel subalgebra remains a highest weight vector of the new Borel subalgebra if and only if the (γˇ ) = 0, where γˇ is the simple coroot corresponding to the odd simple root γ , along which we have reflected. Furthermore in this case the new highest weight and the original highest weight coincide. If however (γˇ ) = 0, then e−γ v is the highest weight vector with respect to the new Borel subalgebra, where e−γ is the root vector corresponding to −γ . Furthermore the new highest weight is − γ . ˜ 0 . It follows that when Let 0 ≤ λ ≤ n and consider the highest weight λδ˜1 + we change from non-standard to the standard Borel subalgebra it gets transformed to ˜ 0 . If λ > n, then the highest weight gets transformed to ˜1 + · · · + ˜1 + · · · + ˜λ + ˜ ˜ 0 . On the other hand let λ < 0 and consider the highest weight of the ˜n + (λ − n)δ1 + ˜ ˜ 0 . Changing from the non-standard Borel to the standard Borel form (λ + 1)δn + ˜n + via the sequence of odd reflections described above it follows that the highest weight ˜ 0 . However, the list of highest weights here for the standard gets transformed to λδ˜n + Borel subalgeba coincides with the list of integrable highest weights in [KW]. Thus all our modules are integrable.

As the Cartan subalgebras of the non-standard Borel and standard Borel subalgebras coincide, it follows that our character formula agree with the character formula of [KW]. Comparing both formulas gives rise to combinatorial identities. Our method can also be used to obtain a character formula for integrable level 1 ˆ m|n -modules as follows. Consider the Fock space F generated by m pairs of free fermgl ions ψ ±,i (z), i = 1, · · · , m and n pairs of free bosons γ ±,j (z), j = 1, · · · , n. Then ˆ m|n -module. To be more F according to [KW] is completely reducible as a level 1gl ˆ m|n × gl1 on F and they form a dual pair. The action precise there is an action of gl ˆ of the gl m|n is given in the usual way, while gl1 is generated by the charge operator +,j −,j +,i −,i n I = m : ψ ψ : − : γ γ : . We have the 1 1 s r −s −r i=1 j =1 s∈ 2 +Z+ r∈ 2 +Z+ following decomposition with respect to this joint action [KW]: F∼ =

λ∈Z

ˆ m|n , (λ)) ˜ L(gl ⊗ V1λ ,

(5.8)

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

˜ where (λ) is given as follows:  ˜ 0,  for 0 ≤ λ ≤ m, ˜1 + · · · + ˜λ + ˜ ˜ 0 , for λ > m, (λ) = ˜1 + · · · + ˜m + (λ − m)δ˜1 +  λδ˜ + ˜ 0, for λ < 0. n

115

(5.9)

We remark that (5.9) is a complete list (up to essential equivalence [KW]) of integrable ˆ m|n of level 1 with m ≥ 2 [KW]. Now we can write the character highest weights for gl of F as m s −1 −1 s i=1 (1 + xyi q )(1 + x yi q ) chF = , (5.10) n s −1 −1 s 1 j =1 (1 − xzj q )(1 − x zj q ) s∈ 2 +Z+

˜

where again we use the notation yi = e˜i , zj = eδj and x = e with ∈ (gl1 )∗ such ˜ that (I ) = 1. By Proposition 3.1 (5.10) can be written as (q = eδ )    H Sµ (zq; yq)H Sν (z−1 q; y−1 q) x λ , (5.11) λ∈Z

µ−ν=λ

where z±1 q = {zj±1 q s |j = 1, · · · , n; s ∈ 21 Z+ } and y±1 q = {yi±1 q s |i = 1, · · · , m; s ∈ 1 2 Z+ }. Thus we have arrived at the following description of characters by Lemma 3.1. ˆ m|n -highest weight defined by (5.9). Then ˜ Theorem 5.2. Let λ ∈ Z and (λ) be the gl |λ| ˆ m|n , (λ)) ˜ chL(gl = q− 2 H Sµ (zq; yq)H Sν (z−1 q; y−1 q), µ−ν=λ

where µ, ν ∈ Z+ . Remark 5.2. We note that we can derive character formulas in an analogous fashion for ˆ ∞ -modules that appear in the Fock space decomposition [F, FKRW, KR2] (see the gl also [W1] and [W2] for a rather elegant argument in the spirit of Howe duality). The resulting formulas will be sums of products of ordinary Schur functions instead of hook Schur functions. These formulas agree with the ones obtained in [KR2] and the one in [AFMO2] in the special case when = 0. However, when dealing with the q-character formulas, we can also produce a character involving just a sum of Schur functions. ˆ ∞ is closely related to the repreRemark 5.3. By [KR1] the representation theory of gl sentation theory of W1+∞ , which is the limit (in an appropriate sense) of the W -algebras WN , as N → ∞ [PRS1, PRS2]. In particular, quasi-finite irreducible highest weight representations of the latter can be constructed on a suitable tensor product of quasifinite irreducible highest weight representations of a central extension of gl∞ ⊗ An , where An ∼ = C[t]/t n . Using an analogous argument it can be shown that the quasi-finite highest weight irreducible representations of the Lie superalgebra of differential operators on the super circle, the super W1+∞ introduced in [MR] (cf. [AFMO1] and [CW3] for definition), can be realized on a suitable tensor product of quasi-finite highest weight irreducible modules of a central extension of gl∞|∞ ⊗ An . Furthermore ∞|∞ , (λ)) carries a structure of an irreducible representation of super W1+∞ each L(gl [CW3]. In particular, our character formula may be modified to obtain a character formula for these quasi-finite irreducible super W1+∞ -modules.

116

S.-J. Cheng, N. Lam

6. Tensor Product Decomposition In this section as another application of Theorem 3.1 we will compute the tensor product decomposition µν ˆ ∞|∞ , (µ)) ⊗ L(gl ˆ ∞|∞ , (ν)) ∼ ˆ ∞|∞ , (λ)), L(gl aλ L(gl (6.1) = λ

ˆ ∞|∞ of level l and level r, respectivewhere (µ) and (ν) denote highest weights of gl ly, so that µ and ν are generalized partitions with l(µ) ≤ l and l(ν) ≤ r. The summation λ in (6.1) is over all generalized partitions of length not exceeding l + r and (λ) is ˆ ∞|∞ of level l + r. We will compute the coefficients viewed as a highest weight of gl µν aλ in terms of the usual Littlewood-Richardson coefficients (see e.g. [M]). To emphasize the dependence of the Fock space F in Theorem 3.1 on the integer l we will write F = Fl and hence Theorem 3.1 reads ˆ ∞|∞ , (λ)) ⊗ Vlλ . Fl ∼ L(gl = λ

Therefore we have ˆ ∞|∞ , (ν)) ⊗ V µ ⊗ Vrν . ˆ ∞|∞ , (µ)) ⊗ L(gl Fl ⊗ F r ∼ L(gl = l µ,ν

Now Fl ⊗ Fr ∼ = Fl+r and hence using Theorem 3.1 again we have λ ˆ ∞|∞ , (λ)) ⊗ Vl+r L(gl λ

∼ =

ˆ ∞|∞ , (µ)) ⊗ L(gl ˆ ∞|∞ , (ν)) ⊗ V µ ⊗ Vrν . L(gl l

(6.2)

µ,ν λ , when regarded as a gl ×gl -module via the obvious embedding Now suppose that Vl+r l r of gll × glr into gll+r , decomposes as µ λ ∼ λ Vl+r bµν Vl ⊗ Vrν . = µ,ν

This together with (6.1) and (6.2) give µν

λ . aλ = bµν

(6.3)

The duality between the branching coefficients and tensor products of a general dual pair is well-known [H2]. We recall that in (6.3) µ, ν and λ are generalized partitions subject to constraints on their lengths. Now Proposition 3.2 combined with an analogous argument as the one given above imply that µν

λ a˜ λ = bµν ,

(6.4) µν

where here µ, ν, λ are partitions of appropriate lengths and the a˜ λ ’s are the usual Littlewood-Richardson coefficients. We remark that there are combinatorial algorithms

Infinite-Dimensional Lie Superalgebras and Hook Schur Functions

117

to compute these coefficients, the most well-known probably being the celebrated Littlewood-Richardson rule (again see e.g. [M]). Now for generalized partitions µ, ν and λ of appropriate lengths the decomposi λ ∼ λ V ν ⊗ V µ implies that V λ+d1l+r ∼ λ V µ+d1l ⊗ V ν+d1r , tion Vl+r = µ,ν bµν = µ,ν bµν r r l+r l l λ = where here 1k denotes the k-tuple (1, 1, · · · , 1) regarded as a partition. Hence bµν λ+d1

bµ+d1l+r . Now if we choose a non-negative integer d so that λ + d1l+r is a partition, l ,ν+d1r λ+d1

µ+d1 ,ν+d1r

l = a˜ λ+d1l+r then bµ+d1l+r l ,ν+d1r

and hence by (6.3) and (6.4), µν

µ+d1 ,ν+d1r

l aλ = a˜ λ+d1l+r

.

From our discussion above we arrive at the following theorem. Theorem 6.1. Let µ and ν be generalized partitions with l(µ) ≤ l and l(ν) ≤ r so that ˆ ∞|∞ -highest weights of level l and r, respectively. we may regard (µ) and (ν) as gl ˆ ∞|∞ , (µ)) ⊗ L(gl ˆ ∞|∞ , (ν)) into Then we have the following decomposition of L(gl ˆ ∞|∞ -highest weight modules of level l + r: irreducible gl µ+d1 ,ν+d1 r l ˆ ∞|∞ , (ν)) ∼ ˆ ∞|∞ , (λ − d1l+r )), ˆ ∞|∞ , (µ)) ⊗ L(gl a˜ λ L(gl L(gl = (λ,d)

where the summation above is over all pairs (λ, d) subject to the following three conditions: (i) λ is a partition of length not exceeding l + r and d a non-negative integer. (ii) µ + d1l and ν + d1r are partitions. (iii) If d > 0, then λ is a partition with λl+r = 0. µ+d1 ,ν+d1

r l are determined by the tensor product decomposition Here the coefficients a˜ λ µ+d1l µ+d1l ,ν+d1r λ ν+d1r ∼ ⊗ Vk Vk , where k ≥ l + r. of glk -modules Vk = λ a˜ λ

ˆ ∞ and gll mentioned earlier in Remark 6.1. Making use of the Howe duality between gl Remark 5.2, one derives in a completely analogous fashion a tensor product decompoˆ ∞ -modules that is identical to that for gl ∞|∞ -modules. sition rule for these gl References [AFMO1] Awata, H., Fukuma, M., Matsuo, Y., Odake, S.: Quasifinite highest weight modules over the super W1+∞ algebra. Commun. Math. Phys. 170, 151–179 (1995) [AFMO2] Awata, H., Fukuma, M., Matsuo,Y., Odake, S.: Character and determinant formulae of quasifinite representation of the W1+∞ algebra. Commun. Math. Phys. 172, 377–400 (1995) [BR] Berele, A., Regev, A.: Hook Young diagrams with applications to combinatorics and representations of Lie superalgebras. Adv. Math. 64, 118–175 (1987) [BS] Bouwknegt, P., Schoutens, K.: W -symmetry in conformal field theory. Phys. Rep. 223, 183– 276 (1993) [CW1] Cheng, S.-J., Wang, W.: Howe duality for Lie Superalgebras. Compositio Math. 128, 55–94 (2001) [CW2] Cheng, S.-J., Wang, W.: Remarks on the Schur-Howe-Sergeev duality. Lett. Math. Phys. 52, 143–153 (2000) [CW3] Cheng, S.-J., Wang, W.: Lie subalgebras of differential operators on the super circle. To appear in Publ. Res. Inst. Math. Sci., math.QA/0103092

118

S.-J. Cheng, N. Lam

[F]

Frenkel, I.: Representations of affine Lie algebras, Hecke modular forms and Kortwegde Vries type equations. Lect. Notes. Math 933, 71–110 (1982) Frenkel, E., Kac, V., Radul, A., Wang, W.: W1+∞ and W (glN ) with central charge N. Commun. Math. Phys. 170, 337–357 (1995) Howe, R.: Remarks on classical invariant theory. Trans. Am. Math. Soc. 313, 539–570 (1989) Howe, R.: Perspectives on Invariant Theory: Schur Duality, Multiplicity-free Actions and Beyond. The Schur Lectures, Israel Math. Conf. Proc. 8, Tel Aviv 1992, pp. 1–182 Kac, V., van de Leur, J.: Super boson-fermion correspondence. Ann. Inst. Fourier 37, 99–137 (1987) Kac, V., Radul, A.: Quasi-finite highest weight modules over the Lie algebra of differential operators on the circle. Commun. Math. Phys. 157, 429–457 (1993) Kac, V., Radul, A.: Representation theory of the vertex algebra W1+∞ . Transf. Groups 1, 41–70 (1996) Kac, V., Wakimoto, M.: Integrable highest weight modules over affine superalgebras and Appell’s function. Commun. Math. Phys. 215, 631–682 (2001) Kudla, S.: Seesaw reductive pairs. In: Automorphic Forms in Several Variables, Taniguchi Symposium. Katata, Boston: Birkh¨auser, 1983, pp. 244–268 Macdonald, I.G.: Symmetric Functions and Hall Polynomials. Oxford Math. Monogr., Oxford: Clarendon Press, 1995 Manin, Y., Radul, A.: A supersymmetric extension of the Kadmtsev-Petviashivili Hierarchy. Commun. Math. Phys. 98, 65–77 (1985) ´ Norm. Sup. 4e s´erie, Nazarov, M.: Capelli identities for Lie superalgebras. Ann. Scient. Ec t. 30, 847–872 (1997) Ol’shanskii, G., Prati, M.: Extremal weights of finite-dimensional representations of the Lie superalgebra gl n|m . Il Nuovo Cimento 85 A, 1–18 (1985) Pope, C., Romans, L., Shen, X.: A new higher-spin algebra and the lone-start product. Phys. Lett. D242, 401–406 (1990) Pope, C., Romans, L., Shen, X.: W∞ and the Racah-Wiger algebra. Nucl. Phys. D339, 191–221 (1990) Penkov, I., Serganova, V.: Representations of classical Lie superalgebras of type I. Indag. Math. 3, 419–466 (1992) Sergeev, A.: An analog of the classical invariant theory for Lie superlagebras, I. Michigan Math. J. 49, 113–146 (2001) Sergeev, A.: An analog of the classical invariant theory for Lie superlagebras, II. Michigan Math. J. 49, 147–168 (2001) Wang, W.: Duality in infinite dimensional Fock representations. Comm. Contem. Math. 1, 155–199 (1999) Wang, W.: Dual Pairs and Infinite Dimensional Lie Algebras. In: Recent Developments in Quantum Affine Algebras and Related Topics, N. Jing, K.C. Misra (eds), Contemp. Math. 248, 453–469 (1999)

[FKRW] [H1] [H2] [KL] [KR1] [KR2] [KW] [KU] [M] [MR] [N] [OP] [PRS1] [PRS2] [PS] [S1] [S2] [W1] [W2]

Communicated by M. Aizenman

Commun. Math. Phys. 238, 119–129 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0821-9

Communications in

Mathematical Physics

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds Philip Foth, Frederick Leitner Department of Mathematics, University of Arizona, Tucson, AZ 85721-0089, USA. E-mail: [email protected]; [email protected] Received: 3 April 2002 / Accepted: 15 January 2003 Published online: 21 March 2003 – © Springer-Verlag 2003

Abstract: The purpose of this paper is to describe certain natural 4-vector fields on quaternionic flag manifolds, which geometrically determine the Bruhat cell decomposition. These structures naturally descend from the symplectic group Sp(n), and are related to the dressing action given by the Iwasawa decomposition of the general linear group over the quaternions, GLn (H). 1. Introduction In this paper we wish to describe certain natural 4-vector fields on quaternionic flag manifolds. In the context of the Poisson geometry, a bi-vector field is penultimate in the study of the geometry of the underlying manifold. Analogously, we make use of a 4-vector field, closed under the Schouten bracket with itself, which we call a quatrisson structure, to reveal the internal structures of certain natural spaces arising in geometry, namely quaternionic flag manifolds. A more general definition involving a multi-vector field was first given in [1]. Another generalization, the Nambu-Poisson structure, was studied in [16]. Quaternionic flag manifolds possess natural group invariant quatrisson structures, and the study of the geometry of the flag manifolds can be pursued in the natural setup of quatrisson 4-vector fields and tetraplectic structures [6]. In particular, we describe the so-called Bruhat quatrisson 4-vector fields on quaternionic flag manifolds where the leaf decompositions coincide with the Bruhat decompositions of GLn (H) defined purely combinatorially. We also show that the existence of the Bruhat decomposition leads to a description of the tetraplectic leaves in the group Sp(n) in terms of the dressing action on the group. Drinfeld [4], Lu and Weinstein [11], Semenov-Tian-Shansky [14], and Soibelman [15] first described this setup in the context of standard Poisson geometry, and this viewpoint has been elaborated by many others. Several important features of the Poisson geometry of flag manifolds readily translate to our situation, including Schubert

120

P. Foth, F. Leitner

calculus and a version of generalized hamiltonian dynamics. We suggest that further studies of these structures might lead to interesting results related to the geometry and (equivariant) differential calculus on quaternionic flag manifolds as well as quantum groups. 2. Quaternionic Matrices and Flags We begin with some generalities on quaternionic matrices for which we define the following subgroups of GLn (H): R := {diag(r1 , . . . , rn ) | ri ∈ R+ }, U := {upper triangular matrices with 1’s along the diagonal}, t Vw := {U ∈ U | Pw U Pw−1 ∈ U}, here Pw denotes the permutation matrix (Pw )i,j = δi,w(j ) for w ∈ Sn , D := {diag(d1 , . . . , dn ) | di ∈ H∗ }, B := UD. Now for G ∈ GLn (H) we recall the strict Bruhat normal form [3] of G as: G = U DPw V . Here all the matrices are uniquely determined: the matrix D = diag(d1 , . . . , dn ) belongs to D; Pw is, as usual, the permutation matrix corresponding to w ∈ Sn ; both U and V belong to U; and we further require that Pw V Pw−1 is lower triangular with 1’s along the diagonal, i.e. V ∈ Vw . This decomposition allows us to define the Dieudonn´e determinant [2] as the residue of sgn(w) · di in H/[H, H] = R+ . Moreover, by means of the strict Bruhat normal form, we obtain the Bruhat decomposition: GLn (H) = BPw Vw . w∈Sn

Denoting Zw := {BPw Vw }, we obtain from the Bruhat decomposition a parameterization of GLn (H) by Sn . The condition that V w := wV w −1 is lower triangular implies in the case of w = e that V e = V must be both upper and lower triangular hence equals the identity matrix and hence dimR (Ze ) = dimR (B). Taking w to be the longest permutation, wl = (n (n − 1) · · · 2 1), rotates the matrix V by 180◦ so that it is lower triangular. As no further conditions on the entries of V are imposed, we have that dimR (Zwl ) = 4n2 . In general, we define the length of a permutation w, len(w), to be the minimal number of adjacent transpositions required in a factorization of the permutation. One readily sees that the maximal number of non-zero entries allowed in V , so that V w is lower triangular, is exactly len(w) so that dimR (Zw ) = 4 · len(w) + dimR (B). We will see later that the entries of V give coordinates on the Bruhat cells of quaternionic flags. Denoting the conjugate transpose by (·)∗ , we have the Lie group: Sp(n) := {g ∈ GLn (H) | g ∗ g = e}.

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds

121

We also identify the corresponding Lie algebra: ᒐᒍn := {X ∈ ᒄᒉn (H) | X + X ∗ = 0}

with Tx Sp(n) for x ∈ GLn (H) by left translation. One knows that matrices in Sp(n) have Dieudonn´e determinant 1, and thus lie inside the semi-simple group: SL(n) := {g ∈ GLn (H) | det(g) = 1}. We define the spheroid to be := Sp(1)n (S 3 )n , whose elements are of the form diag(exp(s1 ), . . . , exp(sn )) where the si are purely quaternionic (no real component). This is a subgroup of D, and we have, in fact, that: D = × R. We denote the corresponding Lie algebra by ᒐ. The full quaternionic flag of Hn , which we denote Fn , can now be identified as Fn \Sp(n) B\GLn (H). Using the second identification of Fn , we denote Cw := B\Zw , which we call the Bruhat cells of the flag. By our discussion above, we see that dimR (Cw ) = 4 · len(w). Example. Consider the space HP1 identified as HP1 \Sp(2) S 4 from which we obtain the fibration: S 3 × S 3 Sp(1) × Sp(1) →

Sp(2) ↓ . HP1 S 4

The Bruhat decomposition yields a decomposition of \Sp(2) HP1 into the cells C(12) and Ce which have real dimensions 4 and 0respectively. We view the cells under 1 ∗ H and Ce is the North pole. the identifications that C(12) N 2 = 0 1 Recall that S 4 has neither a symplectic nor a complex structure (nor even an almost complex [13]). This is one of the reasons for introducing the tetraplectic structure in [6]. 3. Quatrisson and Tetraplectic Structures Recall that a symplectic manifold is a manifold equipped with a closed non-degenerate 2-form. We also recall that a Poisson manifold is a manifold equipped with a bi-vector field that induces a Lie algebra structure on the space of smooth functions, compatible with the commutative product of functions via the Leibniz rule. In the case of quaternionic flags, we make use of the following structures which reflect the underlying geometry: Definition 3.1. [6]. Let X be a real orientable manifold of dimension 4m. A tetraplectic structure on X is a four-form, ψ satisfying: 1) ψ is closed (dψ = 0), 2) ψ m is a volume form. We call the pair (X, ψ) a tetraplectic manifold. A map φ : (X, ψ) → (X , ψ ) is called tetraplectic if φ ∗ ψ = ψ. If, in addition, φ is a diffeomorphism, then we call φ a tetraplectomorphism.

122

P. Foth, F. Leitner

Example. Let ψ be an Sp(2)-invariant volume form on S 4 . Then (HP1 ,ψ) is a tetraplectic manifold. In fact, in [6] the construction of invariant tetraplectic structures on all quaternionic flag manifolds was given. One can define a standard Poisson structure on a manifold by giving a bi-vector field whose Schouten bracket with itself is zero. However, in order to reflect the geometry of our situation we will make use of 4-vector fields, for which we recall the following [17]: Proposition 3.2. Denoting i χ (M) the space of i-vector fields on M, there exists a unique bracket, called the Schouten bracket: [·, ·] :

p

χ (M) ×

q

χ (M) →

p+q−1

χ (M)

which extends the usual Lie bracket of vector fields and is an R-linear operation satisfying the following identities: [P , Q] = (−1)pq [Q, P ]

1) 2) 3)

[P , Q ∧ R] = [P , Q] ∧ R

(Anti-Symmetry),

+ (−1)pq+q Q ∧ [P , R]

(Leibniz),

(−1)p(r−1) [P , [Q, R]] + (−1)q(p−1) [Q, [R, P ]] +(−1)r(q−1) [R, [P , Q]] = 0

(Jacobi).

We recall that in [1] the authors use the vanishing of the Schouten bracket of a p-vector field ξ with itself, [ξ, ξ ] = 0, to define the Generalized Poisson Structures (GPS). For etymologic-semantic reasons, we give the following definition: Definition 3.3. Let M be a manifold, and let ξ be a 4-vector field on M satisfying [ξ, ξ ] = 0. We call ξ a quatrisson structure on M and the pair (M, ξ ) a quatrisson manifold. Definition 3.4. For two quatrisson manifolds (X, ξ ) and (X ,ξ ) a map φ : X → X is called a quatrisson map if for any quadruple of functions fi ∈ C ∞ (X ), 1 ≤ i ≤ 4 the following identity holds: ξ(dφ ∗ f1 ∧ dφ ∗ f2 ∧ dφ ∗ f3 ∧ dφ ∗ f4 ) = φ ∗ ξ (df1 ∧ df2 ∧ df3 ∧ df4 ), i.e. φ∗ (ξ ) = ξ . Definition 3.5. Let ξ be a 4-vector field on a 4m-dimensional manifold, M. Then we call ξ non-degenerate if ξ ∧m is a nowhere vanishing 4m-vector field. If ξ is a non-degenerate vector field on M, then ξ induces a surjection 3 Tx∗ M → Tx M for all x ∈ M, obtained by contraction with ξ . If (M, ξ ) is quatrisson, we define the rank of ξ at x ∈ M as the dimension of the image of this map. One can see that a quatrisson structure ξ on M is non-degenerate if the rank of ξ at any point of M is equal to the dimension of M. Lemma 3.6. Letting (M, ξ ) be as above, the rank of ξ at any x ∈ M is divisible by 4. Proof. This is an easy exercise in multi-linear algebra for the reader.

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds

123

Definition 3.7. Let M be a manifold equipped with a 4-vector field ξ . We say that a smooth 4l-dimensional submanifold, L, is a tetraplectic leaf in M if: 1) ξ comes from 4 χ (L) at all points of L, 2) ξ is non-degenerate on L, 3) L is not properly included in any other such submanifold of M, 4) the four-form, ψ, given by iψ ξ m = ξ m−1 , defines a tetraplectic structure on M. To each triple of functions f=(f1 , f2 , f3 ), we can associate a “hamiltonian” vector field Xf given by ι(df1 ∧ df2 ∧ df3 )ξ . Then we get the characteristic distribution of M. Unlike in the Poisson case, we cannot expect in general that (M, ξ ) is stratified as a union of smooth tetraplectic leaves, even if ξ is quatrisson, see Example 8 in [7]. However, the particular case of this result for quaternionic flag manifolds will follow later. Example. Let Fn be a quaternionic flag manifold considered as a tetraplectic manifold with an Sp(n)-invariant 4-form ψ [6]. The corresponding 4-vector field, χ , defined by iχ (ψ m ) = ψ m−1 , is quatrisson and Sp(n)-invariant. We refer to χ as the invariant quatrisson structure of Fn . 4. Quatrisson Structures on HP1 For our flag manifolds, we construct a Bruhat quatrisson structure explicitly by analogy to the Poisson case as in [11] or [15]. We begin by defining a quatrisson 4-vector field, κ, on Sp(2), which we show descends to the quotient \Sp(2). We begin by defining an element of ∧4 ᒐᒍ2 in terms of the following basis for ᒐᒍ2 : 0 x 0 1 , E= , Sx = x 0 −1 0 x 0 x 0 Hx = , Mx = , 0 −x 0 x where x is one of {i, j, k}. For convenience we denote S−x := −Sx , H−x := −Hx , and M−x := −Mx . We now can note the following commutator relations: [Mx , E] = 0 [Hx , E] = 2 · Sx [Sx , E] = 2 · Hx , [Mx , My ] = [Hx , Hy ] = [Sx , Sy ] = [Sx , Hy ] = [Sx , My ] = [Hx , My ] = We may now define:

2 · Mx·y 0

x=

y , x=y

0 −2 · E

x=

y , x=y

0 2 · Sx·y

x=y , x = y

0 2 · Hx·y

x=y . x = y

:= E ∧ Si ∧ Sj ∧ Sk ,

124

P. Foth, F. Leitner

and denoting by L and R the left and right invariant 4-vector fields on Sp(2) with value at the identity element, we let: κ = L − R . Proposition 4.1. The 4-vector field κ is a quatrisson structure on Sp(2). Moreover, κ descends to a vector field, ℵ, on HP1 \Sp(2) inducing a -invariant quatrisson structure on HP1 called the Bruhat quatrisson structure. Proof. The fact that κ is a quatrisson structure on Sp(2) is nothing more than the fact that [κ, κ] ∈ 7 χ (Sp(2)) = 0. To show that ℵ is -invariant, and hence descends, we may apply the same formalism of the Poisson case and show that for any X ∈ ᒐ we have adX ( ) = [X, ] = 0. This follows readily from the above commutator relations and the Leibniz rule of the Schouten bracket. It is clear that [ℵ, ℵ] = 0. We can make use of the Bruhat decomposition to describe the vector field explicitly. As above, we denote by Ce and C(12) the cells of Fn corresponding to the North pole and the H components. It is clear that at the North pole κ is the zero vector. For x ∈ C(12) we choose a convenient coset representative in Sp(2), namely: 1 −v¯ 1 kx =

· , 1 v 1 + ρ2 √ ¯ In fact, the identification of S 4 as a natural SO(5) Sp(2)/(Z/2) where ρ = |v| = v v. -invariant submanifold of R5 with H plus the point at infinity using stereographic projec1 tion sends v ∈ H to a point in S 4 at the height 1 − and the same Sp(1)-angular 1 + |v|2 coordinate. To compute ℵ at x, we identify Tx Sp(2) with Te Sp(2) by right translations so that we have R = and L is simply conjugation of by kx . We have thus expressed ℵ = f (v)∂v1 ∧∂v2 ∧∂v3 ∧∂v4 in terms of the coordinates v = v1 +v2 ·i+v3 ·j+v4 ·k. We would like more natural coordinates for H, namely if v = ρ ·exp(θ1 ·i)exp(θ2 ·j)exp(θ3 ·k) ∈ H, g(ρ) then we have ℵ = 3 ∂ρ ∧ , for = ∂θ1 ∧ ∂θ2 ∧ ∂θ3 – the Sp(1)-invariant 3-vector ρ field on Sp(1) S 3 . To find g(ρ), we divide ℵ by its value at v = 0, the South pole. (1 + 3ρ 4 ) After a computer assisted computation1 we see that g(ρ) = . Thus we have (1 + ρ 2 )3 proved that: Proposition 4.2. The invariant quatrisson structure on S 4 HP1 is given by: χ :=

(1 + ρ 2 )4 ∂ρ ∧ , ρ3

and the Bruhat quatrisson structure is given by: ℵ=

(1 + ρ 2 )(1 + 3ρ 4 ) ∂ρ ∧ . ρ3

In particular we have that: ℵ= 1

1 + 3ρ 4 χ. (1 + ρ 2 )3

We thank Klaus Lux and Stephane Lafortune for help with this.

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds

125

One can easily see that ℵ has rank four everywhere except for at the North pole, where it vanishes. Thus the two cells are characterized by the rank of ℵ. 5. Quatrisson Structures on Flag Manifolds Following [12] we will produce the quatrisson structure on the full flag of Hn by way of the so-called multiplication formula. By analogy to the Sp(2) case, for 1 ≤ p < q ≤ n we denote by E p,q the quaternionic matrix whose entries are 0’s everywhere except in p,q the (p, q)th position which is 1, and the (q, p)th position which is −1. We also let Sx denote the matrix with 0’s everywhere except in the (p, q)th and (q, p)th positions where the entries are x, where x is again chosen from {i, j, k}. Similarly, these matrices are clearly in the Lie algebra, ᒐᒍn , of Sp(n), and correspond to “positive roots” (i.e. pairs of integers 1 ≤ p < q ≤ n) as in [12]. We define ∈ 4 Te Sp(n) by: p,q p,q p,q = E p,q ∧ Si ∧ Sj ∧ Sk . p
Then if L and R are the right and left invariant 4-vector fields on Sp(n) with the values at the identity element on Sp(n), we let: κ = L − R . Unlike the Sp(2) case, when n > 2, one can readily check that κ will not be a quatrisson structure on Sp(n) by making use of the Leibniz rule and commutator relations similar to those as above and noting that there will be some terms that will not cancel. However, we still have: Proposition 5.1. The 4-vector field κ descends to \Sp(n), inducing a -invariant quatrisson structure, ℵ, called the Bruhat quatrisson structure. Proof. For κ to descend and be invariant we need to show that both the left and right translations by elements of the Spheroid leave κ invariant, meaning that the adjoint action by the Spheroid on is trivial. This can be checked similarly to the n = 2 case of Proposition 4.1. One can directly check that [ℵ, ℵ] = 0 on \Sp(n). The fact that ℵ is quatrisson will also follow from Proposition 5.6. We recall: Definition 5.2. Let H be a Lie group equipped with a multiplicative 4-vector field µ, which acts on a quatrisson manifold (P , ξ ): β : H × P → P. We say that H acts multiplicatively if, denoting the corresponding translation maps: βh : P → P y → h · y we have:

βy : H → P h → h · y

ξ(h · x) = βh∗ ξ(x) + βx∗ µ(h).

We sometimes say that the action is multiplicative with respect to the direct sum 4-vector field µ ⊕ ξ on H × P .

126

P. Foth, F. Leitner

Notice the following fact (cf. [11], [8]): Lemma 5.3. The 4-vector field κ on Sp(n) is multiplicative. Proposition 5.4. Let ℵ be the Bruhat quatrisson structure on \Sp(n). The action map: Sp(n) × \Sp(n) → \Sp(n) : (g, h) → g · h is multiplicative with respect to the four-vector field κ ⊕ ℵ on Sp(n) × (\Sp(n)). Proof. Straightforward.

We will also make use of the following embeddings: fr,r+1 : Sp(2) → Sp(n), A → Ar,r+1 , a b where 1 ≤ r < n and for A = the matrix Ar,r+1 is given by: c d   

I

0 a c

0

b d

  

←r th -row

.

I

Lemma 5.5. The embeddings fr,r+1 : Sp(2) → Sp(n) respect the multiplicative 4-vector fields κ. Proof. Straightforward.

Proposition 5.6. Every tetraplectic leaf L of Sp(n) lies entirely in some Zw . If Lw is a tetraplectic leaf containing the permutation matrix Pw corresponding to some w ∈ Sn , and we write w = m τ as a minimal product of adjacent transpositions, we have a i=1 i tetraplectomorphism: Fw : Lτ1 × · · · × Lτm → Lw , (l1 , · · · , lm ) → l1 l2 · · · lm . Moreover, for σ ∈ , the tetraplectic leaf through σ Pw equals σ Lw . Proof (cf. [12], [15]). Immediately follows from the discussion above.

More explicitly, one can follow [12] to identify Lw with the Vw – orbit of Pw , and in the next section, we will define and exploit the analogues of the dressing action [14] to get a clearer picture of the tetraplectic leaves. In any case, we have the following: Theorem 5.7. The tetraplectic leaf decomposition of the quaternionic flag manifold Fn B\GLn (H) arising from the Bruhat quatrisson structure coincides with the Bruhat cell decomposition. Proof. The important point is that any tetraplectic leaf in Sp(n) under the quotient map Sp(n) → \Sp(n) maps tetraplectomorphically onto a Bruhat cell as follows from the results in this section.

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds

127

6. Quatrisson Action and Intrinsic Derivative We elaborate on some general notions related to group actions in the quatrisson context where we recall the notation set forth in Definition 5.2 and assume that we have a multiplicative action. Denoting the Lie algebra of H by ᒅ, we let: γ : ᒅ → χ (P ) be the usual Lie algebra anti-homomorphism, and recall the intrinsic derivative of ξ at e: 4 de ξ : ᒅ → ᒅ. We also define the 4-bracket [·, ·, ·, ·] on ᒅ∗ to be the dual of dξe . The next statement and its proof are analogous to Theorem 2.6 of [11]. Theorem 6.1. In the above situation for each X ∈ ᒅ we have: Lγ (X) ξ = ∧4 γ (de µ)(X). Moreover, for any 1-forms ωi for 1 ≤ i ≤ 4 on P we have: Lγ (X) ξ(ω1 ∧ ω2 ∧ ω3 ∧ ω4 ) = < [ζ1 , ζ2 , ζ3 , ζ4 ], X >, where ζi is the ᒅ∗ -valued function on P defined by: < ζi , X > = < ωi , γ (X) > forX ∈ ᒅ and [ζ1 , ζ2 , ζ3 , ζ4 ] denotes the point-wise 4-bracket in ᒅ∗ . 7. Dressing Action The Iwasawa decomposition of GLn (H) = RUSp(n) = Sp(n)RU allows us to define: Definition 7.1. The dressing action of RU on Sp(n) is the map RU × Sp(n) → Sp(n) given by (G, K) → K , where G · K = K · R · U for the unique R ∈ R and U ∈ U. Our goal of this section is to relate the orbits of the dressing action with the tetraplectic leaves of the group Sp(n). Notice that we have restricted the usual dressing action to RU since we will be only concerned with RU orbits of the dressing action in the remainder. Finally, we can state the main result of this section. Theorem 7.2. The tetraplectic leaves of κ on Sp(n) are the orbits of the dressing action of RU on Sp(n). Proof. We already know that the leaves are parametrized by Sn and . More precisely, we define the center of any leaf as the element Pw σ , where w as usual is the permutation matrix corresponding to w ∈ Sn , and σ ∈ . The dressing action can be rewritten as (G, K) → GKG ∈ Sp(n), for G, G ∈ RU, which leaves us in the same (open) submanifold of the Bruhat decomposition. Taking K = σ Pw , we see that the orbit of a dressing action on a cell remains in that cell as there are no permutations appearing in G or G . Further, the fact that the orbit is contained within a single leaf follows from G and G being upper triangular with real diagonal, and thus the dressing action does not introduce any non-trivial elements of .

128

P. Foth, F. Leitner

For the opposite inclusion, suppose we are given two points, K1 , K2 , of a tetraplectic leaf. As the Ki are in the same leaf, this implies that the Ki ’s have the same permutation type, w, in the Bruhat decomposition, so we write Ki = Bi Pw Vi for some Bi ∈ B and some Vi ∈ Vw . Then we have B2 B1−1 K1 = K2 V2 V1−1 with V2 V1−1 ∈ Vw . Now, as Bi ∈ B, we may write: Bi = diag(d1i , . . . , dni )diag(r1i , . . . , rni ), rji ∈ R+ , dji ∈ Sp(1). But as the orbits are parametrized by , we know that dj1 corresponds to dj2 , which

implies that B2 B1−1 must be in R from which it follows that the Ki ’s lie in the same orbit. Another possible proof of the above result can be obtained using the infinitesimal computations near the centers of each leaf [11, 14]. Once we know that the tetraplectic leaves go along the orbits of the dressing action infinitesimally, the analyticity of the manifolds in question will provide a global coincidence. We have established that the orbits of the dressing action of RU on Sp(n) coincide with the tetraplectic leaves induced by the 4-vector field κ, and these are permuted by the action of . Therefore we have obtained a geometric orbit picture for any tetraplectic leaf or a Bruhat cell, in Fn . 8. Further Remarks First of all, the approach that we pursued in the present paper can be easily extended to all partial quaternionic flag manifolds, in particular the Grassmannians and projective spaces. It would be interesting to express the dressing action as a quatrisson action, with respect to a multiplicative 4-vector field on RU. While it is clear that such a structure exists, it is not easy to write down a local expression. It seems plausible that a suitable generalization of Lu-Ratiu construction [10] would help. Evens and Lu [5] showed that the Kostant harmonic forms [9] on complex flag manifolds have a Poisson harmonic nature with respect to the Bruhat Poisson structure. It would be interesting to see how their ideas can be applied to our situation. One can use the operator ∂ℵ = −d ◦ ιℵ + ιℵ ◦ d + ισ to define Sp(n)-harmonic forms on the quaternionic flag manifolds. Here σ is the modular tri-vector field given by d(ιℵ ψ m ) = ισ ψ m , and ψ m is a Sp(n)-invariant volume form on Fn . Analogously to the T -equivariant cohomology of complex flag manifolds, one can consider the -equivariant cohomology. Another possibility is to consider quaternionic flag manifolds as fixed point sets of certain natural involutions on complex partial flag manifolds, where the dimensions of the subspaces are even, and restrict a certain subalgebra of forms. Another possible venue to pursue is to study the hamiltonian type dynamics associated with the quatrisson structures. In particular, it seems that to determine a system subject to a -action which preserves a hamiltonian, we may need fewer integrals than in the standard Poisson case. We suspect that certain symmetric spaces such as quaternionic Grassmannians will have the property that an invariant quatrisson structure is compatible with the Bruhat quatrisson structure, i.e. [χ , ℵ] = 0. This would lead to generalized bi-hamiltonian type systems, which are worth investigating. The 4-bracket on ᒅ∗ that we briefly mentioned in Sect. 6, gives rise to a certain deformed algebra of functions on H (by way of the Kontsevich formality theorem) where

Geometry of Four-Vector Fields on Quaternionic Flag Manifolds

129

the deformation parameter h ¯ now has degree 2. This implies that the m2 term in the operadic expansion is just the standard multiplication, m3 is trivial, and m4 is determined by the bracket. This is the first natural occurrence of the generalized quantum group setup that we are aware of, and thus it seems plausible that it would lead to new interesting algebraic structures. Acknowledgements. The first author is grateful to Sam Evens and Lu Jiang-Hua for many conversations related to Poisson geometry. The first author was supported by NSF grant DMS-0072520. The second author was supported by an NSF VIGRE graduate fellowship.

References 1. de Asc´arraga, J.A., Perelomov, A.M., P´erez Bueno, J.C.: The Schouten-Nijenhuis bracket, cohomology and generalized Poisson structures. J. Phys. A 29, 7993–8009 (1996) 2. Dieudonn´e, J.: Les d´eterminants sur un corps non commutatif. In French. Bull. Soc. Math. France 71, 27–45 (1943) 3. Draxl, P.K.: Skew Fields. London Math. Soc. Lect. Not. Ser. 81, Cambridge: Cambridge University Press, 1993 4. Drinfeld, V.: Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang-Baxter equation. Soviet Math. Dokl. 27, 68–71 (1983) 5. Evens, S., Lu, J.-H.: Poisson harmonic forms, Kostant harmonic forms, and the S 1 -equivariant cohomology of K/T . Advances in Math. 142, 171–220 (1999) 6. Foth, P.: Tetraplectic structures, tri-momentum maps, and quaternionic flag manifolds. J. Geom. Phys. 41, 330–343 (2002) 7. Ib´an˜ ez, R., de Le´on, M., Marrero, J. C., Padr´on, E.: Nambu-Jacobi and generalized Jacobi manifolds. J. Phys. A: Math. Gen. 31, 1267–1286 (1998) 8. Korogodski, L., Soibelman, I.: Algebras of functions on quantum groups. Math. surveys and monographs 56, Providence, RI: AMS, 1998 9. Kostant, B.: Lie algebra cohomology and generalized Shubert cells. Ann. Math. 77, 72–144 (1963) 10. Lu, J.-H., Ratiu, T.: On the non-linear convexity theorem of Kostant. Journal of AMS 4, 349–363 (1991) 11. Lu, J.-H., Weinstein, A.: Poisson-Lie groups, dressing transformations, and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990) 12. Lu, J.-H.: Coordinates on Shubert cells, Kostant’s harmonic forms, and the Bruhat Poisson structures on G/B. Transform. Groups 4, 355–374 (1999) 13. Massey, W.S.: Non-existence of almost complex structures on quaternionic projective spaces. Pacific J. Math. 12, 1379–1384 (1962) 14. Semenov-Tian-Shansky, M.A.: Dressing transformations and Poisson Lie group actions. Publ. RIMS Kyoto University 21, 1237–1260 (1985) 15. Soibelman, Y.: The algebra of functions on a compact quantum group and its representations. Leningrad J. Math. 2, 161–178 (1991) 16. Takhtajan, L.: On foundation of the generalized Nambu mechanics. Commun. Math. Phys. 160, 295–315 (1994) 17. Vaisman, I.: Lectures on the geometry of Poisson manifolds. Progress in Mathematics 118, Boston: Birkh¨auser, 1994 Communicated by L. Takhtajan

Commun. Math. Phys. 238, 131–147 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0856-y

Communications in

Mathematical Physics

R-Matrix Structure of Hitchin System in Tyurin Parameterization V.A. Dolgushev1,2,3 1

Department of Mathematics, MIT, 77 Massachusetts Avenue, Cambridge, MA, 02139-4307, USA. E-mail: [email protected] 2 University Center, JINR, Dubna, 141980 Moscow Region, Russia 3 Institute for Theoretical and Experimental Physics, 117259 Moscow, Russia Received: 17 September 2002 / Accepted: 16 January 2003 Published online: 7 May 2003 – © Springer-Verlag 2003

Abstract: We present a classical r-matrix for the Hitchin system without marked points on an arbitrary non-degenerate algebraic curve of genus g ≥ 2 using Tyurin parameterization of holomorphic vector bundles.

1. Introduction The study of the moduli space of holomorphic vector bundles over an algebraic curve motivated by the geometric Langlands conjecture [4] is now one of the most fascinating topics of modern algebraic geometry. Important tools for the investigation are integrable systems of Hitchin type [20, 21, 24–26] whose configuration spaces are defined as connected components of the moduli space of holomorphic vector bundles over compact Riemann surfaces. It is not an easy task to give a satisfactory description of a Hitchin system since its definition is implicit and at first sight it is clear neither how to find its Lax representation nor how to write down the respective equations of motion. For the case of algebraic curves of genus zero and one this question was solved in papers [7, 14, 16, 26] and for Schottky curves of an arbitrary higher genus a description of Hitchin systems was proposed in [9]. In the paper by A. Tyurin [34] a classification of holomorphic vector bundles over algebraic curves of arbitrary genus is obtained and a convenient parameterization of big cells of connected components of the moduli space of the bundles is suggested. After its introduction in [34] Tyurin parameterization is used in the works on integrable differential systems [8, 19, 21, 22, 28] and two papers [8, 21] are worthy of mention, in which the Tyurin description is used to parameterize Hitchin systems. In [8] the parameterization of Hitchin systems is obtained for the case of rank 2 holomorphic vector bundles of degree 2g over algebraic curves of genus g ≥ 1 and in [21] Tyurin description is used to parameterize an arbitrary Hitchin system and to

132

V.A. Dolgushev

construct infinite-dimensional field analogues1 of systems of Hitchin type. The results of the papers [8] and [21] show that Tyurin parameterization should enable us to achieve a rough but explicit description of quantum Hitchin systems, and the first step in this direction is a construction of classical r-matrix structures for Hitchin systems which will allow us to quantize the systems in a quantum group theoretic setting [10, 15, 18]. The concept of a classical r-matrix was originally introduced in works of the “Leningrad school” [30, 32] (see also book [12]) as a natural object that encodes the Hamiltonian structure of the Lax equation, provides the involution of integrals of motion [3], and gives a natural framework for quantizing integrable systems. In this paper we present a classical r-matrix for the Hitchin system without marked points on an arbitrary non-degenerate algebraic curve of genus g ≥ 2 using Tyurin parameterization of the moduli space of rank n holomorphic vector bundles of degree ng. Following Tyurin [34] a generic holomorphic vector bundle B of this type over a nondegenerate curve has an n-dimensional space H 0 (, B) of holomorphic sections and for a generic point P of the curve the sections generate a basis in the respective fiber BP . However, the evaluations of these sections on ng points γa ∈ , a = 1, . . . , ng are linearly dependent and for distinct points γa they determine subspaces Va ∈ Bγa of codimension 1 or just one-dimensional linear subspaces la in the dual space H 0 (; B)∗ . The collection of lines la ∈ H 0 (, B)∗ can be identified with nonzero vectors αa ∈ Cn , which are defined up to the scalar multiples αa → λa αa ,

λa ∈ C, λa = 0

and up to the following transformations of the group SLn (C)2 αai → α˜ ai = αa (G−1 )i , j

j

det Gij = 1, i, j = 1, . . . n,

(1)

generated by changes of basis in H 0 (, B). Thus, we arrive at the Tyurin map from an open dense set of the moduli space of rank n holomorphic vector bundles of degree ng over the curve to the following quotient: [ × P(Cn )](ng) /SLn (C),

(2)

where the notation (ng) stands for the symmetric direct product. The set of points γa ∈ and vectors αa ∈ Cn are referred to as Tyurin parameters, and the main statement of the paper [34] we are going to use is that an open dense set of the moduli space of rank n holomorphic vector bundles of degree ng over the curve is parameterized by points of the quotient (2). Note that in our considerations we are not going to bother about the singularities of the quotient of the space [ × P(Cn )]ng with respect to the action of the symmetric group Sng and in what follows we omit factorization of our phase space with respect to permutations. In order to parameterize the phase space of the Hitchin system without marked points one has to supplement Tyurin parameters (γa , αa ) with points κa ∈ Tγ∗a and vectors 1 In this context paper [24] is also worthy of mention, in which the case of the two-dimensional version of the elliptic Gaudin system is considered in detail. 2 We assume here the summation over repeated indices.

R-Matrix Structure of Hitchin System in Tyurin Parameterization

133

βa ∈ Cn , which are subject to the following conditions: n

βai αai = 0,

(3)

i=1

Tij =

ng

j

βai αa = 0.

(4)

a=1

Equation (3) means that the βai may be regarded as coordinates in the cotangent space and Eq. (4) are just the first class constraint conditions corresponding to the symplectic action of the group SLn (C) on the parameters αai and βai , that is i j j j G ∈ SLn (C). βai → Gij βa , (5) αai → αa (G−1 )i , j Tα∗a Pn−1

In other words, the phase space of the Hitchin system in question can be obtained via symplectic reduction in the space P = T ∗ [ × P(Cn )]ng

(6)

on the surface of the first class constraints3 (4). The main statement of the paper (see Theorem 1) is that the Krichever Lax matrix of the Hitchin system being extended to the symplectic manifold (6) admits a simple r-matrix structure, which is defined by a matrix-valued meromorphic section of the bundle × T ∗ over the direct product of curves × . We argue that using the r-matrix structure one can easily derive the classical r-matrix for the initial Lax matrix of the Hitchin system either with the help of a gauge invariant extension of the Krichever Lax matrix to the manifold (6) or with the help of on-shell Dirac brackets between the entries of the initial extension of the Krichever Lax matrix. Note however that the r-matrix structure of the extended system is much simpler than the resulting r-matrix of the Hitchin system and this remarkable simplification turns out to be possible due to the fact4 that the Krichever Lax matrix of the Hitchin system, being a meromorphic differential on the curve , can be extended to the symplectic manifold (6) in such a way that the extension is also a meromorphic matrix-valued differential on . The organization of the paper is as follows. In the second section we present the extension of the Krichever Lax matrix for the Hitchin system without marked points on a non-degenerate algebraic curve of genus g ≥ 2 and propose that the extended system admits an r-matrix structure, which is defined as a meromorphic matrix-valued function on one copy of the curve and a meromorphic 1-form on another copy of the same curve. Then, postponing the proof of this proposition to the next section, we show how to derive the classical r-matrix for the genuine Krichever Lax matrix of the Hitchin system using the above r-matrix structure. Before presenting the proof in Sect. 3 we show that a matrix-valued differential that enters into the definition of the above r-matrix structure does exist. We also give the 3 A similar trick is used in [7] for description of Hitchin systems associated with marked rational and elliptic curves. 4 I am indebted to A.M. Levin for the technical trick concerning the extension of the Krichever Lax matrix.

134

V.A. Dolgushev

properties of the differential as a function in the first variable and identify derivatives of the extended Krichever Lax matrix with respect to phase space variables as meromorphic differentials on the algebraic curve. In the concluding section we mention dynamical properties of the presented r-matrices, discuss a possibility to derive the r-matrices using an infinite-dimensional Hamiltonian reduction, and raise some other questions. In the appendix at the end of the paper we present the Krichever lemma, which is used throughout the paper as a tool, that enables us to identify meromorphic vector-valued differentials by their singular parts and certain linear equations for their regular parts. Although the statement is analogous to Lemma 2.2 in [21] we present its proof in the appendix since in some respect the statement generalizes the lemma and the presented proof differs from the one given in [21]. In this paper we use standard notations for Poisson brackets between entries of a Lax matrix. For example, if L(z) is a matrix-valued function and r(z, w) = rij kl (z, w)eij ⊗ ekl , (7) i,j,k,l

where

(eij )kl = δik δj l

are the elements of the standard basis in gln (C) then the expression {L1 (z), L2 (w)} = [r(z, w), L1 (z)] − [r21 (w, z), L2 (w)] means that Poisson brackets between the entries Lij (z) and Lkl (w) take the following form: {Lij (z), Lkl (w)} =

n

(rimkl (z, w)Lmj (z) − Lim (z)rmj kl (z, w)) m=1 n −

(rkmij (w, z)Lml (w) − Lkm (w)rmlij (w, z)).

m=1

Throughout the paper we assume that is a non-degenerate algebraic curve of genus g ≥ 2. 2. R-Matrix Structure for the Hitchin System Without Marked Points We start with the following particular case of Lemma 2.2 in [21]: Lemma 1. For a generic set of pairs (γa , ka ), γa ∈ , ka ∈ Tγ∗a , a = 1, . . . , ng and complex parameters αai and βai i = 1, . . . n such that n

βai αai = 0

(8)

i=1

there exists a unique matrix-valued meromorphic differential Lij = Lij (z)dz of the third kind satisfying the following properties: 1. The differential Lij has poles only at the points γa and at some fixed point P ∈ .

R-Matrix Structure of Hitchin System in Tyurin Parameterization

135

2. On a neighborhood of the point γa the differential Lij (z)dz behaves like j

βai αa a,1 Lij (z) = + La,0 ij + Lij (z − z(γa )) + · · · . z − z(γa )

(9)

3. αa is a left eigenvector for the matrix ||La,0 ij || with the eigenvalue κa n

j

αai La,0 ij = κa αa .

(10)

i=1

The differential Lij (z)dz is obviously invariant under the transformations αa → λa αa ,

βa → λ−1 a βa ,

λa ∈ C,

λa = 0

(11)

and, hence, it may be regarded as a function with values in meromorphic differentials on an open dense set of the space (6) so that the components of the vector αa are identified with homogeneous coordinates in P(Cn ) and the components of the vector βa , being subject to the conditions (8) define a point in the respective cotangent space Tα∗a P(Cn ). The differential Lij (z)dz and its natural generalizations were originally found in the paper [21] by Krichever as solutions of the momentum map equations for Hitchin systems. Although the differential Lij (z)dz is not a Krichever Lax matrix of the Hitchin system without marked points since Eq. (4) are not imposed, Lij (z)dz may be regarded as an extension of the above Lax matrix to the symplectic manifold P. In what follows, we refer to L as a Krichever Lax differential. Notice that in view of Lemma 2.1 of [21] , the differential Lij (z)dz can be identified with a meromorphic section with a single pole at the point P of the bundle End(B) ⊗ K, where B is the holomorphic bundle over corresponding to the Tyurin parameters γa and αai and K is a canonical bundle of the curve . Soon we will show that the Krichever Lax differential being considered as a function on the symplectic manifold P admits an r-matrix structure, but now we present an important ingredient which enters into the definition of the r-matrix structure in question. Lemma 2. For a generic set of Tyurin parameters αa and γa there exists a unique matrix-valued differential rj k (z, w)dw such that 1. rj k (z, w)dw is a meromorphic function in z and a meromorphic 1-form in w, 2. rj k (z, w)dw is holomorphic in w everywhere on except the points w = w(P ) and w = z, where it has simple poles with residues δj k and −δj k , respectively, 3. αa are null vectors for the matrices rj k (z, γa ), n

rj k (z, γa )αak = 0.

(12)

k=1

The existence of the meromorphic differential rj k (z, w)dw which is also a meromorphic function in z satisfying the above conditions is proved in Subsect. 3.1 where a stronger statement (see Lemma 3) concerning the properties of the differential rj k (z, w)dw as a function in z is also formulated. To this end, the uniqueness of the differential rj k (z, w)dw follows directly from the Krichever lemma. We now present the main statement of the paper.

136

V.A. Dolgushev

Theorem 1. For an arbitrary non-degenerate algebraic curve of genus g ≥ 2 the canonical Poisson brackets of the space (6) between the entries of the Krichever Lax differential (9) obey the Yang-Baxter relation {L1 (z), L2 (w)}dz ⊗ dw = [r(z, w), L1 (z)]dz ⊗ dw − [r21 (w, z), L2 (w)]dz ⊗ dw, (13) where the differential r(z, w)dw is given by the formula r(z, w)dw = rj k (z, w)eij ⊗ eki dw,

(14)

i,j,k

and rj k (z, w)dw is the meromorphic 1-form defined in Lemma 2. We will refer to the differential (14) as an r-matrix differential. In the following section we present an algebraic-geometric proof of the theorem. First we explain how to achieve the r-matrix for the Hitchin system we consider using the differential (14). As we have mentioned in the introduction, the phase space of the Hitchin system without marked points can be identified with an open dense set of the quotient of the constraint surface (4) in the space (6) with respect to the symplectic action (5) of the group SLn (C). In other words, if one chooses some gauge fixing conditions χ ij (αak ) = 0

(15)

for the transformations (5) then the phase space of the Hitchin system can be roughly identified with an intersection of the surfaces (4) and (15) in the space (6) and the respective Krichever Lax matrix is defined as the differential (9), restricted to the intersection lij (z)dz = Lij (z)dz|Tkl =χ kl =0 .

(16)

Obviously, the Lax matrix (16) is a meromorphic differential on the curve with the same properties (9), (10) as the Krichever Lax differential except that the point P is now regular for the differential (16). In view of Lemma 2.1 of [21], this means that the differential (16) can be identified with a holomorphic section of the bundle End(B)⊗K. The gauge transformations (5) of the Krichever Lax differential (9) have a form of adjoint action L(z) → GL(z)G−1 ,

G ∈ SLn (C),

(17)

and hence, whatever gauge fixing conditions are chosen, the desired r-matrix of the Hitchin system can be derived from the r-matrix differential (14) either with the help of a gauge invariant extension of the Krichever Lax matrix (16) or with the help of on-shell Dirac brackets between the entries of the differential (9). Recall that the gauge invariant extension of Lax matrices was originally used to calculate classical r-matrices for integrable systems in the works [1] and [2]. In a more general situation Dirac brackets and gauge invariant extension of Lax matrices are used for analogous calculations in the paper [5]. At last, in the paper [13] Dirac bracket technique is used in a specific framework to obtain new examples of Etingof-Varchenko dynamical r-matrices [11].

R-Matrix Structure of Hitchin System in Tyurin Parameterization

137

To derive the classical r-matrix for the Hitchin system we use the gauge invariant extension of the Krichever Lax matrix (16) to the space (6). For example, if some n × nminor αai j of the matrix αai , where 1 ≤ a1 < a2 < · · · < an ≤ ng is non-degenerate we can choose gauge fixing conditions in the form [21] αai j = 0, i = j,

αb1 = αb2 = · · · = αbn ,

(18)

where b does not coincide with any of the indices a1 , a2 , . . . an . On an open region of the space (6) one can define the SLn -valued function G(αa ) such that if the vectors αa do not satisfy the gauge fixing conditions (18) then the transformed vectors α˜ a j j α˜ ai = αa (G−1 (αc ))i do so. Otherwise G(αa ) is just the identity matrix. Then the matrix-valued differential l G (z) = G(αa )L(z)G−1 (αa )

(19)

turns out to be a desired gauge invariant extension of the Krichever Lax matrix (16) to the space P and the r-matrix in question takes the form r H (z, w)dw = (r(z, w)dw + {G1 (αa ), L2 (w)}dw)|on shell ,

(20)

where the notation |on shell means that the expression in the parenthesis is considered on the surface of the constraints (4) and (18). Example. Although Hitchin systems without marked points are non-trivial only for algebraic curves of genus g ≥ 2 the Krichever Lax differential (9) and its r-matrix structure (14) exist on an elliptic curve as well. To show this, we realize an elliptic curve as a quotient = C/{1, τ }, I m τ > 0 and denote the parameters γa and ka by qa and pa , respectively, where a now runs from 1 to n. Then, the Krichever Lax differential (9) and the r-matrix differential (14) can be written in terms of the standard θ -function as follows: Lij (z) = L˜ ij (z) =

n

πik L˜ kl (z)αl ,

L˜ ii = pi ,

j

k,l=1 n

− qi )θ (z + qi − qj )θ (qj )θ (0) , θ (z)θ (z − qj )θ (qj − qi )θ (qi )

θ (z αik βjk

k=1

r(z, w) =

n

(21) i = j,

(E(z − w) + E(w))eij ⊗ ej i

i,j =1

−

n

j

πka αa (E(z − qa ) + E(qa ))eij ⊗ eki ,

i,j,k,a=1 j

where ||πi || is the inverse matrix to ||αkl ||, n k=1

j

j

αik πk = δi ,

(22)

138

V.A. Dolgushev

θ (z) =

exp (π iτ (m + 1/2)2 + 2πi(m + 1/2)(z + 1/2)),

m∈Z

and E(z) =

θ (z) . θ (z)

To explain the relation of the Lax matrix (21) to Lax representation of known integrable systems we have to enlarge the phase space parameterized by coordinates qa , pa , αai , and βai with some coadjoint orbit O of the group SLN . Symplectic reduction of this space to the first class constraint surface ng

j

βai αa + ηij = 0

(23)

a=1

leads us to the phase space and the Lax matrix5 of the elliptic spin Calogero-Moser system [26, 27]. Here ηij denote conventional coordinates on the coadjoint orbit O . If we now restrict O to be the maximal coadjoint orbit we just get the particular case of one marked point of the integrable system considered in [7]. The latter system is now generally regarded as an elliptic Gaudin system [26, 33]. 3. The Proof of the Yang-Baxter Relation The proof of Theorem 1 is based on the observation that both sides of Eq. (13) satisfy the same properties, which, in turn, uniquely define them as meromorphic forms on the direct product of curves × . Namely, it turns out that both sides of Eq. (13) have coincident singular parts while their regular parts at the points γa obey the same linear inhomogeneous equations, which uniquely define the remaining arbitrariness in the holomorphic parts due to the Krichever lemma. To calculate Poisson brackets between the entries of the Krichever Lax differential we choose the local chart of the space (6) where αa1 = 1,

βa1 = −

n

βaµ αaµ ,

∀ a = 1, . . . , ng.

(24)

µ=2

Note that although a choice of another local affine chart affects intermediate calculations the Poisson bracket {Lij (z), Lkl (w)}dz ⊗ dw

(25)

is, in fact, “a function” on the space (6), and therefore the properties of the expression (25) as a form on product of curves × do not depend on the choice of local coordinates on P. Throughout this section we also assume that some local coordinates are chosen on neighborhoods of the points γa on the curve and for simplicity we denote the coordinate z(γa ) by the same letter γa . 5 We note that the Lax matrix of the elliptic spin Calogero-Moser system was originally presented as a meromorphic function on the elliptic curve in the paper [21].

R-Matrix Structure of Hitchin System in Tyurin Parameterization

139

3.1. Properties of the r-matrix differential as a function of the first argument. We start this subsection with the following Lemma 3. The differential rij (z, w)dw defined in Lemma 2 exists and is holomorphic in z everywhere on except the points γa , where it has simple poles. The differential rij (z, w)dw is also vanishing at the point z = z(P ), rij (z(P ), w)dw = 0.

(26)

Proof. First, using the Krichever lemma we introduce auxiliary holomorphic vectorvalued differentials uai (z)dz, which are uniquely defined by the following properties: n

uai (γb )αbi = δab .

(27)

i=1

Using standard arguments based on the Kodaira-Nakano vanishing theorem and GAGA principles one can easily show that for an arbitrary point Q ∈ there exists a matrix-valued differential ij (z, w)dw, which is holomorphic in z on some open neighborhood UQ of the point Q and holomorphic in w everywhere on except the points w = w(P ) and w = z, where the differential has simple poles with residues δij and −δij , respectively. It is easy to see that the following matrix-valued differential: U rij Q (z, w)dw = ij (z, w)dw − ik (z, γa )αak uaj (w)dw (28) a,k

is meromorphic in z on the neighborhood UQ and satisfies conditions 2 and 3 of Lemma 2. U Since conditions 2 and 3 of Lemma 2 uniquely determine rij Q (z, w)dw as a 1-form in w we can define the desired differential rij (z, w)dw by its restrictions (28) to the sets UQ . Equation (28) also implies that the resulting differential rij (z, w)dw is holomorphic in z everywhere on except the points γa and on the neighborhoods of the points the differential behaves like rij (z, w)dw = −

αai uaj (w)dw + regular terms. z − γa

(29)

Note also that as the differential rij (z(P ), w)dw is holomorphic in w everywhere on Eq. (12) imply that the differential is in fact vanishing due to the Krichever lemma. Thus, the statement is proved.

In order to prove the Yang-Baxter relation we have to identify the next two coefficients of the Laurent expansion of the differential rij (z, w)dw in the first variable z around a point γa . In the following lemma we identify these coefficients as meromorphic differentials on the curve . Lemma 4. The expansion coefficients rija,0 (w)dw and rija,1 (w)dw of the Laurent series rij (z, w)dw = −

αai uaj (w)dw + rija,0 (w)dw + (z − γa )rija,1 (w)dw + o(z − γa ) z − γa (30)

140

V.A. Dolgushev

of the differential rij (z, w)dw on a neighborhood Uγa of a point γa are uniquely defined by the following properties6 : 1. The 1-form rija,0 (w)dw is holomorphic everywhere on except the points P and γa , where it has simple poles with residues δij and −δij , respectively. 2. For b = a , αb is a null vector for the matrix ||rija,0 (γb )||, n

j

rija,0 (γb )αb = 0,

b = a,

j =1

and αa is a null vector for the regular part the matrix ||rija,0 (w)|| at the point w = γa , n

j

rija,0 (w)αa |regular part at w=γa = 0.

j =1

3. The 1-form rija,1 (w)dw has a single pole at the point γa and on a neighborhood of the point it behaves like rija,1 (w) = −

δij + regular terms. (w − γa )2

(31)

4. For b = a , αb is a null vector for the matrix ||rija,1 (γb )|| : n

j

rija,1 (γb )αb = 0,

b = a,

j =1

and αa is a null vector for the regular part of the matrix ||rija,1 (w)|| at the point w = γa : n j rija,1 (w)αa |regular part at w=γa = 0. j =1

Proof. Applying the properties of the 1-form rij (z, w)dw (see Lemma 2) to the Laurent expansion (30) we get that outside the neighborhood Uγa the differential rija,0 (w)dw has only a simple pole at the point w = w(P ) with the residue δij , the differential rija,1 (w)dw is holomorphic in the region \ Uγa , and for b = a, αb is a right null vector for the matrices ||rija,0 (γb )|| and ||rija,1 (γb )||, n j =1

j

rija,0 (γb )αb = 0,

n

j

rija,1 (γb )αb = 0,

b = a.

j =1

The expansion (30) cannot be used for the case when w is on the neighborhood Uγa because rij (z, w) is irregular at the point z = w. 6 Note that the uniqueness of the differentials r a,0 (w)dw and r a,1 (w)dw satisfying the presented ij ij properties follows from the Krichever lemma.

R-Matrix Structure of Hitchin System in Tyurin Parameterization

141

In order to cure the problem we consider the function ϕij (z) = rij (z, w) +

δij , w−z

which is already holomorphic at the point z = w, and therefore the Laurent expansion αai uaj (w)dw δij a,0 ϕij (z) = − + rij (w) + w − γa z − γa δij a,1 (z − γa ) + o(z − γa ), + rij (w) + (w − γa )2

(32)

of the function is convergent on the neighborhood Uγa even in the case when the point w is on the neighborhood. Hence, we can apply the remaining properties of the differential rij (z, w)dw to expansion (32) and finally get that on the neighborhood Uγa the differentials rija,0 (w)dw and rija,1 (w)dw behave like δij + regular terms, w − γa δij rija,1 (w)dw = − + regular terms, (w − γa )2

rija,0 (w)dw = −

and αa is a right null vector for the regular parts of the matrices ||rija,0 (w)|| and ||rija,1 (w)|| at the point w = γa . Thus, the lemma is proved.

3.2. Derivatives of the Krichever Lax differential. In this subsection we present the properties of derivatives of the differential (9) with respect to the variables γa and κa µ µ and with respect to the canonical coordinates αa and βa µ = 2, . . . , n in the local chart (24) on the space P. As it will be seen the properties uniquely define the derivatives of L as meromorphic differentials on the curve . First, we note that the differential ∂ka Lij (z)dz can be written in the following form: j

∂ka Lij (z)dz = αa uai (z)dz,

(33)

where uai (z)dz are holomorphic differentials defined by Eq. (27). Second, the differential ∂βaµ Lij (z)dz has at most simple poles at the points P and γa and the residue of ∂βaµ Lij (z)dz at the point γa equals j

j

Resz=γa ∂βaµ Lij (z)dz = δiµ αa − δi1 αaµ αa . For b = a , αb is a left null vector for the matrix ||∂βaµ Lij (γb )|| n i=1

αbi ∂βaµ Lij (γb ) = 0,

b = a,

(34)

142

V.A. Dolgushev

and αa is a left null vector for the regular part of the matrix ||∂βaµ Lij (z)|| at point γa : n

αai ∂βaµ Lij (z)|regular part at z=γa = 0.

i=1

Third, the differential ∂αaµ Lij (z)dz also has at most simple poles at the points P and γa and the residue of ∂αaµ Lij (z)dz at the point γa equals j

Resz=γa ∂αaµ Lij (z)dz = βai δj µ − δi1 βaµ αa .

(35)

For b = a , αb is a left null vector for the matrix ||∂αaµ Lij (γb )|| n

αbi ∂αaµ Lij (γb ) = 0,

b = a,

i=1

and the regular part the matrix ||∂αaµ Lij (z)|| at point γa satisfies the following linear inhomogeneous equation (for the definition of the matrix ||La,0 ij || see Eq. (9)) n

αai ∂αaµ Lij (z)|regular

part at z=γa

= ka δµj − La,0 µj .

i=1

Finally, the differential ∂γa Lij (z)dz is holomorphic everywhere on except the point γa , where it has a pole of the second order and on a neighborhood of the point it behaves like j

∂γa Lij (z)dz =

βai αa dz + regular terms. (z − γa )2

(36)

For b = a , αb is a left null vector for the matrix ||∂γa Lij (γb )||, n

αbi ∂γa Lij (γb ) = 0,

b = a,

i=1

and, in addition, the regular part of the matrix ||∂γa Lij (z)|| at the point γa satisfies the following linear inhomogeneous equation (for the definition of the matrix ||La,1 ij || see Eq. (9)) n n αai ∂γa Lij (z)|regular part at z=γa = − αai La,1 ij . i=1

i=1

All the properties of the derivatives ∂ka Lij (z)dz, ∂γa Lij (z)dz, ∂αaµ Lij (z)dz and ∂βaµ Lij (z)dz can be easily derived from the definition of the Krichever Lax differential (9) and the uniqueness of the derivatives as meromorphic differentials on follows directly from the Krichever lemma.

R-Matrix Structure of Hitchin System in Tyurin Parameterization

143

3.3. The sketch of the proof. Let us rewrite theYang-Baxter relation (13) in the following form: Dij kl (z, w)dz ⊗ dw = Rij kl (z, w)dz ⊗ dw,

(37)

where Dij kl (z, w) = {Lij (z), Lkl (w)} =

ng

∂γa Lij (z)∂ka Lkl (w) − ∂ka Lij (z)∂γa Lkl (w)

a=1 ng n

∂αaµ Lij (z)∂βaµ Lkl (w) − ∂βaµ Lij (z)∂αaµ Lkl (w) ,

+

a=1 µ=2

and Rij kl (z, w) =

n

δil rmk (z, w)Lmj (z) − Lil (z)rj k (z, w) m=1 n −

δkj rmi (w, z)Lml (w) + Lkj (w)rli (w, z).

m=1

Using the properties of the differentials ∂ka Lij (z)dz, ∂γa Lij (z)dz, ∂αaµ Lij (z)dz and ∂βaµ Lij (z)dz we derive a relatively long list of properties for the form Dij kl (z, w)dz⊗dw: 1. The poles of the form Dij kl (z, w)dz ⊗ dw are located at the points γa and P so that the pole at the point P is simple and the poles at the points γa are of the second order. 2. If w coincides neither with the point P nor with any of the points γb the singular part of the component Dij kl (z, w) at the point z = γa looks like Dij kl (z, w) =

Dija,2kl (w) (z − γa )2

+

Dija,1kl (w) z − γa

+ regular terms,

(38) j

where Dija,2kl (w) is a component of the holomorphic differential αal βai αa uak (w)dw and Dija,1kl (w)dw is a differential of the third kind defined by the following properties: • Dija,1kl (w)dw has poles only at the points γa and P with the residue at the point γa being j

Resw=γa Dija,1kl (w)dw = δkj βai αal − δil βak αa .

(39)

• The values of the components Dija,1kl (w) at the points γb , b = a satisfy the following “null vector” conditions n k=1

αbk Dija,1kl (w) = 0,

b = a.

(40)

144

V.A. Dolgushev

• The regular parts of Dija,1kl (w) at the point γa obey the following linear inhomogeneous equations: n

j αak Dija,1kl (w)|regular part at w=γa = − ka δil − La,0 αa . il

(41)

k=1

3. The regular parts of the components Dij kl (z, w) at the points γa satisfy the linear inhomogeneous equations n

αai Dij kl (z, w)|regular part at z=γa = Djakl (w),

(42)

i=1

where Djakl (w) are components of a meromorphic tensor-valued differential defined by the following properties: • Djakl (w)dw is holomorphic everywhere on except the points P and γa where it has poles of the first and second order respectively. • On a neighborhood of the point w = γa it behaves like k α l α j dw αal dw ka δkj − La,0 kj β a Djakl (w)dw = − a a + + regular terms. (43) (w − γa )2 w − γa • The values of the components Djakl (w) at the points γb , b = a satisfy the following “null vector” conditions n

αbk Djakl (γb ) = 0,

b = a.

(44)

k=1

• The regular parts of the components Djakl (w) at the point γa obey the following linear inhomogeneous equations: n

αak Djakl (w)|regular part at w=γa =

k=1

n

j l k a,1 − α α L αa αak La,1 . a a kj kl

(45)

k=1

An analogous detailed analysis of the components Rij kl (z, w) shows that Rij kl (z, w) dz ⊗ dw satisfies all the above properties of the form Dij kl (z, w)dz ⊗ dw. Due to the Krichever lemma these properties define a unique form Dij kl (z, w)dz ⊗ dw and, thus, the desired statement is proved.

4. Concluding Remarks In conclusion, we point out that the classical r-matrix (14) of the extended Krichever Lax matrix (9) depends only on the variables γa and αa , that is, on coordinates of the respective configuration space. Since the differential (9) is linear in the variables ka and βai , the genuine r-matrix (20) of the Hitchin system also depends only on the variables γa and αa . This forces us to assume that the classical r-matrices satisfy simple analogues of the classical dynamical Yang-Baxter equation [11], which should express the consistency of the respective Yang-Baxter relations for the Krichever Lax matrices (9) and (16).

R-Matrix Structure of Hitchin System in Tyurin Parameterization

145

Note also that a formal expression for the classical r-matrix of the extended system can be obtained by the method developed in the paper [5]. Following that method we have to present the system on the manifold (6) with the Krichever Lax matrix (9) via an infinite-dimensional Hamiltonian reduction on ng copies of the cotangent bundle to the loop group GLn (C)[z, z−1 ]. Although the method allows one to express the desired r-matrix in terms of infinite series in the Krichever-Novikov type basis [23, 29, 31] it turns out to be very hard to analyze such answers and to identify the resulting r-matrix with any meromorphic object associated with the product of curves × . Finally, we mention that it would be interesting to compare the Krichever parameterization of Lax and r-matrix structures of Hitchin systems based on Tyurin description to the analogous approach [9] based on the Schottky uniformisation of Riemann curves and it would be also intriguing to explain a role of the obtained r-matrices in the context of WZNW models on Riemann surface [6, 17]. Appendix. The Proof of the Krichever Lemma Lemma 5 (Krichever). Let νi (z)dz be a meromorphic vector-valued differential on the curve . Then, for a generic set of Tyurin parameters γa ∈ and αa ∈ Cn and for an arbitrary set of complex numbers ba there exists a unique meromorphic vector-valued differential vi (z)dz having the same singular parts as the differential νi (z)dz and obeying the following conditions7 : • If vi (z)dz is holomorphic at the point γa then n

vi (γa )αai = ba ,

(46)

vi (z)αai |regular part at z=γa = ba .

(47)

i=1

• and otherwise, n i=1

Proof. The statement of the lemma is equivalent to the fact that for a generic set of Tyurin parameters (γa , αa ) and for an arbitrary set of complex numbers ca there exists a unique holomorphic vector-valued differential hi (z)dz satisfying the equations n

hi (γa )αai = ca ,

(48)

i=1

which are, in turn, equivalent to the following linear inhomogeneous equations: g n

i hA i µA (γa )αa = ca

(49)

i=1 A=1

for the expansion coefficients hA i of the differential hi (z)dz in some basis {µA (z)dz, A = 1, . . . , g} of holomorphic differentials on the curve . 7 Note that we choose some local coordinates on neighborhoods of the points γ and the right hand a sides of Eqs. (46) and (47) depend on this choice.

146

V.A. Dolgushev

Since the number of coefficients hA i coincides with the number of Eq. (49) the desired statement is equivalent to the fact that the following ng × ng-matrix Ma(Ai) = µA (γa )αai is non-degenerate. The proof of this fact turns out to be a simple task of linear algebra.

(50)

Acknowledgements. I would like to express my sincere thanks to I.M. Krichever and M.A. Olshanetsky for formulating the problem and for useful discussions of this topic. I acknowledge I.M. Krichever for constructive criticisms concerning the first version of this article and A.M. Levin for an important technical trick, which drastically simplifies the result of this paper. I also acknowledge H.W. Braden, A.S. Gorsky, S.V. Oblezin and A.V. Zotov for useful discussions. I am grateful to M. Ching for criticisms concerning the English language of this paper. The work is partially supported by RFBR grant 00-02-17-956, the Grant for Support of Scientific Schools 00-15-96557, and the grant INTAS 00-561.

References 1. Arutyunov, G.E., Medvedev, P.B.: Geometric construction of the classical R-matrices for the elliptic and trigonometric Calogero-Moser systems. hep-th/9511070 2. Avan, J., Babelon, O., Talon, M.: Construction of the classical R-matrices for the Toda and Calogero models. Algebra i Analiz 6(2), 67–89 (1994); PAR-LPTHE-93-31, hep-th/9306102 3. Babelon, O., Viallet, C.M.: Hamiltonian Structures and Lax Equation. Phys. Lett. B 237, 411–416 (1990) 4. Beilinson,A.A., Drinfeld,V.G.: Quantization of Hitchin’s fibration and Langlands program. Preprint, p. 3, 1993; Laumon, G.: Correspondance de Langlands geometrique pour les corps de fonctions. Duke Math. J. 54, 309–359 (1987) 5. Braden, H.W., Dolgushev, V.A., Olshanetsky, M.A., Zotov, A.V.: Classical R-matrices and FeiginOdesskii algebra via Hamiltonian and Poisson reductions. hep-th/0301121 6. Bernard, D.: On the Wess-Zumino-Witten models on torus. Nucl. Phys. B303, 77–93 (1988); On the Wess-Zumino-Witten models on Riemann surfaces. Nucl. Phys. B309, 145–174 (1988) 7. Enriquez, B., Rubtsov, V.: Hitchin systems, higher Gaudin operators and R-matrices. Math. Res. Lett. 3(3), 343–357 (1996) 8. Enriquez, B., Rubtsov, V.: Hecke-Tyurin parametrization of the Hitchin and KZB systems. math.AG/9911087 9. Enriquez, B.: Dynamical r-matrices for Hitchin’s systems on Schottky curves. Lett. Math. Phys. 45(2), 95–104 (1998) 10. Etingof, P., Varchenko, A.: Solutions of the quantum dynamicalYang-Baxter equation and dynamical quantum groups. Commun. Math. Phys. 196(3), 591–640 (1998) 11. Etingof, P., Varchenko, A.: Geometry and classification of solutions of the classical dynamical YangBaxter equation. Commun. Math. Phys. 192(1), 77–120 (1998) 12. Faddeev, L.D., Takhtajan, L.A.: Hamiltonian methods in the theory of solitons. Springer Series in Soviet Mathematics. Berlin: Springer-Verlag, 1987 13. Feh´er, L., G´abor, A., Pusztai, B.G.: On dynamical r-matrices obtained from Dirac reduction and their generalizations to affine Lie algebras. J. Phys. A, Math. Gen. 34(36), 7335–7348 (2001) 14. Feigin, B., Frenkel, E., Reshetikhin, N.: Gaudin model, Bethe ansatz and critical level. Commun. Math. Phys. 166(1), 27–62 (1994) 15. Felder, G.: Elliptic quantum groups. In: Mathematical Physics. Proceedings. D. Iagolnitzer. ed. Paris, 1994, Cambridge, USA: IP, 1995, pp. 211–218 16. Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proceedings of the International Congress of Mathematicians, Vol. 1, Basel: Birkh¨auser, 1995, pp. 1247–1255 17. Felder, G.: The KZB equations on Riemann surfaces. Quantum symmetries. Proceedings. Les Houches, 1995, pp. 687–725 18. Gervais, J.-L., Neveu, A.: Novel triangle relation and absence of tachyons in Liouville string field theory. Nucl. Phys. B238, 125–141 (1984) 19. Grinevich, P.: Rational solutions for the equations of commutation of differential operators. Funct. Anal. Appl. 16, 19–24 (1982) 20. Hitchin, N.: Stable bundles and integrable systems. Duke Math. J. 54(1), 91–114 (1987)

R-Matrix Structure of Hitchin System in Tyurin Parameterization

147

21. Krichever, I.M.: Vector bundles and Lax equations on algebraic curves. Commun. Math. Phys. 229(2), 229–269 (2002); hep-th/0108110 22. Krichever, I.M., Novikov, S.P.: Holomorphic bundles over Riemann surfaces and the KP equations. I. Funct. Anal. Appl. 12, 41–52 (1978) 23. Krichever, I.M., Novikov, S.P.: Algebras of Virasoro type, Riemann surfaces and structures of the theory of solitons. Funct. Anal. Appl. 21, 126–142 (1987); Virasoro-type algebras, Riemann surfaces and strings in Minkowski space. Funct. Anal. Appl. 21, 294–307 (1987); Virasoro-GelfandFuks type algebras, Riemann surfaces, operator’s theory of closed strings. J. Geom. Phys. 5, 4, 631– 661 (1988); Algebras of Virasoro type, energy-momentum tensor, and decomposition operators on Riemann surfaces. Funct. Anal. Appl. 23, 19–33 (1989) 24. Levin, A.M., Olshanetsky, M.A., Zotov, A.V.: Hitchin systems – symplectic Hecke correspondence and two-dimensional version. ITEP-TH-56-01, nlin.si/0110045 25. Markman, E.: Spectral curves and integrable systems. Composition Math. 93(3), 255–290 (1994) 26. Nekrasov, N.: Commun. Math. Phys. 180, 587–604 (1996) 27. Olshanetsky, M.A.: Lett. Math. Phys. 42, 59–71 (1997) 28. Previato, E., Wilson, G.: Vector bundles over curves and solutions of the KP equations. In: Theta Functions. Proc. Symp. Pure Math. AMS Bowdoin, 49, 1, 1987, pp. 553–570 29. Schlichenmaier, M., Sheinman, O.K.: The Wess-Zumino-Witten-Novikov theory, KnizhnikZamolodchikov equations, and Krichever-Novikov algebras, I. Mannheimer Manuskripte 236, math.QA/9812083 30. Semenov-Tian-Shansky, M.A.: What is a classical r-matrix? Funct. Anal. Appl. 17, 259–272 (Russian) (1983); 17–33 (English translation) 31. Sheinman, O.K.: Elliptic affine Lie algebras. Funct. Anal. Appl. 24(3), 210–219 (1990); Highest weight modules over certain quasigraded Lie algebras on elliptic curves. Funct. Anal. Appl. 26(3), 65–71 (1992); Affine Lie algebras on Riemann surfaces. Funct. Anal. Appl. 27(4), 54–62 (1993); Highest weight modules for affine Lie algebras on Riemann surfaces. Funct. Anal. Appl. 29(1), 56–71 (1995) 32. Sklyanin, E.K.: On the complete integrability of the Landau-Lifchitz equation. Preprint LOMI E-379. Leningrad, 1979 33. Talalaev, D.: The elliptic Gaudin system with spin. Theor. Math. Phys. 130, 361–374 (2002) 34. Tyurin, A.: Classification of vector bundles over an algebraic curve of arbitrary genus. Am. Math. Soc., Translat., II. Ser. 63, 245–279 (1967) Communicated by L. Takhtajan

Commun. Math. Phys. 238, 149–186 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0871-z

Communications in

Mathematical Physics

Global Existence of Plasma Ion-Sheaths and Their Dynamics Seung-Yeal Ha, Marshall Slemrod Department of Mathematics, University of Wisconsin-Madison, USA. E-mail: [email protected]; [email protected] Received: 9 December 2002 / Accepted: 16 January 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: When a negative high voltage pulse is applied to a target material immersed in a plasma, a boundary layer (sheath) forms around the target and the motion of cold ions in the plasma is governed by the Euler-Poisson system. In this paper, by simplifying the Euler-Poisson system on suitable physical regimes, we present a theory for the existence and dynamics of time-dependent sheaths with planar, cylindrical and spherical symmetry. For the construction of ion-sheaths, we employ the method of characteristics and study the dynamic behavior of a plasma-sheath edge based on ODEs which are formally derived from the Euler-Poisson system. 1. Introduction The purpose of this paper is to describe the motion of plasma sheaths which originate with loss of quasi-neutrality in a plasma consisting of ions and electrons. The issues can be easily understood by the examination of the Euler-Poisson (E-P) system. Consider a plasma consisting of ions and electrons confined to a domain ⊂ R3 . Both ions and electrons have constant temperature, the temperature of the ions being absolute zero Kelvin. The density of ions is denoted by n, the density of electrons ne is taken to be e−φ i.e., the Boltzmann relation [23], −φ is the potential field and u is the velocity of the ions. In this case, (E-P) reads   ∂t n + div(nu) = 0, (x, t) ∈ × (0, ∞), (1.1) ∂t u + (u · ∇)u = ∇φ,   2 φ = n − n , n = e−φ , e e subject to initial and boundary conditions (n, u, φ)(x, 0) = (n0 , u0 , φ0 )(x), x ∈ , φ = φw on ∂. u = uw ,

150

S.-Y. Ha, M. Slemrod

Here is proportional to the Debye length λD [23], and uw is the velocity of ions at the boundary ∂. Typically away from the boundary of , the formal → 0 limit in (E-P) can be used to yield the quasi-neutral relation n = e−φ . However near the boundary ∂, quasi-neutrality breaks down (see Sect. 2) and a boundary layer of order forms, the plasma ion-sheath, which has essentially zero electron density. The goal of this paper is to describe the dynamics of the plasma in both the quasineutral and sheath regions based on a step-sheath model which has distinct quasi-neutral and sheath regions separated by a propagating sheath edge surface. The usefulness of such models can be seen in studying material processing [23] and in particular the plasma source ion implantation (PSII) technique invented by Conrad and his collaborators [7]. In this process a negative high voltage pulse −φw is applied to the target material immersed in a plasma. An ion sheath develops near the target and ions in the sheath region are accelerated by the potential difference and implanted onto the surface of a target causing a change of surface properties. Moreover, the plasmasheath edge is also accelerated into the bulk quasi-neutral plasma region, as the ions are embedded in the sheath region. As described in [5, 6, 21, 34], laboratory experiments indicate that sheath dynamics has three phases: • Transient sheath: Initial stage of a sheath which consists of ions and essentially no electrons (matrix sheath). • Dynamic sheath: Evolution to the steady sheath (Child-Langmuir sheath). • Returning sheath: Return to the physical boundary of a target. We briefly discuss the above three phases of sheath motions respectively: When the −1 , ω : plasmanegative high-voltage is applied to a target material, on a time scale (ωpe pe electron frequency), electrons are repelled instantaneously, which results in the electron free region (matrix sheath). Neglecting electrons inside the sheath, a sheath edge is a sharply defined boundary with quasi-neutral plasma on one side and only ions on the other side, i.e., the electron density distribution is assumed to be a step function at the sheath edge. (For a finite electron temperature, there is some penetration by electrons into the sheath but this effect is important only in the narrow region of the sheath edge). −1 On a time scale (ωpi , ωpi : plasma-ion frequency), a high current peak is formed by the ions extracted from the matrix sheath and the sheath edge expands until it reaches the steady sheath (Child-Langmuir sheath). In this case, the location of the steady sheaths is determined by the Child-Langmuir law [1–4, 10–13]: 4 |hc | = 0 9

3

2e φw2 , 2 m s∞

where hc , 0 are the Child law current and the vacuum permittivity, e and m are the electron charge and mass, −φw is the applied potential and s∞ is the location of a steady sheath. Finally the sheath edge returns to the boundary of a target due to the decrease of the applied potential. In this third phase, as shown experimentally in [34], depending on the geometry of a target and ion-neutral collisions, the ion-acoustic wave can separate from the sheath edge and propagate into the quasi-neutral plasma regime. In this paper, we exclude ion-neutral collisions and the separation of an ion-acoustic wave from a sheath edge occurs in the cylindrical and spherical sheaths. Recently, in [28] K.-U. Riemann and Th. Daube studied an analytic sheath model based on the Euler-Poisson system (1.1) and obtained an explicit solution in the homogeneous sheath

Global Existence of Plasma Ion-Sheaths and Their Dynamics

151

region during ion-extraction phase. In this paper motivated by the work of [28], we generalize their results to several directions: more general data, time-asymptotic behavior of a sheath edge, cylindrical and spherical symmetric targets. More precisely, for a planar target, we study the second phase of the sheath dynamics, i.e., the convergence to the steady sheath and in the case of cylindrical and spherical sheaths, we study the third phase of a sheath. In this paper, we formulate the sheath problem as a free boundary problem and extract the exact dynamics for the sheath edge. Based on this new formulation, we present a theory for the existence and dynamics of prototype planar, cylindrical and spherical ion-sheaths. For the cylindrically and spherically symmetric cases, we assume the target radius is of the order of , i.e., proportional to the Debye length. Further details on the role of plasma sheaths in material processing may be found in the book by Lieberman and Lichtenberg [23]. Discussion of the mathematics of (E-P) for the pure initial value problem may be found in [8, 15, 16, 26]. The rest of this paper is organized as follows. In Sect. 2, we present a formulation of a plasma-sheath problem in the case of planar, cylindrical and spherical targets based on the suitable physical assumptions. We decompose the domain into two sub-domains (a quasi-neutral region and a sheath region) and their common boundary (a plasma-sheath edge). On each sub-domain, we simplify the Euler-Poisson system according to suitable physical relations (zero electron density limit and quasi-neutral limit). In Sect. 3, we consider a planar target and construct smooth solutions to a sheath system using the method of characteristics and prove the convergence of a sheath to the Child-Langmuir sheath and determine the time-asymptotic location of the sheath edge. In Sect. 4, we study an “outer” quasi-neutral problem outside the sheath region. For cylindrical and spherical sheaths, we consider two disjoint sub-regimes of the quasi-neutral regime which are governed by dynamic solutions and steady solutions separately. And finally in Sect. 5, we consider an “inner” sheath problem for cylindrical and spherical targets. We study the third phase of the sheath evolutions and using the ODE for a sheath edge dynamics, we heuristically derive the dynamics of the sheath edge for the small and large time. In an Appendix, we formally derive a current equation from the Euler-Poisson system, and a second order ODE for the dynamics of cylindrical and spherical sheath edge.

2. Formulation of a Sheath Problem In this section, we present a formulation of a sheath problem for multi-dimensional targets with planar, cylindrical and spherical symmetry. First we discuss the simplification of the Euler-Poisson system on suitable physical regimes, and then we consider the main issues and assumptions of this paper.

2.1. Simplification of the Euler-Poisson system. In planar, cylindrical and spherical symmetric cases, the Euler-Poisson system (1.1) becomes   ∂t ρ + ∂r (ρu) = 0, r0 ≤ r < ∞, u2 ∂t u + ∂r 2 = ∂r φ,   2 ∂r (r ν ∂r φ) = ρ − ρe ,

t > 0, (2.1)

152

S.-Y. Ha, M. Slemrod

subject to initial and boundary conditions (ρ, u, φ)(r, 0) = (ρ0 , u0 , φ0 )(r), r0 ≤ r < ∞, (u, φ)(r0 , t) = (uw , φw )(t), t ≥ 0, where r = the radial distance from the center of a target, r0 = the radius of a target, u = the outward normal component of velocity u to a symmetric surface (a plane, a cylinder and a sphere), ρ = r ν n,

ρe = r ν e−φ ,

ν ∈ {0, 1, 2},

and ν = 0, 1 and 2 correspond to planar, cylindrical and spherical targets respectively. First we give a rather elementary description of the plasma sheath. Since the Debye length is a small parameter in (2.1), the Poisson equation suggests that the quasi-neutral relation (n = e−φ ) should pervade in our problem. Substitution of this relation into (2.1) yields the quasi-neutral system (Q): ∂t ρ + ∂r (ρu) = 0, r0 ≤ r < ∞, t > 0, 2 (2.2) ∂t u + ∂r u2 + lnρ = νr , with prescribed initial and boundary data for ρ and u at t = 0 and r = r0 respectively. The hyperbolic system (2.2) possesses two characteristic curves: dχ1 dχ2 = u − 1, = u + 1, dt dt which carry the prescribed data into the domain (r0 , ∞) × R+ . Notice that when u decreases below the critical value u = −1, both characteristics χ1 and χ2 will run into the boundary r = r0 , thus making the initial-boundary value problem for (2.2) unsolvable in the class C 1 ((r0 , ∞)×(0, T ))∩C 0 ([r0 , ∞)×[0, T )), for some positive constant T . Hence near the “Bohm velocity” u = −1, the quasi-neutrality condition breaks down and a sheath boundary layer forms. Since the Poisson equation reads 2 ∂r (r ν ∂r φ) = r ν (n − e−φ ), the quasi-neutrality relation is violated when the left-hand side begins to become nonnegligible. For steady problems [7, 17, 27, 30–32], this has been accounted for by setting the sheath edge where ∂r φ ≈ −β , 0 < β < 1, so that the electric potential develops a large gradient near the sheath edge. We consider two cases: Either cylindrically and spherically symmetric case (ν = 1, 2), when r0 = r¯0 , (¯r0 is positive and independent of ) or the planar case (ν = 0), when r0 = 0. Then we introduce fast variables (¯r , t¯), r¯ =

r ,

t¯ =

t ,

r¯0 ≤ r¯ < ∞,

to get a rescaled system:   ∂t¯ρ + ∂r¯ (ρu) = 0, r¯0 ≤ r¯ < ∞, u2 ∂t¯u + ∂r¯ 2 = ∂r¯ φ,   ∂r¯ (¯r ν ∂r¯ φ) = ρ − ρe ,

t¯ > 0, (2.3)

Global Existence of Plasma Ion-Sheaths and Their Dynamics

153

subject to initial and boundary conditions (ρ, u, φ)(¯r , 0) = (ρ0 , u0 , φ0 )(¯r ), r¯0 ≤ r¯ < ∞, (u, φ)(¯r0 , t) = (uw , φw )(t), t ≥ 0. In a sheath region, we formally set the electron density to be zero ρe = 0 to get the rescaled sheath system (S):   ∂t¯ρ + ∂r¯ (ρu) = 0, r¯0 ≤ r¯ < ∞, u2 ∂t¯u + ∂r¯ 2 = ∂r¯ φ,   ∂r¯ (¯r ν ∂r¯ φ) = ρ,

t¯ > 0, (2.4)

subject to initial and boundary conditions (ρ, u, φ)(¯r , 0) = (ρ0 , u0 , φ0 )(¯r ), r¯0 ≤ r¯ < ∞, (u, φ)(¯r0 , t) = (uw , φw )(t), t ≥ 0. Next we return to the issue of the sheath edge. In the formal quasi-neutral limit ( → 0+), the sheath edge relation described above yields ∂r¯ φ = ∂r φ ≈ 1−β → 0,

as → 0 + .

(2.5)

In the sequel, for convenience we delete over bars in (¯r , t¯). Combining (2.5) with the “Bohm-relation” u = −1, we give a definition of the sheath edge for (2.3). Definition 2.1. A sheath edge S(t) = (s(t), t) separating a quasi-neutral region and an ion-sheath region is the level set of velocity and electric fields {(s(t), t) : u(s(t), t) = −1,

∂r φ(s(t), t) = 0}

and sheath velocity s˙ and sheath density ρs satisfy the following system of ODEs: s˙ = − ρhs − 1, ρ˙s = − νh s , where h is a current in the sheath region, ρs (t) =: ρ(s(t), t) is evaluated at the sheath edge as a limit from the sheath region where (2.4) is satisfied. Remark 2.1. The above system of ODEs can be formally derived from the Euler-Poisson system (see Appendix A) and we note in case ν = 0, we get a 1st order ODE: s˙ = −

h − 1, ρs

ρs = const ,

and in case ν = 1, 2, the above ODE system can be written as a 2nd order ODE: s¨ =

˙ s + 1) ν(˙s + 1)2 h(˙ − . h s

154

S.-Y. Ha, M. Slemrod

Notice that the above definition is somewhat ad hoc: It is neither derived rigorously nor formally, e.g., via matched asymptotic expansions, from the original Euler-Poisson system. Such matching issues have been pursued for steady problems [17, 30–32] but the authors know of no asymptotic or rigorous results for derivation of the dynamic step sheath model. Nevertheless the model (2.4) is derived formally based on our understanding of the nature of the plasma and equally gives results consistent with the experiment described in [21]. Definition 2.2. The outermost characteristic curve A(t) = (a(t), t) issued from s0 associated with the quasi-neutral system (Q), i.e., da(t) = u(a(t), t) + 1, dt

a(0) = s0 ,

is called an ion-acoustic wave. Remark 2.2. In fact we will show in Sect. 4.1 that propagates away from the target.

da(t) dt

≥ 0 and the ion-acoustic wave

Notice that in the case of cylindrical and spherical symmetries, steady data in the quasi-neutral region now evolves into a non-steady dynamic solution in the quasi-neutral region. This is a direct consequence of the fact that we are not in the special planar case and is representative of multi-dimensional dynamics. Furthermore the splitting of the steady data into dynamic and steady quasi-neutral solutions separated by an ionacoustic wave is exactly what is observed in [21].

t shock wave ion-acoustic wave

sheath edge q

R1 R r0

q

R2

s

s0

r

Fig. 1. Schematic diagram of a physical domain

2.2. Main issues and assumptions. Since the sheath edge S(t) separates the physical domain × [0, ∞) into two parts as depicted in Fig. 1, we decompose the physical domain × [0, ∞) into three parts depending on time t: × [0, ∞) = Rs (t) ∪ S(t) ∪ Rq (t), Rs (t) = a sheath region and Rq (t) = a quasi-neutral region.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

155

As a mathematical model for the sheath problem, we take the quasi-neutral system (2.2) and the sheath system (2.4) on Rq and Rs as governing equations for cold ions respectively, and the ODEs in Remark 2.1 as governing equations for the sheath edge. The main goals of this paper are (1) understand the planar sheath dynamics in the second phase (convergence to the Child-Langmuir sheath) and (2) investigate cylindrical and spherical sheaths in the third phase (return to the boundary). Next we briefly discuss the main issues. We first consider an “outer” quasi-neutral flow in the quasi-neutral region Rq (t). For a planar target, we simply take the flow given by (n, u, φ) = (1, −1, 0) which is clearly a steady solution of (2.2). In contrast, for cylindrical and spherical targets, this simple flow does not satisfy the quasi-neutral system (2.2) because of a geometric source term. Hence in the case of cylindrical and spherical targets, we take a composite flow consisting of a steady flow and dynamic solutions to (2.2). Moreover we introduce an ion-acoustic wave issued from the initial sheath location. Hence unlike a planar target, we decompose the quasi-neutral region Rq (t) into two regions: q

q

Rq (t) = R1 (t) ∪ R2 (t), q

R1 (t) = the intermediate region between a sheath edge S(t) and an ion-acoustic wave A(t), q R2 (t) = the exterior region outside an ion-acoustic wave A(t). We note that our construction as shown in Fig. 1 is qualitatively consistent with the experimental result of Kim et al ([21] Fig 3.) We list main assumptions for the mathematical model of the sheath problem; • M1. The physical domain × (0, ∞) can be decomposed into several parts: For a planar sheath, × (0, ∞) = Rs (t) ∪ S(t) ∪ Rq (t),

and

for cylindrical and spherical sheaths, q

q

× (0, ∞) = Rs (t) ∪ S(t) ∪ (R1 (t) ∪ R2 (t)). • M2. We use the sheath system (2.4) and the quasi-neutral system (2.2) in Rs (t) and Rq (t) respectively. • M3. The sheath edge is non-characteristic in the sense that s˙ (t) = −1. • M4. Continuity relation: ρ, u, φ, ∂r ρ, ∂r u, ∂r φ and h are continuous across the sheath edge. Remark 2.3. The approximation of the Euler-Poisson system (1.1) by the sheath system and the quasi-neutral system is formally adopted, although some partial results [9, 29] on the quasi-neutral limit for the Euler-Poisson system are available. 3. A Planar Ion-Sheath In this section, we construct smooth solutions to the planar sheath system (2.4) with (ν = 0), and as a specific example, we consider the time-evolution of a matrix sheath during an ion-extraction phase [28] and finally, we study the dynamics of the sheath edge.

156

S.-Y. Ha, M. Slemrod

3.1. Global existence of smooth solutions. We consider the rescaled sheath system describing the motion of cold ions inside the sheath region [28]:   ∂t n + ∂x (nu) 2 = 0, (x, t) ∈ (0, s(t)) × R+ , ∂t u + ∂x u2 = ∂x φ, (3.1)   2 ∂x φ = n, subject to initial and boundary data: (n, u, φ)(x, 0) = (n0 (x), u0 (x), φ0 (x)), x ∈ [0, s0 ], (u, φ)(0, t) = (uw (t), φw (t)), t ≥ 0. Here s(t) is the sheath edge with an initial location s0 , and n, u and φ denote the density, the velocity and the potential of ions inside a sheath. Now we impose some conditions on “well prepared” initial and boundary data: • A1. (Regularity of initial and boundary data) n0 ∈ C 1 (0, s0 ),

u0 , φ0 ∈ C 2 (0, s0 ) and uw , φw ∈ C 2 (0, ∞).

• A2. (Compatibility and monotonicity of initial data) φ0 (s0 −) = 0, n0 (s0 −) = 1, u0 (s0 −) = −1, n0 (s0 −) = 0, u0 (s0 −) = 0, φ0 (s0 −) = 0, φ0 = n0 , φ0 (s0 −) = 1, h(0)u 0 (s0 −) = 1,

u 0 ≥ 0,

φ0 ≥ 0,

where h(0) is the initial current in the sheath region. • A3. (Compatibility and decay condition of boundary data) (uw , φw )(0) = (u0 , φ0 )(0) and u˙ w (t) ≤ 0, φ˙ w (t) ≥ 0, ∞ uw , u˙ w , u¨ w (t) → (u∞ as t → ∞, w , 0, 0) and φw (t) → φw

∞ where u∞ w and φw are negative, positive constants respectively satisfying 5

3

∞) 4 1 3 1 2 4 (φw ∞ 2 2 = (2|u∞ w | − 2) + (2|uw | − 2) . 6 3

• A4. uw (t) is given such that the total current h(t) is always less than equal to −1. This results in the non-decrease of the sheath edge s(t). Remark 3.1. 1. Outside the sheath region, the ions are assumed to be in the quasi-neutral state: n(x, t) = 1,

u(x, t) = −1,

φ(x, t) = 0,

(x, t) ∈ (s(t), ∞) × [0, ∞).

2. The behavior of the boundary data uw and φw is motivated by the description in the paper of Riemann and Daube [28]. The main results of this section are the following theorem and proposition. Theorem 3.1. Suppose that the initial and boundary data satisfy the assumptions (A1)− (A4). Then there exist smooth solutions n, u and φ to (3.1).

Global Existence of Plasma Ion-Sheaths and Their Dynamics

157

t t E2 s

R 22

sheath edge

s

R 21 quasi-neutral region

tE1

R

s R 12

q

s

R 11 x

s0 Fig. 2. Sub-regions

s Rkl

Proposition 3.1. Assume that the assumptions A1–A4 in Sect. 3.1 hold. Then the sheath edge s(t) satisfies 5

lim s(t) =

t→∞

3

∞) 4 2 4 (φw . 3

For later use, we derive an equivalent system to (3.1) for u, E = ∂x φ, and n: 2  u  = E, (x, t) ∈ (0, s(t)) × R+ , u + ∂ ∂ x 2  t ∂t E + u∂x E = h,   ∂x E = n, h(t) ≡ [∂t E + (nu)](0, t).

(3.2)

When the current h(t) is known a priori, it is easy to calculate u and E by integrating (3.2) along characteristic curves. The system (3.1) can then be reduced to a system of ODEs along characteristic curves: Dn + n∂x u = 0, Dt D ∂x u = Dt n

D(∂x u) + (∂x u)2 = n, Dt D(∂x u) Dn Dt n − (∂x u) Dt n2

= 1,

(3.3) D = ∂t + u∂x . where Dt

Below, we construct smooth solutions to (3.1) using the method of characteristics. Before we give a detailed construction of a smooth sheath, we outline our main steps. (For the definition of Rskl and tEi , see Fig. 2; the exact definition will be given later.) • Step 1. We solve the system (3.1) over Rs11 using the method of characteristics and 2 φ(0, t) + (nu)(0, t), t ∈ [0, t ]. determine the current h(t) := ∂xt E1 • Step 2. Use the relation s˙ (t) = −1 − h(t) to construct the trajectory of the sheath edge s(t), t ∈ [0, tE1 ].

158

S.-Y. Ha, M. Slemrod

• Step 3. We repeat Step 1 on a region Rs12 ∪ Rs21 , and determine the current h(t) , t ∈ [tE1 , tE2 ] again. • Step 4. Use h(t) determined in Step 3 to construct the trajectory of s(t), t ∈ [tE1 , tE2 ]. We repeat Step 3 and Step 4 successively, so that we can define smooth C 1 solutions on any sub-region Rskl , k = 1, 2, · · · , l = 1, 2 and we glue local smooth solutions together. Remark 3.2. Although the above construction generally gives only an implicit expression for smooth solutions, for special initial data (n0 and u0 are constant) explicit smooth solutions can be constructed locally in time. (See K.U. Riemann and Th. Daube’s example in this section). For the simplicity of presentation, we define some notation: For a given α > 0 at time t0 , the flow map χ (α, ·) : [t0 , ∞) → R+ is defined to be the solution of an ODE: dχ (α, t) = u(χ (α, t), t), dt

χ (α, t0 ) = α,

t ≥ t0 ;

and for a given point (x, t) ∈ (0, s(t)) × R+ , we define ψ(x, t) and τ (x, t) as follows: ψ(x, t) = the intersection point between a line t = 0 and a backward characteristic curve issued from (x, t), τ (x, t) = the intersection time between the sheath edge S(t) and a backward characteristic curve issued from (x, t), in other words, χ(ψ(x, t), t) = x;

χ (s(τ (x, t), τ (x, t)) = s(τ (x, t)),

χ (s(τ (x, t), t) = x.

Since χ is decreasing, such ψ(x, t) and τ (x, t) exist uniquely but in general, they cannot be given explicitly in terms of (x, t). However in the case that the current h(t) is known a priori, τ (x, t) can be calculated via the inverse function theorem (see Step 3 below). Let tE1 be the time of intersection between the characteristic curve χ (s0 , t) and the wall: χ (s0 , 0) = s0 ,

χ (s0 , tE1 ) = 0,

and we also denote the region bounded by χ (s0 , t), R+ × {t = 0} and {x = 0} × R+ by Rs11 . Step 1. We construct smooth solutions on a region Rs11 by the method of characteristics. Lemma 3.1. On Rs11 , we have n(x, t) = ∂x u(x, t) =

2n0 (ψ(x, t)) , 2 + 2u0 (ψ(x, t))t + n0 (ψ(x, t))t 2 2[u 0 (ψ(x, t)) + n0 (ψ(x, t))t] . 2 + 2u 0 (ψ(x, t))t + n0 (ψ(x, t))t 2

Proof. Let (x, t) ∈ Rs11 . We integrate (3.3) along the characteristic curve χ (ψ(x, t), ·) from (ψ(x, t), 0) to (x, t) to obtain ∂ u u x (x, t) = 0 (ψ(x, t)) + t. n n0

(3.4)

Global Existence of Plasma Ion-Sheaths and Their Dynamics

Since ∂x u = −

159

1 Dn on a characteristic curve, we have n Dt u D 1 (x, t) = 0 (ψ(x, t)) + t. Dt n n0

(3.5)

Again, integrate (3.5) along the characteristic curve χ (ψ(x, t), ·) from (ψ(x, t), 0) to (x, t) to obtain

u0 (ψ(x, t)) 1 t2 1 = + t+ . n(x, t) n0 (ψ(x, t)) n0 (ψ(x, t)) 2 After simplification, we obtain an implicit formula for an ion density n: n(x, t) =

2n0 (ψ(x, t)) . 2 + 2u0 (ψ(x, t))t + n0 (ψ(x, t))t 2

Since u 0 ≥ 0, n(x, t) is well defined for all (x, t) ∈ Rs11 and it follows from (3.4) that ∂x u(x, t) = This completes the proof.

2[u 0 (ψ(x, t)) + n0 (ψ(x, t))t] (≥ 0). 2 + 2u 0 (ψ(x, t))t + n0 (ψ(x, t))t 2

Based on the above lemma, the velocity u(x, t) of ions is given by u(x, t) = uw (t) +

x

∂ξ u(ξ, t)dξ.

0

(3.6)

On the other hand, using ∂x φ = ∂t u + u∂x u and boundary datum φw (t), we obtain φ(x, t) = φw (t) +

x 0

∂ξ φ(ξ, t)dξ.

(3.7)

In order to calculate n, u and φ, we need to know ψ(x, t) but for some special cases such as when n0 and u0 are constant, the above implicit formula give us explicit expressions. (See K.U. Riemann and Th. Daube’s example in this section). Next for later use, we estimate the space and time variation of ψ(x, t). Lemma 3.2. Suppose that the current h is known a priori in the time interval [0, tE1 ]. Then we have t σ1 σ2 φ0 (α)t 2 χ (α, t) = α + u0 (α)t + + h(σ3 )dσ3 dσ2 dσ1 , 2 0 0 0 1 ∂x ψ(x, t) = , 2 1 + u0 (ψ(x, t))t + φ0 (ψ(x, t)) t2 t σ u0 (ψ(x, t)) + φ0 (ψ(x, t)) + 0 0 2 h(σ3 )dσ3 dσ2 ∂t ψ(x, t) = − . 2 1 + u 0 (ψ(x, t))t + φ0 (ψ(x, t)) t2

160

S.-Y. Ha, M. Slemrod

Proof. By direct calculations, one has the ODE for χ (α, t): d 3 χ (α, t) dχ (α, 0) d 2 χ (α, 0) = h(t), χ (α, 0) = α, = u (α), = φ0 (α). 0 dt 3 dt dt 2 Integrate the above ODE three times to get t σ1 σ2 t2 h(σ3 )dσ3 dσ2 dσ1 , χ (α, t) = α + u0 (α)t + φ0 (α) + 2 0 0 0 and use the definition of ψ(x, t) to obtain x = ψ(x, t) + u0 (ψ(x, t))t

t + φ0 (ψ(x, t))

2

+

2

t

σ1

σ2

h(σ3 )dσ3 dσ2 dσ1 . 0

0

0

(3.8) Now we differentiate (3.8) with respect to x to obtain ∂x ψ(x, t) =

1 1 + u 0 (ψ(x, t))t

+ φ0 (ψ(x, t)) t2

2

.

Again we differentiate (3.8) with respect to t to get 0 = ∂t ψ(x, t) + u 0 (ψ(x, t))∂t ψ(x, t) + u0 (ψ(x, t)) t σ2 +φ0 (ψ(x, t))∂t ψ(x, t) + φ0 (ψ(x, t))t + h(σ3 )dσ3 dσ2 . 0

This yields ∂t ψ(x, t) = −

u0 (ψ(x, t)) + φ0 (ψ(x, t)) +

This completes the proof.

1 + u 0 (ψ(x, t))t

0

t σ2

0 0 h(σ3 )dσ3 dσ2 . 2 + φ0 (ψ(x, t)) t2

Step 2: Trajectory of s(t), t ∈ [0, tE1 ]: By Remark 2.1, we have the ODE for the sheath edge s: s˙ (t) = −h(t) − 1. Since the current h was given in Step 1, we solve the above ODE with initial data s(0) = s0 . On the other hand, we denote the time of intersection between the wall and the characteristic curve issued from (s(tE1 ), tE1 ) by tE2 , i.e., χ (s(tE1 ), tE1 ) = s(tE1 ),

χ (s(tE1 ), tE2 ) = 0.

We decompose the region bounded by s(t), χ (s0 , t), t ∈ [0, tE1 ] and χ (s(tE1 ), t), t ∈ [tE1 , tE2 ] as Rs12 ∪ Rs21 : Rs12 ≡ the region by bounded by s(t), χ (s0 , t), t ∈ [0, tE1 ] and a line t = tE1 , Rs21 ≡ the region by bounded by χ (s(tE1 ), t), t ∈ [tE1 , tE2 ] and a line t = tE1 . Step 3. Let (x, t) ∈ Rs12 ∪ Rs21 : Since x = s(t) is a non-characteristic curve in space-time coordinates, as in Step 1, we can solve the system (3.1) with initial data (n, u, φ)(s(t), t) = (1, −1, 0), t ∈ [0, tE1 ] using the method of characteristics. Next, we calculate n, u and φ for (x, t) ∈ Rs12 ∪ Rs21 .

Global Existence of Plasma Ion-Sheaths and Their Dynamics

161

Lemma 3.3. On Rs12 ∪ Rs21 , we have 2 , 2 + [t − τ (x, t)]2 2[t − τ (x, t)] ∂x u(x, t) = . 2 + [t − τ (x, t)]2 n(x, t) =

Proof. Let (x, t) ∈ Rs12 ∪ Rs21 . We integrate (3.3) from (s(τ (x, t)), τ (x, t)) to (x, t) along the characteristic curve χ (s(τ (x, t)), ·) to obtain u x (x, t) = t − τ (x, t), n where we have used ux (s(τ (x, t)), τ (x, t)) = 0. We then use the relation unx = n1 on the characteristic curve χ (s(τ (x, t)), ·) to find n(x, t) =

2 . 2 + [t − τ (x, t)]2

By the same argument as in Step 1, we obtain ∂x u(x, t) = This completes the proof.

2[t − τ (x, t)] . 2 + [t − τ (x, t)]2

Again, using boundary data at x = 0 and x = s(t), we see  x   uξ (ξ, t)dξ, x ∈ Rs21 ,  uw (t) + 0 s(t) u(x, t) =   uξ (ξ, t)dξ, x ∈ Rs12 .  −1 −

(3.9)

x

Similarly, using ∂x φ = ∂t u + u∂x u, we derive  x   ∂ξ φ(ξ, t)dξ, x ∈ Rs21 ,  φw (t) + 0 φ(x, t) = s(t)   ∂ξ φ(ξ, t)dξ, x ∈ Rs12 . − 

(3.10)

x

In the above calculations, we have assumed the existence of τ (x, t) which is obvious from the geometry of a characteristic curve. Let χ (s(t0 ), t) ≡ χ (t) be the characteristic curve passing through (s(t0 ), t0 ): dχ (t) = u(χ (t), t), χ (t0 ) = s(t0 ), t ∈ [tE1 , tE2 ]. (3.11) dt Lemma 3.4. Suppose that the current h(t) is known a priori. Then we have, for (x, t) ∈ Rs12 , t σ1 σ2 h(σ3 )dσ3 dσ2 dσ1 , x = s(τ (x, t)) − (t − τ (x, t)) + τ (x,t) τ (x,t) τ (x,t)

−2 ∂x τ (x, t) = , h(τ (x, t))[2 + (t − τ (x, t))2 ] 2u(x, t) ∂t τ (x, t) = . h(τ (x, t))[2 + (t − τ (x, t))2 ]

162

S.-Y. Ha, M. Slemrod

Proof. First notice that the characteristic curve χ (t) defined as above satisfies, d 3 χ (t) = h(t), dt 3 subject to initial data χ (t0 ) = s(t0 ),

dχ (t0 ) = −1, dt

d 2 χ (t0 ) = 0. dt 2

Now we integrate the third order ODE for χ using the initial conditions to obtain t σ1 σ2 h(σ3 )dσ3 dσ2 dσ1 . (3.12) χ (t) = s(t0 ) − (t − t0 ) + t0

t0

t0

(i) Let (x, t) ∈ Rs12 ∪ Rs21 . Then by definition of τ (x, t), we have t σ1 σ2 h(σ3 )dσ3 dσ2 dσ1 . x = s(τ (x, t)) − (t − τ (x, t)) +

(3.13)

τ (x,t) τ (x,t) τ (x,t)

(ii) Now we differentiate (3.13) with respect to x to obtain

(t − τ (x, t))2 . 1 = ∂x τ (x, t) s˙ (τ (x, t)) + 1 − h(τ (x, t)) 2 We use the relation h(τ (x, t)) = −˙s (τ (x, t)) − 1 to get the second identity. (iii) We differentiate Eq. (3.13) with respect to t to get

(t − τ (x, t))2 + u(x, t). 0 = ∂t τ (x, t) s˙ (τ (x, t)) + 1 − h(τ (x, t)) 2 This implies the third identity.

Now we use the third identity of the above lemma to determine h(t) in order to calculate the sheath edge in time interval [tE1 , tE2 ]. By definition of the current h, we have 2 φ(0, t) + n(0, t)uw (t) h(t) ≡ ∂xt 2 = ∂t2 u(0, t) + ∂t u(0, t)∂x u(0, t) + uw (t)∂xt u(0, t) + n(0, t)uw (t)

and hence

t − τ (0, t) h(t) = u¨ w (t) + 2u˙ w (t) 2 + [t − τ (0, t)]2

4 − 2[t − τ (0, t)] 2uw (t) +uw (t) 1 − h(τ (0, t))[2 + (t − τ (0, t))2 ] [2 + [t − τ (0, t)]2 ] 2uw (t) + . (3.14) 2 + [t − τ (0, t)]2

Notice that n, u and φ only depend on the travelling time t − ψ(x, t) of particles located at x = s(τ (x, t)) at time t = τ (x, t) along a particle path (a characteristic curve). Again

Global Existence of Plasma Ion-Sheaths and Their Dynamics

163

using the relation s˙ (t) = −1 − h(t), we determine s(t) in time interval [tE1 , tE2 ]. Then we repeat Step 3 on the region Rs22 ∪ Rs31 . In this way, we can define the solutions on any domains Rsi2 ∪ Rs(i+1)1 , i = 3, · · · . Next we study how local solutions defined on Rsij can be joined smoothly. Proof of Theorem 3.1. Since the local solutions are clearly C 1 on the sub-regions Rshk , it suffices to show that the local solutions are actually C 1 on the joint boundary of subregions. For this, we only show that the local solutions in Lemma 3.1 and Lemma 3.3 coincide up to the first derivative across the characteristic curve χ (s0 , ·). The other cases can be proved similarly. (i) First, we consider the ion density n. It follows from Lemma 3.1 that n(χ (s0 , t)−, t) =

2n0 (ψ(χ (s0 , t)−, t)) 2 + 2u0 (ψ(χ (s0 , t)−, t))t + n0 (ψ(χ (s0 , t)−, t))t 2

=

2n0 (s0 −, t) 2 + 2u0 (s0 −, t)t + n0 (s0 −, t)t 2

=

2 . 2 + t2

(3.15)

Here we used the fact that ψ(χ (s0 , t)−, t) = s0 and the continuity relation (M4) in Sect. 2. On the other hand, since τ (χ (s0 , t), t) = 0, Lemma 3.2 yields n(χ (s0 , t)+, t) =

2 2 = . 2 2 + [t − τ (χ (s0 , t), t)] 2 + t2

(3.16)

Combining (3.15) and (3.16), we have the continuity of n across the characteristic curve χ(s0 , ·): n(χ (s0 , t)−, t) = n(χ (s0 , t)+, t). Now we need to check ∂x n(χ (s0 , t)−, t) = ∂x n(χ (s0 , t)+, t). It follows from Lemma 3.2 and Lemma 3.4 that ∂x ψ(χ (s0 , t)−, t) =

2

, 2 + φ0 (s0 −)t 2

∂x τ (χ (s0 , t)+, t) = −

2 . h(0)(2 + t 2 )

By direct calculations, we have 4t∂x τ (χ (s0 , t)+, t) (2 + t 2 )2 −8t = , h(0)(2 + t 2 )3

∂x n(χ (s0 , t)+, t) =

164

S.-Y. Ha, M. Slemrod

−4tu 0 (s0 −)∂x ψ(χ (s0 , t)−, t) (2 + t 2 )2 −8tu 0 (s0 −) . = (2 + t 2 )2 (2 + φ0 (s0 −)t 2 )

∂x n(χ (s0 , t)−, t) =

Since φ0 (s0 −) = 1 and h(0)u 0 (s0 −) = 1, we have ∂x n(χ (s0 , t)−, t) = ∂x n(χ (s0 , t)+, t). (ii) Next we consider the ion velocity u. In Lemma 3.1, we use ψ(χ (s0 , t)−, t) = s0 to get 2t . 2 + t2

(3.17)

2[t − τ (χ (s0 , t)+, t)] 2t = . 2 + [t − τ (χ (s0 , t)+, t)]2 2 + t2

(3.18)

∂x u(χ (s0 , t)−, t) = On the other hand, Lemma 3.3 yields ∂x u(χ (s0 , t)+, t) =

It follows from (3.17) and (3.18) that ∂x u(χ (s0 , t)−, t) = ∂x u(χ (s0 , t)+, t). It then follows from (3.9) that u(χ (s0 , t)+, t) = −1 −

s(t)

uξ (ξ, t)dξ, χ(s0 ,t) χ(s0 ,t)

u(χ (s0 , t)−, t) = uw (t) + Since

0

s(t)

0

uξ (ξ, t)dξ.

uξ (ξ, t)dξ = u(s(t), t) − uw (t) = −1 − uw (t),

we have u(χ (s0 , t)−, t) = u(χ (s0 , t)+, t). Hence we have shown that n, ∂x n, u and ∂x u are C 1 on χ (s0 , t). From (3.1), this implies the continuity of ∂t n, ∂t u, φ, ∂x φ as well. This completes the proof. Before we finish this section, we consider special initial data (n and u are constant) and show how the implicit formulas can be used to find explicit local solutions in t. Here assumption A2 will not be satisfied and we will construct a weak but not smooth solution in the sheath region. Example (K.U. Riemann and Th. Daube [28]). In this example, we consider the timeevolution of a sheath arising from a matrix sheath during the ion-extraction phase (t ∈ [0, tE1 ]). We take initial data and boundary data: n0 (x) = 1,

u0 (x) = −1 φ0 (x) =

1 (x − s0 )2 , 2

s0 =

2φ0 (0),

0 ≤ x ≤ s0 ,

Global Existence of Plasma Ion-Sheaths and Their Dynamics

(u, φ)(0, t) = (uw , φw )(t),

165

t ≥ 0.

Since the last requirement of Assumption A2 is violated by this data since u (s0 −) = 0, we cannot expect a smooth solution in the sheath region. Outside the sheath region x ∈ (s(t), ∞), the plasma is in the quasi-neutral state: n(x, t) = 1,

u(x, t) = −1,

φ(x, t) = 0,

t ≥ 0.

Then by Lemma 3.1, on Rs11 we have 2 , 2 + t2

2xt u(x, t) = uw (t) + , 2 + t2 2xt x2 φ(x, t) = φw (t) + u˙ w (t)x + u (t) + . w 2 + t2 2 + t2

n(x, t) =

Hence we can calculate the current h(t), t ∈ [0, tE1 ] explicitly, i.e., 2t 8uw (t) , + h(t) = u¨ w (t) + u˙ w (t) 2 + t2 (2 + t 2 )2 s˙ (t) = −1 − h(t). Once the trajectory of s(t) is known in 0 ≤ t ≤ tE1 , local solutions on Rs12 are given as 2(t − τ (x, t)) 2 , ∂x u(x, t) = , 2 2 + (t − τ (x, t)) 2 + (t − τ (x, t))2 s(t) 2(t − τ (ξ, t)) dξ. u(x, t) = −1 − 2 + (t − τ (ξ, t))2 x

n(x, t) =

From the relation ∂x φ = ∂t u + u∂x u, we have φ(x, t) = −

s(t)

x

∂ξ φ(ξ, t)dξ

and since the current h(t) is given explicitly, we can determine τ (x, t) from (3.13). Remark 3.3. Since the initial data does not satisfy u 0 (s0 ) = 0, the density n(x, t) while continuous possesses a jump in the first derivative in x across the characteristic curve χ(s0 , ·). Hence the example of K.U. Riemann and Th. Daube provides a weak solution but not a smooth C 1 solution in the sheath region. 3.2. Large time dynamics of the sheath edge. In this subsection, we study the large time dynamics of the sheath edge x = s(t). By our choice of boundary data uw and φw , the current h(t) is always less than or equal to −1. In addition, we have s˙ (t) = −1 − h(t) ≥ 0,

s(0) = s0 > 0.

Since the sheath edge is non-decreasing in time t, there are two cases: either lim s(t) = s∞ < ∞ t→∞

or

lim s(t) = ∞.

t→∞

166

S.-Y. Ha, M. Slemrod

We first consider the impossibility of s∞ = ∞. Let t 1 be given and consider the characteristic curve χ (s(τ (0, t)), σ ) issued from (s(τ (0, t)), τ (0, t)). Since u is nondecreasing, we have 1 ≤ |u(χ , σ )| ≤ |u∞ w |. We integrate

dχ(s(τ (0,t)),σ ) dσ

from σ = τ (0, t) to σ = t to get t dχ (s(τ (0, t)), σ ) |s(τ (0, t))| = dσ tτ (0,t) ≤ |u(χ (s(τ (0, t)), σ ), σ )|dσ τ (0,t)

≤ |u∞ w |(t − τ (0, t)). By assumption, the left hand side goes to ∞ as t → ∞. Hence we conclude lim (t − τ (0, t)) = ∞.

t→∞

But this is impossible, because in (3.14), we will have lim h(t) = 0,

t→∞

lim s(t) = −1,

t→∞

which contradicts assumption A4. Hence the sheath edge cannot go to infinity as t → ∞. In the case that the sheath edge comes to rest as t → ∞, we now give the time-asymptotic location of the sheath edge s(t) explicitly in terms of the asymptotic value of the ∞. wall potential φw Proof of Proposition 3.2. Since the sheath edge s(t) converges to s∞ (< ∞) monotonically, we have lim s˙ (t) = 0,

t→∞

lim h(τ (0, t)) = −1,

t→∞

lim [t − τ (0, t)] = T∗ ,

t→∞

lim ∂t (t − τ (0, t)) = 0.

(3.19)

t→∞

It follows from (3.14) that 1=

2|u∞ w| , or equivalently T = 2|u∞ ∗ w | − 2. 2 + T∗2

Below, we calculate the time-asymptotic location s∞ of the sheath edge. Recall that h(t) = −1 + o(1) and uw (t) = u∞ w + o(1),

as t → ∞.

Let t 1 be given and denote χ (σ ) by the characteristic curve issued from (s(τ (0, t)), τ (0, t)) i.e., dχ (σ ) = u(χ (σ ), σ ), σ ∈ [τ (0, t), t], dσ χ (τ (0, t)) = s(τ (0, t)), χ (t) = 0. Along characteristic curves, we have Du = E, Dt

DE = h. Dt

Global Existence of Plasma Ion-Sheaths and Their Dynamics

167

We integrate the above equations along the characteristic curve to get σ E(χ (σ ), σ ) = E(s(τ (0, t)), τ (0, t)) + −1dσ1 + o(1) = −(σ − τ (0, t)) + o(1), u(χ (σ ), σ ) = u(s(τ (0, t)), τ (0, t)) −

τ (0,t) σ

(σ1 − τ (0, t))dσ1 + o(1),

τ (0,t)

1 = −1 − (σ − τ (0, t))2 + o(1), as t → ∞, 2 where we have used the sheath edge relations E(s(τ (0, t)), τ (0, t)) = 0,

u(s(τ (0, t)), τ (0, t)) = −1.

Now, we find the characteristic curve χ (σ ) satisfying boundary conditions χ (t) = 1 dχ (σ ) = −1 − (σ − τ (0, t))2 + o(1) to 0, χ (τ (0, t)) = s(τ (0, t)). We integrate dσ 2 get 1 χ (σ ) − s(τ (0, t)) = −(σ − τ (0, t)) − (σ − τ (0, t))3 + o(1). 6

(3.20)

Since χ (t) = 0 and s(τ (0, t)) = s∞ + o(1), we have 1 s∞ = T∗ + T∗3 + o(1) 6 1 3 1 ∞ 2 2 = (2|u∞ w | − 2) + (2|uw | − 2) + o(1), 6

as t → ∞.

Next we determine the time-asymptotic location of the sheath edge in terms of the ∞ . By the fundamental theorem of calculus, we have time-asymptotic wall potential φw s(σ ) φw (σ ) = − E(x, σ )dx. (3.21) 0

For given (x, σ ), x ∈ [0, s(σ )], 1 σ , there exists the unique characteristic curve χ passing through (x, σ ) and (0, τ (0, t)), for some t 1. Then it follows from (3.20) that 1 x = s(σ ) − (σ − τ (0, t)) − (σ − τ (0, t))3 + o(1) 6

as σ, t → ∞,

i.e., (σ − τ (0, t))3 + 6(σ − τ (0, t)) + 6(x − s(σ )) = o(1),

as σ, t → ∞,

which yields σ − τ (0, t) = ζ (x, σ ) + o(1), Hence we find

E(x, σ ) =

for some ζ (x, σ ) as σ, t → ∞.

σ

h(σ1 )dσ1 τ (0,t)

= −[σ − τ (0, t)] + o(1) = −ζ (x, σ ) + o(1),

as σ, t → ∞.

(3.22)

168

S.-Y. Ha, M. Slemrod

Although we could solve the cubic equation (3.22) explicitly, instead we obtain ζ as follows. Consider the following cubic equation of (3.22). 1 −(x − s(σ )) = (σ − τ (0, t)) + (σ − τ (0, t))3 + o(1). (3.23) 6 First, notice that if 1 σ − τ (0, t) = (σ − τ (0, t))3 , 6 then √ σ − τ (0, t) = 6. Let η be positive number such that 0 < η 1. Then by the comparison of two terms in (3.22), we consider the following three cases. √ √ Case 1. 0 ≤ σ − τ (0, t) ≤ η 6 or equivalently s(σ ) − η 6 ≤ x ≤ s(σ ): We see −x + s(σ ) = σ − τ (0, t) = ζ (x, σ ) + o(1), as σ, t → ∞. √ √ √ Case 2. √ η 6 ≤ σ − τ (0, t) < (1 + η) 6 or equivalently −(1 + η) 6 ≤ x ≤ s(σ ) − η 6: We see √ ζ (x, σ ) + o(1) = 6, as σ, t → ∞. √ √ ˜ Case 3. (1 + η) 6 ≤ σ − ψ(0, t) or equivalently x < s(σ ) − η 6: We see 1

ζ (x, σ ) + o(1) = (6(−x + s(σ ))) 3 ,

as σ, t → ∞.

In (3.21), we divide the integral into three pieces according to above three cases: s(σ ) φw (σ ) = ζ (x, σ )dx 0 s(σ ) = √ (−x + s(σ ))dx s(σ )−η 6 √ s(σ )−η 6

+ 1

√

√ s(σ )−(1+η) 6

√ s(σ )−(1+η) 6

6dx +

1

6(−x + s(σ ))

3

dx + o(1)

0

√ √ s(σ )2 − (s(σ ) − η 6)2 − s(σ )(η 6) + 6 2 1 √ 4 4 63 3 + s(σ ) 3 − [(1 + η) 6] 3 + o(1), as σ, t → ∞. 4 On the other hand, by letting η → 0+ in the above equation, we obtain =

4

5

4

φw (σ ) = 3 3 2− 3 s(σ ) 3 + o(1),

as σ → ∞.

Hence we obtain the Child-Langmuir law for sheath edge location for a steady sheath: 5

24 ∞ 3 lim s(t) = (φ ) 4 . t→∞ 3 w This completes the proof.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

169

4. The Quasi-Neutral System with Cylindrical and Spherical Symmetry We begin our study of sheath dynamics with cylindrical and spherical symmetry. The first step is to understand the quasi-neutral system which describes the fluid motion of cold ions exterior to the sheath. Specifically we study steady state solutions and dynamic solutions of the quasi-neutral system (Q) which is a 2 × 2 strictly hyperbolic system: ∂t ρ + ∂r (ρu) = 0, s(t) < r < ∞, t > 0, 2 (4.1) ∂t u + ∂r u2 + ln ρ = νr . These solutions will be used to describe exterior flows of sheaths. Notice that the above system can be written as an isothermal Euler equations with pressure law P = ρ and νρ a geometric source term . In the following two subsections, we will consider steady r state solutions and dynamic solutions of (4.1) separately. 4.1. Steady state solutions. Steady state solutions (ρ, u) = (R, U )(r) which are the q exterior flow in region R2 satisfy the following steady system (Q)st :   d RU = 0, a(t) < r < ∞, dr (4.2)  d U 2 + ln R = ν , dr 2 r subject to boundary conditions at r = ∞; U → 0,

n → 1,

RU → −A,

as r → ∞.

where a(t) is the trajectory of the ion-acoustic wave issued from s0 , and A is a given positive constant. Then it follows from the first equation of (4.2) that RU = −A,

or

R=−

A . U

(4.3)

By direct calculation, we have 1 dR 1 dU =− . R dr U dr

(4.4)

It follows from (4.2) and (4.4) that dU νU = , dr r(U 2 − 1)

U (∞) = 0,

and we obtain an implicit solution (R, U ): U2 1 s0 ν −√ = U e− 2 , e r

(4.5)

defined on the maximal interval existence (s0 , ∞), where lim U (r) = −1, r→s0 +

lim

r→s0 +

dU (r) = ∞. Notice the implicit relation (4.5) algebraically defines possible multiple dr U2 solutions U . However F(U ) ≡ U e− 2 is a monotone increasing function in the interval

170

S.-Y. Ha, M. Slemrod

[−1, 0] and F(−1) = − √1e , F(0) = 0. On the other hand, the left-hand side of (4.5) satisfies 1 1 s0 ν −√ ≤ −√ ≤ 0, r ∈ [s0 , ∞). e e r Hence we take U to be the unique monotone increasing solution of (4.5) with range in [−1, 0]. The steady state solution ρ = R and u = U are only defined on the exterior interval r > s0 . Thus U and R provide initial data for our problem in cylindrical and spherical symmetry when r > s0 . da(t) Recall the ion-acoustic wave is defined by the equation = U (a(t)) + 1, where dt U, R is the solution of (4.2). But as noted above, U has range in [−1, 0] and hence da(t) dt ≥ 0 and in fact a(t) is only defined on the domain of U , i.e., s0 ≤ a(t) < ∞, t ≥ 0. This observation justifies Remark 2.2. 4.2. Dynamic solutions. In the previous Subsect. 4.1, we derived initial data (ρ0 , u0 )(r) = (R, U )(r),

r > s0 ,

for the quasi-neutral system (4.1). In this subsection, we will see how (Q) evolves for this steady initial data and we study dynamic solutions to the system (4.1) which defines the exterior flow between the sheath edge and the ion-acoustic wave issued from initial q sheath edge position in region R1 . First we rewrite the system (4.1) into quasi-linear form: ∂t ρ + u∂r ρ + ρ∂r u = 0, (r, t) ∈ (s(t), a(t)) × R+ , (4.6) ∂t u + ρ1 ∂r ρ + u∂r u = νr , with initial data u0 (r) = U (r),

ρ0 (r) = R(r),

r ∈ [s0 , ∞).

By direct calculations, we obtain two genuinely nonlinear characteristic fields in the sense of Lax [22]; λ1 (ρ, u) = u − 1, λ2 (ρ, u) = u + 1, r1 (ρ, u) = (−ρ, 1), r2 (ρ, u) = (ρ, 1), ∇λ1 (ρ, u) · r1 (ρ, u) = 1 > 0, ∇λ2 (ρ, u) · r2 (ρ, u) = 1 > 0. Next we define some notation: Let α > s0 be given. χi (α, t) = an i th characteristic curve issued from r = α satisfying dχi (α, t) χi (α, 0) = α. = λi (ρ(α, t), u(α, t)), dt Define directional derivatives along characteristic curves: F 1 =

d F (χ1 (α, t), t), dt

F 2 =

d F (χ2 (α, t), t). dt

We introduce Riemann invariants (1 , 2 ): 1 = u − ln ρ,

2 = u + ln ρ.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

171

In terms of the Riemann invariants, the system (4.1) becomes ∂t 1 + λ1 (1 , 2 )∂r 1 = νr , ∂t 2 + λ2 (1 , 2 )∂r 2 = νr .

(4.7)

We use the simplified notation for the density on the sheath edge: ρs =: ρ(s(t), t). Lemma 4.1. Suppose there exists T1 > 0 such that s˙ (t) < 0,

t ∈ (0, T1 ].

Then 2 satisfies ∂r 2 (s(t), t) > 0,

t ∈ [0, T1 ].

Proof. We first consider the case t ∈ (0, T1 ]. Then by assumption we have s˙ (t) < 0,

t ∈ (0, T1 ].

Since u(s(t), t) = −1,

ρ(s(t), t) = ρs (t),

we have 2 (s(t), t) = −1 + lnρs (t).

(4.8)

Below all partial derivatives are evaluated as limits from the quasi-neutral region. We differentiate (4.8) with respect to t to get ∂t 2 (s(t)+, t) + s˙ (t)∂r 2 (s(t)+, t) =

ρ˙s (t) . ρs (t)

(4.9)

On the other hand, since 2 is the second Riemann invariant, we have

22 (s(t), t) = ∂t 2 (s(t), t) + λ2 (s(t), t)∂x 2 (s(t), t) ν = ∂t 2 (s(t), t) = , s(t)

(4.10)

where we used λ2 = 0 on the sheath edge. Now we subtract (4.10) from (4.9) to get ρ˙s (t) ν − ρs (t) s(t) ν h ν =− . + 1 = s˙ (t) s ρs s(t)

s˙ (t)∂r 2 (s(t), t) =

Hence we have ∂r 2 (s(t), t) > 0. Now we consider ∂2 (s0 , 0). By definition, we have ∂r 2 (s0 , 0) = ∂r u0 (s0 ) + (ln ρ) (s0 ) ν ν = 2∂r u0 (s0 ) + ≥ > 0. s0 s0 This completes the proof.

172

S.-Y. Ha, M. Slemrod

Remark 4.1. In preparation for Sect. 5, notice that the initial data such that s˙ (0) = ˙ 0, h(0) = 0 yields the 2nd order equation for s s¨ (0) = −

ν < 0. s(0)

Hence there exists some time T1 such that s˙ (t) < 0,

t ∈ (0, T1 ].

For simplicity, we use the following abbreviated notations: χ2 (t : t0 ) ≡ the 2nd characteristic curve issued from (s(t0 ), t0 ), (1 −2 ) Y (t : t0 ) ≡ e 4 ∂r 2 (χ2 (t : t0 ), t), t ≥ t0 . Next we derive an ODE for ∂r 2 along the second characteristic curve χ2 (t : t0 ) which leads to the finite-time blow up of ∂r 2 . By a rather tedious calculation, we obtain (∂r 2 )2 = −(∂r u)2 − (∂r lnρ)(∂r u) − = −(∂r u)(∂r 2 ) −

ν . χ22

ν χ22 (4.11)

Here all partial derivatives are evaluated at (χ2 (t), t). Substitute ∂r u =

∂r 1 + ∂r 2 2

into (4.11) to obtain

(∂r 2 )2 = − =−

∂r 1 + ∂r 2 2

∂r 2 −

ν , χ22

(∂r 2 )2 (∂r 1 )(∂r 2 ) ν − − 2. 2 2 χ2

(4.12)

On the other hand, we have 12 (χ2 (t), t) = (∂t 1 + (u + 1)∂r 1 ) (χ2 (t), t) = 11 + 2∂r 1 (χ2 (t), t) ν = + 2(∂r 1 )(χ2 (t), t) = 22 + 2∂r 1 (χ2 (t), t), χ2 which implies (∂r 1 )(χ2 (t), t) =

(1 − 2 )2 (χ2 (t), t), 2

and (4.12) becomes 1 (∂r 2 )2 ν (∂r 2 )2 + (1 − 2 )2 ∂r 2 = − − 2. 4 2 χ2

Global Existence of Plasma Ion-Sheaths and Their Dynamics

173

Now multiply the above equation by the integrating factor e e

(1 −2 ) 4

∂r 2

2

=−

(1 −2 ) 4

and we find

(∂r 2 )2 (1 −2 ) ν (1 −2 ) e 4 − 2e 4 . 2 χ2

(4.13)

With this notation (4.13) becomes Y 2 = −e−

1 −2 4

1 −2 ν Y2 −e 4 < 0. 2 χ22

By Lemma 4.1 we know Y (t0 : t0 ) > 0, that

(4.14)

t0 ∈ [0, T1 ]. Hence it follows from (4.14)

Y (t : t0 ) ≤ Y (t0 : t0 ),

t ≥ t0 .

Then since 1 and 2 will be bounded in terms of data on the sheath edge and the q ion-acoustic wave in the domain R1 × [0, T1 ], we have the following estimate: −c1 Y 2 − c2 ≥ Y 2 ≥ −c3 Y 2 − c4 ,

(4.15)

where c1 , c2 , c3 and c4 are positive constants depending on T1 . Since data Y (χ2 (t), t) on a sheath edge is positive, Eq. (4.14) implies −c1 Y 2 (t : t0 ) − c2 ≥

d Y (t : t0 ) ≥ −c3 Y 2 (t : t0 ) − c4 . dt

Explicit integration of this relation shows Y (t : t0 ) must eventually become negative and so −Y has a finite-time blow up. This observation yields: Proposition 4.1. For the cylindrical and spherical symmetry cases, a shock wave must form in finite-time in a region Rq between the sheath edge and the ion-acoustic wave. Below, we estimate the positive lower bound for the blow-up time of a 2 along the characteristic curve χ2 (t : t0 ) starting from (s(t0 ), t0 ). Let T2 (t0 ) be a time such that Y (T2 (t0 ) : t0 ) = 0. Then since 0 < Y (t : t0 ) < Y (t0 : t0 ) on some finite time 0 < t < min{T2 (t0 ), T1 }, we have d Y (t : t0 ) ≥ −c3 Y 2 (t0 : t0 ) − c4 . dt We integrate from t0 to T2 (t0 ) to get Y (t0 : t0 ) ≤ [c3 Y 2 (t0 : t0 ) + c4 ](T2 (t0 ) − t0 ). Hence we have an estimate for T2 (t0 ): T2 (t0 ) >

Y (t0 : t0 ) + t0 . c3 Y 2 (t0 : t0 ) + c4

174

S.-Y. Ha, M. Slemrod

So, −Y (t : t0 ) will blow up at some time after T2 (t0 ). We denote the first blow up time of data on the sheath (s(t), t), t ∈ [0, T1 ] by T∗s . Then by the above estimate, we have T∗s ≥ min

t0 ∈[0,T1 ]

Y (t0 : t0 ) . + t 0 c3 Y 2 (t0 : t0 ) + c4

(4.16)

Next we consider the possibility of finite-time blow up in ∂r 1 . Since 1 = u − lnρ, we have ∂r 1 = ∂r u −

∂r ρ , ρ

and using u = U (r), ρ = R(r) and (4.2), we have

U (U + 1) (a(t), t) ≤ 0. (∂r 1 )(a(t), t) = U Therefore ∂r 1 is non-positive on the ion-acoustic wave, and a similar argument as for ∂r 2 shows 1. ∂r 1 does not blow up in some time interval T∗i > t > 0. 2. ∂r 1 eventually must blow up. Let T∗ = min{T∗s , T∗i }. On the time interval 0 < t < T∗ , we have smooth solutions to our one-sided Goursat-like problem. Proposition 4.2. There is a positive T∗ such that we have smooth solutions (ρ, u) in a q region R1 ∩ (R+ × [0, T∗ ]). 5. Cylindrical and Spherical Ion-Sheaths In this section, we study the “inner” problem for cylindrical and spherical sheaths which are governed by the rescaled sheath system   ∂t ρ + ∂r (ρu) = 0, (r, t) ∈ (r0 , s(t)) × R+ , u2 ∂t u + ∂r 2 = ∂r φ, (5.1)   ν ∂r (r ∂r φ) = ρ, subject to initial and boundary data: (ρ, u, φ)(r, 0) = (ρ0 , u0 , φ0 )(r), r ∈ [r0 , s0 ], (u, φ)(0, t) = (uw , φw )(t), t ≥ 0. Unlike the case of planar sheath in Sect. 3.1, we employ the initial data representing the third phase of the sheath dynamics (Returning sheath). Now we impose compatibility, monotonicity conditions on initial and boundary data: • B1. (Regularity of initial and boundary data) ρ0 ∈ C 1 ((r0 , s0 )), uw ∈ C 4 ((0, ∞)),

u0 , φ0 ∈ C 2 ((r0 , s0 )), and φw ∈ C 2 ((0, ∞)).

Global Existence of Plasma Ion-Sheaths and Their Dynamics

175

• B2. (Compatibility and monotonicity of initial data) ρ0 (s0 −) = A, u0 (s0 −) = −1, φ0 (s0 −) = − ln A + ν ln s0 , ρ0 (s0 −) = R (s0 +), u 0 (s0 −) = U (s0 +), ν φ0 (s0 −) = −(ln R) (s0 ) + , u 0 ≥ 0, φ0 ≥ 0, s0 where R and U is the stationary solution defined in Sect. 4.1 and A is a given constant in Sect. 4.1. • B3. (Compatibility and decay condition of boundary data) (uw , φw )(0) = (u0 , φ0 )(r0 ) and u˙ w (t) ≥ 0, φ˙ w (t) ≤ 0, (uw (t), u˙ w (t), u¨ w (t)) → (−1, 0, 0), ∂r (r ν ∂r φ0 ) = ρ0 and φw (t) → 0 as t → ∞. • B4. (Consistency between initial and boundary data) Initial and boundary data are “well prepared” so that s˙ (0) = 0,

s˙ (t) > −1,

˙ h(0) = 0.

h(0) = −A,

Remark 5.1. 1. In Proposition 5.1, we need the boundedness of the third derivative of s at t = 0+ which implies C 2 regularity of the current h. This results in assumption B1 uw ∈ C 4 (see Lemma 5.1). 2. In B2, we must have 2nd derivative relations in order to have smooth solutions like A2 in Sect. 3.1. 3. The above compatibility and decay conditions have been formulated with the intent of setting up initial conditions mimicking the experiment described in [21]. Specifically we wish any lack of smoothness in the initial data to reflect the (assumed) quasi-neutral and sheath initial data. Any other lack of smoothness would be contrary to our goal of reproducing the observed experimental result. The main result of this section is the following theorem. Theorem 5.1. Suppose that the initial and boundary data satisfy the assumptions (B1)–(B4). Then there exist piecewise smooth solutions n, u and φ to (5.1) locally t ∈ [0, T ] for some T > 0. Proof. The proof is based on Lemma 5.1–Lemma 5.3 in the next section.

As we did for the planar sheaths in Sect. 3, we construct the smooth solutions using the method of characteristics.

5.1. Global existence of piecewise smooth weak solutions. We first rewrite the system (5.1) by equations of ODEs along characteristic curves: Du e = ν, Dt r

De = h, Dt

∂r e = ρ,

where

D = ∂t + u∂r . Dt

(5.2)

We define a characteristic curve χ to be a solution of ODE: dχ (α, t) = u(χ (α, t), t), dt

χ (α, 0) = α,

(5.3)

176

S.-Y. Ha, M. Slemrod

and for notational convenience, we use the simplified notation (α, t) ≡ ∂α χ (α, t). Although the construction of local smooth solutions is the same as the planar sheath in spirit, we first point out the difference. Along a characteristic curve, we have Dρ + ρ∂r u = 0, Dt D(∂r u) ρ ν = −(∂r u)2 + ν − ∂r φ. Dt r r This results in D Dt

∂r u ρ

=

1 ν − (∂r φ)ρ. rν r

Since the right hand side of the above equation involves in ∂r φ and ρ, this formula is not useful compared to the planar case. In the planar case (ν = 0), the right-hand side is 1 so that we can integrate the above equation along characteristic curves. In the cylindrical and spherical cases we must follow a more complicated route. First let tE1 be the time of intersection between a characteristic χ (s0 , ·) starting at r = s0 and the target. Then we decompose the region [0, s(t)] × [0, tE1 ] by two sub-regions Rs11 and Rs12 : Rs11 = the region enclosed by χ (s0 , ·), t = 0 and r = r0 , Rs12 = the region enclosed by χ (s0 , ·), t = tE1 and the sheath S(t). Step 1. Construction of smooth solutions on Rs11 : Lemma 5.1. A characteristic curve χ satisfies the following third order nonlinear ODE: d ν d 2χ = h, χ dt dt 2 subject to initial conditions dχ (α, 0) = u0 (α), dt

χ (α, 0) = α,

d 2 χ (α, 0) = φ0 (α). dt 2

Proof. By definition of χ, we have d 2χ du e = = ∂r φ = ν , dt 2 dt χ d 3χ dχ e 1 de −ν = ν 3 dt χ dt dt χ ν+1 ν dχ d 2 χ h , = ν − χ χ dt dt 2 and we have dχ (α, 0) = u(χ (α, 0), 0) = u0 (α), dt 2 d χ (α, 0) = ∂r φ(χ (α, 0), 0) = φ0 (α). dt 2

χ (α, 0) = α,

This completes the proof.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

177

Let τ (α) be the time which the characteristic curve χ (α, ·) starting at r = α, t = 0 reaches a wall, i.e., χ (α, τ (α)) = 0,

u(r0 , τ (α)) = uw (τ (α)).

Lemma 5.2. Let α(t) be the solution to τ (α) = t, i.e., α(t) = τ −1 (t). Then the current h satisfies the relation: ˙ = h(t), r0ν u¨ w (t) − α ν ∂r2 φ0 (α(t))α(t)

0 < t < tE1 .

Proof. Since d ν d 2χ = h, χ dt dt 2 integration from t = 0 to t = τ (α) yields χ ν (α, τ (α))

2 d 2 χ (α, τ (α)) ν d χ (α, 0) − α = dt 2 dt 2

τ (α)

h(σ )dσ.

(5.4)

0

On the other hand, since we have d 2 χ (α, τ (α)) = u˙ w (τ (α)), dt 2

χ ν (α, τ (α)) = r0ν ,

d 2 χ (α, 0) = φ0 (α), dt 2

(5.4) becomes r0ν u˙ w (t) − α ν φ0 (α)

t

=

h(σ )dσ,

(5.5)

0

and we differentiate (5.5) with respect to t to get ˙ = h(t), r0ν u¨ w (t) − α ν φ0 (α)α(t) where α(t) ˙ = (τ (α(t))−1 ) . This completes the proof.

Remark 5.2. Combining Lemma 5.1 and Lemma 5.2, we obtain the equation for the χ : d ν d 2χ ˙ = r0ν u¨ w (t) − α ν φ0 (α(t))α(t). χ dt 2 dt Lemma 5.3. Along the characteristic curve χ (α, t) we have ∂t (α, t) , (α, t) ρ0 (α) ρ(χ (α, t), t) = , (α, t)

∂r u(χ (α, t), t) =

e(χ (α, t), t) = e0 (α) +

t

h(σ ) dσ. 0

178

S.-Y. Ha, M. Slemrod

Proof. (i) We differentiate the characteristic equation (5.3) with respect to α to get ∂t ∂α χ (α, t) = (∂r u)(χ (α, t), t) · (∂α χ (α, t)).

(5.6)

Then Eq. (5.6) becomes ∂t (α, t) = ∂r u(χ (α, t), t). (α, t) (ii) It follows from the conservation of mass equation in (5.1) that ρ +

ρ∂t = 0.

This yields d ρ(χ (α, σ ), σ )(α, σ ) = 0. dσ

(5.7)

Now we integrate (5.7) from σ = 0 and σ = t along the characteristic curve χ (α, t) to get ρ(χ (α, t), t) =

ρ0 (α) . (α, t)

Here we used (α, 0) = 1. (iii) We integrate De Dt = h along a characteristic curve to obtain the third identity. This completes the proof. Step 2. We use the current h implicitly given to find the trajectory of the sheath edge s(t), t ∈ [0, tE1 ] by solving s¨ =

ν(˙s + 1)2 h˙ (˙s + 1) − , h s

subject to initial data s(0) = s0 ,

s˙ (0) = 0.

Once h and s are known in time-interval [0, tE1 ], the sheath edge density ρs is given as t νh(σ ) dσ. ρs (t) = A − 0 s(σ ) Step 3. Construction of solutions on Rs12 ∪ Rs21 . Recall that on the sheath edge, u = −1,

ρ = ρs ,

φ = − ln ρs + ν ln s.

Using these data and the method of characteristics, we obtain smooth solutions on Rs12 . Let tE2 be the time when the characteristic χ (s(tE1 ), ·) reaches the wall. And Rs21 and Rs22 are defined similarly as in Sect. 3.1. We repeat the above steps and we glue all local smooth solutions to get local weak solutions.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

179

5.2. Dynamic behavior of a sheath edge. In this subsection, we study the short and large time dynamics of a sheath edge governed by the 2nd order ODE: (See Appendix 2) s˙ = − ρhs − 1, (5.8) ρ˙s = − νh s , subject to initial conditions s(0) = s0 ,

ρs (0) = A,

or equivalently s¨ =

˙ s + 1) ν(˙s + 1)2 h(˙ − , h s

(5.9)

subject to the following initial data s˙ (0) = 0.

s(0) = s0 , Notice that

• Sheath acceleration is determined by a current, a sheath velocity and curvature of the sheath edge. • If h(t) converges to h∞ monotonically as t → ∞, s¨ = −

ν(˙s + 1)2 + o(1) < 0, s

as t → ∞,

so the sheath edge will be concave in t 1 which results in return of the sheath edge to boundary. 5.2.1. Short-time behavior of a sheath edge. In this part, we study the behavior of the sheath edge for 0 < t 1. Recall that the equation for the sheath edge is s¨ =

˙ s + 1) ν(˙s + 1)2 h(˙ − , h s

s(0) = s0 ,

s˙ (0) = 0.

We first introduce a new dependent variable y: y(t) ≡ s(t) + t to get y¨ =

ν y˙ 2 h˙ y˙ − , h y−t

y(0) = y0 ,

y(0) ˙ = 1.

For small t 1, the above equation is approximated by y¨1 =

ν y˙ 2 h˙ y˙1 − 1. h y1

Here we used the fact that y − t ≈ y. By straightforward calculations, we get 1 (ν + 1) t ν+1 h(s)ds . y1 (t) = s0 1 − As0 0

180

S.-Y. Ha, M. Slemrod

Here we again used h(0) = −A. Therefore, for small time t, the sheath edge is approximated by 1 (ν + 1) t ν+1 s ≈ s0 1 − h(s)ds − t. As0 0 In the next proposition, we show concavity of the sheath edge near t = 0. Proposition 5.1. Suppose the main assumptions B1–B4 hold. Then the sheath edge s(t) is concave at t = 0: s¨ (0) = −

ν < 0. s0

Proof. By assumption B4, we have h(0) = −A,

˙ h(0) = 0.

It follows from (5.9) that s¨ (0) = −

˙ ν ν h(0) − = − < 0. h(0) s0 s0

Hence the sheath edge is concave at t = 0 and for small-time t 1, we have s(t) = s0 − This completes the proof.

νt 2 + O(t 3 ) 2s0

as t → 0 + .

5.2.2. Large-time behavior of a sheath edge. In this part, we consider the large-time behavior of a sheath edge under the assumption that a sheath edge s has a time-asymptotic location s∞ (≥r0 ) . Proposition 5.2. Suppose that the sheath edge s converges to the time-asymptotic location s∞ monotonically. Then we have νt as t → ∞. ρs (t) = O e s∞ Proof. Assume that the sheath edge s converges to s∞ monotonically, then lim s˙ (t) = 0.

t→∞

(i) It follows from (5.8) that ρ˙s ≈ which results in

ν ρs , for t 1, s∞

νt ρs (t) = O e s∞ .

Global Existence of Plasma Ion-Sheaths and Their Dynamics

181

(ii) From the first equation in (5.8), we have νt

h(t) + ρs (0)e s∞ → 0, as t → ∞, so that h˙ ν , → h s∞

as t → ∞.

Hence Eq. (5.9) is approximated for t 1 by s¨ =

ν(˙s + 1) ν(˙s + 1)2 . − s∞ s

(5.10)

First notice that the above approximated equation (5.10) has special solutions s(t) = s∞ , s(t) = −t + const. We look for solutions with the following ansatz: s(t) = at + b. Substitute the above ansatz into (5.10) to get 0=

ν ν(a + 1)2 (a + 1) − . s∞ (at + b)

If a = −1, then we divide the above equation by a + 1 to get at + b = s∞ (a + 1), which implies a = 0,

b = s∞ .

Therefore the above special solutions are the only solutions which are linear in t.

Remark 5.3. Based on Proposition 5.1 and 5.2, we conjecture that the qualitative dynamic behavior of the sheath edge is  νt 2  s0 − 2s0 , t 1, s(t) = −t + const, t = O(1),   s∞ , t 1. 6. Conclusion In this paper we have formulated an axiomatic definition of the plasma sheath edge separating an ion-electron plasma region into two subregions, a quasi-neutral region and a boundary layer. We then described the dynamics of evolution of the sheath to the initial -boundary value problem in planar, cylindrical and spherical geometries. We have given existence theorems for the plasma-ion sheaths and time-asymptotic estimates for the sheath edge dynamics. Most important of all, however, is that our results are qualitatively consistent with the experimental data given in the papers of Kim et al. [21] and Riemann-Daube [28].

182

S.-Y. Ha, M. Slemrod

Appendix A. Formal Derivation of the Current Equation In this appendix, we formally derive the current equation from the Euler-Poisson system (2.1) and defining equations of the sheath edge in Definition 1.1: −(ρ − ρe )˙s (t) + displacement current

ρu − ρe ue

= h(t).

(A.1)

convection current

First we differentiate the Poisson equation in (2.1) with respect to t to get ∂r ( 2 r ν ∂r ∂t φ − ρe ue + ρu)(r, t) = 0.

(A.2)

We integrate the above equation (A.2) from the target to get the current equation: ( 2 r ν ∂r ∂t φ + ρu − ρe ue )(r, t) = h(t).

(A.3)

Now, we will replace the first term of L.H.S of (A.3) by a term involving with the velocity of the sheath edge s(t). We claim: 2 r ν ∂r ∂t φ = −(ρ − ρe )˙s (t). By Definition 1.1 of the sheath edge, we consider a surface of zero electric field: ∂r φ(s(t), t) = 0.

(A.4)

We differentiate the above Eq. (A.4) with respect to t to obtain ∂r ∂t φ(s(t), t) = −∂r2 φ(s(t), t)˙s (t). On the other hand, it follows from the Poisson equation that 2 νr ν−1 ∂r φ + r ν ∂r2 φ = ρ − ρe . Since ∂r φ(s(t), t) = 0, the above equation yields 2 r ν ∂r2 φ(s(t), t) = (ρ − ρe )(s(t), t). Hence on the level surface ∂r φ(s(t), t) = 0, we obtain the current equation: −(ρ − ρe )˙s (t) + ρu − ρe ue = h(t). A moving sheath edge S(t) and boundary r = r0 of a target form a capacitor so the first term of L.H.S of (A.1) represents a displacement current which is needed to charge a capacitor, and last two terms of L.H.S denote the convection current of ions and electrons respectively. We use the quasi-neutral relation ρ = ρe in the quasi-neutral region Rq and ρe = 0 in the sheath region Rs to get −ρs (t){1 + ue (s(t)+)}, in the quasi-neutral region, h(t) = −ρs (t){˙s (t) + 1}, in the sheath region, where we used the sheath edge condition u(s(t), t) = −1,

ρ(s(t), t) = ρs (t).

Global Existence of Plasma Ion-Sheaths and Their Dynamics

183

Since the current h is continuous across the sheath edge (M4 in Sect. 2), we have −ρs (t){1 + ue (s(t)+)} = −ρs (t){˙s (t) + 1}. As long as ρs = 0, we obtain ue (s(t)+) = s˙ (t), which is exactly the relation obtained through the Rankine-Hugoniot jump condition applied to the continuity equation for the electron density [14]: [ρe ] = s˙ [ρe ue ] with ρe (s(t)−) = 0, where [·] denotes the jump across the sheath edge. Appendix B. Derivation of 2nd Order ODE for a Sheath Edge In this appendix, we derive a second order ODE for the sheath edge to the cylindrical and spherical targets. It follows from the current equation (A.1) that ρs s˙ + h + ρs = 0.

(B.1)

Since the sheath edge density ρs (t) is not known a priori, we need to find ρs . By Definition 1.1 of a sheath edge r = s(t), we have three defining equations: ρ(s(t), t) = ρs (t),

u(s(t), t) = −1

∂r φ(s(t), t) = 0.

(B.2)

Below all partial derivatives are evaluated at the sheath edge (s(t), t) as limits from the quasi-neutral region. We first differentiate the 1st and 2nd equations in (B.2) with respect to t to get ∂t ρ + s˙ ∂r ρ = ρ˙s , (B.3) ∂t u + s˙ ∂r u = 0. On the other hand, the Euler-Poisson system (2.1) yields ∂t ρ − ∂r ρ + ρs ∂r u = 0, ∂t u − ∂r u = 0.

(B.4)

Here all partial derivatives are evaluated at (s(t), t), and we used the fact that on a sheath edge: u = −1,

ρ = ρs ,

∂r φ = 0.

We use the first equation of (B.3) and the first equation of (B.4) to obtain −(˙s + 1)∂r ρ + ρs ∂r u = −ρ˙s . Similarly, it follows from the second equations in (B.3) and (B.4) that (˙s + 1)∂r u = 0.

(B.5)

184

S.-Y. Ha, M. Slemrod

Since the sheath edge is non-characteristic (˙s = −1), the above equation yields ∂r u(s(t)+, t) = 0. In (B.5), we have ρ˙s (t) = (˙s (t) + 1)∂r ρ(s(t), t).

(B.6)

Now we use the quasi-neutral relation (ρ = r ν e−φ ) to obtain ∂r ρ =

νρs , s

(B.7)

where we used the fact that the electric field vanishes in the sheath edge, i.e., ∂r φ(s(t), t) = 0. In (B.6), we have (B.7) to get νρs νh (˙s + 1) = − . s s

ρ˙s =

(B.8)

Combining (B.1) and (B.8), we obtain a coupled system for the motion of a sheath edge:

s˙ = − ρhs − 1, ρ˙s = − νh s ,

(B.9)

subject to initial conditions s(0) = s0 ,

ρs (0) = ρ0 (s0 ).

We eliminate ρ˙s in (B.9) to find a single 2nd order ODE for s: s¨ =

˙ s + 1) ν(˙s + 1)2 h(˙ − . h s

Here we used 1 (˙s + 1) . =− ρs h Acknowledgement. The research of S.Y. Ha was partially supported by the NSF grant DMS-0203858 and the research of M. Slemrod was supported in part by the NSF grant DMS-0071463. In addition, we would like to thank Professor Pierre Degond for his valuable remarks on our work.

Global Existence of Plasma Ion-Sheaths and Their Dynamics

185

References 1. Ben Abdallah, N.: Convergence of the Child-Langmuir asymptotics of the Boltzmann equation of semiconductors. SIAM J. Math. Anal. 27, 92–109 (1996) 2. Ben Abdallah, N.: The child-Langmuir regime for electron transport in a plasma including a background of positive ions. Math. Models Methods Appl. Sci. 4, 409–438 (1994) 3. Ben Abdallah, N., Degond, P., Markowich, P.: The quantum Child-Langmuir problem. Nonlinear Anal 31, 629–648 (1998) 4. Ben Abdallah, N., Degond, P.: The Child-Langmuir law for the Boltzmann equation of Semiconductor. SIAM J. Math. Anal. 26, 364–398 (1995) 5. Andrews, J.G., Varey, R.H.: Sheath growth in a low pressure plasma. The Physics of Fluids 14, 339–343 (1971) 6. Cipolla, J.W., Silevitch, M.B.: On the temporal development of a plasma sheath. Plasma Physics 25, 373–389 (1981) 7. Conrad, J.R., Radtke, J.L., Dodd, R.A., Worzaka, F.J., Tran, N.C.: Plasma source ion-implantation technique for surface modification of materials. J. Appl. Phys. 62, 4591–4596 (1987) 8. Cordier, S.: Global solutions to the isothermal Euler-Poisson plasma model. Appl. Math. Lett. 8, 19–24 (1995) 9. Cordier, S., Grenier, E.: Quasi-neutral limit of an Euler-Poisson system arising from plasma physics. Comm. Partial Differ. Eqs. 25, 1099–1113 (2000) 10. Degond, P., Jaffard, S., Poupaud, F., Raviart, P.A.: The Child-Langmuir asymptotics of the Vlasov-Poisson equation for cylindrical or spherically symmetric diodes. II. (Analysis of the reduced problem and determination of the Child-Langmuir current). Math. Methods Appl. Sci. 19, 313–340 (1996) 11. Degond, P., Jaffard, S., Poupaud, F., Raviart, P.A.: The Child-Langmuir asymptotics of the VlasovPoisson equation for cylindrical or spherically symmetric diodes. I. (Statement of the problem and basic estimates). Math. Methods Appl. Sci. 19, 287–312 (1996) 12. Degond, P., Raviart, P.A.: On a penalization of the Child-Langmuir emission condition for the onedimensional Vlasov-Poisson equation. Asymptotic Analysis 6, 1–27 (1992) 13. Degond, P., Raviart, R.A.: An asymptotic analysis of the one-dimensional Vlasov-Poisson system: the Child-Langmuir law. Asymptotic Analysis 4, 187–214 (1991) 14. Degond, P., Parzani, C., Vignal, M.-H.: Un modele ` d’expansion de plasma dans le vide. Submitted Comptes Rendus Academie Sciences (Paris), Mathematical problems in mechanics 15. E, W., Rykov, Yu.G., Sinai, Ya.G.: Generalized variational principles, global weak solutions and behavior with random initial data for systems of conservation laws arising in adhesion particle dynamics. Commun. Math. Phys. 177, 349–380 (1996) 16. Engelberg, S., Liu, H., Tadmor, E.: Critical thesholds in Euler-Poisson equations. Indiana Univ. Math. J. 50, 109–157 (2001) 17. Franklin, R.N., Ockendon, J.R.: Asymptotic matching of plasma and sheath in an active low pressure discharge. J. Plasma Phys. 4, 3521–3528 (1970) 18. Gierling, J., Riemann, K.-U.: Comparison of a consistent theory of radio frequency sheaths with step models. J. Appl. Phys. 83, 3521–3528 (1988) 19. Godyak, V.A., Sternberg, N.: Dynamic model of the electrode sheaths in symmetrically driven of discharges. Phys. Rev. A 42, 2299–2312 (1990) 20. Greengard, C., Raviart, P.A.: A boundary value problem for the stationary Vlasov-Poisson equations: The plane diode. Comm. Pure Appl. Math. 43, 473–507 (1990) 21. Kim, Y.-W., Kim, G.-H., Han, S., Lee, Y., Cho, J., Rhee, S.-Y.: Measurement of sheath expansion in plasma source ion implantation. Surface and Coatings Technology 136, 97–101 (2001) 22. Lax, P.D.: Hyperbolic systems of conservation laws II. Comm. Pure Appl. Math. 10, 537–566 (1957) 23. Lieberman, M.A., Lichtenberg, A.J.: Principles of plasma discharge and meterials processing. New York: Wiley, 1994 24. Liu, H., Slemrod, M.: KdV dynamics in the plasma-sheath transition. Submitted to Appl. Math. Lett. 25. Liu, H., Tadmor, E.: Spectral dynamics of the velocity gradient field in restricted flows. Commun. Math. Phys. 228, 435–466 (2002) 26. Poupaud, F., Rascle, M., Vila, J.P.: Global solutions to the isothermal Euler-Poisson system with arbitrarily large data. J. Differ. Eqs. 123, 93–121 (1995) 27. Riemann, K.-U.: The Bohm criterion and sheath formation. J. Phys. D: Appl. Phys. 24, 493–518 (1991) 28. Riemann, K.-U., Daube, Th.: Analytical model of the relaxation of a collisionless ion matrix sheath. J. Appl. Phys. 86, 1201–1207 (1999) 29. Slemrod, M., Sternberg, N.: Quasi-neutral limit for Euler-Poisson system. J. Nonlinear Sci. 11, 193–209 (2001)

186

S.-Y. Ha, M. Slemrod

30. Slemrod, M.: Shadowing and the plasma-sheath transition layer. J. Nonlinear Sci. 11, 397–414 (2001) 31. Slemrod, M.: Monotone increasing solutions of the Painleve 1 equation y = y 2 + x and their role in the stability of the plasma-sheath transition. Eur. J. Appl. Math. 13, 663–680 (2002) 32. Slemrod, M.: The radio-frequency driven plasma sheath: Asymptotics and analysis. To appear in SIAM J. Appl. Math. 33. Sternberg, N., Godyak, V.A.: Solving the mathematical model of the electrode sheath in symmetrically driven rf discharges. J. Comput. Phys. 111, 347–353 (1994) 34. Widner, M., Alexeff, I., Jones, W.D.: Ion acoustic wave excitation and ion sheath evolution. Phys. of Fluids 13, 2532–2540 (1970) Communicated by P. Constantin

Commun. Math. Phys. 238, 187–209 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0855-z

Communications in

Mathematical Physics

Virtual Crystals and Kleber’s Algorithm Masato Okado1 , Anne Schilling2 , Mark Shimozono3 1

Department of Informatics and Mathematical Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan. E-mail: [email protected] 2 Department of Mathematics, University of California, One Shields Avenue, Davis, CA 95616-8633, USA. E-mail: [email protected] 3 Department of Mathematics, 460 McBryde Hall, Virginia Tech, Blacksburg, VA 24061-0123, USA. E-mail: [email protected] Received: 10 September 2002 / Accepted: 22 January 2003 Published online: 7 May 2003 – © Springer-Verlag 2003

Abstract: Kirillov and Reshetikhin conjectured what is now known as the fermionic formula for the decomposition of tensor products of certain finite dimensional modules over quantum affine algebras. This formula can also be extended to the case of q-deformations of tensor product multiplicities as recently conjectured by Hatayama et al. In its original formulation it is difficult to compute the fermionic formula efficiently. Kleber found an algorithm for the simply-laced algebras which overcomes this problem. We present a method which reduces all other cases to the simply-laced case using embeddings of affine algebras. This is the fermionic analogue of the virtual crystal construction by the authors, which is the realization of crystal graphs for arbitrary quantum affine algebras in terms of those of simply-laced type. 1. Introduction In 1987 Kirillov and Reshetikhin [24] conjectured a formula, now known as the fermionic formula, for the decomposition of tensor products of certain finite dimensional representations over an untwisted quantum affine algebra Uq (ᒄ) into its Uq (ᒄ) components, where ᒄ is the simple Lie algebra associated with the affine Kac-Moody algebra ᒄ. The conjecture is motivated by Bethe Ansatz studies. Recently, conjectures for fermionic formulas have been extended to q-deformations of tensor product multiplicities [7, 6]. In (1) type An this q-tensor multiplicity formula appeared in [25]. For a single tensor factor, the fermionic formula gives the ᒄ-isotypical components of a Uq (ᒄ)-module associated with a multiple of a fundamental weight. This conjecture was proven by Chari [3] in a number of cases. Recently, Nakajima [27] showed in the simply-laced case that the characters of such modules satisfy a certain system of algebraic relations (Q-system). Combining the result of [7], his result completes the proof of a “weak” version of the q = 1 fermionic formula in this case. The term fermionic formula was coined by the Stony Brook group [18, 19], who interpreted fermionic-type formulas for characters and branching functions of conformal field

188

M. Okado, A. Schilling, M. Shimozono

theory models as partition functions of quasiparticle systems with “fractional” statistics obeying Pauli’s exclusion principle. Fermionic formulas are q-polynomials or q-series expressed as certain sums of products of q-binomial coefficients (a) (a) m + p (a) i i , q cc({mi }) (a) mi (a) i,a m+p

{mi }

where m = (q)m+p /(q)m (q)p is the q-binomial coefficient with (q)m = m i=1 (1 − (a) (a) (a) i q ), cc({mi }) is some function of the summation variables mi and pi is the vacancy number (see (5.2)). The summation variables are subject to constraints (4.1). Those sets (a) {mi } satisfying (4.1) are called admissible configurations. From (4.1) alone, it is computationally difficult to find the admissible configurations, making the evaluation of the fermionic formula intractable. For simply-laced algebras ᒄ, Kleber [16, 17] has given (a) an efficient algorithm to determine the admissible configurations {mi }. This algorithm generates a rooted tree with nodes labelled by dominant integral weights such that the tree nodes are in bijection with the admissible configurations. For non-simply laced algebras, the algorithm fails: some admissible nodes cannot be reached. One of our goals in this paper is to modify Kleber’s algorithm to work in all types. This is accomplished by using the well-known natural embeddings of any affine algebra into another of simply-laced type [9]: (1)

(2)

(2)†

(2)

Cn , A2n , A2n , Dn+1 (2) (1) A2n−1 , Bn (2) (1) E6 , F4 (3) (1) D4 , G2

→ → → →

(1)

A2n−1 (1) Dn+1 (1) E6 (1) D4 .

(1.1)

It is not hard to express the fermionic formula of the smaller algebra in terms of the larger; we call this the virtual fermionic formula. Our algorithm is an adaptation of Kleber’s algorithm in the simply-laced affine algebra, which trims the tree so as not to generate nodes that cannot contribute to the virtual fermionic formula. This algorithm succeeds by using some nodes in the larger weight lattice that do not correspond to weights in the embedded weight lattice. Fermionic formulas denoted M have crystal counterparts. Crystal bases were introduced by Kashiwara [12] and are bases of Uq (ᒄ)-modules in the limit q → 0. Let us denote the one-dimensional configuration sums, which are generating functions of highest weight elements in tensor products of finite dimensional crystals with energy statistics, by X. It was conjectured in [7, 6] that X = M. In light of the embeddings of affine algebras (1.1), one might hope that such embeddings also exist for the quantized algebras. Unfortunately they do not. However we assert that such embeddings exist for all finite-dimensional affine crystals, and give a construction for them in terms of crystals of simply-laced type. A virtual crystal is such a realization of a crystal inside another of possibly different type. Perhaps the first instance of a virtual crystal is Kashiwara’s embedding of a crystal of highest weight λ, into that of highest weight kλ, where k is a positive integer [14]. Extending Baker’s work [2], in (1) (2) (2) [29] we conjectured that finite dimensional crystals of type Cn , A2n , and Dn+1 can be (1)

realized in terms of crystals of type A2n−1 . We proved this for crystals associated with single columns (i.e. fundamental weights).

Virtual Crystals and Kleber’s Algorithm

189

In this paper we establish the correctness of the virtual crystal approach for crystals associated with single rows (that is, multiples of the first fundamental weight) for the two infinite families of embeddings. The paper is organized as follows. In Sect. 2 we review the essentials of crystal theory. Virtual crystals are introduced in Sect. 3 and the characterization and validity of virtual crystals associated with single rows is proven. Sections 4 and 5 review the fermionic formulas conjectured in [7, 6] and the Kleber algorithm, respectively, and describe their virtual counterparts. 2. Crystals 2.1. Affine algebras. We adopt the notation of [6]. Let ᒄ be a Kac-Moody Lie algebra (r) (1) (1) (1) of affine type XN , that is, one of the types An (n ≥ 1), Bn (n ≥ 3), Cn (n ≥ 2), (1) (1) (1) (1) (2) (2)† (2) Dn (n ≥ 4), En (n = 6, 7, 8), F4 , G2 , A2n (n ≥ 1), A2n (n ≥ 1), A2n−1 (n ≥ 2), (2)

(2)

(3)

(r)

Dn+1 (n ≥ 2), E6 or D4 . The Dynkin diagram of ᒄ = XN is depicted in Fig. 1 (Table Aff 1-3 in [10]). Its nodes are labelled by the set I = {0, 1, 2 . . . , n}. Let I = I \{0}. Every affine algebra ᒄ has a simple Lie subalgebra ᒄ obtained by removing the 0-node from the Dynkin diagram. This is summarized in the following table: (1)

(2)

(2)†

(2)

(2)

(2)

(3)

ᒄ Xn A2n A2n A2n−1 Dn+1 E6 D4 . ᒄ Xn Cn Bn Cn Bn F4 G 2

(2.1)

Let αi , hi , i (i ∈ I ) be the simple roots, simple coroots, and fundamental weights of

ᒄ. Let δ and c denote the generator of imaginary roots and the canonical central element,

respectively. Recall that δ = i∈I ai αi and c = i∈I ai∨ hi , where the Kac labels ai are the unique set of relatively prime positive integers giving the linear dependency of the columns of the Cartan matrix A (that is, A(a0 , . . . , an )t = 0). Explicitly,  α0 + · · · + αn if ᒄ = An(1)    α + α + 2α + · · · + 2α  if ᒄ = Bn(1) 0 1 2 n     α0 + 2α1 + · · · + 2αn−1 + αn if ᒄ = Cn(1)      α0 + α1 + 2α2 + · · · + 2αn−2 + αn−1 + αn if ᒄ = Dn(1)      α0 + α1 + 2α2 + 3α3 + 2α4 + α5 + 2α6 if ᒄ = E6(1)     if ᒄ = E7(1) α0 + 2α1 + 3α2 + 4α3 + 3α4 + 2α5 + α6 + α7     α0 + 2α1 + 3α2 + 4α3 + 5α4 + 6α5 + 4α6 + 2α7 + 3α8 if ᒄ = E8(1)  if ᒄ = F4(1) δ = α0 + 2α1 + 3α2 + 4α3 + 2α4    α0 + 2α1 + 3α2 if ᒄ = G2(1)    (2)  2α0 + 2α1 + · · · + 2αn−1 + αn if ᒄ = A2n    (2)†   α0 + 2α1 + · · · + 2αn−1 + 2αn if ᒄ = A2n    (2)   if ᒄ = A2n−1 α0 + α1 + 2α2 + · · · + 2αn−1 + αn   (2)  if ᒄ = Dn+1 α0 + α1 + · · · + αn−1 + αn     α0 + 2α1 + 3α2 + 2α3 + α4 if ᒄ = E6(2)    α0 + 2α1 + α2 if ᒄ = D4(3) . (2.2)

190

M. Okado, A. Schilling, M. Shimozono

(1)

A1 :

(1)

<> 0 1

F4 : 0

1

(n ≥ 3)

1

0

>

2

3

2

2

1

2

1

(2) A2n :

2 > n n−1

(n ≥ 2)

2 < n−1 n

(2)†

A2

2

(n ≥ 2)

n−2 n−1

6 2

3

4

5

0

1

2

2

0 2

<

2 >

3

4

6

5

2

3

4

2

1 2

2

1

2

2 2 < n−1 n

2

> n−1 n

2

3

2 < n n−1

2

2

1

2

0

1

2

0

1

0 0

0

< >

>

1

1 0

1

0

<

(3)

D4 :

2

3

(2)

E6 :

7 (1) E7 :

2

(2)

Dn+1 : (n ≥ 2)

(1)

1

1

(2)

A2n−1 : (n ≥ 3)

0

E6 :

:

0

(2)† A2n :

n

0

(1)

Dn : (n ≥ 4)

(2)

A2 :

0

(1) Cn :

(n ≥ 2)

n−1 n

2

(1) Bn :

1

(1)

G2 :

(1)

An : (n ≥ 2)

0

>

<

2 > n−1 n <

2

2

3

4

3 2

8 (1) E8 :

0

1

2

3

4

5

6

7

(r)

Fig. 1. Dynkin diagrams for XN . The enumeration of the nodes with I = {0, 1, . . . , n} is specified under or the right side of the nodes. In addition, the numbers ti (resp. ti∨ ) defined in (2.3) are attached above the nodes for r = 1 (resp. r > 1) if and only if ti = 1 (resp. ti∨ = 1)

The dual Kac label ai∨ is the label ai for the affine Dynkin diagram obtained by “reversing the arrows” of the Dynkin diagram of ᒄ, or equivalently, the coefficients giving the linear dependency of the rows of the Cartan matrix A.

+ = Let P = a∈I Za ⊕ Zδ be the weight lattice of ᒄ and P a∈I Z≥0 a .

+ Similarly, let P = a∈I Za be the weight lattice of ᒄ, P = a∈I Z≥0 a , Q =

+ a∈I Zαa the root lattice of ᒄ and Q = a∈I Z≥0 αa with simple roots and funda+ mental weights αa , a for a ∈ I . For λ, µ ∈ P write λ µ if λ − µ ∈ Q . For i ∈ I let ai ∨ ,a , ti = max ai∨ 0

ti∨

ai∨ = max , a0 . ai

(2.3)

Virtual Crystals and Kleber’s Algorithm

191

The values ti are given in Fig. 1. We shall only use ti∨ and ti for i ∈ I . For a ∈ I we have ta∨ = 1 if r = 1,

ta = a0∨ if r > 1.

Let (·|·) be the normalized invariant form on P [10]. It satisfies (αi |αj ) =

ai∨ Aij ai

(2.4)

2r a0∨

(2.5)

for i, j ∈ I . In particular (αa |αa ) = if αa is a long root. 2.2. Crystals. The quantized universal enveloping algebra Uq (ᒄ) associated with a symmetrizable Kac–Moody Lie algebra ᒄ was introduced independently by Drinfeld [4] and Jimbo [8] in their study of two dimensional solvable lattice models in statistical mechanics. The parameter q corresponds to the temperature of the underlying model. Kashiwara [11] showed that at zero temperature or q = 0 the representations of Uq (ᒄ) have bases, which he coined crystal bases, with a beautiful combinatorial structure and favorable properties such as uniqueness and stability under tensor products. Let ᒄ be the derived subalgebra of ᒄ. Denote the corresponding quantized universal enveloping algebras of ᒄ ⊃ ᒄ ⊃ ᒄ by Uq (ᒄ) ⊃ Uq (ᒄ) ⊃ Uq (ᒄ). In [7, 6] it is conjectured that there is a family of finite-dimensional irreducible (a) Uq (ᒄ)-modules {Wi | a ∈ I , i ∈ Z>0 } which, unlike most finite-dimensional Uq (ᒄ)modules, have crystal bases B a,i . This family is conjecturally characterized in several different ways: (1) Its characters form the unique solutions of a system of quadratic relations (the Q-system) [24]. (2) Every crystal graph of an irreducible integrable finite-dimensional Uq (ᒄ)-module, is a tensor product of the B a,i . (3) For λ ∈ P let V (λ) be the universal extremal weight module defined in [15, Sect. 3] and B(λ) its crystal base, with unique vector uλ ∈ B(λ) of weight λ. Then the affinization of B a,i (in the sense of [22]) is isomorphic to the connected component of uλ in B(λ), for the weight λ = ia . In light of point (2) above, we consider the category of crystal graphs given by tensor products of the crystals B a,i . We introduce notation for tensor products of B a,i . Let B=

(a)

(B a,i )⊗Li ,

(2.6)

(a,i)∈I ×Z>0 (a)

(1)

where only finitely many Li are nonzero. In type An this is the tensor product of modules, which, when restricted to An , are irreducible modules indexed by rectangular

192

M. Okado, A. Schilling, M. Shimozono

partitions. The set of classically restricted paths (or classical highest weight vectors) in + B of weight λ ∈ P is by definition P(B, λ) = {b ∈ B | wt(b) = λ and ei b undefined for all i ∈ I }. Here ei is given by the crystal graph. For b, b ∈ B a,i we have b = ei (b) if there is i

an arrow b −→ b in the crystal graph; if no such arrow exists then ei (b) is undefined. i

Similarly, b = fi (b) if there is an arrow b −→ b in the crystal graph; if no such arrow exists then fi (b) is undefined. If B1 and B2 are crystals, then for b1 ⊗ b2 ∈ B1 ⊗ B2 the action of ei is defined as ei b1 ⊗ b2 if εi (b1 ) > ϕi (b2 ), ei (b1 ⊗ b2 ) = b1 ⊗ ei b2 else, where εi (b) = max{k | eik b is defined} and ϕi (b) = max{k | fik b is defined}. This is the opposite of the notation used by Kashiwara [11]. 2.3. Simple crystals. Let W be the Weyl group of ᒄ, {si | i ∈ I } the simple reflections in W . Let B be the crystal graph of an integrable Uq (ᒄ)-module. Say that b ∈ B is an extremal vector of weight λ ∈ P provided that wt(b) = λ and there exists a family of elements {bw | w ∈ W } ⊂ B such that (1) bw = b for w = e. h , wλ (2) If hi , wλ ≥ 0 then ei (bw ) = ∅ and fi i (bw ) = bsi w . hi , wλ (3) If hi , wλ ≤ 0 then fi (bw ) = ∅ and ei (bw ) = bsi w . Following [1], say that a Uq (ᒄ)-crystal B is simple if (1) B is the crystal base of a finite dimensional integrable Uq (ᒄ)-module. +

(2) There is a weight λ ∈ P such that B has a unique vector (denoted u(B)) of weight λ, and the weight of any extremal vector of B is contained in W λ, where W is the Weyl group of ᒄ. In the definition of simple crystal in [1], Condition 1 is not present. However we always want to assume both conditions, so it is convenient to include Condition 1 in the definition above. Theorem 2.1 ([1]). (1) Simple crystals are connected. (2) The tensor product of simple crystals is simple. For the Uq (ᒄ)-crystal B, define , ϕ : B → P by (b) = i (b)i and ϕ(b) = ϕi (b)i . i∈I

i∈I

Then the level of B is lev(B) = min{c , (b) | b ∈ B}.

(2.7)

Virtual Crystals and Kleber’s Algorithm

193

2.4. Dual crystals. The notion of a dual crystal is given in [13, Sect. 7.4]. Let B be a Uq (ᒄ)-crystal. Then there is a Uq (ᒄ)-crystal denoted B ∨ obtained from B by reversing arrows. That is, B ∨ = {b∨ | b ∈ B} with wt(b∨ ) = −wt(b), i (b∨ ) = ϕi (b), ϕi (b∨ ) = i (b), ∨

(2.8) ∨

ei (b ) = (fi (b)) , fi (b∨ ) = (ei (b))∨ . Proposition 2.2 ([13]). There is an isomorphism (B2 ⊗ B1 )∨ ∼ = B1∨ ⊗ B2∨ given by ∨ ∨ ∨ (b2 ⊗ b1 ) → b1 ⊗ b2 . 2.5. One dimensional sums. In this section we recall the structure of a Uq (ᒄ)-crystal as a graded Uq (ᒄ)-crystal. The grading is given by the intrinsic energy function D : B → Z. For b ∈ B, one may define D(b) as the minimum number of times e0 occurs in a sequence of operators involving ei , fi for i ∈ I and e0 , leading from u(B) to b. However we prefer to work with the following concrete definition when B is a tensor product of crystals of the form B r,s . This definition essentially comes from [6], but it is useful to formulate it as follows [29]. Let B1 , B2 be simple Uq (ᒄ)-crystals. It was shown in [22, Sect. 4] that there is a unique isomorphism of Uq (ᒄ)-crystals R = RB2 ,B1 : B2 ⊗ B1 → B1 ⊗ B2 , called the combinatorial R matrix. In addition there exists a function H : B1 ⊗ B2 → Z called the local energy function, that is unique up to a global additive constant, which is constant on I components and satisfies for all b2 ∈ B2 and b1 ∈ B1 with R(b2 ⊗ b1 ) = b1 ⊗ b2 ,   −1 H (e0 (b2 ⊗ b1 )) = H (b2 ⊗ b1 ) + 1  0

if 0 (b2 ) > ϕ0 (b1 ) and 0 (b1 ) > ϕ0 (b2 ) if 0 (b2 ) ≤ ϕ0 (b1 ) and 0 (b1 ) ≤ ϕ0 (b2 ) otherwise. (2.9)

We shall normalize the local energy function by the condition H (u(B2 ) ⊗ u(B1 )) = 0. It was conjectured in [6] that ϕ(b ) = lev(B r,s )0

for a unique b ∈ B r,s .

(2.10)

For a given crystal B r,s , denote this element also by u (B r,s ). Define the function DB r,s : B r,s → Z by DB r,s (b) = H (b ⊗ b ) − H (u(B r,s ) ⊗ b ),

(2.11)

where H = HB r,s ,B r,s is the local energy function. In all cases in which the Uq (ᒄ)-mod(r)

ule Ws and its crystal base B r,s have been constructed, (2.10) holds and (2.11) agrees with the explicit grading on B r,s specified in a case-by-case manner in the appendices of [7, 6].

194

M. Okado, A. Schilling, M. Shimozono

A graded simple crystal (B, D) is a simple crystal B together with a function D : B → Z. Let (Bj , Dj ) be a graded simple Uq (ᒄ)-crystal and uj = u(Bj ), for 1 ≤ j ≤ L. Let B = BL ⊗ · · · ⊗ B1 . Following [28] define the energy function EB : B → Z by Hi Ri+1 Ri+2 · · · Rj −1 , (2.12) EB = 1≤i<j ≤L

where Ri is the combinatorial R-matrix and Hi is the local energy function, where the subscript i indicates that the operators act on the i th and (i + 1)st tensor factors from the right. This given, define DB , DB : B → Z by DB = EB +

L

Dj R1 R2 · · · Rj −1 ,

(2.13)

j =1

DB (b) = DB (b) − DB (u(B)), where Dj : Bj → Z acts on the rightmost tensor factor. Then we say that the graded simple crystal (B, DB ) is the tensor product of the graded simple crystals (Bj , Dj ). Theorem 2.3 ([29]). Graded simple Uq (ᒄ)-crystals form a tensor category. Now suppose that for all j , Bj has the form B r,s and Dj = DB r,s is the intrinsic energy as defined above. Then the function DB is called the intrinsic energy of B. Let bj ∈ Bj be as in (2.10). Then conjecturally there is an element b ∈ B such that bj is the leftmost tensor factor in RL−1 · · · Rj +1 Rj b . Using the Yang-Baxter equation for R and the fact that RB⊗B is the identity for any B, it follows that [6] DB (b) = H (b ⊗ b ) − H (u(B) ⊗ b ).

(2.14)

The one-dimensional sum X(B, λ; q) ∈ Z[q, q −1 ] is the generating function of paths graded by the intrinsic energy q DB (b) . (2.15) X(B, λ; q) = b∈P (B,λ)

2.6. Crystals of type Bn , Cn , Dn . In this and the next section we will describe the classical highest weight crystals B(s1 ) and the finite dimensional affine crystals B 1,s for all nonexceptional types as weakly increasing words b in an alphabet X . They are also determined by x(b) = (xi )i∈X , where xi is the number of i’s in b. Whenever an operation yields a negative value for an xi it will be undefined. According to [23], the crystal of B(1 ) has the underlying set ¯ X = {1 < 2 < · · · < n < ◦ < n¯ < · · · < 2¯ < 1} ¯ X = {1 < 2 < · · · < n < n¯ < · · · < 2¯ < 1} n ¯ X = {1 < 2 < · · · < < · · · < 2¯ < 1} n¯

for Bn , for Cn , for Dn .

Virtual Crystals and Kleber’s Algorithm

195

The crystal B(s1 ) is the set of weakly increasing words of length s in the alphabet X such that, in addition, for type Bn there is at most one ◦, and in type Dn , there are either no letters n or no letters n. ¯ The crystal operators ei on B(s1 ) are given by ei b =

(x1 , . . . , xi + 1, xi+1 − 1, . . . , x¯1 ) if xi+1 > x¯i+1 (x1 , . . . , x¯i+1 + 1, x¯i − 1, . . . , x¯1 ) if xi+1 ≤ x¯i+1

(2.16)

with the following exceptions: Type Bn : en b =

(x1 , . . . , xn , x◦ + 1, x¯n − 1, . . . , x¯1 ) if x◦ = 0 (x1 , . . . , xn + 1, x◦ − 1, x¯n , . . . , x¯1 ) if x◦ = 1,

Type Cn : en b = (x1 , . . . , xn + 1, x¯n − 1, . . . , x¯1 ), (x1 , . . . , xn−1 + 1, xn − 1, x¯n , . . . Type Dn : en−1 b = (x1 , . . . , xn , x¯n + 1, x¯n−1 − 1, . . . (x1 , . . . , xn−1 + 1, xn , x¯n − 1, . . . en b = (x1 , . . . , xn + 1, x¯n , x¯n−1 − 1, . . .

, x¯1 ) if xn > 0, , x¯1 ) if xn = 0

(2.17)

, x¯1 ) if x¯n > 0 , x¯1 ) if x¯n = 0.

(1)

2.7. Affine crystals B 1,s . We recall the crystals B 1,s from [21] (and [20] for type Cn ). The affine algebra ᒄ has a simple Lie subalgebra of type given in (2.1). There is an isomorphism of classical crystals

B

1,s

∼ =

  B(s1 )       B(s 1 ) 

s ≤s    B(s 1 )      s ≤s

(2) for types Bn(1) , Dn(1) , A2n−1 (2) (2) for types A2n , Dn+1

(2.18)

(2)† for type Cn(1) , A2n .

s−s ∈2Z

The crystal operators ei for 1 ≤ i ≤ n are given in Subsect. 2.6. The operator e0 is given by

Type

(x1 , x2 − 1, . . . , x¯2 , x¯1 + 1) if x2 > x¯2 (x1 − 1, x2 , . . . , x¯2 + 1, x¯1 ) if x2 ≤ x¯2 (x1 − 1, x2 , . . . , x¯2 , x¯1 ) if x1 > x¯1 e0 b = (x1 , x2 , . . . , x¯2 , x¯1 + 1) if x1 ≤ x¯1   if x1 ≥ x¯1 + 2 (x1 − 2, x2 , . . . , x¯2 , x¯1 ) e0 b = (x1 − 1, x2 , . . . , x¯2 , x¯1 + 1) if x1 = x¯1 + 1  (x , x , . . . , x¯ , x¯ + 2) if x1 ≤ x¯1 . 1 2 2 1 (2.19)

(2) Bn(1) , Dn(1) , A2n−1 : e0 b (2)

(2)

Type A2n , Dn+1 : (2)†

Type Cn(1) , A2n :

=

196

M. Okado, A. Schilling, M. Shimozono

3. Virtual Crystals 3.1. Embeddings of affine algebras. As given in (1.1), there are natural inclusions of the affine Lie algebras. These embeddings do not carry over to the corresponding quantum algebras. Nevertheless we expect that such embeddings exist for crystals. Note that every affine algebra can be embedded into one of type A(1) , D (1) and E (1) which are the untwisted affine algebras whose canonical simple Lie subalgebra is simply-laced. (1) (2) (2)† (2) (1) Crystal embeddings Cn , A2n , A2n , Dn+1 → A2n−1 are studied in [29]. Consider one of the embeddings given in (1.1) of an affine algebra with Dynkin diagram X into one with diagram Y . We consider a graph automorphism σ of Y that fixes (1) (1) the 0 node. For type A2n−1 , σ (i) = 2n − i (mod 2n). For type Dn+1 the automorphism interchanges the nodes n and n + 1 and fixes all other nodes. There is an additional (1) automorphism for type D4 , namely, the cyclic permutation of the nodes 1,2 and 3. For (1) type E6 the automorphism exchanges nodes 1 and 5 and nodes 2 and 4. Let I X and I Y be the vertex sets of the diagrams X and Y respectively, I Y /σ the set of orbits of the action of σ on I Y , and ι : I X → I Y /σ a bijection which preserves edges and sends 0 to 0. (1)

(2)

(2)†

(2)

(1)

Example 3.1. If X is one of Cn , A2n , A2n , Dn+1 and Y = A2n−1 , then ι(0) = 0, ι(i) = {i, 2n − i} for 1 ≤ i < n and ι(n) = n. (1) (2) (1) If X = Bn or A2n−1 and Y = Dn+1 , then ι(i) = i for i < n and ι(n) = {n, n + 1}. (2)

(1)

(1)

If X is E6 or F4 and Y = E6 , then ι(0) = 0, ι(1) = 1, ι(2) = 3, ι(3) = {2, 4} and ι(4) = {1, 5}. (3) (1) (1) If X is D4 or G2 and Y = D4 , then ι(0) = 0, ι(1) = 2 and ι(2) = {1, 3, 4}.

To describe the embedding we endow the bijection ι with additional data. For each i ∈ I X we shall define a multiplication factor γi that depends on the location of i with respect to a distinguished arrow (multiple bond) in X. Removing the arrow leaves two connected components. The factor γi is defined as follows: 1. Suppose X has a unique arrow. (a) Suppose the arrow points towards the component of 0. Then γi = 1 for all i ∈ I X . (b) Suppose the arrow points away from the component of 0. Then γi is the order of σ for i in the component of 0 and is 1 otherwise. (1) 2. Suppose X has two arrows, that is, Y = A2n−1 . Then γi = 1 for 1 ≤ i ≤ n − 1. For i ∈ {0, n}, γi = 2 (which is the order of σ ) if the arrow incident to i points away from it and is 1 otherwise. (1)

(1)

Example 3.2. For X = Bn and Y = Dn+1 we have γi = 2 if 0 ≤ i ≤ n − 1 and (2)

(1)

γn = 1. For X = A2n−1 and Y = Dn we have γi = 1 for all i.

The embedding : P X → P Y of weight lattices is defined by

(X Yj . i ) = γi j ∈ι(i)

As a consequence we have

(αiX ) = γi

j ∈ι(i)

αjY ,

(δ ) = a0 γ0 δ Y . X

Virtual Crystals and Kleber’s Algorithm

197

3.2. Virtual crystals. Suggested by the embeddings X → Y of affine algebras, we wish to realize crystals of type X using crystals of type Y . be a Y -crystal. We define the virtual crystal operators Let V ei , fi for i ∈ I X as the composites of Y -crystal operators fj , ej given by γ fj i , fi = j ∈ι(i)

ei =

j ∈ι(i)

γ

ej i .

These are designed to simulate X-crystal operators fi , ei for i ∈ I X . The type Y operators on the right-hand side, may be performed in any order, since distinct nodes j, j ∈ ι(i) are not adjacent in Y and thus their corresponding raising and lowering operators commute. ) such that: A virtual crystal is a pair (V , V is a Y -crystal. (1) V is closed under (2) V ⊂ V ei , fi for i ∈ I X . (3) There is an X-crystal B and an X-crystal isomorphism : B → V such that ei , fi correspond to ei , fi . Sometimes by abuse of notation, V will be referred to as a virtual crystal. and i ∈ I X . We say that b is i-aligned if Let b ∈ V (1) ϕjY (b) = ϕjY (b) for all j, j ∈ ι(i), and similarly for ε. (2) ϕjY (b) ∈ γi Z for all j ∈ ι(i) and similarly for ε. In this case ϕiX (b) =

1 Y ϕ ( (b)) for j ∈ ι(i), b ∈ B γi j

(3.1)

is aligned if it is i-aligned for all i ∈ I X and a subset and similarly for ε. Say that b ∈ V is aligned if all its elements are. V ⊂V Proposition 3.3 ([29]). Aligned virtual crystals form a tensor category. ) is simple if V and V are simple crystals. For the rest of the definitions Say that (V , V we assume that the virtual crystals are simple and aligned. ) be virtual crystals. ) and (V , V Let (V , V Definition-Conjecture 3.4. Define the virtual R-matrix R v : V ⊗ V → V ⊗ V as the : V ⊗V → V ⊗ V . restriction of the type Y R-matrix R ⊗ V ) ⊂ V ⊗ V . For this definition to make sense it needs to be shown that R(V : B ∼ V be X-crystal isomorphisms. By the In this case, let : B ∼ V and

= = uniqueness of the R-matrix it follows that the diagram R

B ⊗ B −−−−→ B ⊗ B    

⊗ ⊗ Rv

V ⊗ V −−−−→ V ⊗ V commutes.

(3.2)

198

M. Okado, A. Schilling, M. Shimozono

Definition 3.5. Define the virtual energy function H v : V ⊗ V → Z by 1 H v (b ⊗ b ) = HY (b ⊗ b ), γ0 ⊗V → Z. where HY : V If Definition-Conjecture 3.4 holds, it follows that HX (b ⊗ b ) = H v ( (b) ⊗ (b )),

(3.3)

⊗ B

→ Z is the energy function. where HX : B Similarly, define D v : V → Z as 1 D v (b) = DV (b). γ0 If (2.14) and Definition-Conjecture 3.4 hold then DX (b) = D v ( (b))

for b ∈ B,

(3.4)

where DX : B → Z is the intrinsic energy of B. + Finally, let λ ∈ P for the algebra X and X

P(V , λ) = {b ∈ V | wt(b) = (λ) and ei b = 0 undefined for i ∈ I }. Then let Xv (V , λ) =

qD

v (b)

.

b∈P (V ,λ)

Let us define the Y -crystal r,s = V

j ∈ι(r)

except for

(2) A2n

j,γr s

BY

n,s = B n,s ⊗ B n,s . and r = n in which case V Y Y

r,s generated from u(V r,s ) using the virtual Definition 3.6. Let V r,s be the subset of V X crystal operators ei and fi for i ∈ I . Conjecture 3.7. r,s ) is a simple aligned virtual crystal. (V1) The pair (V r,s , V (V2) There is an isomorphism of X-crystals

: B r,s ∼ = V r,s X

such that ei and fi correspond to ei and fi respectively, for all i ∈ I X . (V3) Let λ be a classical dominant weight for X, B a tensor product of X-crystals of ) the corresponding tensor product of virtual crystals the form B r,s , and (V , V r,s r,s (V , V ). Then X v (V , λ) = X(B, λ). In [29] Conjecture 3.7 is proved for embeddings and tensor factors of the form B r,1 .

(1) (2) (2)† (2) Cn , A2n , A2n , Dn+1

(3.5) (1)

→ A2n−1

Theorem 3.8. Conjecture 3.7 holds when X is of nonexceptional affine type and B is a tensor product of crystals of the form B 1,s . This theorem is proven in subsects. 3.3 and 3.4.

Virtual Crystals and Kleber’s Algorithm

199

(2)

(1)

(1)

3.3. Virtual crystals V 1,s for A2n−1 , Bn → Dn+1 . (1)

(1)

Proposition 3.9. For X = Bn and Y = Dn+1 , 1,s | xi , x¯i ∈ 2Z for i < n, xn + x¯n ∈ 2Z, xn+1 = x¯n+1 = 0}. V 1,s = {b ∈ V Moreover Theorem 3.8 holds. 1,s ) = 12s and the definitions of the Proof. The explicit form of V 1,s follows from u(V virtual crystal operators. It is easy to show that for s = 1 the map B 1,1 → V 1,1 defined by i → ii and i¯ → i¯i¯ for 1 ≤ i ≤ n and ◦ → nn, ¯ is the desired isomorphism for s = 1. Similarly, it is straightforward to show that for s arbitrary, the desired isomorphism

: B 1,s → V 1,s is given by replacing each letter (which is an element of B 1,1 ) of a word in B 1,s by the corresponding pair of letters as in the case s = 1. This proves (V1) and (V2). For (V3) we need to check that D v ( (b)) = DB (b)

for b ∈ B.

(3.6)

Since DB is defined in terms of R, H and functions DB 1,s , it suffices to verify (2.14) and Definition-Conjecture 3.4. The element u (B 1,s ) is given explicitly by 1¯ s . By the explicit computation of H : B 1,s ⊗ B 1,s → Z given in [5] it follows that (2.14) holds. To check Definition-Conjecture 3.4 we consider the explicit expressions for the (1) (1) R-matrices of types Bn and Dn+1 given in [5]. From this it suffices to show that the images of relations in the plactic monoid of type Bn are relations in the plactic monoid of type Dn+1 [26]. This is straightforward. (2)

(1)

Proposition 3.10. For X = A2n−1 and Y = Dn+1 , 1,s | xn+1 = x¯n+1 = 0}. V 1,s = {b ∈ V Moreover Theorem 3.8 holds. Proof. The proof is similar to that of Proposition 3.9. In particular the bijection B 1,s → V 1,s is given by leaving a word unchanged. (1)

(2)

(2)†

(2)

(1)

3.4. Virtual crystals V 1,s for Cn , A2n , A2n , Dn+1 → A2n−1 . We require some pre(1)

liminaries on crystals of type A2n−1 .

(1) (2) (2)† (2) 1,s = and X one of Cn , A2n , A2n , Dn+1 . In all these cases V Consider Y = 2n−1,s 1,s BY ⊗ BY . We introduce the alphabets (1) A2n−1

Y = {1 < 2 < · · · < 2n}

Y ∨ = {2n∨ < (2n − 1)∨ < · · · < 2∨ < 1∨ }.

(3.7)

2n−1,1 Y and Y ∨ are the sets of elements of BY1,1 and (BY1,1 )∨ ∼ respectively. The = BY 2n−1,1 element i ∨ ∈ BY is the column of height 2n − 1 in the alphabet Y with the letter i missing. For 1 ≤ i ≤ 2n − 1, fi ((2n + 1 − i)∨ ) = (2n − i)∨ and fi (b) is undefined otherwise. f0 (1∨ ) = (2n)∨ and f0 (b) is undefined otherwise. In this notation, BY2n−1,s consists of the weakly increasing words of length s in the alphabet Y ∨ . For

200

M. Okado, A. Schilling, M. Shimozono

1,s , let yi be the number of letters i in b2 and y ∨ the number of letters b = b1 ⊗ b2 ∈ V i ∨ i in b1 , for 1 ≤ i ≤ 2n. The R-matrix R : BY1,1 ⊗ BY2n−1,1 → BY2n−1,1 ⊗ BY1,1 is given by  ∨  if i = j j ⊗ i ∨ i ⊗ j → (i + 1)∨ ⊗ (i + 1) if i = j < 2n (3.8)  1∨ ⊗ 1 if i = j = 2n. The R-matrix R : BY1,s ⊗ BY2n−1,s → BY2n−1,s ⊗ BY1,s is given by iterating the above R-matrix so that all of the elements of Y ∨ are commuted to the left. The element 1∨ ⊗ 1 commutes with all elements of BY1,1 and BY2n−1,1 . To formulate the next propositions we also need an involution ∗ : B → B on crystals (1) of type A2n−1 [29, Sect. 3.8]. Given a word u, let u∗ be the word obtained by replacing each letter i by 2n + 1 − i, and reversing the resulting word. Clearly if u is a column word then so is u∗ . If b = c1 c2 . . . cs ∈ B r,s , where cj is a column word for all j , then by definition b∗ = cs∗ . . . c1∗ ∈ B r,s , which is a sequence of column words. Under this map the crystal operators transform as follows: fi (b∗ ) = en−i (b)∗ , ei (b∗ ) = fn−i (b)∗ , wt(b∗ ) = w0 wt(b). (1)

(1)

Proposition 3.11. For X = Cn and Y = A2n−1 , ∨ 1,s | b∨∗ = R(b), min(y1 , y1∨ ), min(yn+1 , yn+1 V 1,s = {b ∈ V ) ∈ 2Z}.

(3.9)

Moreover Theorem 3.8 holds. Proof. We first prove (3.9). By the definition of V 1,s , it suffices to show that the right1,s ), and every element of V is reachable from u(V 1,s ) hand side V of (3.9) contains u(V X using the virtual crystal operators ei , fi for i ∈ I . We first digress on the self-duality condition b∨∗ = R(b).

(3.10)

r,s , the condition (3.10) is preserved under By the proof of [29, Prop 6.8], in the set V 1,s , using (3.8), Eq. (3.10) e0 , en , ei for 1 ≤ i ≤ n − 1, and similarly for f . For b ∈ V is equivalent to ∨ y2n+1−i = yi∨ − min(yi , yi∨ ) + min(yi+1 , yi+1 ),

∨ y2n+1−i

=

∨ yi − min(yi , yi∨ ) + min(yi+1 , yi+1 )

(3.11) (3.12)

∨ = y1∨ . for 1 ≤ i ≤ 2n, where y2n+1 = y1 and y2n+1 We deduce two consequences of (3.10). Subtracting (3.11) and (3.12) we obtain ∨ yi + y2n+1−i = yi∨ + y2n+1−i

(3.13)

∨ − min(y , y ∨ ). By (3.12) with i = 2n for 1 ≤ i ≤ 2n. We also have ε0 (b) = y1 + y2n 2n 2n and (3.13) with i = 1, we have

ε0 (b) = 2y1 − min(y1 , y1∨ ).

(3.14)

Virtual Crystals and Kleber’s Algorithm

201

1,s ) ∈ V . This element satisfies yi = sδi,1 and y ∨ = Now we show that u = u(V i sδi,2n for 1 ≤ i ≤ 2n. Comparing this with (3.11) and (3.12) it follows that u satisfies (3.10). It follows that u ∈ V . We next check that V , is aligned. Let b ∈ V and i ∈ I X . Since b satisfies (3.10) it is i-aligned if 1 ≤ i ≤ n − 1 by [29, Prop. 6.9]. For 0-alignedness, by (3.14) we see that ε0 (b) is even since min(y1 , y1∨ ) is. The proof that ϕ0 (b) is even is similar. So b is 0-aligned. The proof that b is n-aligned, is similar as well. So V is aligned. ei and fi for i ∈ I X . Let b ∈ V . ei b Next it is shown that the set V is closed under is self-dual since b is. Note that the quantity min(y1 , y1∨ ) is unchanged for i ∈ {0, 1}. We have ε1 (b1 ) = y1∨ and ϕ1 (b2 ) = y1 . Hence by the tensor product rule, min(y1 , y1∨ ) remains the same upon applying e1 . Let i = 0. Since b ∈ V , b is 0-aligned, so that ε0 (b) ∈ 2Z. Since ε0 ( e0 b) = ε0 (b) − 2 is even, by (3.14), the self-dual element e0 b has the property that min(y1 , y1∨ ) ∈ 2Z. Thus ei b satisfies that property for all i. The ∨ ) ∈ 2Z is satisfied for property that min(yn+1 , yn+1 ei b is similar. Thus ei b ∈ V for all X X i ∈ I . The proof that fi b ∈ V for all i ∈ I is again similar. ei and fi leading from b to Let b ∈ V . It suffices to find a sequence of operators ∨ u. We shall induct on the quantity min(y1 , y1 ), which is invariant under ei and fi for X i ∈ I \{0} by previous arguments. Suppose first that εj (b) > 0 for some j = 0. By alignedness it follows that we may apply a sequence of operators ei for i ∈ I X \{0} to b, 1,s thereby passing to a classical highest weight vector of V . The classical highest weight 1,s are given explicitly by uk = (2n∨ )s−k 1∨k ⊗ 1s , for 0 ≤ k ≤ s. uk vectors of V satisfies min(y1 , y1∨ ) = k. By assumption b = uk for k even. If k = 0 then b = u0 = u and we are done. If k > 0 then f0 b satisfies min(y1 , y1∨ ) = k − 2, which is even. We are done by induction. We have shown that (3.9) holds and that V 1,s is aligned. The bijection : B 1,s → V 1,s is given as follows. Let b ∈ B 1,s . In the case s = 1, the map B 1,1 → V 1,1 is given by i → (2n + 1 − i)∨ ⊗ i and i¯ → i ∨ ⊗ (2n + 1 − i). The map : B 1,s → V 1,s is given by the composite map B 1,s → (B 1,1 )⊗s → (BY2n−1,1 ⊗ BY1,1 )⊗s → (BY2n−1,1 )⊗s ⊗ (BY1,1 )⊗s .

(3.15)

It follows from (3.8) that the image of this map is contained in BY2n−1,s ⊗ BY1,s . Computing this commutation explicitly and using the notation xi , x¯i to describe b for 1 ≤ i ≤ n, and yi , yi∨ for (b), we have y1 = x1 − min(x1 , x¯1 ) + s − y1∨ = x¯1 − min(x1 , x¯1 ) + s −

n i=1 n

(xi + x¯i ), (xi + x¯i ),

(3.16)

i=1

yi = xi − min(xi , x¯i ) + min(xi−1 , x¯i−1 ) yi∨

= x¯i − min(xi , x¯i ) + min(xi−1 , x¯i−1 )

for i > 1, for i > 1.

To recover yi and yi∨ for n + 1 ≤ i ≤ 2n one may use (3.11) and (3.12), plus the fact that the total number of letters in either b1 or b2 , is s. ei for 1 ≤ i ≤ n [2]. It is straightforThe composite map given in (3.15) sends ei to ward to check that e0 goes to e0 using (3.16), (3.11), and (3.12). Therefore is a mor1,s ) phism of X-crystals. It is clearly injective. The image is V 1,s since (u(B 1,s )) = u(V

202

M. Okado, A. Schilling, M. Shimozono

and both B 1,s and V 1,s are connected. Therefore : B 1,s → V 1,s is an isomorphism of X-crystals. This completes the proof of (V1) and (V2). (V3) follows by [29, Sect. 6.6]. (2)

(1)

Proposition 3.12. For X = A2n and Y = A2n−1 , ∨ 1,s | b∨∗ = R(b), min(yn+1 , yn+1 ) ∈ 2Z}. V 1,s = {b ∈ V

(3.17)

Moreover Theorem 3.8 holds. (1)

The proof is entirely similar to that of Cn . (2)

(1)

Proposition 3.13. For X = Dn+1 and Y = A2n−1 , 1,s | b∨∗ = R(b)}. V 1,s = {b ∈ V

(3.18)

Moreover Theorem 3.8 holds. (2)

(1)

Proof. For X = Dn+1 most of the proof is similar to that of type Cn . Here the classical subalgebra of X is of type Bn , so the isomorphism : B 1,s → V 1,s is a bit different. It (1) is given by ◦ → (n + 1)∨ ⊗ (n + 1), with the other letters mapped as in the Cn case. The explicit map is given as in (3.16) except that y1 = x1 − min(x1 , x¯1 ) + s − x◦ −

n

(xi + x¯i ),

i=1

y1∨

= x¯1 − min(x1 , x¯1 ) + s − x◦ −

n

(3.19) (xi + x¯i ).

i=1

(2)†

(1)

Proposition 3.14. For X = A2n and Y = A2n−1 , 1,s | b∨∗ = R(b), min(y1 , y1∨ ) ∈ 2Z}. V 1,s = {b ∈ V

(3.20)

Moreover Theorem 3.8 holds. The proof is similar. 4. Fermionic Formula 4.1. Review. This subsection reviews definitions of [6, 7]. For this section we assume + (2)† that ᒄ = A2n ; for that type we refer the reader to [29, Sect. 7.6]. Fix λ ∈ P and B a (a) tensor product of crystals of the form B r,s . Let Li be the number of tensor factors in (2) a,i B that are equal to B . Set α˜ a = αa for all a ∈ I except for type A2n in which case α˜ a are the simple roots of type Bn . (a) Let ν = (mi ) be a matrix of nonnegative integers for i ∈ Z>0 and a ∈ I . Say that ν is a (B, λ)-configuration if (a) (a) i mi α˜ a = i Li a − λ (4.1) a∈I i∈Z>0

a∈I i∈Z>0

Virtual Crystals and Kleber’s Algorithm

203

(2)

except for type A2n . In this case the right-hand side should be replaced by ι(r.h.s) where ι is a Z-linear map from the weight lattice of type Cn to the weight lattice of type Bn such that C ι(a )

=

B a

for 1 ≤ a < n

B 2a

for a = n.

Say that a configuration ν is admissible if (a)

pi

≥0

for all a ∈ I and i ∈ Z>0 ,

(4.2)

where (a) pi

=

k∈Z>0

 L(a) min(i, k) − 1 k ta∨

 (b) (α˜ a |α˜ b ) min(tb i, ta k) mk  .

(4.3)

b∈I

Write C(B, λ) for the set of admissible (B, λ)-configurations. Define cc(ν) =

1 2

(a)

a,b∈I j,k∈Z>0

(b)

(α˜ a |α˜ b ) min(tb j, ta k)mj mk .

(4.4)

The fermionic formula is defined by M(B, λ; q) =

q

cc(ν)

pi(a) + m(a) i a∈I i∈Z>0

ν∈C(B,λ)

(a)

mi

∨

.

(4.5)

q ta

The X = M conjecture of [6, 7] states that X(B, λ; q −1 ) = M(B, λ; q).

(4.6)

The fermionic formula M(B, λ) can be interpreted using combinatorial objects called (a) rigged configurations. Denote by (ν, J ) a pair where ν = (mi ) is a matrix and J = (J (a,i) ) is a matrix of partitions with a ∈ I and i ∈ Z>0 . Then a rigged configuration is a pair (ν, J ) such that ν ∈ C(B, λ) and the partition J (a,i) is contained in a (a) (a) mi (ν) × pi (ν) rectangle for all a, i. The set of rigged (B, λ)-configurations for fixed λ and B is denoted by RC(B, λ). Then (4.5) is equivalent to

M(B, λ) =

q cc(ν,J ) ,

(ν,J )∈RC(B,λ)

where cc(ν, J ) = cc(ν) + |J | and |J | =

∨ (a,i) |. (a,i) ta |J

204

M. Okado, A. Schilling, M. Shimozono

4.2. Virtual fermionic formula. We define virtual rigged configurations in analogy to virtual crystals. Definition 4.1. Let X and Y be as in (1.1), and λ and B as in Subsect. 4.1 for type X. ) be the virtual Y -crystal corresponding to B. Then RCv (B, λ) is the set of Let (V , V , (λ)) such that: elements ( ν, J) ∈ RC(V (a) (b) (a) (b) (1) For all i ∈ Z>0 , m i = m i and Ji = Ji if a and b are in the same σ -orbit in Y I . X (b) (2) For all i ∈ Z>0 , a ∈ I , and b ∈ ι(a) ⊂ I Y , we have m j = 0 if j ∈ γa Z and the (b) parts of J are multiples of γa . i

ν, J) Theorem 4.2. There is a bijection RC(B, λ) → RCv (B, λ) sending (ν, J ) → ( X Y given as follows. For all a ∈ I , b ∈ ι(a) ⊂ I , and i ∈ Z>0 , m γa i = mi ,

(b)

(a)

(4.7)

(b) Jγa i

(a) γa Ji ,

(4.8)

=

(2)

except when X = A2n and a = n, in which case (n)

m i

(n)

= mi ,

(n) (n) Ji = 2Ji .

The cocharge changes by cc( ν, J) = γ0 cc(ν, J ).

(4.9)

be to V as L is to B as in Subsect. 4.1. For a ∈ I X , b ∈ ι(a), and i ∈ Z>0 , Proof. Let L (b) = L(a) , L γa i i (b) = 0 L j

for j ∈ γa Z,

except when X = A2n and a = n, in which case L i have, for all b ∈ ι(a) and i ∈ Z>0 , (2)

(n)

(b)

(n)

= 2Li

for all i. Using (4.3) we

(a)

p γa i = γa pi , (2)

(n)

(n)

i = 2pi . Therefore (ν, J ) → except when X = A2n and i = n, in which case p ( ν, J ) defines a bijection. Using (4.4) we see that (4.9) holds.

Virtual Crystals and Kleber’s Algorithm

205

5. Algorithms for Computing the Fermionic Formula To compute the fermionic formula M(B, λ), one must find the set of admissible (B, λ)configurations C(B, λ). One direct approach would be to test the admissibility conditions (4.2) on the set of (B, λ)-configurations (4.1) which consist of all possible n-tuples of partitions of sizes that depend on λ and B. This quickly becomes infeasible as B and λ grow. In [16, 17] Kleber gives an efficient algorithm to compute the set of admissible con(1) (1) (1) (1) (1) figurations in the simply-laced types An , Dn , E6 , E7 , and E8 . It generates a + rooted tree T (B) whose nodes are labelled by elements of P . The tree T (B) is constructed to have the property that the elements of C(B, λ) are in bijection with the nodes of T (B) labelled λ. If a node x labelled λ corresponds to a configuration ν, then ν can be recovered from the unique path in T (B) from x to the root. 5.1. Kleber’s algorithm. We review Kleber’s algorithm [16, 17]. Let X be the Dynkin diagram of an untwisted affine Lie algebra whose canonical simple subalgebra is of simply-laced type. Let B and L be as in Subsect. 4.1. We define a tree T (B) by the following algorithm. Each node x is labelled by an + element wt(x) ∈ P called its weight. It has the property that if x is a node and y is its child, then wt(x) = wt(y) and wt(x) wt(y). A tree edge (x, y) is labelled by the + element dxy = wt(x) − wt(y) ∈ Q \{0}. (1) Let T0 be the tree consisting of a single node of weight 0 and set = 0. (2) Add 1 to . (a) (3) Let T be obtained from T−1 by adding na=1 a i≥ Li to the weight of each node. (4) Let T be obtained from T as follows. Let x be a node at depth − 1 of weight + µ. Suppose there is a weight τ ∈ P such that µ = τ , µ τ , and if x is not the + root, ν − 2µ + τ ∈ Q , where ν is the weight of the parent w of x. In every such case we attach to x a child y of weight τ . Note that if x is not the root, the condition + ν − 2µ + τ ∈ Q is equivalent to dwx dxy . (5) If T = T−1 then go to Step 2. (6) Otherwise set T (B) = T and stop. For large Step 3 does not change the tree. For such , Step 4 can only be applied finitely + many times since there are finitely many elements of P dominated by a given element + of P . Hence the algorithm terminates. There is a bijection from the nodes of T (B) and the configurations C(B) = + C(B, λ) given as follows. Let x be a node at depth p in T (B) of weight λ. λ∈P Let λ(0) , λ(1) , . . . , λ(p) = λ be the weights of the nodes on the path from the root of T (B) to x. Then the configuration ν ∈ C(B, λ) corresponding to x is defined by (a)

mi

= (λ(i−1) − 2λ(i) + λ(i+1) | a ),

(5.1)

where we make the convention that λ = λ(p+1) = λ(p+2) = · · · . The vacancy numbers are given by (a) (a) pi = − Lj + (λ(i) | αa ). (5.2) j >i

206

M. Okado, A. Schilling, M. Shimozono +

Suppose we are only interested in finding C(B, λ) for a particular λ ∈ P . It is wasteful to generate the entire tree T (B) and then select the nodes of weight λ. Because the weight of a node dominates that of any of its children, we can prune the tree as follows. In Step 4, we only add a node of weight τ at depth if (a) a Lj λ. (5.3) τ := τ + j >

There is another condition under which we can prune. Suppose that in the absence of pruning, we would have added a node y of weight τ at depth in Step 4, with parent x. Then we do not add y if there is an a such that (τ − λ | a ) > 0 and (dxy | a ) = 0. For in this case, the condition in Step 4 prevents one from reaching the weight λ as a descendant of τ . (1)

Example 5.1. Let B = B 3,2 ⊗ B 2,1 ⊗ B 1,1 ⊗ B 1,1 of type A3 . The Kleber algorithm produces the tree T (B) given in Fig. 2. The corresponding configurations are given in (a) the following diagram, where we represent ν as a sequence of partitions ν (a) with mi (a) rows of length i. The vacancy number pi is placed to the right of a row of length i in ν (a) .

5.2. Virtual Kleber algorithm. Outside of the simply-laced case, Kleber’s algorithm does not directly apply. However we use the embeddings of affine algebras into those of simply-laced type, where Kleber’s algorithm can be applied. We call our method the virtual Kleber algorithm. Let X and Y be as in (1.1). Theorem 4.2 defines a bijection , (λ)) C(B, λ) ∼ ν ∈ C(V = C v (B, λ), where C v (B, λ) consists of the configurations constrained as in Definition 4.1, or equivalently, the ν such that ( ν, J ) ∈ RCv (B, λ) for some J. A naive approach would be to run Kleber’s algorithm to compute the set , (λ)) and then to select the desired subset C v (B, λ). A more efficient way is to C(V

Virtual Crystals and Kleber’s Algorithm

207

Fig. 2. Tree T (B)

prune the branches that cannot contain elements of C v (B, λ). This results in a good algorithm to find C v (B, λ) and therefore M(B, λ) for any affine type. More precisely, one only adds the child y to the node x in Step 4 at depth if: (1) (wt(y) | αa ) = (wt(y) | αb ) if a and b are in the same σ -orbit of I Y . (2) If − 1 ∈ γa Z, then dwx = dxy , where w is the parent of x. These conditions are equivalent to those in Definition 4.1. Let T(B) be the resulting tree. Let γ = maxa γa . Then there is a bijection between C v (B, λ), and the set of nodes y of weight λ in T(B) that satisfy either of the following conditions: (1) y is at depth with ∈ γ Z, or (2) (dxy | a ) = 0 for every a such that 1 < γ = γa , where x is the parent of y. Observe that for ∈ γ Z, there may be nodes at depth in T whose weights are not in the image of the embedding P X → P Y , but rather in a superlattice of index γ . These weights, which cannot appear in the final tree, are necessary as they allow the virtual Kleber algorithm to reach all of the desired weights. (1)

(1)

Example 5.2. Let X = C2 , Y = A3 , B = B 1,2 ⊗ B 1,1 ⊗ B 2,1 . The virtual Kleber algorithm produces the tree T(B) given in Fig. 3. The nodes corresponding to elements of C v (B, λ) are circled. We list the configurations corresponding to the circled nodes, ordered by increasing depth and then from left to right. Here we represent ν as a sequence (a) (a) of partitions ν (a) with mi rows of length i. The vacancy number pi is placed to the (a) right of a row of length i in ν .

208

M. Okado, A. Schilling, M. Shimozono

Fig. 3. Tree T(B)

Acknowledgements. Most of this work was carried out as part of the Research in Pairs program of the Mathematisches Forschungsinstitut Oberwohlfach in August 2002. AS and MS would like to thank the institute for the ideal working conditions during their stay. AS also thanks the University of Wuppertal and the Max-Planck-Institut f¨ur Mathematik in Bonn for hospitality, where this work was completed. MO was partially supported by Grant-in-Aid for Scientific Research (No.14540026), JSPS. AS was partially supported by the Humboldt Foundation and NSF grant DMS-0200774. MS was partially supported by NSF grant DMS-0100918.

References 1. Akasaka, T., Kashiwara, M.: Finite-dimensional representations of quantum affine algebras. Publ. RIMS, Kyoto Univ. 33, 839–867 (1997)

Virtual Crystals and Kleber’s Algorithm

209

2. Baker, T.: Zero actions and energy functions for perfect crystals. Publ. Res. Inst. Math. Sci. 36(4), 533–572 (2000) 3. Chari, V.: On the fermionc formula and the Kirillov-Reshetikhin conjecture. Internat. Math. Res. Notices, no. 12, 629–65 (2001) 4. Drinfeld, V.G.: Hopf algebra and theYang–Baxter equation. Soviet. Math. Dokl. 32, 254–258 (1985) 5. Hatayama, G., Kuniba, A., Okado, M., Takagi, T.: Combinatorial R matrices for a family of crystals: (1) (1) (2) (2) Bn , Dn , A2n , and Dn+1 cases. J. Algebra 247(2), 577–615 (2002) 6. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Tsuboi, Z.: Paths, crystals and fermionic formulae. Prog. Math. Phys. 23, Boston, MA: Birkh¨auser Boston, 2002, pp. 205–272 7. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Yamada, Y.: Remarks on fermionic formula. Contemp. Math. 248, 243–291 (1999) 8. Jimbo, M.: A q-difference analogue of U (G ) and the Yang–Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) 9. Jimbo, M., Miwa, T.: On a duality of branching rules for affine Lie algebras. Adv. Studies in Pure Math. 6, 17–65 (1985) 10. Kac, V.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 11. Kashiwara, M.: Crystalizing the q-analogue of universal enveloping algebras. Commun. Math. Phys. 133, 249–260 (1990) 12. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63, 465–516 (1991) 13. Kashiwara, M.: On crystal bases. In: Representations of Groups (Banff, AB, 1994), CMS Conf. Proc. 16, Providence, RI: Am. Math. Soc., 1995, pp. 155–197 14. Kashiwara, M.: Similarity of crystal bases. Contemp. Math. 194, 177–186 (1996) 15. Kashiwara, M.: On level zero representations of quantized affine algebras. Duke Math. J. 112, 117–195 (2002) 16. Kleber, M.: Combinatorial structure of finite dimensional representations of Yangians: The simplylaced case. Internat. Math. Res. Notices, no. 4, 187–201 (1997) 17. Kleber, M.: Finite dimensional representations of quantum affine algebras. Ph.D. dissertation at University of California Berkeley, 55 pages, 1998, math.QA/9809087 18. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic quasi-particle representations for characters of (G(1) )1 × (G(1) )1 /(G(1) )2 . Phys. Lett. B 304(3–4), 263–270 (1993) 19. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic sum representations for conformal field theory characters. Phys. Lett. B 307(1–2), 68–76 (1993) 20. Kang, S.-J., Kashiwara, M., Misra, K.C.: Crystal bases of Verma modules for quantum affine Lie algebras. Compositio Math. 92, 299–325 (1994) 21. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Perfect crystals of quantum affine Lie algebras. Duke Math. J. 68(3), 499–607 (1992) 22. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Affine crystals and vertex models. Int. J. Mod. Phys. A7, (suppl. 1A), 449–484 (1992) 23. Kashiwara, M., Nakashima, T.: Crystal graphs for representations of the q-analogue of classical Lie algebras. J. Algebra 165(2), 295–345 (1994) 24. Kirillov, A.N., Reshetikhin, N.Y.: Representations of Yangians and multiplicities of the inclusion of the irreducible components of the tensor product of representations of simple Lie algebras. (Russian). Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 160 (1987), Anal. Teor. Chisel i Teor. Funktsii. 8, 211–221, 301; translation in J. Soviet Math. 52(3), 3156–3164 (1990) 25. Kirillov, A.N., Reshetikhin, N.Y.: The Bethe ansatz and the combinatorics of Young tableaux. (Russian) Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 155 (1986), Differentsialnaya Geometriya, Gruppy Li i Mekh. VIII, 65–115, 194; translation in J. Soviet Math. 41(2), 925–955 (1988) 26. Lecouvey, C.: Schensted-type correspondences and plactic monoids for types Bn and Dn . Preprint 27. Nakajima, H.: t-analogs of q-characters of Kirillov-Reshetikhin modules of quantum affine algebras. Preprint math.QA/0204185 28. Nakayashiki, A., Yamada, Y.: Kostka polynomials and energy functions in solvable lattice models. Selecta Math. (N.S.) 3, 547–599 (1997) (2) 29. Okado, M., Schilling, A., Shimozono, M.: Virtual crystals and fermionic formulas of type Dn+1 , (2)

(1)

A2n , and Cn . Representation Theory, 7, 101–163 (2003) Communicated by L. Takhtajan

Commun. Math. Phys. 238, 211–223 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0859-8

Communications in

Mathematical Physics

Existence of Global Weak Solutions for a 2D Viscous Shallow Water Equations and Convergence to the Quasi-Geostrophic Model Didier Bresch1 , Benoˆıt Desjardins2 1

Laboratoire de Math´ematiques Appliqu´ees, Universit´e Blaise Pascal et C.N.R.S., 63177 Aubi`ere cedex, France. E-mail: [email protected] 2 CEA/DIF, B.P. 12, 91680 Bruy`eres le Chˆatel, France. E-mail: [email protected] Received: 4 October 2002 / Accepted: 22 January 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: We consider a two dimensional viscous shallow water model with friction term. Existence of global weak solutions is obtained and convergence to the strong solution of the viscous quasi-geostrophic equation with free surface term is proven in the well prepared case. The ill prepared data case is also discussed. 1. Introduction We consider the viscous shallow water model in a bounded two–dimensional domain with periodic boundary conditions that means = T 2 . This model, also called the Saint-Venant equations among the French scientific community, is commonly used in Oceanography. It is aimed at describing vertically averaged flows in three dimensional shallow domains in terms of the horizontal mean velocity field u and the depth variation h due to the free surface. In the rotating framework, a particular model reads as  ∂t h + div(hu) = 0,    ⊥    ∂t (hu) + div(hu ⊗ u) + (hu) + r0 u + r1 h|u|u − κh∇h Ro (1)     h∇h   + 2 − νdiv(h∇u) = hf, Fr where Fr > 0 denotes the Froude number, Ro > 0 the Rossby number, and κ ≥ 0 the capillary coefficient. System (1) is supplemented with initial conditions h|t=0 = h0 ,

(hu)|t=0 = m0 .

(2)

This model is derived from the three-dimensional Navier-Stokes equations with free surface, where the normal stress is determined from the air pressure and capillary effects. The drag terms r0 u in the laminar case (r0 ≥ 0), and r1 h|u|u in the turbulent regime

212

D. Bresch, B. Desjardins

(r1 ≥ 0) are obtained from the friction condition on the bottom, see [21]. The Saint Venant system (1) without Coriolis force but with a friction term of the form r0 u/ 1 + (4r0 h/3ν) is formally derived in the one-dimensional case for laminar flows in [15]. Some numerical simulations are also given. Notice that our analysis may be performed with this kind of friction term instead of r0 u. Moreover the diffusive tensor h∇u may be replaced by hD(u), where D(u) = (∇u + t ∇u)/2 without extra difficulties, see [3]. Note that the unidimensional version of (1) may be used for the study of pollutant dispersion in rivers with free surface. The energy inequality associated with System (1) reads as 2 t t |u|2 h |∇h|2 2 2 +h νh|∇u| + r0 |u| + r1 h|u|3 +κ + 2 2 2 2Fr 0 0 t h2 |u0 |2 |∇h0 |2 0 ≤ + h hf · u. (3) + κ + 0 2 2 2 2Fr 0 In the sequel, we assume f = 0 without loss of generality since all the analysis can be extended to the case of regular enough f . The initial data are taken in such way that h0 ∈ L2 (),

∇ h0 ∈ (L2 ())2 ,

√ |m0 |2 ∈ L1 (), κ∇h0 ∈ (L2 ())2 , h0 −r0 log− h0 ∈ L1 (),

(4)

where m0 = 0 on h−1 0 ({0}) and log− g = log min(g, 1). We shall say that (h, u) is a weak solution on (0, T ) of (1) if the following three conditions are fulfilled: – (2) holds in D (), – (3) is satisfied for a.e. non negative t, – System (1) holds in (D ((0, T ) × ))3 and the following regularity properties are satisfied √ √ ∞ 2 2 ∇ hu ∈ L∞ (0, T ; (L2 ())2 ), √ h ∈ L 2(0, T ; (L 2()) 4), h∇u ∈ L (0, T ; (L ()) ), ∇h ∈ L2 (0, T ; (L2 ())2 ), (5) √ 1/3 2 T ; (L2 ())2 ), r1 h1/3 u ∈ L3 (0, T ; (L3 ())2 ), √r0 u2∈ L (0, κ∇ h ∈ L2 (0, T ; (L2 ())4 ). Before investigating the case of vanishing Froude and Rossby numbers, we establish an existence result for given physical parameters. Theorem 1. Let m0 , h0 satisfy (4) and assume that either κ > 0 or r1 > 0. Then, there exists a global weak solution of (1). Let us recall that existence of smooth solutions for small enough time or data close to equilibrium was proven in [23] with the same viscous term but without friction or capillary effects (r0 = 0, r1 = 0 and κ = 0). The various possible assumptions in Theorem 1 mean that some physical effects are mathematically important: laminar friction allows to take care of vanishing depth h, whereas either quadratic friction or capillarity seem to be necessary for stability in dimension 2. Other ways of modelizing viscous effects have been studied from a mathematical view point, see [18] p. 251 and [2]. For instance the case −νhu was investigated by

Two-Dimensional Viscous Shallow Water Model

213

[20], in which existence of weak solutions with small enough data was obtained. Unlike System (1), the corresponding model allows to divide by h the momentum equation. The counterpart of this simplification is some energetic inconsistency which requires the smallness assumption on the data. The reader is referred to [17] for an other viscous parametrization. There exist also some intermediate models named balance models, see for instance [14]. The choice of System (1) is motivated by its energetic consistency, which has been stressed out from a physical point of view in [13]. The reader is also referred to [4] for other systems with well balanced energy estimates related to the consistency hypothesis between the stress tensor and the closure of the potential part of the velocity in terms of the density. These models concern the evolution of pollutants or the framework of combustion in the low Mach regime. For System (1), contrary to the previous work, the viscous part given by −νdiv(h∇u) and the friction term r0 u are physically justified since they can be derived from the three dimensional Navier-Stokes equations with free surface and friction condition on the bottom, see [15]. Surprisingly these two terms turn out to be essential from a mathematical point of view not only to ensure the stability of weak solutions but also to derive the quasi-geostrophic equations with free surface term used in oceanography. hε , well prepared initial More precisely, assuming Fr = Ro = ε and hε ≡ h = 1 + εF data and letting ε go to 0, we get the quasi-geostrophic equation with free surface term d (ξ − F ) − νξ + r0 ξ + r1 ∇ ⊥ · |u|u = 0, dt d with = ∂t + u · ∇, u = ∇ ⊥ , and = ξ , dt

(6)

where ∇ ⊥ = (−∂y , ∂x ). This equation may be written in terms of the velocity  −1  ∂t u + (u · ∇)u − νu + r0 u + r1 |u|u + ∇p − ∂t u = 0, div u = 0,  u| t=0 = u0 .

(7)

Recall that the rigorous derivation of the quasi-geostrophic equations from the threedimensional Navier-Stokes equations with free surface is still open. Only mathematical results are known with the rigid lid assumption, see for instance [5, 16, 19, 7]. Here, we get the quasi-geostrophic equations with the free surface term −F from the global weak solutions of the viscous shallow water equations. Such asymptotics has been performed in the inviscid case by [22, 11, 12 and 1]. The well prepared case is studied in bounded domain in the first paper and propagation of waves in a two–dimensional periodic domain is investigated in the last two papers. In that case, the momentum equation is divided by h. Here we give a result in the viscous case with well prepared data, where the momentum equation can not be divided by h since the diffusive term is chosen to be −div(h∇u). Remark that there exist other quasi-geostrophic models either in dimension 2, see [8], or in dimension 3 if stratification is considered, see [9] and [6]. Here we shall prove the following asymptotic result: Theorem 2. Let u0 ∈ H 2 () be such that u0 = ∇ ⊥ 0 . Assuming (mε0 , hε0 ) satisfying (4) uniformly in ε where mε0 = hε0 uε0 and

214

D. Bresch, B. Desjardins

(uε0 , hε0 ) → u0 , 1 in (L2 ())3 , √ κ∇hε0 → 0 in (L2 ())2 ,

(hε0 − 1)/ε → 0 in L2 (),

then, denoting by (uε , hε ) a global weak solution of (1), uε → u in L∞ (0, T ; (L2 ())2 ), ε (h → in L∞ (0, T ; L2 ()), √ − 1)/ε ε κ∇h → 0 in L∞ (0, T ; (L2 ())2 )

∇hε → 0 in L2 (0, T ; (L2 ())2 ),

when ε → 0, where u = ∇ ⊥ is the global strong solution of the quasi-geostrophic equation (7) with the initial data u0 .

2. Existence of Global Weak Solutions This section is devoted to the proof of global existence of weak solutions to (1). The first step is to obtain suitable a priori bounds on (h, u) and next to consider sequences (hn , un ) of uniformly bounded weak solutions constructed from an adapted approximating process, see [18]. Such sequences may be built by using the regularization of capillary effects. Indeed, existence of global weak solutions to the so-called Korteweg model [3] has been obtained in two or three space dimensions. The final step is to obtain compactness on (hn , un ) and prove that the limit (h, u) solves Eqs. (1) in the distribution sense.

A priori bounds. The physical energy inequality (3) is obtained in a classical way by multiplying the momentum equation by u, using the mass equation and integrating by parts. More information is needed on h to get compactness. This is the key point that has been formerly used in [3] in the framework of capillary models without Coriolis force and friction term. This additional a priori bound for the Korteweg model provided a L2 (0, T ; H 2 ()) bound on h due to capillarity which allowed to deal with the difficulty of vanishing h. As we shall see, even if L2 (0, T ; H 2 ()) bounds are not always available for the shallow water model, the turbulent friction term combined with the non-capillary estimates will take care of possible concentrations on h−1 ({0}). The energy inequality provides the following uniform estimates: √ √ hn un (L∞ (0,T ;(L2 ())2 ) ≤ c, h n ∇un (L2 (0,T ;(L2 ())2 ) ≤ c, √ κ ∇hn (L∞ (0,T ;(L2 ())2 ) ≤ c, hFrn 2 ≤ c, (8) L (0,T ;L2 ()) √ 1/3 1/3 r0 un (L2 (0,T ;(L2 ())2 ) ≤ c, r1 hn un L3 (0,T ;(L3 ())2 ) ≤ c. The additional estimate comes from Lemma 5. Integrating (17) with respect to t, we get t 0

t

2 t u⊥ |∇hn |2 n · ∇hn + ν |∇ h | + κ|∇ 2 hn |2 + n Ro Fr 2 0 0 t + r1 |un |un · ∇hn − r0 log hn + r0 log hn 0 ≤ c 0

Two-Dimensional Viscous Shallow Water Model

215

with c independent of n. First, we observe that

t u ⊥

Fr 2 n

· ∇hn ≤ un 2L2 (0,T ;(L2 ())2 ) +

2Ro2 0 Ro

1 ∇hn 2 , 2 Fr L2 (0,t;(L2 ())2 )

so that the last term in the right-hand side will be absorbed and the uniform L2 (0, T ; (L2 ())2 ) estimate on un will be used. Next, we observe that t

|un |un · ∇hn = −r1

r1 0

t 0

un hn |un |divun + (un · ∇)un · |un |

so that t

|un |un · ∇hn ≤ C hn un L∞ (0,T ;(L2 ())2 ) hn ∇un L2 (0,T ;(L2 ())4 ) .

νr1 0

Therefore we get t 0

|∇hn |2 +ν 2Fr 2

|∇ hn | − r0

log hn + r0

2

log hn 0 +

t 0

κ|∇ 2 hn |2 ≤ c.

Now, since hn is uniformly bounded in L∞ (0, T ; L2 ()), we have log+ hn ≤ c, where log+ g = log max(g, 1). r0

It remains to assume that −r0 log− hn 0 is uniformly bounded in L1 to control −νr0 log− hn 0 . This gives the following extra estimates on hn :

∇ hn (L∞ (0,T ;(L2 ())2 ) ≤ c, √ κ ∇ 2 hn L2 (0,T ;(L2 ())4 ) ≤ c.

∇h n ≤ c, Fr (L2 (0,T ;(L2 ())2 ) (9)

Compactness. Given the preceding a priori bounds, we are now able to study the compactness of (hn , un ) and pass to the limit in nonlinear terms. Using the continuity equation on hn , we deduce that ∂t hn is bounded in L∞ (0, T ; H −2 ()) which combined with the uniform L2 (0, T ; (L2 ())2 ) bound on ∇hn gives strong compactness of hn to some h in C(0, T ; H s ()) for all s < 1. In addition, since un is uniformly bounded in L2 (0, T ; (L2 ())2 ), it converges weakly to some u ∈ L2 (0, T ; (L2 ())2 ) up to the√extraction√of a subsequence. The crucial step is now to prove the strong convergence of hn un to hu in L2 (0, T ; (L2 ())2 ). th 2 For any k ∈ N , the k Fourier projector Pk is defined on L 2() as follows: if ∈Z 2 c exp(i · x) denotes the Fourier decomposition of f ∈ L (), then Pk f is given as the low frequency part of it | |≤k c exp(i · x). The following classical estimate will be useful in the sequel f − Pk f L2 () ≤

Cp ∇f Lp () , k 2(1−1/p)

for all

p ∈ (1, 2).

216

D. Bresch, B. Desjardins

As a matter of fact, introducing β ∈ C ∞ (R) such that 0 ≤ β ≤ 1, β(s) = 0 for s ≤ 1 and β(s) = 1 for s ≥ 2, we obtain the following estimate denoting βα (·) = β(·/α) for any positive number α

hn un − Pk ( hn un ) L2 (0,T ;(L2 ())2 )

≤ hn un − hn βα (hn )un L2 (0,T ;(L2 ())2 )

+ hn βα (hn )un − Pk ( hn βα (hn )un ) L2 (0,T ;(L2 ())2 )

+ Pk ( hn un ) − Pk ( hn βα (hn )un ) L2 (0,T ;(L2 ())2 ) √ ≤ C α un L2 (0,T ;(L2 ())2 )

Cp + 1/3 ∇ βα (hn ) hn un L2 (0,T ;(L6/5 ())2 ) . (10) k Indeed, in the case when r1 > 0, the gradient can be estimated as follows:

∇ βα (hn ) hn un L2 (0,T ;(L6/5 ())2 ) ≤ C βα (hn ) hn ∇un L2 (0,T ;(L2 ())2 )

1/3 + hn un L3 (0,T ;(L3 ())2 ) ∇ hn L∞ (0,T ;(L2 ())2 ) −1/3

× 2βα (hn )hn + βα (hn )hn 2/3

L∞ ((0,T )×) ,

whereas in the capillary case when κ > 0, we may take advantage of the uniform L2 (0, T ; H 2 ()) ∩ L∞ (0, T ; H 1 ()) bound on hn to write

∇ βα (hn ) hn un L2 (0,T ;(L6/5 ())2 ) ≤ C βα (hn ) hn ∇un L2 (0,T ;(L2 ())2 )

2/3 1/3 + C hn un L∞ (0,T ;(L2 ())2 ) ∇hn L∞ (0,T ;(L2 ())2 ) ∇ 2 hn L2 (0,T ;(L2 ())4 ) βα (hn ) × βα (hn ) + L∞ ((0,T )×) . 2hn Therefore, the left-hand side of (10) may be estimated by √ Cα C α + 1/3 , k where √ the two above constants do not depend on n. It means that the high frequency part of hn un is arbitrarily small in L2 (0, T ; (L2 ())2 ) uniformly in n for large √ enough wave number k. It remains to study the convergence of the product P ( hn u n ) · k √ hn un for a given k ∈ N . Again, using cutoff functions such as βα together with the uniform√L2 (0, T ; (L2 ())2 ) estimate on un , we only have to consider the weak limit of Pk ( hn un ) · hn un . Finally observing that the momentum equation yields uniform L2 (0, T ; H −s ()) bounds on ∂√ t (hn un ) for √ large enough s, we deduce the strong L2 (0, T ; (L2 ())2 ) convergence of hn un to hu. Using the above compactness, we can pass to the limit in hn un , hn un ⊗ un , h2n , r0 un , r1 hn |un |un . It remains to prove the weak convergence of hn ∇un to h∇u. First, this sequence may be rewritten as follows:

hn ∇un = ∇(hn un ) − 2 hn un ∇ hn , the first term of the right hand side already √ √converging√ weakly to ∇(hu), whereas the second one converges weakly to 2 hu∇ h, since hn un converges strongly in √ L2 (0, T ; (L2 ())2 ) and ∇ hn is uniformly bounded and weakly converges in L2 (0, T ; (L2 ())2 ).

Two-Dimensional Viscous Shallow Water Model

217

In the capillary case, the nonlinear term κhn ∇hn rewrites in a more suitable way as

h2 |∇hn |2 − κdiv (∇hn ⊗ ∇hn ), κhn ∇hn = κ∇ n − 2 2

so that a strong L2 (0, T ; (L2 ())2 ) convergence of ∇hn to ∇h suffices to pass to the limit in the above nonlinear terms. This strong compactness is a straightforward consequence of the uniform L2 (0, T ; (L2 ())4 ) bound on ∇ 2 hn combined with the L2 (0, T ; H −2 ()) uniform bound on ∂t hn . Remarks. As emphasized in the √ previous compactness analysis, one of the key points to √ h u to hu is to be able to neglect obtain the strong convergence of n n √ √ the high√frequency part of h u uniformly in n. In dimension one, we may write ∇( hn un ) = hn ∇un + n n √ √ un ∇ hn , which means that ∇( hn un ) is uniformly bounded in L2 (0, T ; (L1 ())4 ). Sobolev embeddings in dimension one then allow to prove this crucial cutoff estimate. As a result, no surface tension or nonlinear friction are required to pass to the limit in hn un ⊗ un and hn ∇un , the result being valid for r1 ≥ 0 and κ ≥ 0. In particular, global existence of weak solutions is proven for the shallow water model derived in [15]. 3. The Quasi–Geostrophic Limit Recalling that u = ∇ ⊥ , the energy equality satisfied by u is t t t |u|2 + ||2 +ν |∇u|2 +r0 |u|2 +r1 |u|3 = |u0 |2 +| 0 |2 .

0

0

0

As usual we denote ε = (hε − 1)/ε. Using the energy inequality for weak solutions (3), the energy equality for the limit solution, the mass and momentum equations of weak solutions (1) tested against (, u), we get the following estimates

t | ε − |2 1 − u|2 ε 2 h |∇h | + νhε |∇(uε − u)|2 + + κ 2 2 2 0 t t 6 + r0 |uε − u|2 + r1 hε (|uε |uε − |u|u) · (uε − u) ≤ Ii , ε |u

ε

0

0

i=1

where I1 = I2 =

hε0 t

|uε0 − u0 |2 + 2

|0ε − 0 |2 − 2 t

(hε0 − 1)

|u0 |2 +κ 2 t

|∇hε0 |2 ,

ν(h − 1)|∇u| + r0 (h − 1)u · u + r1 (hε − 1)|u|3 0 0 t |u|2 (hε − 1) (u · ∇)u · uε + (uε · ∇)u · u − + (hε − 1) 2 0 t t ε + (h − 1) (u · ∇)u · u − (hε − 1)uε · ∂t −1 u, 0 0 t t I3 = ν (∇hε · ∇)u · uε + κ hε ∇hε · u, ε

2

ε

0

0

0

ε

218

D. Bresch, B. Desjardins

t

hε ((uε − u) · ∇)u · (uε − u), 0 t t ε I5 = − (u · ∇)u · (u − u) − (uε − u) · ∇)u · u,

I4 = −

0

0

and I6 = −

t 2 t |u| + p div (hε uε ) − (uε − ∇ ⊥ ε ) · (∂t −1 u). 2 0 0

Let us assume for the that hε → 1 in L∞ (0, T ; L2 ()), ∇hε → 0 in √ moment 2 2 2 ε L (0, T ; (L ()) ), κh is uniformly bounded in L2 (0, T ; (L2 ())2 ) and let us prove the theorem, letting ε go to 0 in the above estimate. We know that I1 converges to 0 by assumptions on the data. The group I2 converges to 0 since hε → 1 in L∞ (0, T ; L2 ()), uε is uniformly bounded in L2 (0, T ; (L2 ())2 ) and u is smooth enough. The group I3 converges to 0 since ∇hε → 0 in L2 (0, T ; (L2 ())2 ), uε and √ κhε are uniformly bounded in L2 (0, T ; (L2 ())2 ). The group I4 is controll by a Gronwall’s type argument since ∇u is smooth enough. The groups I5 and I6 converge to 0 using the weak convergence of uε to u, uε − ∇ ⊥ ε to 0 and div (hε uε ) to 0 and the strong convergence of hε . √ Let us now prove the uniform bounds on (hε − 1)/ε, ∇hε /ε and κhε . We follow exactly the same lines as in the previous section looking carefully at the dependence with respect to the parameters ε, κ. From the classical energy inequality, we easily derive the following estimates: √ hε uε L∞ (0,T ;(L2 ())2 ) ≤ c, √ hε ∇uε L2 (0,T ;(L2 ())4 ) ≤ c, √ κ ∇hε L∞ (0,T ;(L2 ())2 ) ≤ c,

hε − 1 L∞ (0,T ;L2 ()) ≤ cε, 1/2 r0 uε L2 (0,T ;(L2 ())2 ) ≤ c, 1/3 r1 (hε )1/3 uε L3 (0,T ;(L3 ())2 ) ≤ c,

with c independent of ε. The presence of the term r0 uε gives the extra L2 (0, T ; L2 ())2 ) uniform estimate on uε and the first estimate on hε − 1 is classical coming from the pressure term. Using now Lemma 5, integrating with respect to t, we get t 0

|∇hε |2 +ν ε2

√

|∇ t

+ νr1

0

t

t

(uε )⊥ · ∇hε κ|∇ h | + ν ε 0 0 |uε |uε · ∇hε − νr0 log hε + νr0 log hε0 ≤ c

h ε |2

+ν

2 ε 2

with c independent of ε. First, we have

t uε · ∇ ⊥ hε 1

≤ uε 2L2 (0,T ;(L2 ())2 ) + ε 2 0

ε 1 ∇h 2 , 2 ε L2 (0,T ;(L2 ())2 )

so that the last term will be absorbed by the left-hand side, and the uniform t L2 (0, T ; (L2 ())2 ) estimate on uε will be used. We control the term νr1 0 |uε |uε · ε ∇h as in the previous section using the following uniform estimates with respect to ε: √ hε uε L∞ (0,T ;(L2 ())2 ) ≤ c,

√ hε ∇uε L2 (0,T ;(L2 ())4 ) ≤ c.

Two-Dimensional Viscous Shallow Water Model

219

Therefore we get t 0

|∇hε |2 +ν 2ε 2

t √ |∇ hε |2 + ν κ|∇ 2 hε |2 − νr0 log hε + νr0 log hε0 ≤ c. 0

Now, since hε is uniformly bounded in L∞ (0, T ; L2 ()), r0 log+ hε ≤ c.

It remains to assume that −r0 log− hε0 is uniformly bounded in L1 to control −νr0 log− hε . Let us observe that the assumption that solutions of the limit system are smooth enough is not restrictive, since in the framework of initial data with bounded energy, a L2 stability result on solutions of the limit system (7) allows to reduce to the case of smooth data.

The ill prepared case. In the case of ill prepared initial data, the propagation of waves has to be analyzed in the limit of small ε. It turns out that the parabolic property expressed in the preceding section as a cutoff lemma of high frequencies may be again used to reduce the problem to a finite number of √ modes in Fourier space (see for instance [10]). Indeed, a similar cutoff lemma on hε uε holds since the uniform estimates previously used are also uniform with respect to the Froude and Rossby number, and therefore uniform in ε. Thus, the following estimate holds uniformly in ε: √ √ Cα hε uε − Pk ( hε uε ) L2 (0,T ;(L2 ())2 ) ≤ Cα + 1/3 , k where C and the same result applies to hε uε √ Cαε do not depend on k or ε. Notice that ε ε instead of h u since for all given α > 0, ∇(βα (h )hε uε ) is bounded uniformly in ε in L2 (0, T ; (Lp ())4 ) for all p ∈ (1, 6/5). As a result, using the uniform L2 (0, T ; (L2 ())2 ) bound on ∇ ε , and denoting mεk = Pk (hε uε ), kε = Pk ε , we may write div(hε uε ⊗ uε ) +

∇|kε |2 ∇| ε |2 − νdiv(hε ∇uε ) = div(mεk ⊗ mεk ) + − νmεk + Rkε , 2 2

where Rkε satisfies: ∃s, ∀η > 0, ∃k, ∃ε0 , ∀ε ≤ ε0 ,

Rkε L2 (0,T ;(H −s ())2 ) ≤ η.

It means that the dynamics of the low frequency part of system (1) is described by  divmεk   = 0, ∂t kε +   ε   ε ∇k (mε )⊥ ∂t mεk + + k − νmεk + r0 mεk + r1 Pk (|mεk |mεk )  ε ε    ∇|kε |2   + Pk + div(mεk ⊗ mεk ) = Pk R˜ kε , 2

220

D. Bresch, B. Desjardins

where R˜ kε satisfies the above convergence property with respect to ε. Notice that in the capillary case, the extra terms vanish in view of the uniform bound of ∇ ε in L2 (0, T ; (L2 ())2 ). Finally, it remains to apply the classical filtering operator associated with the wave operator to the preceding system of ordinary differential equations. It allows to deduce on the one hand that P(hε uε ) converges strongly in L2 (0, T ; (L2 ())2 to a global weak solution u of the limit quasi–geostrophic model (7), where P denotes the projector on the kernel of the wave operator. On the other hand, the analysis of waves can be carried out in the very same way as in [11, 12], so that it will not be detailed again in the present work. As a byproduct, we obtain the global existence of weak solutions for the system of partial differential equations satisfied by the waves.

4. Appendix The aim of this appendix is to prove technical lemmas that will yield crucial estimates for the proof of existence of global weak solutions and the proof of the convergence to the quasi-geostrophic model. These estimates have been derived in [3] for the Korteweg system without Coriolis force and friction terms. Lemma 3. The following identity holds:

1 d 2 dt

h|∇ log h| + ∇divu · ∇h + h∇u : ∇ log h ⊗ ∇ log h = 0. 2

(11)

Proof. Deriving the equation of mass conservation with respect to xi , we get ∂t (∂i log h) +

(uj ∂j ∂i log h + ∂i ∂j uj + ∂i uj ∂j log h) = 0.

j

Multiplying this equation by h∂i log h and summing over i, this gives 1 1 h∂t |∇ log h|2 + h(u · ∇)|∇ log h|2 + ∇divu · ∇h + h∇u : ∇ log h ⊗ ∇ log h = 0. 2 2

Integrating in space and using again the equation of mass conservation, we get (11).

Using formula (11) established in Lemma 3, we prove that Lemma 4. 1 d 2 dt

d ν h|∇ log h| − νr0 log h + νr1 |u|u · ∇h + νκ |∇ 2 h|2 dt ⊥ |∇h|2 u · ∇h d +ν + ν νu · ∇h + νh∇u : t ∇u. (12) = − 2 Ro dt Fr

2

2

Two-Dimensional Viscous Shallow Water Model

221

Proof. Multiplying the momentum equation by ν∇h/ h, we get ∇h ∇h + + νκ νh(∂t u + u · ∇u) · ν 2 ∇u : h∇ |∇ 2 h|2 h h ⊥ u · ∇h |∇h|2 + νr0 +ν u · ∇ log h + νr1 |u|u · ∇h + ν = 0. 2 Ro Fr That means ∇h ∇h ⊗ ∇h νh(∂t u + (u · ∇)u) · |∇ 2 h|2 + ν 2 ∇u : ∇∇h − + νκ h h ⊥ u · ∇h |∇h|2 +ν u · ∇ log h + νr1 |u|u · ∇h + ν = 0. + νr0 2 Ro Fr Adding this equation to (11) multiplied by ν 2 , gives ν

⊥ u · ∇h ν 2 h|∇ log h|2 + ν Ro 2 2 + νr0 u · ∇ log h + νκ |∇ h| + νr1 |u|u · ∇h =− ν∂t u · ∇h − ν 2 ∇divu · ∇h − ν (u · ∇)u · ∇h − ν 2 ∇u : ∇∇h = I. |∇h|2 1 d + 2 Fr 2 dt

(13)

Let us now rewrite the right-hand side I , as follows d u · ∇h + ν u · ∇∂t h − ν 2 ∇divu · ∇h I = −ν dt −ν (u · ∇)u · ∇h − ν 2 ∇u : ∇∇h.

Therefore, using the equation of mass conservation d 2 u · ∇h − ν u · ∇div(hu) − ν ∇divu · ∇h I = −ν dt −ν (u · ∇)u · ∇h − ν 2 ∇u : ∇∇h.

(14)

Integrating by parts, we get (u · ∇)u · ∇h = u · ∇div(hu) − h∇u : t ∇u, −

and using ∇div = curl curl + , we obtain ∇divu · ∇h − ∇u : ∇∇h = 0. −

(15)

Therefore, using (14), (15) and (16), Eq. (13) gives (12).

(16)

222

D. Bresch, B. Desjardins

Now we give an interesting estimate on u + ν∇ log h. More precisely, we have Lemma 5. The following energy dissipation holds: ⊥ |∇h|2 u · ∇h + ν |u|u · ∇h + νr ν 1 2 Fr Ro d 1 h2 |∇h|2 + h|u + ν∇ log h|2 + +κ 2 dt 2 2 2Fr 2 2 2 + νκ |∇ h| + νr0 |u| + νr1 h|u|3 d − νr0 log h ≤ νh|∇u|2 . dt

(17)

Proof. Formula (12) reads ⊥ 1 d |∇h|2 u · ∇h d + ν log h + h|u + ν∇ log h|2 − νr ν 0 2 Ro dt 2 dt Fr + νκ |∇ 2 h|2 + νr1 |u|u · ∇h 1 d = h|u|2 + νh∇u : t ∇u. 2 dt Finally using the energy estimate (3), we easily conclude.

Let us remark that ν∇ log h has the dimension of a velocity. It means that some information on an auxiliary velocity v = u + ν∇ log h is derived. References 1. Babin, A., Mahalov, A., Nicolaenko, B.: Global splitting in rotating shallow-water equations. European J. Mechanics. B/ Fluids 16, 725–754 (1997) 2. Bernardi, C., Pironneau, O.: On the shallow water equations at low Reynolds number. Commun. Partial Diff. Eqs. 16, 59–104 (1991) 3. Bresch, D., Desjardins, B., Lin, C.K.: On some compressible fluid models: Korteweg, lubrication and shallow water systems. To appear in Comm. Partial Diff. Eqs. 2002 4. Bresch, D., Essoufi, E.H., Sy, M.: De nouveaux syst`emes de type Kazhikhov-Smagulov: mod`eles de propagation de polluants et de combustion a` faible nombre de Mach. C.R. Acad. Sci. Paris, 335, S´erie I, 973–978 (2002) 5. Chemin, J.–Y., Desjardins, B., Gallagher, I., Grenier, E.: Book in preparation. 6. Colin, T.: The Cauchy problem and the continuous limit for the multilayer model in geophysical fluid dynamics. SIAM J. Math. Anal. 28(3), 516–529 (1997) 7. Colin, T., Fabrie, P.: Rotating fluid at high Rossby number driven by a surface stress: existence and convergence. Adv. Differ. Eqs. 2(5), 715–751 (1997) 8. Constantin, P., Wu, J.: Behavior of solutions of 2D quasi-geostrophic equations. SIAM J. Math. Anal. 30(5), 937–948 (1999) 9. Desjardins, B., Grenier, E.: Derivation of the quasigeostrophic potential vorticity equations. Adv. Diff. Eqs. 3(5), 715–752 (1998) 10. Desjardins, B., Grenier, E., Lions, P.-L., Masmoudi, N.: Incompressible limit for solutions of the isentropic Navier–Stokes equations with Dirichlet boundary conditions. J. Math. Pures Appl. (9) 78(5), 461–471 (1999) 11. Embid, P.F., Majda, A.: Low Froude number limiting dynamics for stably stratified flow with small or finite Rossby numbers. Geosphys. Asrtophys. Fluid Dynamics 87, 1–50 (1998) 12. Embid, P.F., Majda, A.: Averaging over fast gravity waves for geophysical flows with arbitrary potential vorticity. Comm. Partial Diff. Eqs. 21, 619–658 (1996) 13. Gent, P.: The energetically consistent shallow water equations. J. Atmos. Sci. 50, 1323–1325 (1993)

Two-Dimensional Viscous Shallow Water Model

223

14. Gent, P., Williams, J.C.: Balanced models in isentropic coordinates in bounded and periodic domains. Dyn. Atmos. Oceans 7, 67–93 (1983) 15. Gerbeau, F., Perthame, B.: Derivation of viscous Saint-Venant system for laminar shallow water; Numerical results. Discrete and Continuous Dynamical Systems-series B. 1(1), 89–102 (2001) 16. Grenier, E., Masmoudi, N.: Ekman layers of rotating fluids, the case of well prepared initial data. Commun. Partial Differential Equations 22(5–6), 953–975 (1997) 17. Levermore, D., Sammartino, M.: A shallow water model with eddy viscosity for basins with varying bottom topography. Nonlinearity 14(6), 1493–1515 (2001) 18. Lions, P.–L.: Mathematical Topics in Fluid Dynamics. Vol. 2 Compressible Models, Oxford: Oxford University Press, 1998 19. Masmoudi, N.: Ekman layers of rotating fluids: the case of general initial data. Commun. Pure Appl. Math. 53(4), 432–483 (2000) 20. Orenga, P.: Un th´eor`eme d’existence de solutions d’un probl`eme de shallow water. Arch. Rat. Mech. Anal. 130(2), 183–204 (1995) 21. Pedlosky, J.: Geophysical Fluid Dynamics. Berlin Heidelberg-New York: Springer-Verlag, 1987 22. Schochet, S.: Singular limits in bounded domains for quasilinear symmetric hyperbolic systems having a vorticity equation. J. Differ. Eqs. 68(3), 400–428 (1987) 23. Sundbye. L.: Global existence for the Dirichlet problem for the viscous shallow water equations. J. Math. Anal. Appl. 202(1), 236–258 (1996) Communicated by P. Constantin

Commun. Math. Phys. 238, 225–256 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0817-5

Communications in

Mathematical Physics

A Small-Scale Density of States Formula John A. Toth Department of Mathematics and Statistics, McGill University, Montr´eal, Canada Received: 10 December 2001 / Accepted: 23 January 2003 Published online: 2 April 2003 – © Springer-Verlag 2003

Abstract: Let (M, g) be a C ∞ compact Riemann manifold with classical Hamiltonian, H ∈ C ∞ (T ∗ M). Assume that the corresponding -quantization P1 := Op (H ) is quantum completely integrable. We establish an -microlocal Weyl law on short spectral intervals of size 2− ; ∀ > 0 for various families of operators P1u ; u ∈ I containing P1 , both in the mean and pointwise a.e. for u ∈ I . The -microlocalization refers to a small tubular neighbourhood of a non-degenerate, stable periodic bicharacteristic γ ⊂ T ∗ M − 0. 0. Introduction Let (M, g) be a compact C ∞ manifold with classical Hamiltonian H ∈ C ∞ (T ∗ M) and let P1 := Op (H ) be the corresponding, self-adjoint -quantization. Let E be a regular value of H (x, ξ ) and consider λj () ∈ Spec(P1 ). Then, one way of quantitatively measuring the uniformity in the spectrum semiclassically localized near H = E is via the density of states measure: δ(x − −1 (E − λj () ). (1) dρ(x; ) := n−1 |λj ()−E|1−

ˇ Let S0 (R) denote the space of Schwartz functions φ(λ) with the property that φ(t) ∈ C0∞ (R). Under very general assumptions on the bicharacteristic flow of the Hamilton vector field H , the semiclassical trace formula [DG, GU, P] implies that for any test function φ ∈ S0 (R), ∞ −n+1 ˇ φ −1 (E − λj () ) = vol {H = E} φ(0) + o(−n+1 ). j =1

Supported in part by an Alfred P. Sloan Research Fellowship and NSERC grant OGP0170280

(2)

226

J.A. Toth

So, in particular,

w − lim dρ(x; ) = vol {H = E} dx, →0

where dx is the Lebesgue measure on R and the weak-∗ limit is taken in the space S0 (R). The identity in (2) can be taken as the definition of the asymptotically uniform distribution of eigenvalues λj () ∈ Spec P1 localized near the energy level E on intervals |λj () − E| as → 0. This is just another way of saying that there exists an asymptotic expansion for the spectral counting function on such an interval with a leading-order Weyl-type term. The result in (2) holds as soon as the set of periodic orbits is of measure zero [DG, GU]. Such a result does not require integrability for either the classical Hamiltonian, H ∈ C ∞ (T ∗ M), or the h-quantization P1 := Op (H ). However, suppose one wishes to detect uniformity in Spec P1 on smaller spectral intervals. One way of doing this consists of putting µ() = κ with κ > 0 and forming the following density of states (DOS) measure: dρµ (x) := µ()−1 n−1 δ ( x − µ()−1 −1 (λj () − E) ). (3) |λj ()−E|

The asymptotics for such √ DOS measures have been studied before in the homogeneous case where P1 = [Vo]. However, by using standard wave-trace methods it is difficult to determine the highest power of κ > 0 for which dρµ has meaningful asymptotics as → 0. This leads one to ask: • Question 1: When → 0+ , what is the smallest polynomial scale µ() = κ ; κ > 0 for which w − lim→0 dρµ (x) = const. dx? Unfortunately, we do not know the answer to Question 1 above. However, we are able to give a partial answer to a related (simpler) question which we now describe in more detail. Let (M, g) be a compact two-dimensional Riemannian manifold and assume that P1 = Op (p1 ) is quantum completely integrable (QCI). This means that there exists P2 = Op (p2 ) with [P1 , P2 ] = 0. In addition, we assume that P1 = Op (p1 ) is selfadjoint, elliptic (in the classical sense) and that the subprincipal symbol σsub (P1 ) = 0. In such a case, there exists a Hilbert basis of L2 -normalized joint eigenfunctions {ψj ; j = 1, 2, ...} of the operators P1 and P2 . To simplify the writing a little we can, without loss of generality, assume that the relevant joint energy levels are (E1 , E2 ) = (1, 1) and that γ ⊂ {(x, ξ ) ∈ T ∗ M − 0; p1 (x, ξ ) − 1 = p2 (x, ξ ) − 1 = 0} is a joint, stable rank-one orbit for both Hamilton vector fields p1 and p2 . Definition 0.1. We say that the joint orbit, γ , is Eliasson non-degenerate [TZ, VN] if α1 = α2 , eiα1

eiα2

where, and are the eigenvalues of the linearized Poincar´e maps of p1 and p2 respectively along the curve, γ . The numbers α1 and α2 are called the Liapunov coefficients along γ of p1 and p2 respectively. Let χ(x, ξ ) ∈ C0∞ (T ∗ M) be a microlocal cutoff function which is supported in a neighbourhood, γ , of the geodesic, γ . To excise the piece of Spec P1 coming from this tubular neighbourhood, we define the following microlocal DOS measure: dρµ (x; χ) := µ()−1 n−1 Op (χ )ψj , ψj δ ( x− µ()−1 −1 (λj ()−1)). |λj ()−1|

(4)

Small-Scale Density of States Formula

227

The question of establishing a small-scale microlocalized trace formula can be phrased in terms of these microlocal DOS measures as follows: • Question 2: When → 0+ , what is the smallest polynomial scale µ() = κ ; κ > 0 for which w − lim→0 dρµ (x; χ ) = const. dx? The purpose of this paper is to give partial answers to Question 2 for certain (families of) operators, P1 = Op (H ). In Sect. 2, we show that for the transversallly metaplectic model operator H0 = Ds + α(2 Dx2 + x 2 ) acting on L2 (R × S1 ), with Diophantine Liapunov coefficient α, we get a positive answer to Question 2 provided µ() = κ with 0 ≤ κ < 1/2. To give a partial answer to Question 2 for the entire range of intervals corresponding to 0 ≤ κ < 1, we average over suitable families of microlocally QCI systems for which γ is a classical orbit. In this way (see Theorems 0.4 and 0.5) we get an “almost-everywhere” affirmative answer to Question 2 for these families of QCI Hamiltonians. In either case, the crucial tool is the Birkhoff normal form construction (see Sect. 1 and [G, Z1, TZ] for further details). Briefly, the classical Birkhoff normal form construction (CBNF) says that, given a sufficiently small tubular neighbourhood, , of the joint orbit γ , there exists a model tubular neighbourhood 0 = {(x, s, ξ, σ ) ∈ T ∗ (R × S1 ); x 2 + ξ 2 ≤ , |σ − 1| ≤ } containing the model orbit γ0 = {(x, s, ξ, σ ) ∈ T ∗ (R × S1 ); x = ξ = 0, σ = 1}, together with a canonical diffeomorphism κ : → 0 and smooth functions, fj ; j=1,2, defined near (0, 1) such that pj ◦ κ = fj (x 2 + ξ 2 , σ ). The quantum Birkhoff normal form (QBNF) construction says that there exists a microlocally unitary -Fourier integral operator F : C0∞ () → C0∞ (0 ) conjugating P1 2 1 and P2 to a model operator acting on the product space L (R × S ). More precisely, there exist smooth symbols, fj (x1 , x2 ; ) ∼ k fj k (x1 , x2 )k ; j = 1, 2 and model operators Q1 = Ds and Q2 = 2 Dx2 + x 2 , where (s, x) ∈ S1 × R, with the property that: F −1 fj (Q1 , Q2 ; )F = Pj + O(∞ ), where

(5)

fj (Q1 , Q2 , ) = Q1 + αj Q2 + . . . .

Here, the dots denote lower-order terms in the sense that they either vanish to high order along γ0 or are of high order in (see [Z1, G]) and following the convention in [CP], throughout the paper we denote microlocal equivalence on by = . 1 Our first result is deterministic: Let > 0 and suppose that 2 − ≤ µ() ≤ 1. In Proposition 2.5 we show that for the model operator P1 = Q1 + αQ2 acting on L2 (R × S1 ), any cutoff χ with sufficiently small support and φ ∈ C 0 ([a, b]), ∞ w − lim dρµ (φ; χ ) = c0 φ(x) dx. Here, c0 :=

S∗M

→0

−∞

χ dω, where dω is Liouville measure on H = E.

228

J.A. Toth

To deal with smaller scales corresponding to 1/2 ≤ κ < 1, it turns out to be much easier to average over a suitable “ensemble” of quantum completely integrable systems. This is consistent with, for instance, the literature on pair-correlations and level spacings distributions [Be1, 2, Bl, BT, Dy, EMM, RS, Sa, Si, UZ, V, Z3, ZZ]. We now describe the appropriate QCI ensembles over which we will average: Definition 0.2. Let I = [1 − , 1 + ] for fixed > 0. Given u ∈ I 2 we let P1u , P2u be a QCI, C ∞ 2-parameter joint deformation of P1 and P2 with classical integrals in involution p1u (x, ξ ) and p2u (x, ξ ). We say that such a joint deformation is regular provided that for j = 1, 2, the symbols pju ∈ C ∞ ((T ∗ M) are smooth in u ∈ I 2 and γ is a stable, Eliasson non-degenerate orbit (see Definition 0.1) for both p1u and p2u for all u = (u1 , u2 ) ∈ I 2 with γ ⊂ {z ∈ T ∗ M − 0; p1u (z) − u1 = p2u (z) − u1 = 0}. This last condition (see Lemma 1.2) implies that there exist symplectic coordinates (x, ξ, s, σ ) near γ in terms of which pju (x, ξ, s, σ ) = u1 σ +

αj (u1 , u2 ) 2 (x + ξ 2 ) + O(|x, ξ |3 ); j = 1, 2. 2

We also impose the additional condition that ∇u2 α1 (u1 , u2 ) = ∇u2 α2 (u1 , u2 ) for all (u1 , u2 ) ∈ I 2 . A one-parameter deformation of p1u is called regular if the conditions on p1u above are satisfied for all u ∈ I and z near γ . Our next result shows that the asymptotics of the microlocal DOS measures for P1u averaged over various one-parameter families all tend weakly to the uniform distribution, c0 dx for the entire range of intervals corresponding to ≤ µ() ≤ 1: Theorem 0.3. (i) Assume that ≤ µ() ≤ 1 and for u ∈ I := [1 − , 1 + ] let P1u be a regular, one-parameter family of QCI Hamiltonians. Then, given φ ∈ S0 (R), 1+ 2 −1 ˇ [dρµ (φ; u) − φ(0)c | log |) φL1 . 0 (u)] du = O( µ() 1−

(ii) Consider the (non-regular) microlocally QCI family of Hamiltonians P1u with QBNF expansion ∞ Q1 + α u Q2 + gj (Q1 , Q2 )j . j =2

Then, for ≤ µ() ≤ 1 and any δ > 0, 1+ 2−δ ˇ [dρµ (φ; u) − φ(0)c µ()−1 ) φL1 . 0 (u)] du = Oδ ( 1−

The next result addresses the question of pointwise convergence of the microlocal DOS measure: Theorem 0.4. Let Pu1 be a QCI Hamiltonian for a regular, two-parameter family of deformations with microlocal QBNF expansion (near γ ) of the form u1 Q1 + α1 u2 Q2 + β(Q1 , Q2 ) +

∞ j =2

gj (Q1 , Q2 )j .

(6)

Small-Scale Density of States Formula

229

(i) Let χ be supported in a sufficiently small tubular neighbourhood of γ in which the QBNF is valid. Fix κ with 0 ≤ κ < 1 and let µ() = κ . Then, for any δ > 0, we have that: 1+ 1+ ˇ MSµ (φ) := | dρµ (φ; u1 , u2 )− c0 (u) φ(0) |2 du1 du2 = Oδ (1−κ−δ ) φL1 . 1−

1−

(ii) Let µ() be the same as in part (i). Then, for Lebesgue almost all (u1 , u2 ) ∈ I , w − lim dρµ (x ; χ , u1 , u2 ) = c0 (u1 , u2 ) dx, →0

1

− 1−κ −δ provided takes its values in a sequence {m }∞ for any δ > 0. m=1 with m << m

Finally, our main result follows from Theorem (0.4). It states that when integrating over the QCI deformations in (6) there is a microlocal Weyl law that is valid on intervals of length ∼ 2− for any > 0. Theorem 0.5. Let Pu1 be a QCI Hamiltonian for a regular, two-parameter family of deformations satisfying the conditions in Theorem (0.4) and Nu ( 1+aµ(), 1+bµ() ) := κ λj ∈[1+aµ(),1+bµ()] Op (χ )ψj , ψj . Fix κ with 0 ≤ κ < 1 and µ() = . Then, as → 0, Nu (1 + 1+κ a, 1 + 1+κ b) − c0 (u)(b − a)−1+κ L2 (I 2 ) = o(−1+κ ). The plan of the paper is as follows: In the first section, we review Birkhoff normal form constructions near the γ both at the classical and quantum levels. There are several related Birkhoff normal form constructions that are well-known [G, Z1, Sj, VN]. However, since we need a convergent normal form valid in a neighbourhood of the orbit, γ , we give a self-contained derivation of both the classical and quantum normal forms in Sect. 1. Section 2 is devoted to the proof of Proposition 2.5 for the transversally metaplectic model Hamiltonian. Here, we give an explicit analysis of the DOS measure corresponding to the simplest possible model Hamiltonian H0 = Q1 + αQ2 under a Diophantine assumption (DIO) on the Liapunov coefficient, α. In Sect. 3, we replace the operator P1 in the microlocal DOS with the model Hamiltonian f (Q1 , Q2 ; ). Using the QBNF construction of Sect. 1, we show that ∞ ˇ dρµ (φ) := dt + O(∞ ). (7) T r Op (χ0 ) eit[f (Q1 ,Q2 ;)−1]/ φ(µ()t) −∞

Consequently, the asymptotic analysis of the DOS measures dρµ is reduced to the corresponding problem in the model case. In Sect. 4, we first prove Theorem 0.3 by using the model normal form in (7) together with an integration by parts argument in u ∈ I applied to the mean 1+ dρµav (φ) = [dρµ (φ; u) − c0 (u)] du, 1−

where, c0 (u) := = 1). Subsequently, again by passing to the normal form in (7), we explcitly analyze the mean-square 1+ 1+ ˇ MSµ (φ) := |dρµ (φ; u) − c0 (u)φ(0) |2 du1 du2 , vol(p1u

1−

1−

for the 2-parameter deformations in (6) and prove Theorem 0.4. Finally, in the last section we replace φˆ with the characteristic function of an interval and prove Theorem (0.5).

230

J.A. Toth

√ 0.1. Remarks. (i) Consider the special case where P1 = . One of our main motivations in writing this paper arises from the fact that by well-known Tauberian arguments [Vo], one can relate DOS measures to error terms in the Weyl law for P1 . In related work [PT1, PT2], we have used the techniques of the present paper to estimate variances for the error term in the Weyl law for flat tori and Heisenberg manifolds by integrating over the appropriate moduli spaces of metrics. (ii) The main result of this paper (Theorem (0.5)) is purely microlocal and applies to quite general QCI Hamiltonians, but requires very special linear deformations of such Hamiltonians. It is an important and difficult problem to determine whether one can prove a global analogue of Theorem (0.5) for QCI Laplacians such as convex surfaces of revolution. In general, this would involve averaging over moduli spaces of integrable metrics where the deformations appear non-linearly in the Hamiltonian. We do not address this question here but hope to return to it in future work.

1. Birkhoff Normal Form Let (M, g) be a compact, two-dimensional Riemannian manifold and γ a periodic orbit of the classical Hamiltonian H ∈ C ∞ (M) on a regular energy shell, E = {z ∈ T ∗ M : H (z) = E}. For notational simplicity, we henceforth assume that the primitive period of γ is 2π. Fix any point p0 ∈ γ and choose a function G ∈ C ∞ (T ∗ M) with the property that G(p0 ) = 0,

{G, H }(p0 ) = 0.

Consider the set := {z ∈ T ∗ M; H (z) − E = G(z) = 0}.

(8)

Given U , a sufficiently small neighbourhood of p0 , it follows that ∩ U is a smooth t denote the Hamiltonian flow associated with H . An hypersurface in H −1 (E). Let φH application of the implicit function theorem gives the existence of a unique smooth map τ : ∩ U → R with the property that τ (p)

φH (p) ∈ , τ (p0 ) = 2π. The Poincar´e map of γ is by definition the local canonical diffeomorphism, τ (p)

φ : ∩ U → : p → φH (p). The linearized Poincar´e map is simply Pγ := dψ(p0 ) : Tp0 → Tp0 . It is not hard to show [HZ] that for different choices of p0 ∈ γ the linearized Poincar´e mappings are always conjugate and so the eigenvalues of Pγ are independent of the choice of p0 ∈ γ . We say that γ is a stable, non-degenerate, periodic geodesic provided the linearized Poincar´e mapping, Pγ , has eigenvalue e±iα on the unit circle with α∈ / 2πZ.

Small-Scale Density of States Formula

231

1.1. The classical model. The phase space for the classical model is T ∗ (R × S1 ) with homogeneous canonical coordinates (x, ξ, s, σ ). The model action functions are just: p :=

1 2 (ξ + x 2 ) and σ. 2

(9)

The local model for the orbit γ is γ0 = {(x, ξ, s, σ ) ∈ T ∗ (R × S1 ); p = σ − 1 = 0 }.

(10)

For fixed with 0 < < 1, the model tubular neighbourhood of γ0 is taken to be 0 = {(x, ξ, s, σ ) ∈ T ∗ (R × S1 ); p ≤ , |σ − 1| ≤ }.

(11)

1.2. The quantum model. The quantum action operators corresponding to p and σ are the respective -Weyl quantizations given by Q1 :=

1 2 2 ( Dx + x 2 ) and Q2 := Ds . 2

(12)

The relevant Hilbert space is L2 (R×S1 ) equipped with a natural Hilbert basis consisting of the joint eigenfunctions, uj , of Q1 and Q2 with uj (x, s) = eimj s ⊗ nj (x),

(13)

where (mj , nj ) ∈ Z × Z+ + 21 and n denotes the nth , L2 -normalized Hermite function.

1.3. Pseudodifferential calculus. We now introduce the relevant symbol classes and semiclassical pseudodifferential operator calculus. Definition 1.1. We say that a(x, s, ξ, σ ; ) ∈ C ∞ (T ∗ S1 × T ∗ R) is in the symbol class m,k 1 Scl (S × R) provided there exists an asymptotic development: a(x, s, ξ, σ ; ) ∼ −k

∞

aj (x, s, ξ, σ )j ,

j =0

where, |∂xα ∂sβ ∂ξ ∂σδ aj (x, s, ξ, σ )| ≤ C(j, α, β) −m (1 + |x| + |ξ | + |σ |)m−j −|α|−|γ |−|δ| . γ

Given a ∈ S m,k (S1 × R) we let Op (a) denote the semiclassical Weyl pseudodifferential operator quantizing a, with Schwartz kernel −2 Op (a)(x, s, x , s ) = (2π ) ei[(x−x )ξ +(s+2πk−s )σ ]/ ×a

x

k∈Z + x

2

,

R2

s + s + π k, ξ, σ ; 2

dξ dσ.

(14)

232

J.A. Toth

m,k Unless otherwise specified, we will henceforth work with the symbol classes Scl and the corresponding semiclassical Weyl operators. Following the convention in [CP], we say that P , Q ∈ Op (S m,k ) are microlocally ˜ provided Op (χ )(P − Q)(0) = O(∞ ) and (P − Q)Op (χ )(0) = equal on ∞ ˜ We denote microlocal equivalence on ˜ O( ) for any cutoff function χ ∈ C0∞ (). by =˜ . To analyze the long-time behaviour of the Schroedinger propagator for P1 , we will ultimately need to replace the quantum Hamiltonian P1 appearing in the microlocal DOS measure in (4) by a suitable operator acting on the model space L2 (R × S1 ). This reduction depends in a crucial way on the Birkhoff normal form construction which we now discuss.

1.4. Classical Birkhoff normal form (CBNF). We will now present a version of classical Birkhoff normal form (CBNF) valid in a sufficiently small tubular neighbourhood of the stable, periodic geodesic, γ . Such constructions have been carried out in several different settings elsewhere [G, Sj, Z1, VN]. However, in light of the fact that we need a convergent, parameter-dependent version of this result valid in a tubular neighbourhood of the geodesic, γ , we give a self-contained treatment below. We only discuss one-parameter regular deformations, but the general argument follows in a similar fashion. To begin, we start with a variant of the Liouville-Arnold Theorem [AM]: Lemma 1.2. Let γ be a stable, joint rank-one orbit for a regular variation p1u ; u ∈ [1 − , 1 + ] of completely integrable Hamiltonian functions. Then, there exist tubular neighbourhoods ⊂ T ∗ M and 0 ⊂ T ∗ (R × S1 ) of γ and γ0 respectively, together with a family of canonical diffeomorphisms κ u : −→ 0 , and canonical coordinates (y, η, θ, σ ) ∈ 0 ⊂ T ∗ (R × S1 ) such that for j = 1, 2, pju ◦ κ u = fju (y, η, σ ). Here, fju ∈ C ∞ (R3 ; (0, 0, 1)) and κ u are locally smooth in the deformation parameter, u ∈ (1 − , 1 + ). Proof. By assumption, we have that γ ⊂ {z ∈ T ∗ M; p1u (z) = u, p2u (z) = u}. Without loss of generality we can assume that for z ∈ γ , |dp2u (z)| ≥ C1 > 0. Extend phase space to T ∗ (M × I ), where I = (I − , 1 + ) and let p3 = u denote the affine coordinate on the interval, I . Then, the functions p1u , p2u and u define an integrable system on T ∗ (M × I ) and for (z, u) ∈ γ × I , rank(∇p2u , ∇p3u )(z, u) = 2. So, by a parameter-dependent version of the Liouville-Arnold Theorem [AM], there exists a smooth family of canonical diffeomorphisms, κ u : × T ∗ I −→ 0 × T ∗ I , with canonical coordinates (y, η, s, σ, u, v) on 0 × T ∗ I such that p2u ◦ κ = f2 (y, η, σ ; u) and p3u ◦ κ = u.

Small-Scale Density of States Formula

233

Since {p1u ◦ κ, p2u ◦ κ} = {p1u ◦ κ, f2 (y, η, σ ; u)} = 0, it follows that p1u ◦ κ = f1 (y, η, σ ; u) for some f1 ∈ C ∞ (R3 ; (0, 0, 1)). Here, we have used the fact that the Hamilton flow of p2u ◦ κ along γ0 is given by (0, 0, s, 1) → (0, 0, s + ∇σ p2u (0, 0, 1), 1), with ∇σ (p2u ◦ κ) = 0 for 0 sufficiently small.

Before going on, we should point out that since γ0 = {(y, η, s, σ ) ∈ T ∗ (M×S 1 ); y = η = 0 , σ = σ0 } is an orbit, it follows that ∂p1u ∂p u (0, 0, σ ) = 1 (0, 0, σ ) = 0. ∂η ∂y Also, since we are assuming that γ is 2π/u-periodic, it follows that: ∂p1u (0, 0, σ ) = u. ∂σ

(15)

Using the fact that α1 (u) is a Liapunov coefficient of Pγ , we get by a Taylor expansion argument that p1u ◦ κ(y, η, σ ) = u σ + Qu (y, η) + Oσ (|y, η|3 ),

(16)

where Qu (y, η) is a quadratic form with eigenvalues ±α1 (u) ∈ C ∞ (I ; R). By a standard theorem of Williamson [A], we can make a linear canonical change of coordinates in the (y, η) variables alone and get that p1u ◦ κ(y, η, σ ) = u σ +

α1 (u) 2 (y + η2 ) + Oσ (|y, η|3 ). 2

(17)

We will henceforth assume that p1u ◦ κ has already been put into the form in (17). The second step involves constructing a further canonical diffeomorphism κ2 such that f1 ◦ κ2 = f2 (x 2 + ξ 2 , σ ). We do this by basically implementing the standard Birkhoff construction in one degree of freedom with smooth dependence on the parameters (u, σ ). Lemma 1.3. Let H u = p1u be a C ∞ regular variation of integrable Hamiltonians in the sense of Definition 0.2. Let γ ∈ be a joint periodic orbit of p1 and p2 and assume that it is a stable, non-degenerate orbit for p1u for all u ∈ I . Let αj (u) ∈ C ∞ (I ; R) denote the Liapunov coefficient of pju along γ . Then, there exists a C ∞ family of local canonical diffeomorphisms κ u : 0 −→ and canonical coordinates (x, ξ, s, σ ) ∈ 0 ⊂ T ∗ (R × S1 ) such that for j = 1, 2, pju ◦ κ u = fju (x 2 + ξ 2 , σ ). Here, σ ◦ κ u = σ and fju ∈ C ∞ (0 ; (0, 0, 1)) depends locally smoothly on u ∈ I with fju (x 2 + ξ 2 , σ ) = u σ +

αj (u) 2 (x + ξ 2 ) + Oσ (p 2 ). 2

234

J.A. Toth

Proof. By applying the Liouville-Arnold theorem as in Lemma 1.2 we get symplectic coordinates (y, η, θ, σ ) ∈ 0 ⊂ T ∗ (M × S1 ) and a locally smooth family of canonical diffeomorphisms κ u : 0 → 0 with the property that p2u = f2u (y, η, σ ) and p1u = f1u (y, η, σ ). We would like to construct here another family of canonical diffeomorphisms κ2u : → with the property that pju ◦ κ u ◦ κ2u = fju (x 2 + ξ 2 , σ ) (here, fju is in general not the same as in the statement of Lemma 1.2). To construct κ2u : 0 → 0 consider ωu (y, η, σ ) := p1u (y, η, σ ) − p1u (0, 0, σ ) = p1u (y, η, σ ) − u σ.

(18)

In the coordinates (y, η, θ, σ, u, v) ∈ 0 ×T ∗ I the joint orbit γu0 = {(y, η, θ, σ, u, v) ∈ 0 × T ∗ I ; y = η = σ − 1 = 0, u = u0 }. Consequently, ∂ωu ∂ωu (0, 0, σ ) = (0, 0, σ ) = 0, ∂y ∂η

(19)

and since γ0 is stable and non-degenerate, we have 2 ∇(y,η) ωu (y, η, σ ) 0.

(20)

By a standard Moser-type isotopy argument in one degree of freedom (see, for instance [CV1] Appendix A) one shows that locally, there exists a family of diffeomorphisms (depending smoothly on u ∈ I ) κ1u : (θ, σ, x, ξ ) → (θ, σ, y u (x, ξ, σ ), ηu (x, ξ, σ )),

(21)

the elements of which are symplectic in the (x, ξ ) variables alone, together with functions f1u ∈ C ∞ (0 ; (0, 0, 1)) such that ωu ◦ κ1u = f1u (x 2 + ξ 2 , σ ). Finally, to make the change of coordinates symplectic in all (θ, σ, x, ξ ) variables, compose κ1u with a further change of coordinates κ2u : (θ, σ, x, ξ ) → (θ + w u (σ, x, ξ ), σ, x, ξ ).

(22)

Here, using the fact that κ1u is symplectic in the (x, ξ ) variables alone, it follows that the function wu (σ, x, ξ ) can be locally determined by appealing to a parameter-dependent version of the Poincar´e lemma. All maps (including the symplectic diffeomorphism κ2u ) depend smoothly on the parameter, u ∈ I . From (17) and the definition of a regular deformation (see Definition 0.2) it follows that for 0 sufficiently small, the transformed integrals p1u ◦ κ and p2u ◦ κ have the form indicated in the statement of the lemma. 1.5. Quantum Birkhoff normal form (QBNF). We now turn to the quantum analogue of the CBNF construction in the previous section. As we have already pointed out, such results have been established in various settings: In [Sj], Sjoestrand gives a QBNF construction in the vicinity of the minimum of an approximate harmonic oscillator. In [G], Guillemin proves a microlocal QBNF result in the vicinity of a stable, non-degenerate geodesic (see also Zelditch [Z1] for a different construction). In the general case, the Birkhoff expansions do not converge and consequently, these results are a little different from what we need here. In [VN], Vu-Ngoc gives a (convergent) QBNF construction,

Small-Scale Density of States Formula

235

but only in the vicinity of a critical point. Since we need a convergent QBNF construction in a neighbourhood of a closed geodesic, we will give a complete argument below. Moreover, using a result of Eliasson [El] on singular Birkhoff normal forms, one can generalize the argument below to higher dimension and neighbourhoods of joint orbits which are tori of dimension greater than one. To simplify the writing a little, we will assume here that the Maslov index, σγ ∼ = 0. In the general case, the proofs are basically the same provided the Fourier basis {eiks }k∈Z is replaced by the shifted exponentials {ei(k+σγ /4)s }k∈Z , and the space C0∞ (0 ) replaced by C0∞ (0 ; π2∗ L). Here, L is the Maslov line bundle over γ and π2 (x, s) := s. Proposition 1.4. Let P1u ∈ Op(S m,k ) be a regular family of QCI Hamiltonians with quantum integral P2u ∈ Op(S m,k ). Let γ be a one-dimensional, Eliasson non-degenerate (see Definition (0.1)) stable joint orbit of the vector fields p1 and p2 with Liapunov coefficients α1 (u) and α2 (u) respectively. Then, there exists microlocally unitary -Fourier integral operators, F u : C0∞ () → C0∞ (0 ) and C ∞ symbols fk (x, s, ξ, σ ; ) ∼ ∞ j j =0 fj k (x, s, ξ, σ ) ; k = 1, 2 with the property that: ∗

u (p, σ )) F u Pku F u =0 Op (fk (x, s, ξ, σ ; )) =0 Op (fku (p, σ )) + Op (rk1 2 u + Op (rk2 (p, σ ; ))

in L2 (S1 × R). Here, (i) fku (p, σ ) = uσ +

αk (u) 2 (x + ξ 2 ) + O(p 2 ), 2

u (p, σ ) = O(p), (ii) rk1 u (iii) Op (rk2 (p, σ ; ))0 = O(1).

Moreover, all functions and operators depend in a locally smooth fashion on u ∈ I . Proof. Since we are working in the semiclassical regime and are localizing the analysis in a tublular neighbourhood of γ0 = {(s, 1, 0, 0) ∈ T ∗ (S1 × R), the dual variable, σ ∼ 1 and, unlike the situation in [G, Z1], the scaling in this variable will not play much of a role here. To simplify the writing a bit, when there is no risk for confusion, we will drop the superscript u denoting dependence on u ∈ I with the understanding that all estimates are regular with respect to this parameter. First, by the semiclassical Egorov theorem combined with the classical Birkhoff construction in Lemma 1.3, it follows that F0∗ P1 F0 =˜ Op (f1 (p, σ )) + Op (w1 ) + O(∞ ).

(23)

To estimate w1 (x, s, ξ, τ ) we argue as follows: First, note that p1 − u is doubly-characteristic along the two-dimensional symplectic Poincar´e cross-section, . Indeed, given z ∈ , we have that Tz = {v ∈ Tz T ∗ M; p1 (z) − u = p2 (z) − u = 0, dp1 (v) = dp2 (v) = 0}. In terms of the canonical Liouville-Arnold coordinates (x, s, ξ, σ ) in Lemma 1.2, we can take = {(x, s, ξ, σ ) ∈ 0 ; s = s0 , σ = 1}.

236

J.A. Toth

One of the main properties of the subprincipal symbol of a doubly-characteristic pseudodifferential operator is that under conjugation by an elliptic -Fourier integral operator, U , it transforms according to the κ associated with canonical transformation j U [Ta1]. Recall [F], given p(x, ξ ; ) ∼ ∞ j =0 pj (x, ξ ) , the subprincipal symbol of the -Weyl quantization, Op (p), is just p1 (x, ξ ). Now, let U be a microlocally unitary -Fourier integral operator quantizing the Liouville-Arnold canonical transformation in Lemma 1.2 and consider the inclusion ι : −→ T ∗ M. Since ι∗ σprin ( P1u − u ) is doubly-characteristic along the orbit, γ , it follows that, for z ∈ γ, w1 (0, s, 0, 1) = σsub (U −1 P1u U − u)(κ(z)) = σsub (P1u − u)(z) = 0,

(24)

since we have assumed that σsub (P1 ) = 0. So, by making a Taylor expansion around (x, ξ ) = (0, 0), it follows from (24) that w1 (x, s, ξ, σ ) = Os,σ (|x, ξ |).

(25)

Here, O(|x, ξ |) denotes an error term which is of total order one in the variables (x, ξ ). The next step involves putting the sub-principal term w1 (x, s, ξ, σ ) into classical Birkhoff normal form in the sense of Lemma 1.3. To do this, we conjugate F ∗ P1 F with the microlocally unitary, -pseudodifferential operator, F1 := exp(iOp (v0 )) for appropriate v0 (x, s, ξ, σ ) ∈ C ∞ (0 ). An application of the symbolic calculus, combined with the Calderon-Vaillancourt theorem gives F1−1 F0−1 P1 F0 F1 =0 [I d + iOp (v0 )] F0−1 P1 F0 [I d − iOp (v0 )] + O(2 ) (26) =0 [I d + iOp (v0 )] [Op (f1 ) + Op (w1 )] [I d − iOp (v0 )] + O(2 ). (27) The semiclassical subprincipal symbol of the pseudodifferential operator in the first term on the RHS of (27) is w1 − {v0 , f1 }. Since ∂s f1 = 0, the first step reduces to solving the first-order linear transport equation (∇σ f1 )

∂v0 ∂v0 ∂v0 + (∇ξ f1 ) − (∇x f1 ) = w1 (s, σ, x, ξ ). ∂s ∂x ∂ξ

(28)

To solve this equation, make a Fourier series decomposition in the s variable on both sides of (28) and equate Fourier coefficients. Writing v0 = k∈Z vˆ0k eiks and w1 = ˆ eiks , for each k ∈ Z we get k∈Z r1k {f1 (p, σ ), vˆ0k }(x,ξ ) + ik (∇σ f1 ) vˆ0k = rˆ1k ,

(29)

where, {·, ·}(x,ξ ) denotes Poisson bracket in the (x, ξ ) variables alone. Next, we decompose vˆ0k and rˆ1k in another Fourier series by introducing polar variables (r, θ ) ∈ R+ × S1 decomposition in the (x, ξ ) variables, where p = r 2 . In terms of these coordinates, Eq. (29) is just (r ∇p f1 )

∂ vˆ0k + ik (∇σ f1 ) vˆ0k = rˆ1k . ∂θ

(30)

Small-Scale Density of States Formula

237

Since r1 ∈ C ∞ , it follows that rˆ10 = rˆ10 (p, σ ). When k = 0 Eq. (30) reduces to (r ∇p f1 )

∂ vˆ00 = rˆ10 (σ, p). ∂θ

By subtracting the resonant term rˆ10 (p, σ ) from r1 we can solve the above equation by putting vˆ00 = 0. Thus, it suffices to assume from now on that the zeroth Fourier coefficient ofilθ r1 is zero and consequently, k = 0. By writing vˆ0k (x, ξ, σ ) = l∈Z vˆ 0kl (r, σ )e , we see that Eq. (30) is equivalent to: [ il(r ∇p f1 ) + ik(∇σ f1 ) ] vˆ0kl = rˆ1kl

(31)

for the double Fourier coefficients vˆ0kl and rˆ1kl . First, by subtracting a resonant term from w1 , we can without loss of generality assume that k = 0 in (31). In order to solve the latter equation, we are thus reduced to showing that there are no non-zero, double Fourier coefficients rˆ1kl (r, σ ) with l(r ∇p f1 ) + k(∇σ f1 ) = 0 for (p, σ ) ∈ 0 . This last fact can be proved as follows: From Lemma 1.3 it follows that both p1 and p2 can be simultaneously put into Birkhoff normal form. Consequently, F0∗ P2 F0 =0 Op (f2 (p, σ )) + Op (r2 ) + O(∞ ).

(32)

Moreover, since [P1 , P2 ] = 0 we deduce from the pseudodifferential symbolic calculus that {f1 (p, σ ), w2 } = {f2 (p, σ ), w1 }, (33) α2 2 where, f2 (p, σ ) = σ + 2 p + O(p ). So, from (33) we get that (34) r (∇p f2 ) l + k (∇σ f2 ) rˆ1kl = r (∇p f1 ) l + k (∇σ f1 ) rˆ2kl . Therefore, ( r l (α2 + O(p)) + k (u + O(p)) ) rˆ1kl = ( r l (α1 + O(p)) + k (u + O(p)) ) rˆ2kl . Since by the Eliasson non-degeneracy assumption, |α1 − α2 | ≥ C1 for some C > 0, it follows that for 0 sufficiently small, there cannot be any non-zero, double Fourier coefficients with l(r ∇p f1 ) + k(∇σ f1 ) = 0 for (p, σ ) ∈ 0 as long as k = 0. Indeed, suppose there were such Fourier coefficients. Then, from (34) above and the assumption that rˆ1kl = 0, it then follows that r (∇p f2 ) l + k (∇σ f2 ) = 0. However, the Eliasson non-degeneracy condition rules this out. So, after subtracting rˆ10 (p, σ ) from the RHS of Eq. (30) we can solve for vˆ0k in (30) and the first step of the inductive proof is complete. We also note that the compatibility equations (33) imply that one can choose vˆ0kl to simultaneously solve [ il(r ∇p f2 ) + ik(∇σ f2 ) ] vˆ0kl = rˆ2kl . This equation combined with (31) and (33) implies that for any (r, σ ) ∈ 0 and N > 0, |vˆ0kl (r, σ )| ≤ CN (1 + |k| + |l|)−N , and so, v0 ∈ C ∞ . Furthermore, by (25) we have that w1 (s, σ, x, ξ ) = O(|x, ξ |) and so, σsub ( F1∗ F0∗ P1 F0 F1 ) = rˆ10 (p, σ ) = O(p). To complete √ the proof of Proposition 1.4, continue this process by further conjugating F1∗ F0∗ F0 F1 with unitary pseudodifferential operators, Fj = Op (j vj (x, s, ξ, σ )); j = 2, 3, ..., where vj ∈ C0∞ (0 ) and solving the respective analogues of the transport equation (30) after subtracting resonant errors.

238

J.A. Toth

1.6. The model propagator. By passing to (QBNF), we will need to estimate the action of various semiclassical pseudodifferential operators on the metaplectic propagator, U0 (t) = eitH0 / , where H0 := Q1 + αQ2 = Ds +

α 2 2 Dx + x 2 . 2

The Schwartz kernel of the propagator, U0 (t) is a simple metaplectic integral [F]. For ; k ∈ Z it is just t = (2k+1)π 2α U0 (x, x , s, s ; t) = (2π)−2

∞ k=−∞

R2

e(t,x,x ,s+k,s ,σ,ξ )/ (sec αt) 2 dσ dξ, 1

where (t, x, x , s, s , σ, ξ ) = φ(t, x, ξ ) − x ξ + (t − (s − s ))σ,

(35)

(36)

and, 1 φ(t, x, ξ ) := − (tan αt)(x 2 + ξ 2 ) + (sec αt)xξ. 2 Sometimes, it is convenient to rewrite the Schwartz kernel, U0 (t), in terms of semiclassical Weyl quantization [Ta2]: U0 (t, x, x s, s ) = (2π )−2

2 k∈Z R

ei[(s+2πk−s −t)σ +(x−x )ξ ]/ Eαt

x + x , ξ ; dξ dσ, 2

(37)

where Eαt (x, ξ ; ) = (cos αt)−1 ei[tan αt (x

2 +ξ 2 )]/

.

(38)

Here, the Schwartz kernel, U0 , extends continuously to U0 (t) ∈ S (R × S1 ⊗ R × S1 ) for all t ∈ R. We will henceforth freely use both left-reduced and Weyl quantizations. In the course of proving Theorems 0.3 and 0.4, we will need to understand explicitly the action of various -Weyl pseudodifferential operators acting on the model propagator U0 (t) at the level of the Schwartz kernel for very long times t ∼ µ()−1 . We begin with the semiclassical analogue of the construction of the standard Heisenberg pseudodifferential functional calculus (see for instance [G] and [Ta2]). Lemma 1.5. Let b(p, σ ) ∈ S 0,−∞ (R × S1 ) and define b(Q1 , Q2 ) by the functional calculus. Then, ˜ b(Q1 , Q2 ) = Op (b(p, σ ; )) with ˜ b(p, σ ; ) = b(p, σ ) +

N−1

rj (p, σ ) pj j + rN (p, σ ; ) N ,

j =1

where rj ∈ S 0,−∞ (R × S1 ) for j = 1, ..., N − 1.

Small-Scale Density of States Formula

239

Proof. The proof is similar to that in [Ta2] Proposition 3.2, so we will only sketch the argument. Write s = (s1 , s2 ) ∈ R2 and let ζ (s) ∈ C0∞ (R2 ) be a cutoff supported in the ball |s| ≤ 21 and identically equal to 1 when |s| ≤ 41 . By the Fourier inversion formula χ (Q1 , Q2 ) = I1 + I2 , where I1 := (2π )−2

R2

ˆ −1 s1 , −1 s2 ) ds1 ds2 , ei[s1 Q1 +s2 Q2 ]/ ζ (s1 , s2 ) b(

and I2 := (2π )−2

R2

ˆ −1 s1 , −1 s2 ) ds1 ds2 . ei[s1 Q1 +s2 Q2 ]/ (1 − ζ (s1 , s2 )) b(

Since b ∈ S(R2 ) it follows that I2 0 = O(∞ ). As far as I1 is concerned, modulo O(∞ ) terms in L2 (R × S1 ), we have by the formulas in (37) and (38) that I1 = (2π )−2

2 k∈Z R

ei[(s+2πk−s )σ +(x−x )ξ ]/ J1

x + x , ξ, σ 2

dξ dσ,

(39)

where J1 (x, ξ, σ ) = (2π)−2

R2

and,

(cos s2 )−1 ei[tan s2 (x

2 +ξ 2 )]/

f (σ, s2 ; ) ds2 ,

(40)

ˆ −1 s1 , −1 s2 ) ds1 . e−is1 σ/ ζ (s1 , s2 )b(

f (σ, s2 ; ) := Then, given the estimate

|eiθ −

N−1 j =0

(iθ )j |θ |N |≤ , j! N!

we can write J1 (x, ξ, σ ) = (2π )

−2

R2

ei[s2 (x

2 +ξ 2 )]/

a(p, s2 ; ) f (σ, s2 ; ) ds2 .

(41)

Here, a(p, s2 ; ) = 1 +

N−1 j =1

kj (s2 )

p j

+ −N RN (s2 , p; ),

where kj (s) = O(s 2j ) and |RN (s, p; )| = O(s 2N p N ). The lemma then follows by s making the change of variables s˜j = j ; j = 1, 2 in the integral formula (41) for J1 .

240

J.A. Toth

1.7. Diophantine approximation. As we have already pointed out, our first result (Proposition 2.5) for the model Hamiltonian H0 = Q1 + α1 Q2 is deterministic. In order to carry out the analysis in this case, we will need to make a standard Diophantine assumption about the Liapunov coefficient α1 ∈ R of the stable, periodic joint orbit γ . Definition 1.6. We say that α ∈ R is Diophantine if for any > 0 there exists a constant Cα, > 0 such that the inequality p Cα, (DIO) (42) α − ≥ 2 q q (log q)1+ is satisfied for all integers p and q(q > 0). The fundamental theorem of Khinchin [K] on metric diophantine approximation implies in particular that for Lebesgue almost all α ∈ R, condition (DIO) is indeed satisfied. It is important to note that the Diophantine condition (DIO) is only imposed in Proposition 2.5. Our subsequent results (Theorems 0.3, 0.4 and 0.5) are probabilistic since we average over entire families of Liapunov coefficients. 2. Deterministic Analysis of the Transversally Metaplectic Model We now carry out a deterministic analysis of the DOS measure in the simplest possible model case. Consider the transversally metaplectic model operator H0 = Q1 + αQ2 , and for φ ∈ S0 (R), the corresponding DOS measure ∞ ˇ t) dt. eit[H0 −1]/ χ0 (Q1 , Q2 ) φ(µ dρµ (φ) = T r −∞

(43)

Small times |t| ≤ 1/C can be handled using stationary phase. So, without loss of generality, we assume here that |t| ≥ 1/C > 0. To obtain a formula for the trace of the microlocalized propagator eitH0 / ◦ χ0 for such long times, just as in Lemma 1.5, we use the Fourier inversion formula in the functional calculus to write V0 (t) := T r(eitH0 / ◦χ0 ) = (2π)−2 T r ei[s1 Q1 +s2 Q2 ]/ g( ˆ −1 s1 , −1 s2 ) ds1 ds2 , R2

with

(44)

g(w1 , w2 ) := χ0 (w1 , w2 ) eit[w1 +αw2 ]/ .

Given the explicit formula for the Mehler kernel in (37), we can rewrite (44) as −2 V0 (t) = (2π) e2πki/σ/ K0 (x, ξ, σ ) dx dξ dσ, 2 k∈Z R

where

K0 (x, ξ, σ ) = K01 (x, ξ, σ ) + K02 (x, ξ, σ )

with K01 (x, ξ, σ ) = (2π)−2

(cos s2 )−1 ei[tan s2 (x

2 +ξ 2 )]/

(45)

(46)

(47)

χ1 (s2 ) g(σ, ˆ −1 s2 ) ds2 . (48)

Small-Scale Density of States Formula

241

Here, gˆ denotes the Fourier transform in the s2 variable and for fixed C > 0, we take

(2k + 1)π 1 1 := s2 ∈ R; s2 − (49) ≥ C; k ∈ Z . 2 The expression for K02 is got by taking the Fourier transform of K01 and integrating over the complement of 1 . We carry out the analysis for K01 (x, ξ, σ ) here, noting that K02 (x, ξ, σ ) can be estimated in a similar fashion. First, note that by an integration by parts in gˆ in w2 it suffices, modulo O(∞ ) error, to assume that for any s2 ∈ 1 , we restrict ourselves to t ∈ R satisfying: |αt − s2 | ≤

(50)

for > 0 arbitrarily small. For such t ∈ R, there exists a uniform constant C > 0 such that for ≤ 0 , 1 ≤ | cos αt| ≤ C. C |θ|N (iθ)j Then, using the estimate |eiθ − N−1 j =0 j ! | ≤ N! , we write: tan s2 − tan αt = (sec2 αt) · (s2 − αt) +

∞

cj (s2 , t) (s2 − αt)j ,

j =2

where cj (s2 , t) = Oj (1). Inserting this expansion in the formula (48) we get that, modulo O(∞ ) error, K01 (x, ξ, σ ) equals −2

(2π )

E(

−1

(αt), x, ξ )

N−1

2 αt

ei[sec

j =0

× dj (s, t)(s2 − αt)2j g(σ, ˆ −1 s2 ) ds2 ,

(s2 −αt)(x 2 +ξ 2 )]/

p j (51)

where, supp dj (s2 , t) ⊂ {(s2 , t); s2 ⊂ supp (χ1 ), |αt − s2 | ≤ } and dj (s2 , t) = Oj (1) uniformly for s2 ∈ 1 and > 0 sufficiently small. We have thus proved Lemma 2.1. For t ∈ α1 1 there exist a bounded family of symbols a(σ, p; t, ) ∈ S 0,−∞ (R × S1 ), such that modulo an error which is O(∞ ) in L2 (R × S1 ), V0 (t) = (2π)−2 ei[t−2kπ ]σ/ E(−1 (αt), x, ξ ) a(p, σ ; t, ) dx dξ dσ. 2 k∈Z R

By applying the Fourier transform to E(t, x, ξ ), one can derive an analogous expansion on the complement where t ∈ R − α1 1 . We first consider the case where µ() ∼ β where 0 < β < 21 . As a consequence of the reduction to normal form, it will follow that for time-scales on this order, it is still possible to carry-out a deterministic analysis of the trace under the Diophantine assumption (DIO) on the Liapunov coefficient, α. We should point out that this fact is consistent with Ehrenfest time T ∼ −1/2 for coherent states centered on stable geodesics [CRR,

242

J.A. Toth

Ha]. Let χ1 (t) be a cutoff function supported on the set π1 (1 ), where π1 : R2 → R denotes projection onto the first component. Then, we can write dρµ (φ) = dρµ(1) (φ) + dρµ(2) (φ), where, as a consequence of Lemma 2.1, we have dρµ(1) (φ)

= (2π )

−1

∞ k=−∞

R4

ˇ ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)

× χ1 (t)a(p, σ, t; ) dξ dσ dxdt + O(∞ ),

(52)

where 1 φ(t, x, ξ ) = − (tan αt)(x 2 + ξ 2 ) + (sec αt)xξ 2 (1)

and a(p, σ ; t, ) is given in Lemma 2.1. We will now carry out the analysis for dρµ in (2) detail and dρµ can be treated in a similar fashion. The ansatz is the usual one when dealing with the semiclassical trace formula [DG, GU]: We isolate the “big” singularity at t = 0 and estimate the residual terms coming from the iterates of the primitive period t = 1 of the stable geodesic, γ . Of course, since the time-scale is now semiclassical, we will have to estimate these residual terms which involve large trigonometric sums. First, let χ2 (t) ∈ C0∞ (R; [0, 1]) be a cutoff supported in [−1/2, 1/2] and identically equal to 1 in [−1/4, 1/4] and > 0 be an arbitrarily small positive number. We rewrite the integral on the RHS of (52) as: (2π)−1

∞

ˇ ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t, ) dξ dσ dxdt

4 k=−∞ R

= (2π )−1

ˇ ei[tσ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t; )χ2 (t) dξ dσ dxdt ˇ ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t; )

R4 −1

+ (2π)

|k| =0

× χ2 (

R4

(t − 2πk)) dξ dσ dxdt ˇ + (2π ) ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t; ) −1

−1

|k| =0 −1

× [1 − χ2 (

R4

(t − 2πk))] dξ dσ dxdt.

(53)

First, we claim that in the second integral on the RHS of (53) we can take |k| ≤ 2µ−1− provided we choose sufficiently small. Indeed, this is an immediate consequence of the support properties of χ2 in the integrand. Second, we claim that the last integral in (53) is O(∞ ). To see this, given > 0, we further decompose (53) into −1

(2π )

−1− 2µ

|k| =0

R4

ˇ ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t; )

Small-Scale Density of States Formula

243

× [1 − χ2 (−1 (t − 2πk))] dξ dσ dxdt (54) ˇ + (2π )−1 ei[(t−2kπ)σ +φ(t,x,ξ )−xξ −t]/ φ(µ()t)χ 1 (t) a(p, σ, t; ) R4

|k|≥2µ−1−

× [1 − χ2 (−1 (t − 2π k))] dξ dσ dxdt.

(55)

O(∞ ).

We claim that the integral in (55) is This is indeed the case, since ∂ [(t − 2kπ )σ + φ(t, x, ξ ) − xξ − t] = t − 2π k, ∂σ ∂ and so by repeatedly integrating by parts with respect to ∂σ it follows that, for |k| ≥ −1− 2µ , N ˇ |N (t − 2π k)−N φ(µ()t)∇ σ χ1 (t) a(p, σ, t; )

× [1 − χ2 (−1 (t − 2πk))]| ≤ CN N |t|−N

1−

2π k t

−N

ˇ |φ(µt)|.

So, the integral in (55) is bounded by   −N ∞ 2πk 1 −  ≤ CN N−1 µ−1  CN N−1 sup |x−1|−N dx. t −1− 2 1/2≤|t|<µ −1− |k|≥2µ

Consequently, by choosing N > 0 large enough it follows that the integral in (55) is O(∞ ). ∂ To handle the integral in (54), we also integrate by parts with respect to ∂σ and note −1− that for |k| ≤ 2µ , N −1 ˇ |N (t − 2π k)−N φ(µ()t)∇ (t − 2π k))]| σ χ1 (t) a(p, σ, t; )[1 − χ2 ( N ˇ ≤ CN |φ(µt)|.

So, the integral in (54) is bounded by CN

N

1+ 2/µ

R

k=1

ˇ |φ(µt)| dt

≤ CN N µ−2− ,

and consequently, the last integral in (53) is O(∞ ). Summing up, we have proved: Proposition 2.2. For > 0 sufficiently small, the microlocal small-scale trace can be decomposed as follows: dρµ(1) (φ) = dρµ,0 (φ) + dρµ,+ (φ) + O(∞ ), (1)

(1)

where, given (t, x, ξ, σ ) := (t, x, x, s, s, σ, ξ ) (see (36)), we have (1) −1 ˇ ei(t,x,ξ,σ )/ φ(µ()t)χ dρµ,0 (φ) = (2π ) 1 (t) a(p, σ, t; )χ2 (t) dξ dσ dxdt R4

(1) dρµ,+ (φ)

= (2π )

−1

−1− 2µ

|k| =0

R4

ˇ ei(t+2πk,x,ξ,σ )/ φ(µ()t)χ 1 (t) a(p, σ, t; )

× χ2 (−1 (t − 2π k)) dξ dσ dxdt. (2)

A similar decomposition holds for dρµ (φ).

244

J.A. Toth

2.1. Proof of Proposition 2.5 (Transversally metaplectic model case). We are now in a position to prove Proposition 2.5 for the model Hamiltonian H0 = Q1 + αQ2 . The main point here is to use the decomposition in Proposition 2.2 and estimate the terms (1) (1) dρµ,0 (φ) coming from t = 0 and dρµ,+ (φ) coming from non-zero periods respectively. First, we have Lemma 2.3.

(1) ˇ ei(t,x,ξ,σ )/ φ(µ()t)χ dρµ,0 (φ) = (2π )−1 1 (t) a(p, σ, t; )χ2 (t) dξ dσ dxdt 4 R

ˇ = χ dω φ(0) + O(2 ). H0 =1

Proof. The proof of the lemma is a straightforward application of the method of station ary phase [DG] taking into account the fact that σsub (H0 ) = 0. (1)

To estimate the integral dρµ,+ (φ), we apply the Fubini theorem to get (1) dρµ,+ (φ)

= (2π )

−1

−1− 2µ

|k| =0

R2

−1 ˇ ei(t−2πk)σ +t]/ I (t, σ )φ(µ()t)χ (t − 2π k)) 2 (

× dσ dt + O(∞ ), where

I (t, σ ) :=

R2

(56)

ei[φ(t,x,ξ )−xξ ]/ χ1 (t) a(p, σ, t; ) dx dξ.

(57)

We would clearly like to expand this integral by applying the lemma of stationary phase (with parameters) [H] in the (x, ξ ) variables. The following lemma shows that this term is dominated by dρµ,0 as long as µ() ∼ β with 0 < β < 1/2. We start with a simple lemma: Lemma 2.4. Suppose µ() ∼ k for 0 < k < 21 . Then, there exists a constant C4 > 0 such that for (x, ξ ) ∈ supp χ , |t| ≤ 2µ−1 and ∈ (0, 0 ], 1

2 | det ∇x,ξ (φ(t, x, ξ ) − xξ ) | 2 ≥ C4 µ() | log µ()|−1− . 1

1

2 (φ(t, x, ξ ) − xξ ) | 2 = (1 − sec αt) 2 . Since Proof. Here we compute that | det ∇x,ξ 1 − cos x ∼ | x − lπ |2 near x = lπ; l ∈ 2Z, it follows by the Diophantine assumption DIO that for (k, l) ∈ Z2 with |k|, |l| ≤ 2µ(),

1 − cos(2kπ α) ≥ C4 µ2 | log µ|−2− . When |t − 2π k| = O(1− ) for any > 0, it follows by Taylor expansion that 1 − cos(αt) = 1 − cos(2kπ α) + O(1− ). Since | cos x| ≤ 1, we take square roots of both sides of this inequality and the lemma follows since > 0 can be chosen arbitrarily small.

Small-Scale Density of States Formula

245

We can apply the lemma of stationary phase (with parameters) to the integral I (t, σ ) and get that: 1

1

I (t, σ ) = (2π )(1 − sec αt)− 2 [ χ1 (t) a(0, σ, t; ) + O( |1 − sec αt|− 2 ) ]. (58) Substituting this expansion in (56) gives (1) dρµ,+ (φ)

=

−1− 2µ

|k| =0

ˇ ei(t−2πk)σ −t]/ χ1 (t) φ(µ()t)(1 − cos αt)− 2

1

R2

× χ2 (−1 (t − 2πk)) dσ dt +

−1− 2µ

R

|k| =0

(59)

ˇ |φ(µ()t)| O( |1 − cos αt|−1 ) χ1 (t) χ2 (−1 (t − 2π k)) dt + O(∞ ).

(60) Given Lemma 2.4, we see that the sum of the integrals in (59) and (60) is bounded by C5 µ()

−1

| log µ()|

1+

−1− 2µ

|t−2πk|≤1−

|k| =0

dt

≤ C5 1− µ()−2 | log µ()|1+ . (61)

(2)

Again, dρµ (φ) can be handled in a similar fashion and we have thus proved: Proposition 2.5. Let H0 := Q1 + α1 Q2 be defined as above. For µ() ∼ β with 0 < β < 1/2 we have that:

ˇ χ dω φ(0) + O( µ−2− ), ∀ > 0. dρµ (φ) = (2π)−n H0 =1

So, in particular, for such values of µ(), w − lim dρµ (x) = c0 dx. →0

3. Reduction to the Model Trace We will now carry out the transition to the model problem in the microlocalized trace. The main step here involves replacing the propagator, U (t) = eitP1 / by the model propagator U0 (t) = eitf (Q1 ,Q2 ;)/ using the QBNF construction in Proposition 1.4. We were motivated here by the recent “coherent states” proof of the semiclassical trace formula by Combescure-Ralston-Robert [CRR] and in particular, the subsequent use of the Duhamel formula to pass to the model problem given by a tranversal metaplectic Hamiltonian. However, the treatment here differs from that in [CRR] since the integrability assumption together with the corrresponding convergent QBNF construction enables one to get more explicit (and better) time-dependent estimates than in the general case. This improvement is crucial for proving both Theorems 0.3 and 0.4. To begin, let χ ∈ C0∞ (), where is a tubular neighbourhood of the geodesic, γ . Here, we take sufficiently small so that the Birkhoff construction in Lemma 1.3 is valid

246

J.A. Toth

in . Let φ ∈ S(R) and assume that φˇ ∈ C0∞ (R). We are interested in the asymptotics as → 0 of the pointwise, semi-classical DOS measure given by: dρµ (φ; χ ) := µ()−1

∞

Op (χ )ψj , ψj φ [λj () − 1]−1 µ()−1 .

(62)

j =1

Applying the Fourier inversion formula in the functional calculus and rescaling time so that t = µ()s, we get: ∞ ˇ dρµ (φ; χ ) = T r Op (χ ) e−isH / ds. eis/ φ(µ()s) (63) −∞

The main point of invoking (QBNF) here is to rewrite the trace in (63) in terms of the model operators, Q1 := Ds and Q2 := 21 (2 Dx2 + x 2 ) and then to explicitly analyze the resulting integrals. First, by essentially the same argument as in Lemma 1.5, one can show that for any N > 0 there exists χN (x1 , x2 ; ) ∈ C0∞ ([1 − , 1 + ]2 ; R) such that χN (P1 , P2 ; ) =

N−1

χj (P1 , P2 )j + O(N )

j =0

in L2 (M) and furthermore, Op (χ ) = χN (P1 , P2 ; ) + O(N ).

(64)

Lemma 3.1. With χN (x1 , x2 ; ) as above, we have that: dρµ (φ; χ ) = µ()

−1

∞

χN (P1 , P2 ; ) ψj , ψj φ

j =1

λ1j − 1 µ()

+ O(N ).

The main result of this section is: Proposition 3.2. Let uj (x, s) = eimj s nj (−1/2 x) be the L2 -normalized joint eigenfunctions of the model operators Q1 and Q2 with eigenvalues mj and nj respectively, where (mj , nj ) ∈ Z+ × (Z+ + 21 ). Let χ0 := F (χN ), where F is the microlocal operator constructed in Proposition 1.4, and let f (x1 , x2 ; ) ∼ ∞ -Fourier integral j be the C ∞ symbol in the QBNF construction in Proposition 1.4. f (x , x ) j 1 2 j =0 Then, dρµ (φ; χ ) = µ−1

∞

Op (χ0 )uj , uj · φ [ f (mj , nj ; ) − 1 ] µ()−1 −1 + O(∞ ).

j =1

Proof. Let χ˜ N ∈ C0∞ () be identically equal to 1 on supp (χN ). First, we claim that for ψj an L2 -normalized joint eigenfunction of (P1 , P2 ) with joint eigenvalues (λj 1 , λ2j ) ∈ supp (χN ), we have that χ˜ N (Q1 , Q2 ) [f (Q1 , Q2 ; ) − f (mj , nj ; )]F ∗ ψj 0 = O(∞ ),

(65)

Small-Scale Density of States Formula

247

and furthermore,

χ˜ N (Q1 , Q2 )F ∗ ψj 0 = 1 + O(∞ ). (66) Indeed, by the CBNF construction we know that in 0 ⊂ supp(χN ), we have p ≤ , |σ − 1| ≤ and moreover, pk ◦ κ = σ + αk p + O(p 2 ) k = 1, 2.

So, in 0 we have p1 ∼ 1 and p2 ∼ 1. For simplicity of notation from now on we write χN := χN (Q1 , Q2 ). From the pseudodifferential symbolic calculus and since [χN , χ˜ N ] = 0, χN ψj , ψj = χN χ˜ N2 ψj , ψj + O(∞ ) = χN χ˜ N ψj , χ˜ N ψj + O(∞ ). Now, by the QBNF construction, χN χ˜ N ψj , χ˜ N ψj = (F ∗ χN χ˜ N F ) F ∗ ψj , F˜ ∗ χN F F ∗ ψj + O(∞ ) = χ0 uj , uj + O(∞ ). Here we have put χ0 := F ∗ χN χ˜ N F and uj := F ∗ ψj , where Q1 uj =0 mj uj and Q2 uj =0 nj uj . Moreover, since χ˜ N = 1 on supp (χN ) and by construction F is microlocally unitary on , it readily follows from a microlocal parametrix construction in the pseudodifferential calculus that χ0 uj 0 = F ∗ χ˜ N F F ∗ ψj 0 + O(∞ ) = 1 + O(∞ ). From the last estimate, both (65) and (66) follow. To complete the proof, we need to uniquely note that any admissible [CP] microlocal solution uj ∈ D (R × S1 ) of the system of differential equations Q1 uj =0 mj uj , Q2 uj =0 nj uj , is uniquely characterized up to a C() multiple. To see this, note that by cutting-off in the fiber variables as in [CP], it follows that uj ∈ D (R×S1 ) extends to vj ∈ D (R×S1 ) satisfying Q1 vj = mj vj + O(∞ ), Q2 vj = (nj + 1/2)vj + O(∞ ) in L2 (R × S1 ), with vj ∼ =0 uj . Here, Spec Q1 = {j ; j ∈ Z} and Spec Q2 = {(j + 1/2); j ∈ Z+ } and so they both possess spectral gaps comparable to . Consequently, it follows that for some c() ∈ C(), vj (x, s) − c() eimj s nj (−1/2 x)0 = O(∞ ). Therefore, in particular, uj (x, s) =0 c()eimj s nj (−1/2 x). Here, mj ∈ Z+ and nj ∈ Z+ and nj is the nth j Hermite function. Finally, since uj 0 = 1 + O(∞ ), it follows that c() = 1 + O(∞ ). As a consequence of Proposition 3.2, we can rewrite the integral in (63) in terms of the model operators Q1 and Q2 as follows: ∞ ˇ dρµ (φ; χ ) = Tr [Op (χ0 ) · eis[f (Q1 ,Q2 ;)−1]/ ] φ(µ()s) ds + O(∞ ). (67) −∞

248

J.A. Toth

4. Analysis of the Model Problem (¯ h ≤ µ(¯ h) ≤ 1) As we have already seen in the previous section, in the course of carrying out a deterministic analysis of the small-scale trace, dρµ (φ), we were compelled to choose µ() ∼ δ with δ < 1/2. The essential reason for this is that the cutoffs χ2 (−1 (t −2π k)) appearing in the trace are the sharpest that one can use and still control error terms effectively under the Diophantine assumption (DIO) on the Liapunov coefficient. Due to this limit on the sharpness of time resolution, to analyze DOS measures on scales which are smaller than 1 2 , we will need to apply an averaging process over a family of models. The main point here is that by averaging over a deformation parameter, u, we can make an integration by !+ parts in the term, 1− dρµ,+ (φ; u) du in the averaged microlocal DOS measure. This !+ integration by parts improves the decay properties of the term 1− dρµ,+ (φ; u) du as → 0 and allows us to estimate the averaged trace for time scales all the way up to !+ t ∼ −1 . Of course, the term 1− dρµ,0 (φ; u) du involving the trivial period t = 0 is handled in the same way by applying stationary phase.

4.1. Proof of Theorem 0.3 (i). To make matters precise, we fix > 0 to be later chosen sufficiently small and let ζ (t) ∈ C0∞ (R) be supported in |t| ≤ 2 and identically equal to 1 when |t| ≤ . Let p1u and p2u define a regular variation with joint orbit, γ . First, we decompose the averaged microlocal DOS as follows: 1+ dρµ (φ; u) du = I1 + I2 , 1−

where I1 :=

!+

dρµ,0 (φ; u) du 1− ∞ u2 it [f u (Q1 ,Q2 ;)−1]/

=

Tr e

0

I2 :=

(68)

ˇ (1 − ζ (t)) χ0 (Q1 , Q2 ) φ(µt) du dt,

(69)

!+

dρµ,+ (φ; u) du 1− ∞ u2 i[f u (Q1 ,Q2 ;)−1]/

=

ˇ ζ (t) χ0 (Q1 , Q2 ) φ(µt) du dt,

u1

Tr e

0

u1

where ζ ∈ C0∞ (R) is a cutoff function supported in the set |t| ≤ 21 and equal to 1 on the set |t| ≤ 41 . One can get an asymptotic expansion for I1 in the usual way using small time analysis for the propagator eitf (Q)/ together with stationary phase (see, for instance Lemma 2.3) and get that 1+ I1 = χ0 dωu du + O(2 ). (70) 1−

p1 =u−1

In (70), we use the fact that σsub (P1 ) = 0 and so the stationary phase expansion at t = 0 does not contain terms of order O() [DG].

Small-Scale Density of States Formula

249

The analysis of the I2 term proceeds as follows: First, note that since p1u (x, s, ξ, σ ) = σ u +

α1 (u) 2 (x + ξ 2 ) + Ou (p 2 ), 2

by taking the neighbourhood 0 of the joint orbit γ0 is taken sufficiently small, ∇u f1u ∈ S 1,0 , and moreover, ∇u f1u (x, s, ξ, σ ) = σ + Ou (p) ≥

1 for (x, s, ξ, σ ) ∈ 0 . C

Consequently, one can construct an -pseudodifferential operator W u (Q1 , Q2 ; ) = Op (w u ) ∈ Op (S −1,0 ) with the property that Op (w u ) ◦ ∇u f1u (Q1 , Q2 ) =0 I d + O(∞ ).

(71)

To exploit this parametrix construction, we integrate by parts directly in u in the integral I2 . By the Fubini theorem, we have

∞ 1+ u ˇ T r eit[f1 (Q1 ,Q2 )−1]/ χ0 (Q1 , Q2 ) (1 − ζ (t))φ(µt)du dt. I2 = −∞

1−

(72)

Since

[∇u f1u (Q1 , Q2 ), f1u (Q1 , Q2 )] = 0,

the parametrix construction in (71) together with an integration by parts in u ∈ I implies that the integral in (72) equals ∞ ˇ i t −1 g(t) (1 − ζ (t))φ(µt)dt, (73) 2 −∞

where,

g(t) := −

+Tr

1+

1− 1+

eit[f1

(Q1 ,Q2 )−1]/

u T r eit[f1 (Q1 ,Q2 )−1]/ ∇u Op (w u ) du

◦ Op (w u )

− Tr

1−

eit[f1

(Q1 ,Q2 )−1]/

(74)

◦ Op (w u ) . (75)

Next, recall that by the semiclassical Weyl law [GU], # {λj (); |λj () − u| ≤ C} ∼ −1 . This in turn implies that uniformly for all |t| ≥

1 C

|g(t)| = O(−1 ) > 0 and so,

C µ

|I2 | ≤ C

|t|−1 −1 dt = O( | log µ()|).

(76)

1

Therefore, we have shown that 1+ ˇ (dρµ (φ; u) − c0 (u) φ(0)) du = O( | log µ()|) φL1 , 1−

where c0 (u) := vol(p1u = 1) and Theorem 0.3 (i) follows.

(77)

250

J.A. Toth

Remark. The argument in Theorem 0.3(i) is quite general and in particular, does not depend on the QBNF construction per se. However, we now turn to DOS measures for non-regular deformations where γ is fixed under the deformation. Here, QBNF plays a central role. Proof of Theorem 0.3(ii). To prove part (ii), we need to study the integral I2 in (72) directly. We put µ() = here noting that the intermediate cases < µ() < 1 can be analyzed in the same way. As a consequence of Proposition 3.3 and the fact that φ ∈ S, we have for (m, n) ∈ Z+ × (Z+ + 21 ),

1+ 1−

dρ+ (φ; u) du =

m,n

∞

1+

−∞ 1−

eis[f

u (m,n;)−1]/

ˇ ×χ1 (m, n; ) (1 − χ2 (s))φ(s) duds + O(∞ ). (78) Moreover, we have that: f u (m, n; ) = m + α u (n) + O(|n|2 ) + O().

(79)

Since p ≤ σ on supp χ1 (p, σ ), we get that the O(|n|2 ) term in (79) is less than 1. Also, from (79), ∂ u f (m, n; ) = α(n), ∂u where, n ∈ Z+ + 21 . So, by integrating by parts in (78) in the u vartiable, we get that 1+ modulo O(∞ ) errors, I2 := 1− dρ+ (φ; u) du equals the difference of two terms of the form: 1 ∞ 1 1± ˇ eis[f (m,n)−1]/ χ0 (m, n; ) (1 − χ2 (s)) φ(s) ds, (80) n −∞ s ± ()

where ± () := {(m, n); |f 1± (m, n; ) − 1| ≤ C1−δ and min (n, |m − 1| ) ≤ }, (81) and δ > 0 is arbitrary. So, from (80), it follows that: |I2 | ≤ C | log |

χ0 (m, n; ) . n

(82)

± ()

Fix u0 = 1 ± and consider (p, σ ) ∈ supp χ1 satisfying f u0 (p, σ ; ) = 1. From the CBNF expansion, we know that ∂σ f u0 (p, σ ; ) = 1 + O(p2 ) + O() ≥ C1 > 0. So, by the implicit function theorem, there locally exists g ∈ C ∞ with σ = g(p; u0 , ). Thus, it follows that for (m, n) ∈ ± (), m = g(n; u0 , ) + O(1−δ ), and so, m=

g(n; ) + O(−δ ).

(83)

(84)

Small-Scale Density of States Formula

251

From this last identity, it follows that for sufficiently small and fixed n ∈ Z+ + 21 there are at most O(−δ ) solutions m ∈ Z+ to (84). Consequently, from the estimate in (82) we finally get that: −1

|I2 | ≤ C(δ) | log | −δ

1 = O(1−3δ ). n n=1

Since δ > 0 can be taken arbitrarily small, this completes the proof of Theorem 0.3 (ii). 4.2. Mean-square estimates: Proof of Theorem 0.4. We assume here that µ() ∼ κ , where 0 < κ < 1. To prove Theorem 0.4, we must analyze the pointwise convergence of the measures dρµ . Unfortunately, the integration by parts argument for the mean in Sect. 4.1 is not available since the semiclassical pseudodifferential operator P = Op (p1u ) ⊗ I − I ⊗ Op (p1u ) is not -elliptic on T ∗ (M × M). Therefore, to prove Theorem 0.4 we will need to estimate the mean-square difference [Sa, Z3]: 1+ 1+ |dρµ (φ; u1 , u2 ) − c0 (u)|2 du1 du2 , (85) MSµ (φ) := 1−

1−

by using the reduction to the model trace given by the (QBNF) construction and then making an explicit analysis of the trace of the propagator U0 (t) = eitf (Q1 ,Q2 ;)/ . First, we split the integral in (85) into two separate terms: 1+ 1+ 2 ˇ MSµ (φ) = |dρµ,+ (φ; u1 , u2 )+dρµ,0 (φ; u1 , u2 )−c0 (u1 , u2 )φ(0)| du1 du2 . 1−

1−

(86) Here, the subscripts + and 0 in the DOS denote the part of the trace coming from integration over the regions |t| ≥ C1 and |t| ≤ C1 respectively. By applying the standard stationary phase argument to dρµ,0 , we get MSµ (φ) =

1+

1−

1+ 1−

|dρµ,+ |2 du1 du2 + O( dρµ,+ L2 (I 2 ) ) + O(2 ) .

As a consequence, we are reduced to estimating the L2 integral 1+ 1+ |dρµ,+ (φ; u1 , u2 )|2 du1 du2 . MS+,µ (φ) := 1−

(87)

(88)

1−

Just as in the case of Theorem 0.3 (ii), we have to analyze (88) directly using Proposition 3.2. Modulo O(∞ ) errors, we must estimate: MS+,µ (φ) 1+ 2 = 1−

1+ 1−

u

f (m, n; ) − 1 ˇ ( φ · (1 − χ2 ) )ˆ m,n

2 · χ0 (m, n) du1 du2 ,

252

J.A. Toth

where, φ (s) := φ(κ s). Since φ ∈ S(R), modulo O(∞ ) errors, it suffices to restrict the summation over the quadruples (m1 , n1 , m2 , n2 ) with the property that for some u0 ∈ [1 − 2, 1 + 2] × [1 − 2, 1 + 2], f u0 (mj , nj ; ) = 1 + O(1−δ ), min (nj , |mj − 1|) ≤ ; j = 1, 2. Denote the set of such quadruples by S. We are thus reduced to estimating: 1+ 1+ 2 J+ := ei(s1 ,s2 ;m,n,)/ a(s; m, n, ) (m1 ,m2 ,n1 ,n2 )∈S

R2 1−

1−

∞

× du1 du2 ds1 ds2 + O( ),

(89)

where, (s1 , s2 ; m, n, ) := s1 [f u (m1 , n1 ) − 1] − s2 [f u (m2 , n2 ) − 1], and a(s; m, n, ) :=

2

ˇ κ sj ). χ1 (mj , nj ; ) (1 − χ2 (sj ))φ(

(90)

(91)

j =1

Let χ3 ∈ C0∞ ([1 − 2, 1 + 2]), where χ3 ≥ 0 and χ3 = 1 on [1 − , 1 + ]. Then, by choosing a large enough constant C > 0 it follows that: ei(s1 ,s2 ;m,n,)/ a(s; m, n, ) χ3 (u1 ) χ3 (u2 ) J+ ≤ C2 (m1 ,m2 ,n1 ,n2 )∈S

R4

× du1 du2 ds1 ds2 + O(∞ ).

(92)

Since s1 f u (m1 , n1 ) − s2 f u (m2 , n2 ) = s1 (u1 m1 + u2 n1 + ...) − s2 (u1 m2 + u2 n2 + ...), by carrying out the iterated (u1 , u2 ) integrals first, we get 2 |a(s; m, n; )| |χˇ3 (m1 s1 − m2 s2 ) |J+ | ≤ C (m1 ,m2 ,n1 ,n2 )∈S

R2

· χˇ3 (n1 s1 − n2 s2 )| × ds1 ds2 .

(93)

To get a contribution from (m1 , n1 , m2 , n2 ) ∈ S that is not O(∞ ), given any δ > 0, we need that for some (s1 , s2 ) ∈ R2 , max{ |m1 s1 − m2 s2 |, |n1 s1 − n2 s2 | } = O(−δ ), where min{ |s1 |, |s2 | } ≥ 1 on supp

2

(1 − χ2 (sj )).

(94)

(95)

j =1

It follows that m1 · n2 − m2 · n1 = O(1−δ ).

(96)

Small-Scale Density of States Formula

253

Next, consider the equation f u (σ, p; ) = 1, where |σ − 1| ≤ and p ≤ . From the QBNF expansion, we know that: 1 ∂ u f (p, σ ; ) ≥ for sufficiently small and p ∈ 0 . ∂p C So, by the implicit function theorem, there locally exists g ∈ C ∞ with p = g(σ ; u, ). Thus, for eigenvalues f u (m, n; ) satisfying f u (m, n; ) = 1 + O(1−δ ), it follows that: n = g(m; u, ) + O(1−δ ), (97) uniformly for u ∈ I 2 . Since (m1 , n1 , m2 , n2 ) ∈ S, after resubstituting the identity (97) into (96), we get that for some u0 ∈ I 2 , m1 · g(m2 ; u0 , ) − m2 · g(m1 ; u0 , ) = O(1−δ ).

(98)

Now since m1 ∼ 1 and m2 ∼ 1, to estimate (98) further, it suffices to make a first order Taylor expansion around x = y for the function ω(x, y) := x g(y) − y g(x). The first-order Taylor coefficient is just ∂ω (x, x) = x g (x) − g(x). ∂y Thus, we must estimate σ ∇σ g(σ ; ) − g(σ ). First, by implicitly differentiating the equation f u (p, σ ; ) = 1 in σ , we get that for some C1 > 0, sufficiently small and (p, σ ) ∈ 0 , 1 | ∇σ g(σ ; u, ) | ≥ > 0, C1 uniformly for u ∈ I 2 . Thus, since g(1; u, ) = O(1−δ ) it follows that for (p, σ ) ∈ 0 and possibly larger C2 > 0, | σ ∇σ g(σ ; u, ) − g(σ ; u, ) | ≥

1 > 0, C2

uniformly for u ∈ I 2 . So, by a first-order Taylor expansion argument, |m1 · g(m2 ; u, ) − m2 · g(m1 ; u, )| ≥

1 |(m2 − m1 ) |, C2

uniformly for u ∈ I 2 . Consequently, from the estimate in (98) it follows that: |(m1 − m2 ) | = O(1−δ ),

(99)

and so, for sufficiently small we get that m1 = m2 + O(−δ ) and then from (96) it follows that n1 = n2 + O(−δ ). We are thus reduced to estimating for fixed (m2 , n2 ) the expression:

254

2−2δ

J.A. Toth −1 −1 m1 =1 n1 >0

R2

|a(s; m, n, )| |χˇ3 (m1 s1 − m2 s2 ) · χˇ3 (n1 s1 − n2 s2 ))| ds1 ds2 .

(100) After making the change of variables S = m1 s1 − m2 s2 and T = s1 + s2 in Eq. (100), we get: −κ −1 1 dT = O(1−κ−2δ | log |). (101) |J+ | ≤ C1−2δ m1 0 m1 =1

Since δ > 0 is arbitrary, this completes the proof of Theorem 0.4(i). Now, Theorem 0.4 (ii) follows by a standard summation argument: By taking ∈ {bm }∞ m=1 with 1

bm m− 1−κ −δ

for any δ > 0, we have that ∞ 1+ 1+ m=1 1−

1−

2 ˇ |dρµ(bm ) (φ) − c0 (u)φ(0)| du1 du2 < ∞.

By the monotone convergence theorem, we can interchange summation and integration and get that for almost all u := (u1 , u2 ) ∈ I , w − lim dρµ (x; u1 , u2 ) = c0 (u1 , u2 ) dx, →0

provided ∈

{bm }∞ m=1 .

Proof of Theorem 0.5. Putting µ() ∼ κ iwith 0 ≤ κ < 1 we have that: MSµ (φ) = |dρµ,+ (φ; u)|2 du1 du2 I2 ˇ + dρµ,+ (φ; u) (dρµ,0 (φ : u)) − c0 (u)φ(0)) du1 du2 I2 2 ˇ + |dρµ,0 (φ : u) − c0 (u)φ(0)| du1 du2 .

(102)

I2

Now, let φ ∈ S be a family of mollifiers with φ → χ[a,b] as → 0+ in Lp ; p ≥ 1, where χ[a,b] is the indicator function of the interval [a, b]. By a standard argument [DG, GU], dρµ,0 (χ[a,b] ; u) − c0 (u)(b − a) = o(1), uniformly for u ∈ I 2 . From the proof of Theorem (0.4) it follows that for any δ > 0, |dρµ,+ (φ; u)|2 du1 du2 ≤ Cδ 1−κ−δ φ2L2 . (103) I2

Then, substitute φ for φ in (103) and take the → 0+ limit noting that the constant Cδ > 0 is independent of > 0. The identity in (102) then implies that: MSµ (χ[a,b] ) = o(1), 1/2

(104)

and Theorem (0.5) is proved since MSµ (χ[a,b] ) := 1−κ Nu (1+a1+κ , 1+b1+κ )− c0 (u)(b − a)−1+κ L2 (I 2 ) .

Small-Scale Density of States Formula

255

References [A]

Arnol’d, V.I.: Mathematical methods of classical mechanics. Second edition, Berlin-Heidelberg-New York: Springer-Verlag, 1987 [AM] Abraham, R., Marsden, J.E.: Foundations of mechanics. Second edition, New York: Benjamin/Cummings, 1978 [AS] Abramovitz, M., Stegun, R.: Handbook of mathematical functions. London: Dover, 1970 [Be1] Berry, M.V.: Regular and irregular semiclassical wavefunctions. J. Phys. A 10(12), 2083–2091 (1977) [Be2] Berry, M.V.: Semi-classical mechanics in phase space: A study of Wigner’s function. Philos. Trans. Roy. Soc. London Ser. A 287, 237–271 (1977) [Bl] Bleher, P.M.: The energy level spacing for two harmonic oscillators with generic ratio of frequencies. J. Stat. Phys. 63, 261–283 (1991) [BT] Berry, M.V., Tabor, M.: Closed orbits and the regular bound spectrum. Proc. Roy. Soc. London Ser. A 349(1656), 101–123 (1976) [B.K.S] Bleher, P., Kosygin, D., Sinai, Ya.G.: Distribution of energy levels of a quantum free particle on a Liouville surface and trace formulae. Commun. Math. Phys. 179, 375–403 (1995) [Ch] Charbonnel, A.M.: Comportement semi-classique du spectre conjoint d’operateurs pseudodifferentiel qui commutent. Asympt. Anal. 1, 227–261 (1988) [CP] Colin de Verdiere, Y., Parisse, B.: Equilibre instable en regime semi-classique I: Concentration microlocale. Commun. in P.D.E. 19, 1535–1563 (1994) [CR] Combescure, M., Robert, D.: Semiclassical spreading of quantum wave packets and applications near unstable fixed points of the classical flow. Asymptotic Anal. 14, 377–404 (1997) [CRR] Combescure, M., Ralston, J., Robert, D.: A proof of the Gutzwiller trace formula using coherent states decomposition. Commun. Math. Phy. 202, 463–480 (1999) [CV1] Colin de Verdiere, Y.: Spectre conjoint d’operateurs pseudodifferentiels qui commutent II: Le cas integrable. Math. Zeit. 171, 51–75 (1980) [CV2] Colin de Verdiere, Y.: Quasi-modes sur les varietes Riemanniennes compactes. Invent. Math. 43, 15–52 (1977) [Do] Dozias, S.: M´emoire de Magist`ere de l’ENS, 1993 [DG] Duistermaat, J.J., Guillemin, V.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) [Dy] Dyson, F.: Statistical properties of the energy levels of complex systems I–III. J. Math. Phys. 3, 140–175 (1962) [El] Eliasson, L.H.: Normal forms for Hamiltonian systems with Poisson commuting integralselliptic case. Comment. Math. Helv. 65, 4–35 (1990) [EMM] Eskin, A., Margulis, G., Mozes, S.: Upper bounds and symptotics in a quantitative version of the Oppenheim conjecture. Ann. of Math. 147, 93–141 (1998) [F] Folland, G.: Harmonic Analysis in Phase Space. Annals of Math. Studies 122, Princeton, NJ: Princeton Univ. Press, 1989 [G] Guillemin, V.: Wave-trace invariants. Duke Math. J. 83(2), 287–352 (1996) [GS] Guillemin, V., Sternberg, S.: Geometric asymptotics (second edition). Math. Surveys and Monographs 14, Providence, RI: A.M.S. 1990. [GU] Guillemin, V., Uribe, A.: Circular symmetry and the trace formula. Invent. Math. 96, 385–423 (1989) [Ha] Hagedorn, G.A.: Semiclassical dynamics with exponentially small error estimates. Commun. Math. Phys. 297(2), 439–465 (1999) [H] H¨ormander, L.: Analysis of linear differential operators I. Berlin: Springer-Verlag, 1983 [HZ] Hofer, H., Zehnder, E.: Symplectic invariants and Hamiltonian dynamics. Bikhauser Adv. Texts, Basel: Birkhauser, 1994 [JZ] Jakobson, D., Zelditch, S.: Classical limits of eigenfunctions for some completely integrable systems. Emerging applications of number theory. IMA 109, 329–354 (1999) [K] Khinchin, Y.: Continued Fractions, New York: Dover, 1997 [KMS] Kosygin, D., Minasov, A., Sina˘ı, Ya.G.: Statistical properties of the spectra of Laplace-Beltrami operators on Liouville surfaces. (Russian) Uspekhi Mat. Nauk 48(4)(292), 3–130 (1993); translation in Russ. Math. Surveys 48(4), 1–142 (1993) [P] Popov, G.: On the contribution of degenerate periodic trajectories to the wave-trace. Commun. Math. Phys. 196, 363–383 (1998) [PS] Paternain, G., Spatzier, R.: New examples of manifolds with completely integrable geodesic flows. Adv. in Math. 108(2), 346–366 (1994) [PT1] Petridis, Y., Toth, J.A.: A probablisitic Weyl law for two-dimensional flat tori. Geom. Funct. Anal. 12, 756–775 (2002)

256 [PT2] [PU] [RS] [Sa] [Si] [Sj] [Ta1] [Ta2] [T1] [T2] [TZ] [UZ] [V] [Vo] [VN] [Z1] [Z3] [ZZ]

J.A. Toth Petridis, Y., Toth, J.A.: The remainder in Weyl’s law for Heisenberg manifolds. J. Diff. Geom. 60, 455–483 (2002) Paul, T., Uribe, A.: Pointwise limits of semiclassical measures. Commun. Math. Phy. 175, 229–258 (1996) Rudnick, Z., Sarnak, P.: The pair correlation function of fractional parts of polynomials. Commun. Math. Phys. 1, 61–70 (1998) Sarnak, P.: Values at integers of binary quadratic forms. CMS Conf. Proc. 21, 181–203 (1997) Sinai, Y.: Advances in Soviet math. AMS Publ. 3, 199–215 (1991) Sjostrand, J.: Semi-excited states in nondegenerate potential wells. Asymp. Anal. 6, 29–43 (1992) Taylor, M.: Pseudodifferential Operators. Princeton, NJ: Princeton Univ. Press, 1981 Taylor, M.: Noncommutative Microlocal Analysis (Part I). Memoirs A.M.S. no. 313, Providence, RI: Aus, 1984 Toth, J.A.: Eigenfunction localization in the quantized rigid body. J. Diff. Geom. 43(4), 844–858 (1996) Toth, J.A.: On the quantum expected values of integrable metric forms. J. Diff. Geom. 52(2), 327–374 (1999) Toth, J.A., Zelditch, S.: Riemannian manifolds with uniformly bounded eigenfunctions. Duke Math. J. 111(2), 97–132 (2002) Uribe, A., Zelditch, S.: Spectral statistics on Zoll surfaces. Commun. Math. Phys. 154, 313–346 (1993) Vanderkam, J.: Pair correlation of four-dimensional flat tori. Duke Math. J. 97(2), 313–328 (1999) Volovoy, A.V.: Improved two-term asymptotics for the eigenvalue distribution function of an elliptic operator on a compact manifold. Commun. P.D.E. 15(11), 1509–1563 (1990) Vu Ngoc, S.: Formes normales semi-classiques des systemes completement integrables au voisinage d’un point critique de l’application moment. Asymptot. Anal. 24(3–4), 319–342 (2000) Zelditch, S.: Wave invariants at elliptic closed geodesics. Geom. Funct. Anal. 7, 145–213 (1997) Zelditch, S.: Level spacings for integrable quantum maps in genus zero. Commun. Math. Phys. 196, 289–318 (1998) Zelditch, S., Zworski, M.: Spacings between phase shifts in a simple scattering problem. Commun. Math. Phys. 204, 709–729 (1999)

Communicated by P. Sarnak

Commun. Math. Phys. 238, 257–285 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0851-3

Communications in

Mathematical Physics

Painlevé Transcendent Evaluations of Finite System Density Matrices for 1d Impenetrable Bosons P.J. Forrester1 , N.E. Frankel2 , T.M. Garoni2 , N.S. Witte1,2 1 2

Department of Mathematics and Statistics, University of Melbourne, Victoria 3010, Australia. E-mail: [email protected]; [email protected] School of Physics, University of Melbourne, Victoria 3010, Australia. E-mail: [email protected]; [email protected]

Received: 24 July 2002 / Accepted: 26 January 2003 Published online: 13 May 2003 – © Springer-Verlag 2003

Abstract: The recent experimental realisation of a one-dimensional Bose gas of ultra cold alkali atoms has renewed attention on the theoretical properties of the impenetrable Bose gas. Of primary concern is the ground state occupation of effective single particle states in the finite system, and thus the tendency for Bose-Einstein condensation. This requires the computation of the density matrix. For the impenetrable Bose gas on a circle we evaluate the density matrix in terms of a particular Painlevé VI transcendent in σ -form, and furthermore show that the density matrix satisfies a recurrence relation in the number of particles. For the impenetrable Bose gas in a harmonic trap, and with Dirichlet or Neumann boundary conditions, we give a determinant form for the density matrix, a form as an average over the eigenvalues of an ensemble of random matrices, and in special cases an evaluation in terms of a transcendent related to Painlevé V and VI. We discuss how our results can be used to compute the ground state occupations. 1. Introduction Recent advances in the experimental physics of Bose-Einstein condensates [14, 15, 6] have led to the experimental realisation of a one-dimensional Bose gas of ultra-cold alkali atoms. One expects [39] that the microscopic forces are such that there is an effective one-body confining harmonic potential acting on each atom individually, and an effective infinitely short range contact potential acting between neighbouring atoms. Moreover, in a certain physical regime depending on the ratio of the transverse confinement width to the s-wave scattering length, it is argued in [39] that the contact potential can be well approximated by the delta function form U (|x − y|) = gδ(|x − y|), and furthermore g → ∞ in the low energy scattering limit. The limit g → ∞ of the delta function interaction Bose gas is the impenetrable Bose gas, introduced in [12, 32]. Not surprisingly, there has thus been renewed interest in the theoretical properties of the ground state of the finite system impenetrable Bose gas [39, 13]. With the 3d Bose gas exhibiting Bose-Einstein condensation, a central question is the tendency of the finite

258

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

system confined to 1d to form a Bose-Einstein condensate. To attack this question is a two step process. First, with the particles confined to the region ∈ R and the ground state wave function ψ0 real, it is necessary to compute the one-particle density matrix ρN (x; y) = N dx2 . . . dxN ψ0 (x, x2 , . . . , xN )ψ0 (y, x2 , . . . , xN ). (1.1)

Second, one must solve the eigenvalue problem ρN (x; y)φk (y) dy = λk φk (x),

k ∈ Z≥0 .

(1.2)

Because this the λj are non-negative, while the trace integral operator is idempotent, condition ρN (x; x) dx = N implies k λk = N. Consequently the λk have the interpretation as occupation numbers of effective single particle states φk (x). The simplest case is when = [0, L] with periodic boundary conditions. The periodicity implies that ρN (x; y) = ρN (x − y; 0). Thus we have φk (x) = √1 e2πikx/L and so L

L

λk =

ρN (x; 0)e2πikx/L dx.

(1.3)

0

However for other geometries and confinements there is no analogue of (1.3) and one must solve (1.2) numerically. A number of results are available on ρN (x; 0) for periodic boundary conditions. In particular Lenard [29] has given ρN+1 (x; 0) as an N × N Toeplitz determinant (see (2.5)–(2.15) below), and subsequently obtained the N → ∞ asymptotic expansion [31]

π ρN (x; 0) ∼ ρ0 A N sin(πρ0 x/N )

1/2 ,

A=

G4 (3/2) , √ 2π

(1.4)

where ρ0 denotes the bulk density and G(x) denotes the Barne’s G-function, valid for x/N fixed. Although the analysis of [31] leading to (1.4) was not rigorous, the setting of the problem as belonging to the asymptotics of Toeplitz determinants with symbols having zeros on [0, 2π) was identified, and this work inspired a subsequent rigorous proof [46]. (We remark that the asymptotic form of Toeplitz determinants of this type was first conjectured √ by Fisher and Hartwig [7, 16].) The result (1.4) substituted into (1.3) gives λ0 ∼ c N for a specific c computable from (1.4). Thus √ for large N the fraction of particles in the zero momentum state is proportional to N . The result (1.4) can also be used to compute the large N behaviour of λk for any fixed k ≥ 0 [9]. For the impenetrable Bose gas confined by a harmonic one-body potential, or indeed in other geometries such as Dirichlet or Neumann boundary conditions, no results of this type are known. All one has is the recent numerical study of Girardeau et al. [13] in the case of the harmonic well, who by a Monte Carlo study of system sizes up to N = 10 obtained the estimate λ0 ∝ N 0.59 for large N . If correct, this result implies the maximum effective single particle state occupation is dependent on the geometry/confining potential. To further study this issue, we take up the first step in the procedure above to compute the λj , and thus provide formulas suitable for the numerical computation of ρN (x; y). Four cases are considered – when the domain is a circle (or equivalently periodic boundary conditions); a line with the particles confined by a harmonic one-body potential; and an interval with Dirichlet or Neumann boundary conditions. The Toeplitz determinant

Painlev´e Transcendent Evaluations of Density Matrices

259

formulation in the case of periodic boundary conditions is extended to Hankel determinant forms for ρN (x; y) in the other cases (Sect. 2.2), and a formulation for efficient Monte Carlo evaluations by way of expressing the ρN (x; y) as averages over the eigenvalue probability density function (p.d.f.) of certain matrix ensembles is given (Sect. 2.3). We then give a systematic Fredholm type expansion of ρN (x; y) about the density ρN (x; x) (Sect. 2.4). Beginning in Sect. 3 we address the issue of closed form evaluations of ρN (x; y). In the infinite system there are some celebrated instances of such evaluations. In particular Jimbo et al. [22] related the problem of evaluating ρ∞ (x; 0) to integrable systems theory, and consequently were able to derive the formula πρ0 x dt , (1.5) σV (t) ρ∞ (x; 0) = ρ0 exp t 0 where σV satisfies the non-linear equation (xσV )2 + 4(xσV − σV − 1) xσV − σV + (σV )2 = 0

(1.6)

subject to the x → 0 boundary condition σV (x) ∼ − x→0

x2 x3 + + O(x 4 ). 3 3π

(1.7)

The differential equation (1.6) is an example of the so-called Jimbo-Miwa-Okamoto σ -form of the Painlevé V equation, the latter being essentially the differential equation obeyed by the Hamiltonian in the Hamiltonian formulation of PV [38], (thV )2 − (hV − thV + 2(hV )2 )2 + 4

4

(hV + vk ) = 0

(1.8)

k=1

with v1 + v2 + v3 + v4 = 0. Setting σV (x) + 1/2 = hV (t),

x=−

it 2

(1.9)

shows that (1.6) reduces to (1.8) with (v1 , v2 , v3 , v4 ) = (1/2, −1/2, 1/2, −1/2). Subsequently the characterisation of ρ∞ (x; 0) in terms of the solution of a differential equation T (x; 0) – the was extended by Its et al. [20] (see also [28]) to the characterisation of ρ∞ density matrix of the impenetrable Bose gas at non-zero temperature T , as the solution of coupled partial differential equations. In the same study that (1.6) was obtained, Jimbo et al. evaluated the scaled probability of an eigenvalue free interval for large GUE random matrices (random Hermitian matrices) in terms of another particular case of the σ -form of PV . In recent years there has been considerable progress in the evaluation of probabilities and averages in matrix ensembles in terms of Painlevé transcendents (see e.g. [11]). Because of the close relationship between the density matrix for impenetrable bosons and gap probabilities in matrix ensembles, the random matrix results can be used to extend the density matrix Painlevé transcendent evaluation (1.5) to the exact Painlevé transcendent of ρN (ι(x); x) in the four cases, where ι(x) denotes the image of x reflected about the centre of the system.

260

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

We adopt two distinct strategies to obtain the exact evaluations. In Sect. 3 we present the first approach where we work directly with the definition of ρN (x; y) on a circle as a multidimensional integral. It turns out that this integral is one of a general class which have recently [11] been identified as τ -functions for certain PVI systems. We show that our PVI transcendent evaluation for the finite system scales to the infinite system result (1.6). As well as being a special case of the class of integrals related to PVI systems in [11], the multidimensional integral formula for ρN (x; y) on a circle is also a special case of a class of integrals over the unitary group shown to satisfy integrable recurrence relations in [1]. We will show that these recurrences can alternatively be derived from orthogonal polynomial theory [33]. Underpinning the second of our strategies is the formulation of Lenard [29] which allows ρN (x; y) to be expressed in terms of the Fredholm minor of 1−ξ KJ , where KJ is the integral operator on J = [x, y] with kernel K of Christoffel-Darboux type. It is this formulation which also underlies the calculation of [22]. The Fredholm minor in turn can be expressed in terms of the product of the corresponding Fredholm determinant, and the resolvent kernel R(s, t) evaluated at the endpoints x, y of J . These latter two quantities have been extensively studied in the context of gap probabilities in random matrix ensembles [42, 43, 49, 48], allowing us to essentially read off from the existing literature an expression for ρN (ι(x); x) in terms of Painlevé transcendents in each case. This is done in Sect. 4. The significance of our results, from the viewpoint of the theory of the ground state occupation of single particle states for the impenetrable Bose gas, and from the viewpoint of the Painlevé theory, is discussed in Sect. 5. 2. Formulations of ρN (x; y) 2.1. The wave functions. We will first revise the construction of the ground state wave function for impenetrable bosons on the circle, on the line with a confining harmonic potential, and on an interval with Dirichlet or Neumann boundary conditions. The wave function and density matrix will be given a superscript “C”, “H”, “D” and “N” respectively to distinguish the four cases. In general the wave function ψ(x1 , . . . , xN ) for impenetrable bosons must vanish at coincident coordinates, ψ(x1 , . . . , xi , . . . , xj , . . . , xN ) = 0 for xi = xj , (i = j ),

(2.1)

and satisfy the free particle Schrödinger equation otherwise. But for point particles without spin the condition (2.1) is equivalent to the Pauli exclusion principle. This means that for any fixed ordering of the particles, x1 < x2 < . . . < xN

(2.2)

say, there is no distinction between impenetrable bosons and free fermions [12]. Consequently the ground state wave function ψ0 can, for the ordering (2.2), be constructed out of a Slater determinant of distinct single particle states. For other orderings ψ0 is constructed from the functional form for the sector (2.2) by the requirement that it be a symmetric function of the coordinates. Consider the case that the particles are confined to a circle of circumference length L. This means we require ψ(x1 , . . . , xi + L, . . . , xN ) = ψ(x1 , . . . , xi , . . . , xN )

(2.3)

Painlev´e Transcendent Evaluations of Density Matrices

261

for each i = 1, . . . , N. Constructing a Slater determinant obeying (2.3) out of distinct single particle states with zero total momentum and minimum total energy gives  2πikxj /L ] N odd j =1,...,N  det[e k=−(N −1)/2,...,(N −1)/2 C −1/2 −N/2 L ψ0 (x1 , . . . , xN ) = (N !)  det[e2πi(k+1/2)xj /L ] j =1,...,N N even k=−N/2,...,N/2−1 2i sin π(xk − xj )/L, (2.4) = (N!)−1/2 L−N/2 1≤j
where the factor of (N !)−1/2 is included so that L L 2 dx1 · · · dxN ψ0C (x1 , . . . , xN ) = 1. 0

0

Excluding the (unitary) factors of i, and recalling (2.2), we note that this state is nonnegative – a property which distinguishes the ground state in Bose systems. By the requirement that the wave function for a Bose system be symmetrical with respect to interchanges xj ↔ xj (j = j ) we see immediately from (2.4) that for general ordering of particles 2| sin π(xk − xj )/L|. (2.5) ψ0C (x1 , . . . , xN ) = L−N/2 (N !)−1/2 1≤j
In the case of impenetrable bosons on a line with a confining harmonic potential, we take as the Schrödinger operator (in reduced units) N N

∂2 − + xj2 . 2 ∂x j j =1 j =1

(2.6)

The corresponding normalised single particle eigenstates {φk (x)}k=0,1,... have the explicit form φk (x) =

2−k −x 2 /2 e Hk (x), ckH

(ckH )2 = π 1/2 2−k k!,

(2.7)

where Hk (x) denotes the Hermite polynomial of degree k. Forming a Slater determinant from the minimal energy states (k = 0, 1, . . . , N − 1), making use of the Vandermonde determinant formula j −1 (xk − xj ) (2.8) det[pj −1 (xk )]j,k=1,...,N = det[xk ]j,k=1,...,N = 1≤j
for any {pj (x)} with pj (x) a monic polynomial of degree j , and arguing as in going from (2.4) to (2.5) shows ψ0H (x1 , . . . , xN ) =

N 1 −xj2 /2 e |xk − xj |, H CN j =1 1≤j
H 2 (CN ) = N!

N−1

(clH )2 .

l=0

(2.9)

262

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Finally we consider the case of impenetrable bosons on the interval [0, L] with Dirichlet or Neumann boundary conditions, requiring that the wave function or its derivative vanishes at x = 0, L respectively. The single particle eigenstates {φk (x)} in these situations are, in increasing order of energy,   1   , k=0 πkx 2 D N L φk (x) = sin , (k = 1, 2, . . . ), φk (x) = .  L L π kx 2   cos , k = 1, . . . L L Recalling the C and D type Vandermonde formulas [40] det[zjk − zj−k ]j,k=1,... ,n =

n

j =1 −(k−1)

det[zjk−1 + zj

]j,k=1,... ,n = 2

1 , (zk − zj ) 1 − zj z k 1≤j
(zj − zj−1 )

1≤j

we see that the corresponding ground state wave functions are ψ0D (x1 , . . . , xN ) 1 = √ det[φkD (xj )]j,k=1,...,N N! N N 1 1 =√ 2 sin(π xl /L) 2| cos π xk /L − cos π xj /L|, (2.10) √ N! 2L 1≤j
1 N ψ0N (x1 , . . . , xN ) = √ det[φk−1 (xj )]j,k=1,...,N N! N−1 1 1 1 = √ √ 2| cos π xk /L − cos π xj /L|. √ N! L 2L 1≤j
2.2. The density matrix as a determinant. The density matrix ρN+1 is defined as an N-dimensional integral by (1.1). In the cases of the impenetrable Bose gas wave functions of the previous section, this integral can be reduced to a computationally simpler N-dimensional determinant. For the circular case, this form has already been given by Lenard [29]. Thus using the general Heine identity N! det =

1/2

−1/2 1/2 −1/2

dx w(z)z

dx1 · · ·

k−j j,k=1,... ,N

1/2 −1/2

dxN

N l=1

w(zl )

1≤j
|zj − zk |2 , zj := e2πixj /L

(2.12)

Painlev´e Transcendent Evaluations of Density Matrices

263

we see from (2.5) and (1.1) that 1 det[ajC−k (x)]j,k=1,... ,N L 1/2 dt |e2πix/L + e2πit ||1 + e2πit |e2πilt . alC (x) :=

C ρN+1 (x; 0) =

−1/2

(2.13) (2.14)

Furthermore, the elements alC have the explicit evaluation [29] a0C C a±1 C a±m

4 πx π 2x πx = sin + 1− cos , π L 2 L L 2x 1 2πx = e±iπx/L π 1 − + sin , π L L πx mπ x πx mπ x 4 (−1)m+2 ±imπx/L cos sin − m sin cos , |m| > 1. = e π m(m2 − 1) L L L L (2.15)

In particular, it follows that 1 , (2.16) L 4 x πx πx ρ2C (x; 0) = π 1/2 − cos + sin , (2.17) πL L L L 8 πx πx x 2 x ρ3C (x; 0) = 2 2 − 1/2π 2 1/2 − sin cos + 3π 1/2 − π L L L L L πx πx 2 x 2 2 4 cos + −5/2 + 2π 1/2 − + 1/2 cos . L L L (2.18) ρ1C (x; 0) =

An essential ingredient underlying the applicability of (2.12) is the factorisation ψ0C (x, x1 , . . .

N π(x − xj ) 1 C 2 sin , xN ) = √ √ ψ0 (x1 , . . . , xN ) L N + 1 L j =1 1

observed from (2.5), and their subsequent use of the determinant form in (2.4) to replace ψ0C on the right-hand side. Now we observe from (2.9), (2.10), (2.11) that ψ0H , ψ0D , ψ0N in the case of N + 1 particles can similarly be factorised. Using the general identity N ! det =

∞ −∞

∞ −∞

dt g(t)hj −1 (t)hk−1 (t)

dx1 · · ·

j,k=1,... ,N ∞ −∞

dxN

N l=1

2 g(xl ) det[hj −1 (xk )]j,k=1,... ,N ,

264

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

(cf (2.12)) we thus obtain analogous to (2.13, 2.14) the determinant formulae H ρN+1 (x; y) =

1 H )2 (cN

e−x

2 /2−y 2 /2

H det[aj,k (x; y)]j,k=1,... ,N ,

2 πx πy D (x; y)]j,k=1,... ,N , sin sin det[aj,k L L L 1 N N ρN+1 det[aj,k (x; y) = (x; y)]j,k=1,... ,N , (N ≥ 1), 4L D ρN+1 (x; y) =

(2.19) (2.20) (2.21)

where 2−j −k+2 ∞ 2 = H H dt |x − t||y − t|Hj −1 (t)Hk−1 (t)e−t , (2.22) cj −1 ck−1 −∞ 1 πx πy D aj,k (x; y) = 8 dt | cos − cos πt|| cos − cos π t| sin πj t sin π kt, (2.23) L L 0 1 πy πx N − cos πt|| cos − cos π t| cos π(j −1)t cos π(k−1)t. (x; y) = 8 dt | cos aj,k L L 0 (2.24)

H (x; y) aj,k

To simplify further, we note |x − t||y − t| =

(x − t)(y − t), −(x − t)(y − t),

t∈ / [x, y] t ∈ [x, y],

(2.25)

πy and similarly with | cos πx L − cos πt|| cos L − cos πt|. Use of such an identity allows H D (x; y), aj,k (x; y) to be evaluated in terms of incomplete gamma functions, and aj,k N (x; y) in a form similar to (2.15). aj,k

2.3. ρN+1 (x; y) and integrals over the classical groups. In general, for a many body wave function ψ0 , |ψ0 |2 has the interpretation as a multivariable p.d.f. As first observed by Sutherland in the cases of ψ0C and ψ0H , a feature of |ψ0 |2 for each of the wavefunctions (2.5), (2.9), (2.10) and (2.11) is that it coincides precisely with the multivariate p.d.f. for particular classes of random matrices. Thus |ψ0C |2 = Ev(U (N ))|θ=2πx/L , |ψ0H |2 = Ev(GUEN ), |ψ0D |2 = Ev(Sp(N ))|θ=πx/L , |ψ0N |2 = Ev(O + (2N ))|θ=πx/L , where Ev(X) denotes the eigenvalue p.d.f. of the ensemble of matrices X, and U (N ) denotes the unitary group with uniform (Haar) measure, GUEN the Gaussian unitary ensemble of N ×N complex Hermitian matrices, Sp(N ) the group of symplectic unitary 2N ×2N matrices with Haar measure, and O + (2N ) denotes the group of real orthogonal 2N × 2N matrices with determinant +1 and Haar measure.

Painlev´e Transcendent Evaluations of Density Matrices

265

Moreover, it follows from the definition (1.1) of the density matrix, and the explicit forms of the wave functions, that ρN+1 (x; y) in each of the cases can be written as an average over Ev(X) for appropriate X. Explicitly N πx θ 1 l C 2 sin 2 sin θl ρN − , (2.26) +1 (x; 0) = L L 2 2 l=1

H ρN +1 (x; y) =

1 H )2 (cN

e−x

2 /2−y 2 /2

N

Ev(U (N))

|x − xl ||y − xl |

l=1

(2.27)

, Ev(GUEN )

N πy πx πy 2 πx D ρN sin sin −cos θl 2 cos −cos θl 2 cos +1 (x; y) = L L L L L l=1

,

Ev(Sp(N))

(2.28)

N πy 1 πx N ρN +1 (x; y) = − cos θl 2 cos − cos θl 2 cos 2L L L

.

(2.29)

Ev(O + (2N))

l=1

Because it is straightforward to generate typical members from each of these matrix ensembles (see e.g. [8]), and so compute eigenvalues from the corresponding p.d.f. Ev(X), these expressions are well suited to evaluation via the Monte Carlo method. For future reference we note that the density matrices for the corresponding free Fermi systems are given by the same averages, except that the absolute value signs are to be removed. In particular N 1 θl πx θl C,FF ρN +1 (x; 0) = 2 sin . (2.30) − 2 sin L 2 L 2 l=1

Ev(U (N))

Furthermore it is elementary to compute density matrices for free Fermi systems as sums over single particle states (a consequence of all energy states below the Fermi surface having occupation unity), and this implies the explicit evaluation C,FF ρN+1 (x; 0) =

1 sin(π(N + 1)x/L) . L sin(π x/L)

(2.31)

2.4. Systematic small-|x − y| expansion of ρN (x; y). According to the definition (1.1), the density matrix at coincident points x = y is equal to the particle density. But the particle density for the impenetrable Bose gas is the same as for the corresponding free Fermi system and thus simple to compute. In the infinite system, the translational invariance of the state gives that the particle density is a constant. For this case Lenard [30] has shown how to make a systematic expansion of the density matrix ρ∞ (x; y) about the case of coincident points ρ∞ (x; x). Here we will present this expansion for finite Bose gas systems with ground state wave functions of the form N 1 g(xl ) C l=1

1≤j
|xk − xj |.

(2.32)

266

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

This form includes the case of the harmonic well (2.9), and after the change of variables cos πxj /L → xj in (2.10) and (2.11) also includes the case of Dirichlet and Neumann boundary conditions. Following Lenard [30], we note that substituting (2.32) in (1.1) and using (2.25) shows y y Ng(x)g(y) 2 ρN (x; y) = dxN g 2 (xN ) dx2 g (x2 ) · · · −ξ dxN − ξ C x x N × (x − xl )(y − xl ) (xk − xj )2 . (2.33) ξ =2

2≤j
l=2

One now introduces the Fermi type distribution function Ng(x)g(y) 2 g (xl ) C n

FF ρN (x; y; x2 , . . . , xn ) =

l=2

×

N

(x − xl )(y − xl )

dxn+1 g 2 (xn+1 ) · · ·

dxN g 2 (xN )

(xk − xj )2

(2.34)

2≤j
l=2

(when n = 1 this corresponds to the free fermion one-body density matrix). Expanding (2.33) in a power series in ξ and using the definition (2.34) shows ρN (x; y) =

∞

(−ξ )n n=0

n!

y

y

dx2 · · ·

x

x

FF dxn+1 ρN (x; y; x2 , . . . , xn+1 )

ξ =2

(2.35)

(the summation can be extended to infinity since ρ FF (x; y; x2 , . . . , xn ) = 0 for n > N ). Next, let {pj (x)}j =0,1,... be monic polynomials of degree j , orthogonal with respect to the weight function g 2 (x). Then writing the integrand in (2.34) as a product of Slater determinants using (2.8) and making use of the orthogonality of the pj (x), a standard calculation shows K(x, y) [K(xj , y)]j =2,...,n FF ρN (x; y; x2 , . . . , xn ) = det [K(x, xk )]k=2,...,n [K(xj , xk )]j,k=2,...,n x x2 · · · xn =: K , (2.36) y x 2 · · · xn where, with Nj :=

∞

−∞ g

2 (x)(p

K(x, y) := g(x)g(y)

N−1

j =0

=

j (x))

2 dx,

pj (x)pj (y) Nj

g(x)g(y) pN (x)pN−1 (y) − pN−1 (x)pN (y) FF (x; y). = ρN NN−1 x−y

(2.37)

The equality in (2.37) follows from the Christoffel-Darboux summation formula, and leads to the name Christoffel-Darboux kernel (the latter term is due to a relationship

Painlev´e Transcendent Evaluations of Density Matrices

267

with integral equations; see Sect. 4.1) for (2.37). Hence ρN (x; y) =

∞

(−ξ )n n=0

n!

y

x

x 1 := − [x,y] ; ξ , y ξ =2 ξ

y

dx2 · · ·

dxn+1 K x

x x2 · · · xn+1 y x2 · · · xn+1 ξ =2 (2.38)

where ∞

a (−ξ )n+1 [x,y] ; ξ := b n!

n=0

y

x

y

dx2 · · ·

dxn+1 K x

a x2 · · · xn+1 . b x2 · · · xn+1

As for |x − y| small each term in (2.38) is proportional to successively higher powers of |x − y|; this is the sought systematic small |x − y| expansion of ρN (x; y). We will see in Sect. 4.1 that the expansion (2.38) forms the basis for Painlevé transcendent evaluations of ρN (ι(x), x) in the harmonic well, Dirichlet and Neumann boundary condition cases. 3. Jimbo-Miwa-Okamoto τ -Functions and Orthogonal Polynomials In this section we will provide the finite N analogue of the Jimbo, Miwa, Mori and Sato [22] Painlevé transcendent evaluation (1.5) of ρ∞ (x; 0), by similarly evaluating C (x; 0), and also presenting a recurrence relation in N for ρ C (x; 0). Our Painlevé ρN +1 N+1 C (x; 0) is in terms of the solution of the Painlevé VI transcendent evaluation of ρN+1 equation in σ -form. Let us then discuss some of the theory relating to this equation.

3.1. Hamiltonian formulation of PVI and τ -function sequences. There are six Painlevé equations, labelled PI – PVI . They result (see e.g. [18]) from the project undertaken by Painlevé, Gambier and others to classify solutions to second order differential equations of the form y = R(y , y, t), where R is rational in y , algebraic in y and analytic in t which are free from movable branch points. It was shown that the only such equations, excluding those which could be reduced to first order equations or to linear second order equations, are PI – PVI . Our interest is in the PVI equation, which has the form

1 1 1 1 1 1 + + (q )2 − + + q q q −1 q −t t t −1 q −t t q(q − 1)(q − t) (t − 1) t (t − 1) α + β , + + γ + δ t 2 (t − 1)2 q2 (q − 1)2 (q − t)2

q = 1/2

(3.1)

C (x; 0) can be identified and its solution, the PVI transcendent q(t). We will see that ρN+1 with a τ -function sequence in the PVI system. The PVI system refers to the Hamiltonian system {q, p; H, t}

q =

∂H , ∂p

p = −

∂H , ∂q

(3.2)

268

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

where, with α0 + α1 + 2α2 + α3 + α4 = 1, t (t − 1)H = q(q − 1)(q − t)p 2 − [α4 (q − 1)(q − t) + α3 q(q − t) + (α0 − 1)q(q − 1)]p + α2 (α1 + α2 )(q − t).

(3.3)

It has been known since the work of Malmquist in the early 1920’s [34] that the PVI equation (3.1) results from the Hamiltonian system (3.2), (3.3) by eliminating p and choosing the parameters so that α = 1/2α12 , β = −1/2α42 , γ = 1/2α32 , δ = 1/2(1 − α02 ). One sees that the Hamiltonian can be written as an explicit rational function of the PVI transcendent and its derivative. This follows from the fact that with H given by (3.3), the first of the Hamilton equations is linear in p, so p can be written as a rational function of q, q and t. The τ -function is defined in terms of the Hamiltonian by H =

d log τ (t). dt

(3.4)

C (x; 0) as a τ -function for the P system is that The utility of being able to identify ρN+1 VI H , and thus by integration of (3.4) τ (t), can be characterised in terms of a differential equation.

Proposition 1 [21, 37]. Rewrite the parameters α0 , . . . , α4 of (3.3) in favour of the parameters b1 = 1/2(α3 +α4 ), b2 = 1/2(α4 −α3 ), b3 = 1/2(α0 +α1 −1), b4 = 1/2(α0 −α1 −1), (3.5) and introduce the auxiliary Hamiltonian h by 1 h = t (t − 1)H + e2 [b]t − e2 [b] 2 = t (t − 1)H + (b1 b3 + b1 b4 + b3 b4 )t −

1 2

bj bk ,

(3.6)

1≤j
where ej [b] denotes the j th degree elementary symmetric function in b1 , b3 and b4 while ej [b] denotes the j th degree elementary symmetric function in b1 , . . . , b4 . The auxiliary Hamiltonian satisfies the Jimbo-Miwa-Okamoto σ -form of PVI , 4 2 2 (h + bk2 ). h t (1 − t)h + h [2h − (2t − 1)h ] + b1 b2 b3 b4 =

(3.7)

k=1

A self contained derivation of this result can be found in [11]. One of the main practical consequences of the Hamiltonian formulation is that it allows for a systematic construction of special solutions via Bäcklund transformations – birational mappings which leave the Hamilton equations formally unchanged [37]. The (1) elementary Bäcklund transformations form an extended affine Weyl group of type D4 .

Painlev´e Transcendent Evaluations of Density Matrices

269

By composing certain of these elementary operators, shift operators can be constructed which have the effect of incrementing the α parameters by ±1 or 0. For example, one such operator of this type, denoted T3 in [11], has the action T3 α = (α0 + 1, α1 + 1, α2 − 1, α3 , α4 ) or equivalently, after recalling (3.5), T3 b = (b1 , b2 , b3 + 1, b4 ).

(3.8)

Although T3 acting on p and q is a non-trivial rational mapping, when acting on H , T3 has the formal action of acting only on the α’s, . T3 H = H α →T3 α

This motivates introducing a sequence of Hamiltonians , T3n H = H n α →T3 α

and a corresponding sequence of τ -functions specified by T3n H =

d log τ3 [n], dt

τ3 [n] = τ3 [n](t) = τ (t; b1 , b2 , b3 + n, b4 ).

(3.9)

A crucial result due to Okamoto [37], which can be derived from the specific form of the action of T3 and T3−1 on H , p and q [25], is that τ3 [n] satisfies a particular differential recurrence relation. Proposition 2. The τ -function sequence (3.9) satisfies the Toda lattice equation δ 2 log τ¯3 [n] =

τ¯3 [n − 1]τ¯3 [n + 1] , τ¯32 [n]

δ = t (t − 1)

d , dt

(3.10)

where (n+b1 +b3 )(n+b3 +b4 )/2 τ¯3 [n] := t (t − 1) τ3 [n].

(3.11)

The significance of this is that an identity of Sylvester (see [35]) gives that if τ¯3 [0] = 1, then the general solution of (3.10) is given by τ¯3 [n] = det δ j +k τ¯3 [1]

(3.12)

j,k=0,1,...,n−1

.

(3.13)

Furthermore, restricting the parameter space so that α2 = 0 (which corresponds to a (1) chamber wall or reflection hyperplane in the affine D4 root system), it has been shown by Okamoto [37] that τ¯3 [1] is given in terms of a solution of the Gauss hypergeometric equation. Using integral solutions of the latter, the formula (3.13) was taken as the starting point by Forrester and Witte [11] in an extensive study of multidimensional integral forms of the τ -function sequence τ¯3 [n]. In particular, results relating to averages of the form (2.26), equivalent to the following were established.

270

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Proposition 3. Define AN (u; ω, µ; ξ ) = N l=1

θl 2ω −1 µ (l) iθl 2µ (1 − ξ χ[0,φ) ) 2 sin (1 − ue ) , Ev(U (N)) u=e−iφ 2 ueiθl

where zl = eiθl , 0 ≤ θl ≤ 2π, and

(l) χJ

Let b=

1/ (N +ω−µ), ω 2

(3.14)

=

1, θl ∈ J 0, θl ∈ / J.

+ 1/2(N +ω+µ), 1/2(N −ω+µ), −µ − 1/2(N +ω+µ) , (3.15)

and write C1 = e2 [b] + µN,

C2 = 1/2e2 [b] + µN

(recall the definition of e2 [b] and e2 [b] from Proposition 1). The PVI system with parameters (3.15) permits the τ -function sequence τ3 [N] ∝ uNµ/2 AN (u; ω, µ; ξ ),

(3.16)

where the proportionality factor is independent of u and furthermore C1 u − C2 + u(u − 1)

d log AN (u; ω, µ; ξ ) = hVI (u; b), du

(3.17)

where hVI (t; b) is an auxiliary Hamiltonian (3.6) for the PVI system with parameters (3.15). Consequently (3.17) satisfies the PVI equation in σ -form (3.7) with parameters (3.15). To relate (3.14) to (2.26) we note that −1 1/2 (θl − φ) (l) iθl . 1 − 2χ[0,φ) (1 − ue )|u=e−iφ = 2 sin ueiθl 2 Consequently C (x; 0) = ρN+1

1 AN (e2πix/L ; 1/2, 1/2; 2), L

(3.18)

C (x; 0) is even in x. where we have used the fact that ρN+1 C The choice of the parameters in (3.18) corresponding to ρN+1 implies a special structure to the τ -function sequence (3.16). First substituting (3.18) in (3.15) shows we are considering the PVI system with parameters (3.19) b = 1/2N, 1 + 1/2N, 1/2N, −1 − 1/2N .

Painlev´e Transcendent Evaluations of Density Matrices

271

As noted above, τ3 [1] satisfies the Gauss hypergeometric differential equation. The parameters in the latter are related to the parameters b by a = b 1 + b4 , b = 1 + b 3 + b4 , c = 1 + b 2 + b4 . Substituting the special values (3.19) we see that in particular c = 1, which is the condition for the existence of a logarithmic solution at the origin (u = 0). For general N , τ3 [N] then corresponds to a generalisation of this logarithmic solution of the Gauss hypergeometric equation. To illustrate this point, we note that with b given by (3.19), according to (3.16) and (3.18) we have C τ3 [N](u) ∝ uN/2 ρN+1 (x; 0) 2π ix/L . u=e

Recalling (2.17) and (2.18) we see τ3 [1](u) ∝ (u + 1)v + 2(u − 1), τ3 [2](u) ∝ 4(u2 + u + 1)v 2 + 12(u − 1)(u + 1)v − u−1 (u − 1)2 (u2 − 14u + 1), where v = π i − log u which exhibits the further structure of being a polynomial of degree N in v, and a Laurent polynomial in u of positive degree N and negative degree N − 1. The PVI system with parameters (3.19) also permits a τ -function sequence which is strictly a polynomial. To anticipate this we relate (3.14) to the free Fermi average (2.30) by noting −1 1/2 (θl − φ) , (1 − ueiθl )|u=e−iφ = 2 sin ueiθl 2 and so deducing 1 AN (e2πix/L ; 1/2, 1/2; 0). (3.20) L Recalling (3.16) and (2.31) we see that this corresponds to the τ -function sequence C,FF ρN+1 (x; 0) =

τ3 [N](u) ∝

N−1

uj .

j =0

This class of polynomial solutions is a special case of the generalised Jacobi polynomial solutions identified in [36]. As a final remark on the theme of special classes of solutions to the PVI system, we note that the specification of the parameters (3.19) is a particular example which permits elliptic solutions [26, 5, 17]. More generally the latter occur when t1 t2 t3 t4

= 1 + b3 − b4 = 2 + N ∈ Z, = b1 + b2 = 1 + N ∈ Z, = b1 − b2 = −1 ∈ Z, = 1 + b3 + b4 = 0 ∈ Z,

4

tk = 2(N + 1) ∈ 2Z.

k=1

Substituting (3.18) into (3.17) of Proposition 3 and replacing N by N − 1 throughout C (x; 0) in terms of a solution of the P gives the sought evaluation of ρN VI equation in σ -form.

272

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Corollary 1. Define σN (u) := u(u − 1) so that

d C (x; 0)|e2π ix/L =u log ρN du

C ρN (x; 0) = ρ0 exp 2πi

0

x/L

dt 2πit σ (e ) . N e2πit − 1

(3.21)

(3.22)

The quantity σN (u) satisfies the particular PVI σ -form differential equation u2 (u − 1)2 (σN )2

+ [σN − (u − 1)σN + 1] 4σN (σN − uσN ) − (N 2 − 1)[σN − (u − 1)σN ] = 0 (3.23)

subject to the boundary condition N2 − 1 (N 2 − 1)(iN − π ) (u − 1)2 + (u − 1)3 + . . . . u→1 12 24π

σN (u) ∼

(3.24)

C,FF (x; 0), except that σNFF (u) is now subject to the The formula (3.23) also holds for ρN boundary condition

N2 − 1 N2 − 1 (u − 1)2 − (u − 1)3 + . . . . u→1 12 24 Proof. This is immediate from Proposition 3 and (3.18), (3.20), except for the boundary conditions. The latter in the free Fermi case follows by substituting the exact evaluation (2.31) in (3.21). Use is also made of the free Fermi density matrix exact evaluation (2.31) to deduce the boundary condition in the impenetrable Bose gas case. Thus according to (2.36)–(2.38) we have C FF x C FF ρ N (x; 0) ρN (x2 ; 0) C C FF ρN (x; 0) = ρN (x; 0) + 2 det dx + · · · ρ C FF (x; x2 ) ρ C FF (x2 ; x2 ) 2 0 N N (N − 1)(N + 1) πx 2 (N − 1)N (N + 1) π x 3 ∼ ρ0 1 − + + ... , x→0 6 L 9π L (3.25) σNFF (u) ∼

where the second line follows after substituting (2.31) and expanding the first term to O(x 2 ) (this term only contains even powers of x), and the second term to its leading order, O(x 3 ). Finally we substitute (3.25) in (3.21) to deduce the expansion (3.24). One immediate consequence of Corollary 1 is that it allows the small x expansion to easily be extended. Thus it follows that the corrections to (3.25) at order x 4 and x 5 are (N − 1)(N + 1)[3N 2 − 7] πx 4 (N − 1)N (N + 1)[11N 2 − 29] π x 5 − . + 360 L 1350π L (3.26) The results (1.5), (1.6) of Jimbo et al [22] follow simply from our results (3.22), (3.23). Thus defining σV (t) = limN→∞ σN (e2it/N ) we obtain (1.5) from (3.22), while substituting u = e2it/N in (3.23), replacing σN (e2it/N ) with σV (t) and equating the leading order terms in N (which are O(1)) to zero gives (1.6). The boundary condition (1.7) corresponds to the scaled limit of (3.24).

Painlev´e Transcendent Evaluations of Density Matrices

273

3.2. Orthogonal polynomials on the unit circle. A feature of a number of recent studies [10, 11, 1, 3, 4] relating Hankel and Toeplitz determinants to Painlevé transcendents has been the characterisation of the former not only as the solution of nonlinear differential equations, but also as the solution of nonlinear difference equations. Here we will show C (x; 0). a difference equation characterisation is also possible for ρN For this purpose we adopt an orthogonal polynomial approach, similar to that used in [19]. The characterisation of the density matrix as a Toeplitz determinant with a non-negative and bounded symbol (2.13, 2.14) immediately implies an underlying orthogonal polynomial system defined on the unit circle. The weight appearing in (2.14) is the special case a = b = 1/2 of the generalised Jacobi weight w(z) =

C |1 + z|2a |1 + uz|2b , 2π

a, b ∈ C, z ∈ T,

(3.27)

where C is the normalisation

dz C |1 + z|2a |1 + uz|2b = 1. 2π T iz

(3.28)

Associated with (3.27) is a system of orthonormal polynomials {φn (z)}∞ n=0,1,... , dz w(z)φn (z)φm (z) = δm,n . T iz In obtaining a recurrence relation for DN−1

dz w(z)zk−j := det , j,k=1,... ,N T iz

and thus since C (x; 0) = ρN+1

1 DN−1 |a=b=1/2 LC N

(3.29)

for the density matrix, one focuses attention on the leading two coefficients κn , ln in φn (z), and the trailing coefficient φn (0), φn (z) = κn zn + ln zn−1 + . . . + φn (0).

(3.30)

The relevance of κn , φn (0) are seen from the Szegö relations [41] κn2 =

Dn−1 , Dn

κn2 =

n

|φk (0)|2

k=0

which show in particular that 1 − |rN |2 =

DN−2 DN , 2 DN−1

rn :=

φn (0) . κn

(3.31)

We will see that for the weight (3.27), the Freud equations – which are recurrence relations among the successive coefficients κn , φn (0) – have a special structure which leads C (x; 0). to a recurrence equation for rn , and thus according to (3.29) and (3.31), for ρN

274

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Proposition 4. Consider the special case a = b = 1/2 of (3.27), in which case according to (3.29) the relation (3.31) reads 1 − |rN |2 =

C ρC ρN+2 N C )2 (ρN+1

.

(3.32)

The ratios rn , and thus via (3.32) the successive density matrices, are determined by the third order difference equation with respect to N 2 cos

2 1 − r˜N+1 πx (N + 3)˜rN+2 + (N + 1)˜rN + 2˜rN+1 r˜N = L r˜N+1 1 − r˜N2 − (N + 2)˜rN+1 + N r˜N−1 , r˜N

(3.33)

where rn := eiπ(1−x/L)n r˜n ∈ R. The initial members of this sequence of r˜n required to start the recurrence are r˜0 = 1,

(3.34)

r˜1 = 1/4 1

2 (π

π − 2πx/L + sin(2π x/L) − 2π x/L) cos(π x/L) + sin(π x/L)

(3.35)

(substituting these values in (3.33) with N = 0 allows r˜2 to be computed). Also, the C sequence are specified by (2.16) and (2.17) (with these values initial members of the ρN C given ρ3 is computed from (3.32) with N = 1). Proof. Magnus has found a recurrence relation [33] for the ratios rn applicable to the generalised Jacobi weight (3.27), (n + 1 + a + b)rn+1 + (n − 1 + a + b)ur ¯ n−1 =

u¯ l¯n /κn + ln /κn − n(u¯ + 1) . (3.36) 1 − |rn |2

However this involves both rn and ln and we require a further relation to determine ln . This relation is ln l¯n (a − b)n (u¯ − 1), − u¯ = κn κn n+a+b

(3.37)

and follows from telescoping the identity ln+1 ln l¯n+1 l¯n (n + a + b + 1) − (n + a + b) = (a − b)(u¯ − 1), − u¯ − u¯ κn+1 κn+1 κn κn which in turn is derived by evaluating dz (1 + z)(1 + uz)w (z)φn (z)φn (z) T iz in two different ways. Then (3.33) follows after setting a = b = 1/2 and extracting a phase factor of eiπ(1−x/L)n .

Painlev´e Transcendent Evaluations of Density Matrices

275

The initial conditions (3.34), (3.35) can be determined by a Gram-Schmidt type construction of the orthonormal polynomials {φn (z)}n=0,1,... . First, due to the normalisation of the weight (3.28) we have φ0 (z) = 1 and thus (3.34) follows. With dz f, g := w(z) f (z)g(z), a=b=1/2 iz T the orthogonality φ1 (z), φ0 (z) = 0, explicit value φ0 (z) = 1 and (3.30) give κ1 z, 1+ φ1 (0) = 0. The value of z, 1 can be read off from (2.15), thus implying (3.35). We note that the special cases corresponding to x at either the endpoints (x = 0, L and thus u = 1) or the midpoint (x = L/2, and thus u = −1) allow simple explicit formulas for the rn . Thus we have [19] u = 1, u = −1,

1 , N +1 1 = , rN=2p+1 = 0, N +1

rN = (−1)N rN=2p

(3.38)

which clearly satisfy (3.33). The density matrix at these points also has a closed form evaluation, N , L 1 4 N−1 (p−1)(2p−1) G(p + 2)G6 (p + 1)G(p) C ρN=2p (L/2; 0) = 4 , L π G2 (2p + 1) 1 4 N−1 p(2p−1) G4 (p + 1)G4 (p + 2) C ρN=2p+1 (L/2; 0) = 4 , L π G2 (2p + 2) C ρN (0; 0) =

where G(x), the Barnes G-function, has the explicit form G(x) = (x − 2)!(x − 3)! . . . 1! for x ∈ Z≥2 . Here the former evaluation follows from the general fact that at coincident points the density matrix is equal to the particle density, while the latter makes use of results from [19] on √ the explicit form of the κn ’s for the weight (3.27), κ2n = κ2n+1 = (2n + 1)!/22n (n!)2 n + 1, in the case u = −1, a = b. The small x expansion C (x; 0), (3.25) and (3.26), substituted into (3.32) allow the corresponding small x of ρN expansion of r˜n to be computed up to a sign, which in turn can be determined using (3.38). This shows 1 n(n + 2) πx 2 n(n + 2) π x 3 n r˜n ∼ (−1) − + ... . + n + 1 6(n + 1) L 3π L A feature of the explicit forms (3.38) is that |rN | → 0 as N → ∞. According to C (x; 0) as N → ∞. Here (3.31) this is a necessary condition for the convergence of ρN we note that with |rN | small the difference equation (3.33) simplifies to read 2 cos

πx = BN+1 − BN , L

BN :=

(N + 2)˜rN+1 + N r˜N−1 . r˜N

(3.39)

It follows from (3.39) that BN = 2N cos πx/L + C, where C is independent of N . Noting that this implies BN ∼ 2(N + 1) cos πx/L for N large, we thus obtain the recurrence (N + 2)˜rN+1 + N r˜N−1 = (N + 1)˜rN 2 cos π x/L,

(3.40)

276

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

which with AN := (N + 1)˜rN , cos πx/L := t reads AN+1 + AN−1 = 2tAN .

(3.41)

This is precisely the three term recurrence satisfied by the Chebyshev polynomials [41]). Thus (3.33) can be regarded as a non-linear generalisation of the Chebyshev recurrence (3.41). Although it is not obvious from the derivation, Eqs. (3.32) and (3.33) remain valid in the free Fermi case. This can be seen by substituting the exact evaluation (2.31) into (3.32) to deduce r˜n =

sin πx/L sin π(n + 1)x/L

and then verifying that this is an exact solution of (3.33). Unlike r˜n in the Bose case, (3.2) does not obey the inequality |˜rn | < 1 for all x. As our final point of the difference equation, we remark that recently Adler and van Moerbeke [1] have constructed essentially the same pair of coupled recurrences (3.36), (3.37) from their theory of the Toeplitz lattice and its Virasoro algebra. In the particular case at hand their weight is specialised to α = β = 1, and ξ −2 = u = e2πix/L . Their variables are related to ours by xn = r˜n and through the use of their relation (0.0.14), which is the analogue of (3.37), then one can show yn = xn . The other recurrence in their work, (0.0.15) is the analogue of (3.36) and can be shown to lead to 2 ) (n − 1)xn xn−2 + n (1 − xn2 ) (n + 2)xn+1 xn−1 + n + 1 − (1 − xn−1 2 = 1 + 2xn−1 xn cos(π x/L) + xn−1 xn2

− 1 + (1 − x12 )(3x2 + 2) − x1 (x1 + 2 cos(π x/L)). Now by clearing denominators and rearranging (3.33) one can recover the first five terms of the above relation. Furthermore by using the initial conditions of the recurrence (3.34), (3.35) one can show that the sum of the last three terms is identically zero and thus the two forms are the same. 4. Painlevé-Type Evaluations of ρN (ι(x); x) C (x; 0) is a 4.1. Fredholm formulation. While the unitary average (2.26) defining ρN+1 known τ -function in the Painlevé theory, the same is not true of the averages (2.27) – (2.29). Indeed the density matrices in these cases are genuinely functions of both x and y. These variables play the role of time in the Hamiltonian formulations of the Painlevé equations, so there being more than one time variable, we are taken outside this class. However, with y = ι(x), where ι(x) denotes the reflection of x about the centre of the system (thus ι(x) = −x for the harmonic well case, and ι(x) = L − x for the case of the Dirichlet and Neumann boundary conditions) we again have a function of one variable. Although this cannot be recognised as a single τ -function, it turns out that we can formulate the calculation of ρN (ι(x); x) so that it is expressed in terms of quantities known in terms of Painlevé transcendents from random matrix theory. For this one makes use of a classical operator theoretic interpretation of (2.38) relating to Fredholm integral equations [30].

Painlev´e Transcendent Evaluations of Density Matrices

277

It is the latter formulation which has been used in the pioneering work of Jimbo et al. [22] on the evaluation of the bulk density matrix in terms of a Painlevé V transcendent, and the generalisation of this result by Its, Korepin and coworkers [20, 28] to the temperature dependent bulk density matrix. The key point is that with KJ denoting the integral operator on J = [x, y] with kernel (2.37), and R(a, b; ξ ) denoting the kernel of the resolvent operator R := ξ KJ (1 − ξ KJ )−1 , it is true in general that (see e.g. [30, 22]) [x,y]

a ; ξ = −ξ det(1 − ξ KJ )R(a, b; ξ ) b

(4.1)

(the quantity [x,y] is called the first Fredholm minor). Now in the harmonic well case and the cases of Dirichlet and Neumann boundary conditions (the latter two after the change of variables cos πxj /L → xj ) the wave function is of the form (2.32) with 2 harmonic well e−x , g 2 (x) = . (4.2) (1 − x 2 )±1/2 , Dirichlet and Neumann These weights have the property of being even in x. This implies a special structure to (4.1) if J is also chosen to be symmetrical about the origin, J = [−x, x] say. Thus a consequence of g 2 (x) being even is that the orthogonal polynomials pj (x) are even for j even and odd for j odd, and this from (2.37) implies K(a, b) = K(−a, −b). Using this latter property, and with J = [−x, x], it is true in general that (see e.g. [42]) d log det(1 − ξ KJ ) = −2R(x, x; ξ ). dx Antidifferentiating and substituting in (4.1) with [x, y] → [−x, x], then substituting the result in (2.38) shows x ρN (−x; x) = R(−x, x; ξ ) exp − 2 R(t, t; ξ ) dt . (4.3) ξ =2

0

The crucial point of the formula (4.3) is that the quantities R(−x, x) and R(t, t), for Christoffel-Darboux kernels corresponding to the weights (4.2) have previously been calculated in terms of Painlevé transcendents as part of studies into gap probabilities (interval J free of eigenvalues) for random matrix ensembles, the GUE in the case of the harmonic well, and the JUE with a = b = ±1/2 in the case of Dirichlet and Neumann boundary conditions. Although (4.2) and (4.3) have general validity, the specific integrable nature of the kernel (2.37) [20] is essential for this characterisation. In the latter case the quantities in (4.3) were studied in [49], but with J = (−1, −x] ∪ [x, 1) rather than J = [−x, x]. To overcome this difference in detail, we note we can rewrite (2.33) to read ∞ ξ x N−1 Ng(x)g(y) ρN (x; y) = (1 − ξ ) + + dx2 g 2 (x2 ) C 1−ξ −∞ y ∞ ξ x ··· dxN g 2 (xN ) + + 1−ξ −∞ y N × (x − xl )(y − xl ) (xk − xj )2 . l=2

2≤j
ξ =2

278

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Repeating the working which led to (4.3) then shows N−1 ρN (−x; x) = (−1) R(−x, x; ξ ) exp − 2

∞

x

R(t, t; ξ ) dt

ξ =2

,

(4.4)

where R now denotes the kernel of the resolvent operator R = ξ KJ¯ (1 − ξ KJ¯ )−1 , J¯ := (−∞, −x] ∪ [x, ∞). H (−x; x) and ρ D,N (L−x; x). Let us begin by specifying the quan4.2. Evaluation of ρN N tities in (4.3) in the harmonic well case. From the above discussion, this corresponds to the interval J being eigenvalue free in the GUE. For this matrix ensemble, the gap probability for both the interval J = [−x, x], and the interval J = (−∞, −x] ∪ [x, ∞) have been studied [43, 49], allowing us to use either (4.3) or (4.4) to deduce an exact H (−x; x). For purposes of specifying the boundary condition in the expression for ρN corresponding differential equation, it is most convenient to use the latter.

Proposition 5. For the impenetrable Bose gas on a line, confined by a harmonic well, we have ∞ H ˜ (−x; x) = R(x; ξ ) exp − 2 ρN R(t; ξ ) dt , ξ =2

x

√ where with h := s 2 − 2R , R satisfies the equation ! sR + 2R = 2s(s − h) − 2h (R + sR )2 − 4s 2 (s − h)R − 2N s 2 (s − h)2 , while R˜ satisfies the equation

s R˜ + 2R˜ + 8Ns R˜ − 24s 2 R˜ 2

2

˜ 2 (R˜ + s R˜ )2 + 8N s 2 R˜ 2 − 16s 3 R˜ 3 . = 4(s − 2R)

The corresponding boundary conditions are H ˜ ξ ) ∼ (ξ − 1)N−1 ρN R(s; ξ ) ∼ R(s; (s; s) s→∞

s→∞

∼ (ξ − 1)N−1

s→∞

2N−1 2 s 2N−2 e−s . − 1)!

π 1/2 (N

The two resolvent kernels are not independent being related by d ˜ ξ ) − 2R˜ 2 (s; ξ ) R(s; ξ ) = −2s R(s; ds ˜ ξ ). Both resolvent kernels have been reduced to a particular so that h = s + 2R(s; PV transcendent w(x) with parameters α = 1/8N 2 ,

β = −1/8(N − )2 ,

γ = 1/2,

δ = −1/2,

Painlev´e Transcendent Evaluations of Density Matrices

279

where = ±1. The reductions were found to be d 1 2x w + N (w − 1)2 + (2x − 1)w + 1 R= √ 2 dx 8 xw(w − 1) d × 2x w − N (w − 1)2 − (2x + 1)w + 1 , dx √ x d N (w − 1) + R˜ = − w−w + , √ 2w(w − 1) dx 4 xw where s 2 = x. In the case of Dirichlet and Neumann boundary conditions we require R(−t, t) and R(t, t) for the symmetric JUE with (−1, −t] ∪ [t, 1) eigenvalue free. The differential equation satisfied by R(t, t) is known from [49], but the equation for R(−t, t) was not made explicit in that work. We therefore give some details of the required calculation below. Proposition 6. The impenetrable Bose gas on the finite interval [0, L] subject to Dirichlet or Neumann boundary conditions at the ends has the density matrix 1 π πx D,N D,N (L − x; x) = sin dtR D,N (t; ξ ) ξ =2 , ρN R0 (s; ξ ) exp −2 L L s s=cos π x/L where σ (s) := (1 − s 2 )R D,N (s; ξ ) satisfies s(1 − s 2 ) (N + α)2 s + 2sσ − (1 − s 2 )σ + (1 + s 2 )F + 2(N + α)s F 2 − 2(1 − s 2 )σ − 2(N + α)s 2 [(N + α)s + F ] − s[(N + α)s + F ]2

2

= −4s 2 [(N + α)s + F ]2 {N (N + 2α) + 2sσ + 2(N + α)s[(N + α)s + F ]} , (4.5) ! where α = ±1/2 and F := (N + α)2 s 2 − 2(1 − s 2 )σ and with the boundary condition D,N R D,N (s; ξ ) ∼ (ξ − 1)N−1 ρN (s; s)

(4.6)

s→1

∼ (ξ − 1)N−1

s→1

(N + α)(N + 2α + 1) (1 − s)α , − 1)!(α + 1)(α + 2)

(4.7)

2α+1 (N

and R0 := R0D,N (s; ξ ) satisfies s(1 − s 2 )2 R0 + 2(1 − s 2 )(1 − 2s 2 )R0 + 8sR0 (2sR0 − (N + α)/2)(2sR0 − N − α) − 2(1 − s 2 + 2α 2 )sR0 = 4 −(N + α)s + 2(1 + s 2 )R0 × (1 − s 2 )2 (R0 + sR0 )2 + 4(sR0 )2 [(2sR0 − N − α)2 − α 2 ] ,

2

(4.8)

280

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

with the boundary condition R0D,N (s; ξ ) ∼ (ξ − 1)N−1 KND,N (−s, s)

(4.9)

s→1

(N + 2α + 1) (1 − s)α . (4.10) s→1 − 1)! 2 (α + 1) Proof. Both cases come under the symmetric Jacobi weight discussed in [49], however this study doesn’t contain sufficient details for our purposes. From Proposition 8 of the above we can derive an integral of the system of differential equations. Combining (4.26) and (4.27) we have ∼ (ξ − 1)N−1

2α+1 (N

(1 − s 2 )(qp) − [β0 + u(2α1 −1)]p2 + [γ0 − w(2α1 +1)]q 2 = 0

(4.11)

and we can rewrite (4.23) as σ − [β0 + u(2α1 −1)]p 2 − [γ0 − w(2α1 +1)]q 2 − 2α1 sqp − 2s −1 (1 − s 2 )q 2 p 2 = 0. (4.12) Adding (4.12) to 2α1 times (4.11) and employing (4.24, 25, 28) we find the sum to be a perfect derivative, and given that R0 , σ, u, w all vanish as s → 1 the integral is [β0 + u(2α1 −1)][γ0 − w(2α1 +1)] − 2sσ − 4α1 (1 − s 2 )qp = β0 γ0 = N (N + 2α). Given this integral we can follow the procedure in the proof of Proposition 11 in [48] to derive second-order differential equations for σ and R0 , and Eqs. (4.5), (4.8) are the results. The boundary conditions are found from the kernels N !(N + 2α + 1) KN (t, t) = 2α+1 2 (1 − t 2 )α 2 (N + α) (α+1,α) (α,α+1) (α,α) (α+1,α+1) × PN−1 (t)PN−1 (t) − PN (t)PN−2 (t) , KN (−t, t) = (−)N−1 (α,β)

where the PN

N !(N + 2α + 1) (1 − t 2 )α (α,α) (α,α) PN−1 (t)PN (t), 22α+1 (N + α)(N + α + 1) t

(t) is the standard Jacobi polynomial.

Again the two resolvent kernels are connected by d σ = −2(N + α)sR0 − 2(1 − s 2 )R02 . ds This system was solved in terms of PVI transcendent w(x) with parameters α = 1/8,

β = −1/2(N + α − /2)2 ,

γ = 0,

δ = 1/2(1 − α 2 )

according to 1 d w−1 w− √ R0 (s) = √ [(w + x) − 2(N + α)x] , 2 xw dx 4 x(x − 1) w d w − (w − 1)(w + x) 2 2(x − 1) dx σ (s) = − , √ 8 xw(w − 1)(w − x) √ x(w − 1) − N (N + 2α)w − (N + α)2 x , 2w(w − x) √ where s = x and = ±1.

Painlev´e Transcendent Evaluations of Density Matrices

281

D,N We remark that the thermodynamic form of ρN (x; y) for x and y fixed but general has been studied in the spirit of the paper of Jimbo et al. by Kojima [27], although the final characterisations obtained (in terms of an integrable differential system) does not appear to be amenable to computation.

4.3. Thermodynamic limit. In the thermodynamic limit, Jimbo et al. [22] were able to identify the first Fredholm minor directly as a τ -function, and so had no need for the formula (4.3). Nonetheless it can be used to evaluate ρ∞ (x; 0) in terms of a solution of (1.8), as we will now demonstrate. First, repeating the working which lead to (4.3) shows that with KJ the integral operator on J = (−x, x) with kernel K(x, y) = sin(x−y)/π(x−y) and R = ξ KJ (1 − ξ KJ )−1 , πρ0 x ρ∞ (−x; x) = πρ0 R∞ (−πρ0 x, πρ0 x; ξ ) exp −2 R∞ (t, t; ξ )dt . 0

ξ =2

Furthermore, we know from [22] that t − R∞ (1/4 t, 1/4 t; ξ ) = hV (−it; (0, 0, 0, 0)), 2

(4.13)

where hV (t; v) satisfies (1.8), subject to the boundary condition hV (−it; (0, 0, 0, 0)) ∼ −ξ t→0

t2 t − ξ2 2 , 2π 4π

and we know too that d R∞ (t, t; ξ ) = 2(R∞ (−t, t; ξ ))2 dt (see e.g. [42]). Consequently ρ∞ (x; 0) can be expressed in terms of the transcendent (4.13) according to d hV (−it; (0, 0, 0, 0)) 1/2 ρ∞ (x; 0) = πρ0 − dt t t hV (−iu; (0, 0, 0, 0)) × exp du t=2πρ0 x . u 0 ξ =2

(4.14)

This is to be compared against the evaluation (1.5) due to Jimbo et al. [22], with the substitution (1.9) and the boundary condition (1.7) generalised to be consistent with (2.38), ρ∞ (x; 0) = ρ0 exp

0

t

−1/2 + hV (−iu; (1/2, −1/2, 1/2, −1/2)) du t=2πρ0 x , u ξ =2 t2 iξ t 3 1 + + . t→0 2 12 48π

hV (−it; (1/2, −1/2, 1/2, −1/2)) ∼

(4.15) (4.16)

282

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

Equating the logarithmic derivatives of (4.14) and (4.16) gives the identity hV (s; (1/2, −1/2, 1/2, −1/2)) = 1/2 + hV (s; (0, 0, 0, 0)) +

s d d hV (s; (0, 0, 0, 0)) log − . 2 ds ds s

It remains to derive this directly from the Painlevé theory. In this regard we note that an identity with similar characteristics for the PII system can be deduced from a classical result of Gambier, while the analogous result for the PIII system has only recently been found [47]. 5. Discussion C (x; 0) the recurWe will conclude with a discussion of our results. In relation to ρN rence relation (3.33) allows rapid and stable tabulation for very large values of N for all x ∈ [0, L). Hence numerical evaluations of the λk in (1.3) can be carried out. Although for fixed k the leading behaviour in N of λk are known from (1.4), of interest is the convergence of the λk (appropriately scaled) to their thermodynamic value. Furthermore, C (x; 0) given in Corollary 1 is well suited the differential equation characterisation of ρN to generating the power series expansion about x = 0. This relates to the behaviour of λk for k large. A comprehensive study of such issues will be discussed in a forthcoming publication [9]. Our results show a remarkable Fermi-Bose correspondence at a mathematical level of characterising the density. It goes back to Girardeau that up to the sign under permutation of the coordinates, the ground state wave function of the 1d free Fermi and impenetrable Bose systems are identical. This means that quantities depending only on the absolute value squared of ψ0 are the same for both systems. The density matrix is not of this type, and so distinguishes the two systems. Nonetheless, we find that the same differential C (x; 0) for both the Bose and Fermi systems – and difference equations characterise ρN they are only distinguished at this mathematical level by the boundary conditions. For ρ∞ (x; 0) this same property was already observed by Jimbo et al. [22]. One would like to be able to use the differential equation to see how the different prescribed small x behaviours imply different behaviours at large x. In the case of ρ∞ (x; 0), Jimbo et al. [22] were able to do this and so obtain the expansion 1 1 A 1/ + O cos(2πρ 1+ , (5.1) x) − ρ∞ (x; 0) ∼ ρ0 √ 0 4 2 ρ0 x 8(πρ0 x) x4

which was first derived by Vaidya and Tracy [44, 45]. A challenging problem is to use C (x; 0) for large N , Corollary 1 to deduce higher order terms in the expansion (1.4) of ρN with x/N fixed. Note that in the Fermi case this expansion can be read off from (2.31). H (x; y) (and similarly ρ D,N (x; y)) we have the presented For the density matrix ρN N determinant formulation specified by (2.19), (2.22), as well as the formulation (2.27) as an average over the eigenvalues of the GUE. Both these forms are suitable for the H (x; y) for general x and y. In the special case that (x; y) → numerical computation of ρN (−x; x) we have given an explicit functional form in terms of the transcendents related to PV . This provides a much more efficient numerical scheme for the computation of H (−x; x), and will provide a valuable test for the accuracy of Monte Carlo evaluation ρN via (2.27).

Painlev´e Transcendent Evaluations of Density Matrices

283

With this achieved, the next step in the determination of the occupation numbers is the numerical solution of the integral equation (1.2). In addition to the numerical evaluation of the occupations, one would like to determine the exact leading asymptotic H (x; y) analogous to (1.4). behaviour of λi , i fixed, from an asymptotic expression for ρN Thus we seek the asymptotic expansion of the determinant specified in (2.19), (2.22), which can be considered as a Hankel generalisation of a Fisher-Hartwig type Toeplitz determinant, or equivalently the asymptotic expansion of the random matrix average (2.27). On this latter viewpoint of the problem, we recall that the original Szegö theorem for the asymptotic form of the Toeplitz determinants with smooth, positive symbols has been proven by Johansson [23] starting from the analogue of the random matrix average (2.26) and then generalised to an analogous theorem for Hankel determinants relating to Jacobi averages [24]. We note too that some mappings of Fisher-Hartwig symbols in the Toeplitz case to analogous symbols in the Hankel case are known [2]. Another aspect of the present work which provides the beginning for future studies is our derivation of the recurrence relation (3.33) using orthogonal polynomial theory. As we have commented, the recurrence obtained via this method coincides with recurrence obtained by Adler and van Moerbeke using methods from soliton theory. Of course one would like to understand the underlying reason for this coincidence. Also, recent works of Borodin and Boyarchenko [3, 4], starting from a formulation in terms of the discrete Riemann-Hilbert problem, provides alternative recurrences involving discrete Painlevé equations for closely related Toeplitz determinants. It remains to understand the relationship between the different approaches. Note added. Subsequent to the completion of this work an asymptotic form analogous to (1.4) has been derived in [9] for the harmonic well case, and from this it is deduced √ that, as with periodic boundary conditions, the λk for fixed k are proportional to N . Acknowledgement. This research has been supported by the Australian Research Council. NSW thanks V. Korepin and A. Fetter for comments and suggestions on this work.

References 1. Adler, M., van Moerbeke, P.: Recursion relations for unitary integrals, Combinatorics and the Toeplitz lattice. math-ph/0201063 2. Basor, E.L., Ehrhardt, T.: Some identities for determinants of structured matrices. Linear Algebra Appl. 343/344, 5–19 (2002). Special issue on structured and infinite systems of linear equations 3. Borodin, A.: Discrete gap probabilities and discrete Painlevé equations. math-ph/0111008 4. Borodin, A., Boyarchenko, D.: Distribution of the first particle in discrete orthogonal polynomial ensembles. math-ph/0204001 5. Deift, P., Its, A., Kapaev, A., Zhou, X.: On the algebro-geometric integration of the Schlesinger equations. Commun. Math. Phys. 203(3), 613–633 (1999) 6. Dettmer, S., Hellweg, D., Ryytty, P., Arlt, J.J., Ertmer, W., Sengstock, K., Petrov, D.S., Shlyapnikov, G. V., Kreutzmann, H., Santos, L., Lewenstein, M.: Observation of phase fluctuations in elongated Bose-Einstein condensates. Phys. Rev. Lett. 87(16), 160406–1–160406–4 (2001) 7. Fisher, M.E., Hartwig, R.E.: Toeplitz determinants: some applications, theorems and conjectures. Ad. Chem. Phys. 15, 333–353 (1968) 8. Forrester, P.J.: Log Gases and Random Matrices. http://www.ms.unimelb.edu.au/~matpjf/matpjf.html 9. Forrester, P.J., Frankel, N.E., Garoni, T., Witte, N.S.: Finite one dimensional impenetrable Bose systems: Occupation numbers. In press Phys. Rev. A 10. Forrester, P.J., Witte, N.S.: Application of the τ -function theory of Painlevé equations to random matrices: PIV, PII and the GUE. Commun. Math. Phys. 219, 357–398 (2001)

284

P.J. Forrester, N.E. Frankel, T.M. Garoni, N.S. Witte

11. Forrester, P.J., Witte, N.S.: Application of the τ -function theory of Painlevé equations to random matrices: PVI, the JUE, CyUE, cJUE and scaled limits. math-ph/0204008 12. Girardeau, M.D.: Relationship between systems of impenetrable bosons and fermions in one dimension. J. Math. Phys. 1, 516–523 (1960) 13. Girardeau, M.D., Wright, E.M., Triscari, J.M.: Ground-state properties of a one-dimensional system of hard-core bosons in a harmonic trap. Phys. Rev. A 63, 033601–1–033601–6 (2001) 14. Görlitz, A., Vogels, J.M., Leanhardt, A.E., Raman, C., Gustavson, T.L., Abo-Shaeer, J.R., Chikkatur, A.P., Gupta, S., Inouye, S., Rosenband, T., Ketterle, W.: Realization of Bose-Einstein condensates in lower dimensions. Phys. Rev. Lett. 87(13), 130402–1–130402–4 (2001) 15. Greiner, M., Bloch, I., Mandel, O., Hänsch, T.W., Esslinger, T.: Exploring phase coherence in a 2d lattice of Bose-Einstein condensates. Phys. Rev. Lett. 87(16), 160405–1–60405–4 (2001) 16. Hartwig, R.E., Fisher, M.E.: Asymptotic behavior of Toeplitz matrices and determinants. Arch. Rational Mech. Anal. 32, 190–225 (1969) 17. Hitchin, N.J.: Twistor spaces, Einstein metrics and isomonodromic deformations. J. Differ. Geom. 42(1), 30–112 (1995) 18. Ince, E.L.: Ordinary Differential Equations. New York: Dover, 1956 19. Ismail, M.E.H., Witte, N.S.: Discriminants and functional equations for polynomials orthogonal on the unit circle. J. Approx. Theory 110(2), 200–228 (2001) 20. Its, A.R., Izergin, A.G., Korepin, V.E., Slavnov, N.A.: Differential equations for quantum correlation functions. In: Proceedings of the Conference on Yang-Baxter Equations, Conformal Invariance and Integrability in Statistical Mechanics and Field Theory, volume 4, 1990, pp. 1003–1037 21. Jimbo, M., Miwa, T.: Monodromy preserving deformation of linear ordinary differential equations with rational coefficients. II. Phys. D 2(3), 407–448 (1981) 22. Jimbo, M., Miwa, T., Môri, Y., Sato, M.: Density matrix of an impenetrable Bose gas and the fifth Painlevé transcendent. Phys. D 1(1), 80–158 (1980) 23. Johansson, K.: On Szeg˝o’s asymptotic formula for Toeplitz determinants and generalizations. Bull. Sci. Math. (2) 112(3), 257–304 (1988) 24. Johansson, K.: On random matrices from the compact classical groups. Ann. of Math. (2) 145(3), 519–545 (1997) 25. Kajiwara, K., Masuda, T., Noumi, M., Ohta, Y., Yamada, Y.: Determinant formulas for the Toda and discrete Toda equations. Funkcialaj Ekvacioj 44, 291–307 (2001). solv-int/9908007 26. Kitaev, A.V., Korotkin, D.A.: On solutions of the Schlesinger equations in terms of θ -functions. Internat. Math. Res. Notices (17), 877–905 (1998) 27. Kojima, T.: Ground-state correlation functions for an impenetrable Bose gas with Neumann or Dirichlet boundary conditions. J. Statist. Phys. 88(3–4), 713–743 (1997) 28. Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum inverse scattering method and correlation functions. Cambridge: Cambridge University Press, 1993 29. Lenard, A.: Momentum distribution in the ground state of the one-dimensional system of impenetrable bosons. J. Math. Phys. 5(7), 930–943 (1964) 30. Lenard, A.: One-dimensional impenetrable bosons in thermal equilibrium. J. Math. Phys. 7(7), 1268–1272 (1966) 31. Lenard, A.: Some remarks on large Toeplitz determinants. Pacific J. Math. 42, 137–145 (1972) 32. Lieb, E.H., Liniger, W.: Exact analysis of an interacting Bose gas. I. The general solution and the ground state. Phys. Rev. (2) 130, 1605–1616 (1963) 33. Magnus, A.: MAPA3072A Special topics in approximation theory 1999–2000: Semi-classical orthogonal polynomials on the unit circle. http://www.math.ucl.ac.be/~magnus/ 34. Malmquist, J.: Sur les équations différentielles du second ordre dont l’intégrale générale a ses points critiques fixes. Arkiv Mat. Astron. Fys. 18(8), 1–89 (1922) 35. Muir, T.: The Theory of Determinants in the Historical Order of Development. New York: Dover, 1960 36. Noumi, M., Okada, S., Okamoto, K., Umemura, H.: Special polynomials associated with the Painlevé equations. II. In: Integrable Systems and Algebraic Geometry (Kobe/Kyoto, 1997), River Edge, NJ: World Sci. Publishing, 1998, pp. 349–372 37. Okamoto, K.: Studies on the Painlevé equations. I. Sixth Painlevé equation PVI . Ann. Mat. Pura Appl. (4) 146, 337–381 (1987) 38. Okamoto, K.: Studies on the Painlevé equations. II. Fifth Painlevé equation Pv . Japan. J. Math. (N.S.) 13(1), 47–76 (1987) 39. Olshanii, M.: Atomic scattering in the presence of an external confinement and a gas of impenetrable bosons. Phys. Rev. Lett. 81(5), 938–941 (1998) 40. Proctor, R.A.: Odd symplectic groups. Invent. Math. 92(2), 307–332 (1988) 41. Szegö, G.: Orthogonal Polynomials. Colloquium Publications 23. Providence, Rhode Island: American Mathematical Society, third edition, 1967

Painlev´e Transcendent Evaluations of Density Matrices

285

42. Tracy, C.A., Widom, H.: Introduction to random matrices. In: Geometric and Quantum Aspects of Integrable Systems (Scheveningen, 1992), Berlin: Springer 1993, pp. 103–130 43. Tracy, C.A., Widom, H.: Fredholm determinants, differential equations and matrix models. Commun. Math. Phys. 163(1), 33–72 (1994) 44. Vaidya, H.G., Tracy, C.A.: One-particle reduced density matrix of impenetrable bosons in one dimension at zero temperature. Phys. Rev. Lett. 42(1), 3–6 (1979) 45. Vaidya, H.G., Tracy, C.A.: One particle reduced density matrix of impenetrable bosons in one dimension at zero temperature. J. Math. Phys. 20(11), 2291–2312 (1979) 46. Widom, H.: Toeplitz determinants with singular generating functions. Am. J. Math. 95, 333–383 (1973) 47. Witte, N.S.: New transformations for Painlevé’s third transcendent. To appear Proc. Amer. Math. Soc. 48. Witte, N.S., Forrester, P.J.: Gap probabilities in the finite and scaled Cauchy random matrix ensembles. Nonl. 13, 1965–1986 (2000) 49. Witte, N.S., Forrester, P.J., Cosgrove, C.M.: Gap probabilities for edge intervals in finite Gaussian and Jacobi unitary matrix ensembles. Nonl. 13, 1439–1464 (2000) Communicated by L. Takhtajan

Commun. Math. Phys. 238, 287–304 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0846-0

Communications in

Mathematical Physics

The General O (n) Quartic Matrix Model and Its Application to Counting Tangles and Links P. Zinn-Justin Laboratoire de Physique Th´eorique et Mod`eles Statistiques, Universit´e Paris-Sud, Bˆatiment 100, 91405 Orsay Cedex, France Received: 8 August 2001 / Accepted: 27 January 2003 Published online: 5 May 2003 – © Springer-Verlag 2003

Abstract: The counting of alternating tangles in terms of their crossing number, number of external legs and connected components is presented here in a unified framework using quantum field-theoretic methods applied to a matrix model of colored links. The overcounting related to topological equivalence of diagrams is removed by means of a renormalization scheme of the matrix model; the corresponding “renormalization equations” are derived. Some particular cases are studied in detail and solved exactly. 1. Introduction The goal of this paper is to investigate a fairly general enumeration problem related to the theory of knots, links and tangles: we want to count objects which live in 3-dimensional space and are (loosely) made of a certain collection of “strings”, some of which open (with fixed endpoints) and some closed on themselves, intertwined together in an alternating way. As usual in knot theory, these objects will be considered up to topological equivalence (deformation or ambient isotopy), and represented by their projections on the plane; we shall then classify them according to the (minimal) number of crossings, the number of connected components, and the way the various “external legs” connect to each other, see for example Fig. 1. Without going into too much detail for now, we see that a convenient way to keep track of the number of connected components and of the connections of the external legs is to use colors, see Fig. 2. The colors allow us to distinguish the various external legs and add an extra power series variable in the theory (the number of colors n) to count separately objects with different numbers of connected components. The idea to use colors was already suggested in [11, 10] and present in the work [12]. At this stage it is natural to define a matrix model whose Feynman diagram expansion will produce such diagrams with n colors. Here we shall give a unified quantum field-theoretic treatment of this O(n)-invariant matrix model, which simplifies and generalizes

288

P. Zinn-Justin

Fig. 1. A diagram with 3 open and 1 closed line, intertwined together with 7 crossings

Fig. 2. Coloring the diagram of Fig. 1. Open lines have fixed colors (distinct from each other), represented here by solid/dashed/dotted lines, whereas closed lines have arbitrary color

the equations obtained in [12] (Sect. 2 below). In particular, it gives a practical way to do the enumeration by computer; this procedure was recently used in the numerical work [4]. Even though the matrix model we propose is fairly natural, since as we shall see it is the most general quartic O(n)-invariant matrix model with a single trace in the action, it is in general unsolvable (or at least unsolved). It can be thought of as describing a statistical model on random dynamical lattices; more precisely, it is a model of fully packed loops drawn in n colors on random tetravalent planar diagrams with weights attached to vertices (intersections or tangencies of loops). Even the corresponding model on a regular (flat) square lattice is not fully understood. However it is tempting to speculate on the universality class of the latter; and that putting it on random lattices will correspond, in the continuum limit, to the usual coupling of two-dimensional conformal field theory to gravity, which enables us to predict the critical exponents of the theory based on the KPZ relation [6]. This in turn leads to various conjectures on the asymptotic number of large links and tangles, made in [10], which have been tested numerically in [4]. We shall not come back to these conjectures here, but instead produce exact analytic solutions of two particular cases of our matrix model (Sect. 3 below): the classical case n = 1 (no colors), with some generalizations of the results of [11]; and the case n = −2, which is interesting because its asymptotic behavior cannot be obviously guessed by the universality arguments mentioned above. 2. General Principle We assume the reader is familiar with the concept of links and tangles. Let us recall here that once projected on a plane, they give rise to planar diagrams with tetravalent vertices which must be “decorated” to distinguish under/over-crossings. Link diagrams are closed, whereas tangle diagrams have external legs. The diagrams are said to be alternating if one meets undercrossings and overcrossings alternatingly as one follows the various closed loops of the diagram. The alternating property allows to ignore the

General O(n) Quartic Matrix Model

289

Fig. 3. A planar Feynman diagram of (2.1) and the corresponding alternating link diagram

decorations of the vertices since they can be recovered from the diagram alone (up to a mirror symmetry for the closed diagrams, see below). 2.1. Definition of the O(n) matrix model. As in [10] and [12], we start with the following matrix integral over N × N hermitian matrices n N tr − 21 na=1 Ma2 + g4 na,b=1 Ma Mb Ma Mb Z (N) (n, g) = , (2.1) dMa e a=1

where n is (for now) a positive integer. The integral is normalized so that Z (N) (n, 0) = 1. The partition function (2.1) displays a O(n) symmetry where the Ma form a vector of O(n). Expanding in power series in g generates Feynman diagrams with double edges (“fat graphs”) drawn in n colors in such a way that colors cross each other at the vertices. By taking the large N limit one selects the planar diagrams, which are closely related to alternating link diagrams, cf. Fig. 3. More precisely, the large N “free energy” log Z (N) (n, g) N→∞ N2

F (n, g) = lim

(2.2)

is a double generating function of the number fk;p of alternating link diagrams with k connected components and n crossings (weighted by the inverse of their symmetry factor, and with mirror images identified): F (n, g) =

∞ ∞

fk;p nk g p .

(2.3)

k=1 p=1

Note that it is clearly possible to analytically continue Z (N) (n, g) to arbitrary values of n (using, for example, a Hubbard–Stratonovitch transformation) so that Eq. (2.3) still holds. In particular, the counting of knot diagrams is given by f1;p and can be obtained by formally taking the limit n → 0, in the spirit of the replica method. Also, if n is an even negative integer one can write fermionic analogues of (2.1), see Sect. 3.2, which display Sp(|n|) symmetry. If one is interested in counting objects with a weight of 1, one cannot consider the free energy which corresponds to closed diagrams, but instead correlation functions of the

290

P. Zinn-Justin

Γ1

g

g3 g3

g2 Γ

2

Fig. 4. Tangles of types 1 and 2

model which generate diagrams with external legs: these are essentially tangle diagrams. Typically, we shall be interested in the two-point function 1 G(n, g) ≡ lim (2.4) trMa2 , N→∞ N where the measure on the matrices Ma is given by Eq. (2.1) and a is any fixed index, which generates tangle diagrams with two external legs; and the connected four-point functions 1 1 (n, g) = lim (2.5.1) tr(Ma Mb )2 , N→∞ N 1 2 (n, g) = lim tr(Ma2 Mb2 ) − G(n, g)2 , (2.5.2) N→∞ N where a and b are two distinct indices, which generate tangle diagrams with four external legs of type 1 and 2 (see Fig. 4). Note that the freedom to replace a link diagram with its mirror image by inverting all under/over-crossings is, in the case of correlation functions, removed by fixing conventionally the first crossing encountered starting from a given external leg. Let us briefly mention for now that the definition of G(n, g) again assumes n to be a positive integer, and has a natural continuation to any n; however the definitions of i (n, g) are only meaningful for n integer greater than or equal to 2, and there is a difficulty associated to this, which will be explained in Sect. 2.3. 2.2. Renormalization of the O(n) model. The model presented above is not sufficient to properly count colored tangles. Essentially, this comes from the fact that there is not a one-to-one correspondence between diagrams and the objects they are obtained from by projection. This generates a redundancy in the counting since to a given knot will correspond many (an infinity of) diagrams, each counted once. In the case of alternating diagrams one can distinguish two steps to remove this redundancy. First one must find a way to restrict ourselves to reduced diagrams which contain no irrelevant crossings (Fig. 5a); such diagrams will have minimum number of crossings. It turns out to be

General O(n) Quartic Matrix Model

291

a)

b)

Fig. 5. (a) An irrelevant crossing. (b) A non-prime link

Fig. 6. A flype

convenient to introduce at this point a closely related notion: a link is said to be prime if it cannot be decomposed into two pieces in the way depicted on Fig. 5b. It is clear that at the level of diagrams, forbidding decompositions of the type of Fig. 5b automatically implies that the diagram is reduced; and we shall therefore restrict ourselves to prime links and tangles. There may still be several reduced diagrams corresponding to the same link: according to the flyping conjecture, proved in [8], two such diagrams are related by a finite sequence of flypes, see Fig. 6. To summarize, there are two problems: a) the diagrams generated by applying Feynman rules are not necessarily reduced or prime; b) several reduced diagrams may correspond to the same knot due to the flyping equivalence. A study of Figs. 5 and 6 shows that this “overcounting” is local in the diagrams in the sense that problem a) is related to the existence of sub-diagrams with 2 external legs, whereas problem b) is related to a certain class of sub-diagrams with 4 external legs. Clearly such graphs can be cancelled by the inclusion of appropriate counterterms in the action. We are therefore led to the conclusion that we must renormalize the quadratic and quartic interactions of (2.1). Now renormalization theory tell us that we should include in the action from the start every term compatible with the symmetries of the model, since they will be generated dynamically by the renormalization. In order to preserve connectedness we only look for terms of the form of a single trace (of a U (N )-invariant expression). A key observation is that, while there is only one such quadratic O(n)-invariant term, there are two quartic O(n)-invariant terms, which leads to a generalized model with 3 coupling constants in the action (bare coupling constants): Z (N) (n, t, g1 , g2 ) n N tr − 2t na=1 Ma2 + = dMa e

g1 4

n

a,b=1 (Ma Mb )

2

+

g2 2

n

2 2 a,b=1 Ma Mb

.

a=1

(2.6) The Feynman rules of this model now allow loops of different colors to “avoid” each other, which one can imagine as tangencies (Fig. 7).

292

P. Zinn-Justin

g1

g2

Fig. 7. Vertices of the generalized O(n) matrix model

We define again the correlation functions G(n, t, g1 , g2 ) and i (n, t, g1 , g2 ) (Eqs. (2.4) and (2.5)), and want to extract from them the counting of colored alternating tangles with external legs. The idea is to find the expressions of t (g), g1 (g) and g2 (g) as a function of the renormalized coupling constant g, in such a way that the overcounting is suppressed and the correlation functions are generating series in g of the number of colored tangles. At leading order, we shall have t (g) = 1 + o(1), g1 (g) = g + o(g) and g2 (g) = o(g) so that we recover the original model (2.1). However there will be higher order corrections which correspond to the counterterms. Let us consider t (g) first. It is clear that one must remove all two-legged subdiagrams, that is impose G(n, t, g1 , g2 ) = 1.

(2.7)

Let us see more explicitly how this fixes t (g). Noting that (Fig. 8) G=

1 , t −

(2.8)

where is the generating function of 1PI (one-particle irreducible, i.e. which cannot be made disconnected by removing one edge) two-legged diagrams, one finds equivalently that t (g) = 1 + (g),

(2.9)

i.e. the counterterms generated by t (g) must cancel all 1PI two-legged subdiagrams. This is almost a tautology; notice however that one must not cancel all two-legged subdiagrams, since one-particle reducible diagrams would be subtracted multiple times. Next, we must consider the flyping equivalence. Again, it is important to notice that a flype can be made of several “elementary” flypes (Fig. 9), an elementary flype being by definition one that cannot be decomposed any more in this way. In the terminology

G

t −1

t −1 Σ t −1

t −1 Σ t −1 Σ t −1

Fig. 8. Decomposition of the two-point function. Reexpanding in powers of t − 1 will cancel the powers of iff t = 1 +

General O(n) Quartic Matrix Model

293

Fig. 9. Breaking a flype into two elementary flypes

of QFT, an elementary flype consists precisely of one simple vertex connected by two edges to a non-trivial H-2PI (two-particle irreducible in the horizontal channel) tangle diagram. Non-trivial means not reduced to a single vertex; H-2PI means that the tangle diagram cannot be cut into two pieces containing the left and right external legs respectively, by removing two edges. We therefore need to introduce auxiliary generating functions H1 (g), H2 (g) and V2 (g) for non-trivial H-2PI tangles of type 1, of type 2 and of type 2 rotated by π/2 respectively. Only these must be included in the counterterms. It is now a simple matter to consider all possible insertions of elementary flypes as tangle sub-diagrams of a diagram; taking into account the two types of tangle sub-diagrams and the two channels (horizontal and vertical), we find (Fig. 10) that the renormalization of g1 and g2 is simply: g1 (g) = g(1 − 2H2 (g)),

(2.10.1)

g2 (g) = −g(H1 (g) + V2 (g)).

(2.10.2)

All that is left is to find the expressions of the auxiliary generating functions in terms of known quantities. They are easily obtained by decomposing the four-point functions in the horizontal and vertical channels, and will not be rederived here (the reader is

g1 Ma Mb Ma Mb

g2 Ma Ma Mb Mb

g Ma Mb Ma Mb

g H’2 Ma Mb Ma Mb

g H’1 Ma Ma Mb Mb

2

g H’2 Ma Mb Ma Mb

g V’ Ma Ma Mb Mb

Fig. 10. Counterterms needed to cancel flypes

294

P. Zinn-Justin

referred to e.g. [12] for details). 1 , (1 ∓ g)(1 + 2 ± 1 ) 1 H2 + nV2 + H1 = 1 − . (1 − g)(1 + (n + 1)2 + 1 ) H2 ± H1 = 1 −

(2.11a) (2.11b)

2.3. Summary and discussion. We shall now summarize and rewrite more explicitly the formulae found previously, as well as discuss their implications. Let us assume that for a certain n, we have computed the free energy F (n, t, g1 , g2 ). What can we extract from the formulae of the previous paragraph, and how? First, let us differentiate F ; we find G = −2

∂ F , ∂t n

(2.12)

as well as two other quantities,

1 ∂ F 1 (Ma Mb )2 , = lim tr F1 = 4 ∂g1 n n N→∞ N a,b

1 ∂ F 1 F2 = 2 Ma2 Mb2 . = lim tr ∂g2 n n N→∞ N

(2.13.1)

(2.13.2)

a,b

According to the equations of motion, these three quantities are not independent: tG = 1 + g1 F1 + 2g2 F2 .

(2.14)

Comparing (2.13) with the definition (2.5) of the i , we see that there are two different choices of basis of the four-point functions;1 using O(n)-invariance of the measure it is easy to relate them: F1 = n1 + 2(2 + G2 ), F2 = 1 + (n + 1)(2 + G2 ).

(2.15.1) (2.15.2)

These relations also have a simple diagrammatic interpretation, which proves in particular that they are valid for any (complex) n. One observes that relations (2.15) can be inverted to extract 1 and 2 only if n = 1, −2. These two cases will be the object of study of the next section, they are the first in the series of bosonic/fermionic matrix models; and as will be shown these are the values of n for which the model possesses only one quartic O(n)-invariant, contrary to the generic case. For now we simply observe that for n = 1, F1 = F2 and therefore according to (2.13), the free energy F is a function of g1 + 2g2 only; while for n = −2, F1 = −2F1 and F is a function of g1 − g2 only. 1 Using the as the preferred basis is not only natural diagramatically; it is also imposed by the i structure of the equations such as (2.10) and (2.11).

General O(n) Quartic Matrix Model

295

Once we have computed G, 1 and 2 , we can slightly simplify the renormalization equations using the obvious scaling property: 1 G(n, 1, g1 /t 2 , g2 /t 2 ), t 1 i (n, t, g1 , g2 ) = 2 i (n, 1, g1 /t 2 , g2 /t 2 ). t Combining this with Eq. (2.7) results in fixing t (g): G(n, t, g1 , g2 ) =

t (g) = G(n, 1, g1 (g)/t (g)2 , g2 (g)/t (g)2 ).

(2.16a) (2.16b)

(2.17)

At this stage the three unknowns t (g), g1 (g), g2 (g) only appear through the combinations h1 (g) ≡ g1 (g)/t (g)2 , h2 (g) ≡ g2 (g)/t (g)2 ; in particular we have the following expressions for the i ≡ i (n, t (g), g1 (g), g2 (g)): i =

i (n, 1, h1 (g), h2 (g)) . G(n, 1, h1 (g), h2 (g))2

(2.18)

We only need to solve the two remaining renormalization equations (2.10), which we rewrite here: h1 (g) G(n, 1, h1 (g), h2 (g))2 = g(1 − 2H2 (g)), h2 (g) G(n, 1, h1 (g), h2 (g)) = 2

−g(H1 (g) + V2 (g)),

(2.19.1) (2.19.2)

where the auxiliary generating functions are still given in terms of the i by Eqs. (2.11). Finally, solving Eqs. (2.19) gives access to the i , which are the generating series of the numbers of prime alternating tangles of type i. However, we can go further. By computing other correlation functions in the model and composing them with the solutions t (g), g1 (g), g2 (g) of the equations above, one can extract the generating functions of the number of alternating tangles with an arbitrary number of external legs. The correlation functions we consider are traces of non-commutative words in the Ma of degree 2k (for 2k external legs). We usually restrict ourselves to connected correlation functions (free cumulants in the language of free probabilities), which exclude configurations in which some strings have no crossings with the other strings and can be pulled out altogether. This choice is only a matter of taste. For example, there are five O(n)-invariants of degree 6, except, as before, for a finite set of values of n for which there are fewer: only 4 for n = −4, 3 for n = 2, 2 for n = −2, 1 for n = 1. They are given by: 1 (2.20.1) 1 = lim tr(Ma Mb Mc Ma Mb Mc ) − disc. terms, N→∞ N 1 2 = lim tr(Ma Mb Mc Ma Mc Mb ) − disc. terms, (2.20.2) N→∞ N 1 3 = lim (2.20.3) tr(Ma Ma Mb Mc Mb Mc ) − disc. terms, N→∞ N 1 4 = lim tr(Ma Mb Mb Ma Mc Mc ) − disc. terms, (2.20.4) N→∞ N 1 5 = lim (2.20.5) tr(Ma Ma Mb Mb Mc Mc ) − disc. terms, N→∞ N (a, b, c distinct) and give rise to the various six-legged diagrams depicted on Fig. 11.

296

P. Zinn-Justin

Ξ1

Ξ2

Ξ3

g3

g3

g2

g3

Ξ4

g4

Ξ5

g4

g3

g4 g4

g4

g4

g4

Fig. 11. The five types of tangles with 6 external legs

3. Application: Two Solvable Cases There are currently two values of n for which the corresponding matrix model has been exactly solved: n = 1 and n = 2. The case n = 1 is particularly important since it corresponds to counting all alternating tangles regardless of the number of connected components; we shall investigate it here in detail, generalizing known results [9, 11]. The application of the O(n = 2) matrix model (also known as the six-vertex model on dynamical random lattices) to knot theory has already been made in [12], using slightly different methods than in the present paper, and we shall not come back to it. However, we have found earlier that aside from n = 1, there is another special value of n, namely −2, for which a simplification in the model occurs and we can expect some exact analytic results. We shall present below a brief analysis of this case.

General O(n) Quartic Matrix Model

297

3.1. The case n = 1: The usual tangles, and more. As an illustration of the general principle developed above, we present an elementary solution of the case n = 1, that is the counting of alternating tangles. Since there are no colors one cannot distinguish the way the various external legs are connected; the correlation functions available to us will be specified by the number of external legs only. Note that this solution, which generalizes the original calculation of the number of prime alternating tangles with 4 external legs found in [9], is technically different from it. We start by setting n = 1 in the definition of the partition function (Eq. (2.6)); we find: t 2 g0 4 Z (N) (t, g0 ) = dM eN tr − 2 M + 4 M , (3.1) where g0 ≡ g1 + 2g2 . The fact that the partition function only depends on a particular combination of g1 and g2 is consistent with what was found in Sect. 2.3 and related to the existence of only one quartic O(n)-invariant for n = 1. The most general “planar” correlation functions of the model are of the form 1 G2 (t, g0 ) ≡ lim (3.2) trM 2 N→∞ N for which we introduce the generating function: ω(λ) ≡ lim

N→∞

∞ 1 1 1 1 G2 2+1 tr = + N λ−M λ λ

(3.3)

=1

and the corresponding connected correlation functions Gc2 , whose generating function is the inverse function λ(ω): ∞

λ(ω) =

1 c 2−1 G2 ω . + ω

(3.4)

=1

Among them we have the two-point function G ≡ Gc2 = G2 and the connected fourpoint function ≡ Gc4 = G4 − 2G22 which is nothing but the generating function of all tangles (regardless of type): = 1 + 22 . Since we do not have access to 1 and 2 separately, we need to recombine the equations of Sect. 2.2 so that only appears in them. Fortunately, this turns out to be possible; taking (2.10.1)+2×(2.10.2) results in g0 (g) = g(1 − 2H (g)),

(3.5)

where H (g) ≡ H2 (g) + H1 (g) + V2 (g) is the generating function of all H-2PI nontrivial tangles. Equation (3.5) can of course be derived directly in a manner similar to Eqs. (2.10), by simply disregarding the types of the tangles, i.e. how the outgoing strings are connected to each other inside the tangle. Setting n = 1 in Eq. (2.11b), we also find that H (g) = 1 −

1 (1 − g)(1 + )

so that for n = 1 (and n = 1 only) we have a closed subset of equations.

(3.6)

298

P. Zinn-Justin

We now turn to the solution of our matrix model. We do not repeat the calculation of the G2 here since it is a standard result of matrix models, see [1]. Starting from the following expression: ω(λ) = with A =

√

1 t− 3

t 2 −12g0 g0

1 tλ − g0 λ3 − (t − g0 λ2 − g0 A) λ2 − 2A 2

(3.7)

solution of 3A2 g0 − 2At + 4 = 0

(3.8)

we find that G2 = A In particular G = 16 A(4 − t as a function of A:

At 2 ),

(2 − 1)!!

( + 2)!

2( + 1) − At . 2

(3.9)

and since G = 1 according to Eq. (2.7), we can express t=

4 (2A − 3). A2

(3.10)

Similarly, using the explicit expression of = G4 − 2, plugging it into Eqs. (3.5), (3.6), and using Eqs. (3.8), (3.10) to express t and g0 in terms of A results in the following fifth degree equation for A: 32(1 − g) − 64(1 − g)A + 32(1 − g)A2 − 4(1 + 2g − g 2 )A3 + 6g(1 − g)A4 − g(1 − g)A5 = 0.

(3.11)

A(g), specified by Eq. (3.11) and A(g = 0) = 2, is a well-defined analytic function of g in a neighborhood of g = 0. The data of A(g) is enough to recover all correlation functions since we have, combining Eqs. (3.9) and (3.10): G2 = 2A−1

(2 − 1)!! (3 − ( − 1)A) . ( + 2)!

(3.12)

Similarly, one can extract the connected correlation functions, using the fact that λ(ω) satisfies a cubic equation (cf. Eq. (3.7)); after a tedious calculation one finds Gc2 =

c (A − 2)−1 (3 − 2 − ( − 1)A) , !

(3.13)

where c is a constant (which already appeared in [1]): c+1 =

1 3 + 1

(−4)q−

/2≤q≤

( + q)! . (2q − )!( − q)!

(3.14)

This concludes the calculation of the generating series of the number of tangles with any given number of external legs. In the appendix, the first few orders of Gc4 , Gc6 , Gc8 are given.

General O(n) Quartic Matrix Model

299

Let us now discuss the asymptotic behavior of the coefficients of the various series for which we found an exact expression. All are simple polynomials in A(g), so that we need to study the latter only. As can be easily checked, the singularity of A(g) closest to the origin is the usual singularity of 2D pure gravity, which is given by g0c = 4/27 and tc = 4/3, so that Ac = 3, and, plugging into Eq. (3.11), √ 21001 − 101 gc = . 270

(3.15)

We expand A around g ↑ gc and find A = 3 − a (gc − g)1/2 + b (gc − g) + O((gc − g)3/2 )

(3.16)

√ 2877137 + 7087 21001 , = 339696 √ 5(99397733 + 2127733 21001) b= . 901510722

(3.17)

with (a > 0) a2

This provides the leading singular part of Gc2 : Gc2 = reg +

1 c a b + ( − 2)a 2 (gc − g)3/2 + · · · , ( − 2)! 3

which finally yields the large order behavior of Gc2 : if Gc2 = p→∞

γ2;p ∼

∞

p=1 γ2;p g

1 c 3/2−p . a b + ( − 2)a 2 p −5/2 gc 3 4 π ( − 2)! 3 √

(3.18) p,

then

(3.19)

For = 2 this result coincides with Theorem 1 of [9]. Note that for any the asymptotic behavior is the same up to a constant. One can of course send and p to infinity in a correlated manner to obtain a non-trivial scaling limit: here, the proper scaling is ∝ p 1/2 . It is simpler to see this at the level of the double generating function ω(g, λ); inserting the various expansions around the critical point into Eq. (3.7) we find the following expression for the leading singularity: ωsing ∼

√

1/2 23/2 √ 2 6(λ − λc ) − a(gc − g)1/2 6(λ − λc ) + a(gc − g)1/2 27 (3.20)

√ (with λc = 6). Up to a normalization of ω, λ − λc and g − gc , this result is simply the usual scaling loop function of pure gravity (see for example [2]), a manifestation of the fact that it is universal, and is not affected by the renormalization of the model.

300

P. Zinn-Justin

3.2. The case n = −2: a fermionic matrix model. For n negative even integer, it is natural, in the spirit of supersymmetry, to look for realizations of our model of colored links under the form of a fermionic matrix model with Sp(−n) symmetry. Let us show how this works in the simplest case, that is n = −2. Our fields will be a “complex fermionic matrix”, that is a matrix = ( ij ), where ¯ j i ), which the ij are independent Grassmann variables, and its formal adjoint † = (

together form the fundamental representation of Sp(2). We apply to them the usual rules of Berezin integration. We must next look for Sp(2)-invariant quadratic and quartic invariants of the form trP ( , † ). Since the are non-commutative objects, one must consider arbitrary tensor products of the (dual of the) fundamental representation; however the trace property combined with the anti-commutativity of the matrix elements implies that an elementary circular permutation must have eigenvalue −1 in this representation. Very explicitly, there is one quadratic invariant,

† − † , and two independant quartic invariants, say

†

† − †

† −

† † + †

†

and

† † − †

† −

† † + † †

. However it is clear that the first quartic invariant is stable by circular permutation and therefore its trace is zero. We are thus left with the following expression: † † † (3.21) Z (N) (t, g0 ) = d d † eN tr −t

+ g0

. It is no surprise that the partition function only depends on one coupling constant g0 , since the analysis of Sect. 2.3 has shown us that for n = −2 all large N quantities depend only on the combination g1 − g2 ; and indeed, by direct inspection one can identify g0 = g1 − g2 . As a side remark, we note that one interpretation of this model is that we are computing the (annealed) average of the Jones polynomial V (t) at t = 1 (which corresponds to g1 = 0; it is a well-known formula that V (1) = (−2)k−1 ). As in the case n = 1, we have access to only one four-point function 1 † † (3.22) tr

= 1 − 2 . ≡ lim N→∞ N c Again, a “miracle” happens in that a particular subset of the renormalization equations becomes closed for n = −2; namely, taking (2.10.1)–(2.10.2), g0 (g) = g(1 − 2H2 (g) + H1 (g) + V2 (g)),

(3.23)

a combination of the H-2PI diagrams appears, which can be related to via the use of Eqs. (2.11): V2 − 2H2 + H1 = −2 +

3 1 + . 2(1 + g)(1 − ) 2(1 − g)(1 + )

(3.24)

We now briefly describe how to compute the integral (3.21) in the large N limit. We can set t = 1 without loss of generality, as explained in Sect. 2.3. We perform the standard Hubbard–Stratonovitch transformation by introducing a hermitian matrix A: √ 1 2 † † † Z (N) (t, g0 ) = d d † dA eN tr −

− 2 A + −g0 A(

+ ) . (3.25)

General O(n) Quartic Matrix Model

301

The gaussian integral over and † can then be performed: N 2 √ Z (N) (t, g0 ) = dA det(1 ⊗ 1 + −g0 (A ⊗ 1 + 1 ⊗ A)) e− 2 trA .

(3.26)

We recognize at this point the usual O(−2) fully packed (non-intersecting)√ loops model2 [7]. We perform the change of variables: M = (A − a0 )2 with a0 = −1/2 −g0 , which absorbs the determinant, resulting in √ N 2 (3.27) Z (N) (t, g0 ) = dM e− 2 tr( M + a0 ) . One can then diagonalize M and compute the integral over eigenvalues using standard large N saddle point techniques. The resolvent of M is given by a complete elliptic integral of the third kind; in particular, 1 1 1 G− trM = + 2 = lim K 3 ((8 − 8k 2 + 3k 4 )K + 4(k 2 − 2)E), N→∞ N 4g0 96π 4 g02 (3.28) where K and E are complete elliptic integrals of the first and second kind with modulus k, and the coupling constant g0 is given by g0 = −

1 K((2 − k 2 )K − 2E) 8π 2

(3.29)

1−G (cf. also [3] for a similar solution). Finally, inserting the expression of = 2g 2 + 1 0G obtained from (3.28) into renormalization Eqs. (3.23)–(3.24) results in a g-dependent transcendental equation for the modulus k 2 . This equation is too complicated to be solved exactly; however it can be easily solved numerically order by order. Finally, is the desired generating function 1 − 2 of tangles; in the appendix we present the first few orders of the expansion of (g). We now turn to the asymptotic behavior of the coefficients of (g). It is known that the O(−2) matrix model does not have any critical point of the usual form of 2D quantum gravity; for example in [3], the solution of the O(−2) model of dense loops, equivalent to ours, is studied in detail when the elliptic modulus k is in the range [0, 1], which corresponds to g0 ∈ [−∞, 0] in our notations, and no singularity is found. This does not mean, of course, that (g) has no singularity, since g0 (and g) can move in the whole complex plane. Generically, these singularities are square root type singularities dg and given by the equation d = 0. It turns out that this equation has plenty of solutions, though one can unfortunately not write them down analytically. Numerically, one finds that the solutions with smallest modulus of g are:

gc ≈ −0.239 ± 0.135i.

(3.30)

2 Of course this might have been expected from the start, since for any n the O(n) model of fully packed non-intersecting loops corresponds to the particular case g1 = 0 of our model, and therefore here g0 = −g2 .

302

P. Zinn-Justin

They are pairs of complex conjugate solutions: this indicates oscillatory behavior of the p series,3 which is due to the fact that n < 0. We therefore find that if = ∞ γ p=1 p g , −p

γp ∼ Re(cst p−3/2 gc ),

(3.31)

where cst ≈ −0.237 ± 0.090i. It would be interesting to find a physical interpretation of this critical point a` la Yang–Lee. Finally, let us note that one could combine the results of n = +2 [12] and of n = −2: this would give rise to a model of oriented tangles in which one counts separately tangles with odd and even numbers of connected components. Since the coefficients of (g) (and presumably also of 1 (g) and 2 (g) separately) in the case n = −2 satisfy, according to Eq. (3.31), γp = O(3.64 . . .p ) whereas in the case n = +2 these coefficients are of the order 6.28 . . .p , we conclude that there are, up to exponentially small corrections, as many odd tangles as there are even tangles. . . Appendix A. Tables for n = 1 and n = –2 up to 32 Crossings Table 1. Table of the number of prime alternating tangles (n = 1) with 4, 6, 8 external legs p 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

3

Gc4

Gc6

Gc8

1 2 4 10 29 98 372 1538 6755 30996 146982 715120 3552254 17951322 92045058 477882876 2508122859 13289437362 71010166670 382291606570 2072025828101 11298920776704 61954857579594 341427364138880 1890257328958788 10509472317890690 58659056351295672 328591560659948828 1846850410940949702 10412612510292744992 58877494436409193754 333824674188182988872

3 14 51 186 708 2850 12099 53756 247911 1178352 5740224 28535604 144283404 740126242 3843972303 20180815236 106957362161 571643594646 3078146310603 16686687494650 91009054240656 499101633250932 2750883342029780 15231756014050908 84695579659496748 472782954018549456 2648662349568626736 14888203427107319436 83947527137925001240 474714688448707647894 2691749836124970938595

12 90 468 2196 10044 46170 215832 1029564 5010192 24830640 125073288 639037476 3306068412 17292904722 91335814848 486589812240 2612379495996 14122834373034 76829648302716 420345016423632 2311716994208856 12773922263423472 70893591427443456 395034141129257304 2209407034450182552 12399753592080373248 69813861782757325992 394245960540417041532 2232568414958638372020 12675855143073018219570

See also [4], in which this oscillatory behavior is found numerically for n < 0.

General O(n) Quartic Matrix Model

303

Table 2. Table of the coefficients of (n = −2 tangles with 4 external legs) p 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

1 −1 1 1 −7 23 −51 50 212 −1596 6492 −19124 37094 1878 −437322 2557800 −10055712 29767944 −58631365 −4689017 740682974 −4462194156 18243692937 −57186253699 127394803329 −81353773012 −1062951245376 7538741871041 −33359417764221 112902256367630 −286176860146756 379259745656069

References 1. Br´ezin, E., Itzykson, C., Parisi, G., Zuber, J.-B.: Planar diagrams. Commun. Math. Phys. 59, 35–51 (1978) 2. Di Francesco, P., Ginsparg, P., Zinn-Justin, J.: 2D gravity and random matrices. Phys. Rep. 254, 1–133 (1995) 3. Eynard, B., Kristjansen, C.: More on the exact solution of the O(n) model on a random lattice and an investigation of the case |n| > 2. Nucl. Phys. B 466, 463–487 (1996) (Preprint hep-th/9512052) 4. Jacobsen, J.L., Zinn-Justin, P.: A transfer matrix approach to the enumeration of knots. J. Knot Theor. Ramif. 11, 739–758 (2002) (Preprint math-ph/0102015); A transfer matrix approach to the enumeration of colored links. J. Knot Theor. Ramif. 10, 1233–1267 (2001), (Preprint math-ph/0104009) 5. Kauffman, L.H.: Virtual knot theory. European J. Comb. 20, 663–690 (1999) (Preprint math.GT/9811028) 6. Knizhnik, V.G., Polyakov, A.M., Zamolodchikov, A.B.: Fractal structure of 2D quantum gravity. Mod. Phys. Lett. A 3, 819–826 (1988); David, F.: Conformal field theories coupled to 2D gravity in the conformal gauge. Mod. Phys. Lett. A 3, 1651–1656 (1988); Distler, J., Kawai, H.: Conformal field theory and 2D quantum gravity. Nucl. Phys. B 321, 509 (1989) 7. Kostov, I.K.: Mod. Phys. Lett. A 4, 217 (1989); Gaudin, M., Kostov, I.K.: Phys. Lett. B 220, 200 (1989); Kostov, I.K., Staudacher, M.: Nucl. Phys. B 384, 459 (1992) 8. Menasco, W.W., Thistlethwaite, M.B.: The tait flyping conjecture. Bull. Am. Math. Soc. 25, 403–412 (1991); The classification of alternating links. Ann. Math. 138, 113–171 (1993) 9. Sundberg, C., Thistlethwaite, M.: The rate of growth of the number of prime alternating links and tangles. Pac. J. Math. 182, 329–358 (1998)

304

P. Zinn-Justin

10. Zinn-Justin, P.: Some matrix integrals related to knots and links. In: Proceedings of the 1999 Semester of the MSRI “Random Matrices and their Applications”. MSRI Publications, Vol. 40 2001 (Preprint math-ph/9910010) 11. Zinn-Justin, P., Zuber, J.-B.: Matrix integrals and the counting of tangles and links. Discr. Math. 246, 343 (2002) (Preprint math-ph/9904019) 12. Zinn-Justin, P., Zuber, J.-B.: On the counting of colored tangles. J. Knot Theor. Ramif. 9, 1127–1141 (2000) (Preprint math-ph/0002020) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 238, 305–331 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0845-1

Communications in

Mathematical Physics

Rationality, Quasirationality and Finite W-Algebras Matthias R. Gaberdiel ∗ , Andrew Neitzke ∗∗ Department of Applied Mathematics and Theoretical Physics, Cambridge University, Wilberforce Road, Cambridge CB3 0WA, United Kingdom Received: 18 October 2000 / Accepted: 28 January 2003 Published online: 5 May 2003 – © Springer-Verlag 2003

Abstract: Some of the consequences that follow from the C2 condition of Zhu are analysed. In particular it is shown that every conformal field theory satisfying the C2 condition has only finitely many n-point functions, and this result is used to prove a version of a conjecture of Nahm, namely that every representation of such a conformal field theory is quasirational. We also show that every such vertex operator algebra is a finite W -algebra, and we give a direct proof of the convergence of its characters as well as the finiteness of the fusion rules.

1. Introduction Conformal field theory has had a major impact on modern theoretical physics as well as modern mathematics. From the point of view of physics, conformal field theory plays a central rˆole in string theory, at present the most promising candidate for a unifying theory of all forces. On the other hand, conformal field theory inspired the purely mathematical definition of vertex operator algebras, which has led to beautiful and deep results in the theory of finite groups and number theory. The significance and power of conformal field theory was first conclusively demonstrated by the work of Belavin, Polyakov and Zamolodchikov [1]. They fixed a general framework for its study which was further developed by Moore and Seiberg [29], in particular. On the other hand, the mathematical theory of vertex operator algebras is due to Borcherds [2, 3] and Frenkel, Lepowsky & Meurman [12], and was further developed by Frenkel, Huang & Lepowsky [10], Zhu [35], Kac [22] and others. Apart from this algebraic viewpoint, there exists also a geometrical approach that was directly inspired ∗ Present address: Department of Mathematics, King’s College London, Strand, London WC2R 2LS, U.K. E-mail: [email protected] ∗∗ Present address: Harvard University, Department of Mathematics, One Oxford Street, Cambridge, MA 02138, USA. E-mail: [email protected]

306

M.R. Gaberdiel, A. Neitzke

by string theory (in particular the work of Friedan & Shenker [13]) and that has been put on a mathematical foundation by Segal [33] and Huang [19–21]. Much has been learned about conformal field theory, but there are still a number of conceptual problems that have not been resolved so far. One of them concerns the question of how to characterise the class of “rational” theories, i.e. those theories that are in some sense finite and tractable. Various definitions of rationality have been proposed, but the interrelations between the different assumptions are not very well understood. One of the assumptions that was introduced by Zhu in [35] in order to be able to prove the convergence of the characters is the condition (sometimes referred to as the C2 condition) that a certain quotient space of the vertex operator algebra is finitedimensional. This is a slightly technical assumption; however, it has the great virtue of being easily testable in concrete examples. In this paper we analyse the consequences that follow from this condition. As we shall show, the C2 condition implies that a whole family of quotient spaces are finite-dimensional, and this in turn is sufficient to prove that the theory has only finitely many n-point functions (Theorem 11), and in particular that all fusion rules between irreducible (untwisted) highest weight representations are finite (Corollary 14). We also prove that the C2 condition implies that each highest weight representation of the theory is quasirational (Theorem 13); this proves a version of a conjecture of Nahm [30]. Finally, we show that every such vertex operator algebra is a finite W -algebra, and we give a direct proof of the convergence of its characters (see also [7, 24] for independent proofs of these results.) Most of these results hinge on finding a small spanning set of the vacuum representation (Proposition 8); we expect that this result may also be useful in other contexts and applications. If we assume in addition that Zhu’s algebra is semisimple, we can also find an upper bound on the (effective) central charge of the theory in terms of the dimension of the C2 quotient space (Proposition 15). The paper is organised as follows. In Sect. 2 we fix our conventions and introduce some notation. In Sect. 3 we define a class of quotient spaces that generalise the construction of the C2 space (and of Zhu’s algebra), and we prove a number of simple properties. In Sect. 4 we recall the definition of a highest weight representation and we explain in which sense Zhu’s algebra classifies these representations. Section 5 is concerned with the different definitions of rationality. The central Proposition is proven in Sect. 6, and a few simple consequences are derived. In Sect. 7 we use this result to prove that each conformal field theory satisfying the C2 condition has only finitely many n-point functions, and we show that this implies Nahm’s conjecture. Section 8 describes our bound on the central charge. Section 9 gives a more precise description of one of the quotient spaces which figure prominently in the proofs of the preceding results, and Sect. 10 contains some conclusions and outlook for further work. Finally, we have included an appendix where two quite technical calculations are described in detail. 2. Notation We assume the reader is familiar with basic notions of conformal field theory, as found for instance in [5, 15]. Some acquaintance with the language of vertex operator algebras [2, 12, 22] is also helpful. At all times in this paper we are considering a fixed chiral bosonic conformal field theory defined on the sphere P. To be precise, by a “chiral bosonic conformal field theory” we mean an object of the type discussed in [16]. From this point of view, a conformal field theory on P is defined in terms of its amplitudes. We assume that the

Rationality, Quasirationality and Finite W-Algebras

307

amplitudes are local, M¨obius invariant, and satisfy the cluster decomposition property (that guarantees that the spectrum of the scaling operator L0 is bounded by zero from below, with a unique state of eigenvalue h = 0). We also assume that the theory is conformal, i.e. that it possesses a stress energy tensor V (L, z) = L(z) whose modes Ln satisfy the Virasoro algebra. The details of the chosen formalism are not essential to following the ideas of this paper, and indeed, the whole argument could be rewritten in terms of the standard axioms of vertex operator algebras [22]. The amplitudes that define the theory are written as ka=1 V (ψa , za ), where the vertex operator corresponding to ψ ∈ V is denoted by V (ψ, z). The vector space V consists of quasi-primary states that generate the whole theory; for convenience we shall occasionally assume that V is the space of all quasiprimary states. We sometimes write V (ψ, z) in terms of modes as V (ψ, z) = Vn (ψ)z−n−hψ = V(m) (ψ)z−m−1 , (1) n∈Z

m∈Z

where V(m) (ψ) = Vm+1−hψ (ψ) is the moding that is commonly used in the mathematical literature. We will frequently consider meromorphic functions and differentials defined on the Riemann sphere P. It is convenient to use the language of “divisors” (see [18]) to classify the zeros and poles of these functions. We now give a brief review of the facts relevant for us. A divisor on P is, by definition, any formal sum of the form D= cP [P ], cP ∈ Z, finitely many cP = 0. (2) P ∈P

Divisors can be added and subtracted in the obvious way, and we say D ≥ 0 if all cP ≥ 0. Now let νP (f ) denote the order of vanishing of f at P (so νP (f ) is negative if f has a pole at P ), and define the divisor of f to be div f = νP (f )[P ]. (3) P ∈P

Clearly div f g = div f + div g, and div f ≥ 0 just if f is holomorphic (i.e. constant). We can similarly define div ω, where ω is a meromorphic k-differential on P; explicitly, such an ω can always be written as ω = f dz⊗k for some f , and then we have div ω = div f + kdiv dz = div f − 2k[∞].

(4)

(This definition expresses the fact that dz has a pole of order 2 at infinity.) The crucial analytic property which the amplitudes of the theory must possess by definition [16] is that, for any ψ ∈ V , V (ψ, z) ki=1 V (ψi , zi )dz⊗hψ depends meromorphically on z ∈ P and has poles only for z = zi . The ψi are elements in V (i.e. meromorphic fields that correspond to states in the vacuum sector of the theory), and the resulting amplitudes are then also meromorphic in the variables zi . Later on we will consider more general amplitudes with insertions of untwisted highest weight represen tations. These amplitudes V (ψ, z) ki=1 W (φi , ui )dz⊗hψ , where the φi are states in such highest weight representations, are then still meromorphic as a function of z ∈ P. However, the functional dependence on the ui is then determined by the Knizhnik-Zamolodchikov equation, and, in particular, the amplitudes are typically non-meromorphic as

308

M.R. Gaberdiel, A. Neitzke

a function of the ui . (At any rate, for our purposes it will be often sufficient to consider these amplitudes at fixed insertion points ui .) The Fock space H of the theory is spanned by finite linear combinations of states of the form = V(n1 ) (ψ1 ) · · · V(nk ) (ψk ),

(5)

where ψi ∈ V , denotes the unique (vacuum) state with conformal weight h = 0, and ni ∈ Z. Any product of vertex operators V (φ1 , u1 ) · · · V (φl , ul ) defines a linear functional on the Fock space by k nk n1 ηV (φ1 ,u1 )···V (φl ,ul ) () = dz1 z1 · · · dzk zk V (φ1 , u1 ) · · · V (φl , ul ) V (ψi , zi ) , 0

0

i=1

(6) where the contours are chosen so that |z1 | > |z2 | > · · · > |zk |. The Fock space is the space spanned by vectors of the form (5), modulo states that vanish in each linear functional associated to any product of vertex operators. In (6) we have considered the Fock space at 0 ∈ P; however, since the amplitudes are translation invariant, it is clear that one can similarly consider the Fock space at any other point on the Riemann sphere. 3. The Subspaces An We begin by introducing a generalization of the quotients of H which appeared in [16, 31, 35]. For fixed u = (u1 , . . . , uk ) ∈ (P − {0})k (where we do not require that the ui be distinct), we define Ou = Span dz g(z)V (ψ, z)χ χ ∈ H, ψ ∈ V , g meromorphic, 0

div gdz⊗−hψ +1 ≥ −N [0] +

k

hψ [ui ] for some N ≥ 0 .

(7)

i=1

We then set Au = H/Ou .

(8)

Because of the M¨obius invariance of the amplitudes, we may assume that one of the ui , u1 say, is equal to ∞. If none of the other uj are equal to ∞, one can give an explicit (M) description of Ou as the space spanned by the states of the form Vu (ψ)χ with M > 0, where  

L0 k (ζ − u ) j dζ j =2 Vu(M) (ψ) = V ψ, ζ  (9) M+1 ζ k−2 0 ζ and ψ ∈ V , χ ∈ H. For the case where the ui are distinct, this space has been considered before in [16], where it was denoted by Ak (but we now renounce that notation in favour of one described below.) In the case where all k of the ui equal ∞, one can give a similarly explicit description: in this case Ou is simply spanned by states of the form V−N −(k−1)hψ (ψ)χ with N > 0. This choice of u is particularly convenient since the

Rationality, Quasirationality and Finite W-Algebras

309

resulting Ou is spanned by states of definite conformal weight; this makes calculations significantly simpler. The original motivation for the definition of Au , in the case where all ui are distinct, stemmed from the fact that the algebraic dual space A∗u describes the correlation functions involving k highest weight states at u1 , . . . , uk (this was first observed by Zhu in [35] for u = (−1, ∞)). More generally, one finds Theorem 1. There is a one-to-one linear correspondence between elements η ∈ A∗u and systems of correlation functions on the sphere, i.e. maps ((ψ1 , z1 ), . . . , (ψl , zl )), ψi ∈ V , zi ∈ P

→

l

∈C

V (ψj , zj )

j =1

(10)

η

such that the lj =1 V (ψj , zj )η (regarded only as functions of the zj ) obey the operator product relations of the theory defined on the sphere (see [16] for a precise definition), and have the “highest weight” property div V (ψ, z)η dz⊗hψ ≥ −

k

hψ [ui ].

(11)

i=1

Proof. Given any system of correlation functions one can construct a linear functional η on H by contour integration, as discussed in Sect. 2. If we further require (11), then this functional vanishes on Ou ⊂ H, and therefore η ∈ A∗u . Conversely, any η ∈ A∗u defines formal Laurent series, whose convergence to functions with the required analytic properties was proven in [31]. (Strictly speaking the proof was only given under the additional hypothesis that the ui be distinct; however, that hypothesis is actually not required anywhere in the proof.)

It is often convenient to use a shorthand notation where we only keep track of the number of coincident points ui . Let us thus define An , where n is a multi-index n = [n1 , . . . , nl ]; this denotes the space Au for the case where n1 of the ui are equal to v1 , n2 of the ui are equal to v2 = v1 , etc. We define Xn to be the corresponding configuration space, namely the set of all u ∈ P|n| (where |n| = n1 + n2 + · · · + nl ) for which the first n1 coordinates are coincident, the second n2 coordinates are coincident, and so on. The usefulness of this notation depends on the following fact: Theorem 2. Suppose A(∞k ) is finite-dimensional and let |n| = k. Then the space An is independent of the choice of u ∈ Xn , in the sense that choosing a homotopy class of paths from u to u in Xn determines a natural isomorphism Au Au . Proof. Using Theorem 1 we can regard A∗n as a space of correlation functions. First consider the case n = (1, 1, . . . , 1) where all ui are distinct. In that case we can introduce a more suggestive notation for the correlation functions, namely, we write k i=1

W (φi , ui )

l i=1

V (ψi , zi ) ≡

l i=1

V (ψi , zi )

. η

(12)

310

M.R. Gaberdiel, A. Neitzke

(Here the formal symbols W (φi , ui ) represent insertions of highest weight states.) Given all correlation functions at some fixed u, the Knizhnik-Zamolodchikov equation determines them at all u using the fact that the Virasoro algebra acts geometrically; more specifically, if u1 = ∞ then k k ∂ W (φi , ui ) = dz W (φi , ui )L(z) . (13) ∂u1 u1 i=1

i=1

Using this formula systematically one can construct a family of differential equations, to be solved in the space obtained by gluing together the A∗u at different points u; if these equations admit solutions, we then expect that they will define the analytic continuation from correlation functions at u to correlation functions at u , proving the theorem. To prove that solutions actually exist one has to impose the condition that A(∞k ) is finite-dimensional (specifically, what one uses is the fact that a basis for A(∞k ) corresponds to a spanning set for each Au , u ∈ Xn .) Under this assumption it is shown in [31] that the A∗u (and hence the Au ) indeed fit together to form a vector bundle over Xn which possesses a natural flat connection given by (13). The argument given there extends straightforwardly to the case where the ui need not be distinct.

The space O(∞,∞) is the C2 space of Zhu, so if A[2] is finite-dimensional, the C2 condition of Zhu is satisfied. On the other hand, A[1,1] is isomorphic to Zhu’s algebra (compare also Sect. 4). The space A[k] has been considered before in [31], where it was denoted by H/Ck . As we now show, its dimension provides an upper bound on the dimension of the spaces An with |n| = k; this result was already used in [35] for the special case n = (1, 1) (see also [31]). Lemma 3. dim An ≤ dim A[|n|] . Proof. Fix u = (∞, u2 , . . . , uk ) (by M¨obius invariance this involves no loss of generality). It follows from (7) that Ou is generated by the states of the form (k−1)hψ

Vu(N) (ψ)χ =

cs V−N−(k−1)hψ +s (ψ)χ ,

(14)

s=0

where N > 0 and cs are some constants (depending on u) with c0 = 1. On the other hand, O(∞k ) is generated by the states of the form V−N−(k−1)hψ (ψ)χ with N > 0. Let {φ1 , . . . , φM } be a set of representatives for H modulo O(∞k ) . We claim that these vectors also span H modulo Ou . Suppose that this is not the case, and let be a vector of minimal conformal weight that does not differ by an element in Ou from a linear combination of φ1 , . . . , φM . By assumption we can write =

M

bj φj +

j =1

L

V−Nr −(k−1)hr (ψr )χr .

(15)

r=1

But then =−

M j =1

bj φj −

L r=1

Vu(Nr ) (ψr )χr

(16)

Rationality, Quasirationality and Finite W-Algebras

311

is a linear combination of vectors whose conformal weight is strictly smaller than that differs by an element in Ou from a of . By the minimality of it then follows that linear combination of φ1 , . . . , φM , and we have the desired contradiction.

It should be noted that the dimension can actually decrease when we “split points”. The simplest example for this phenomenon occurs already for n = [1, 1]: the e8 level 1 theory is self-dual (i.e. the only representation is the vacuum representation), and therefore has no nontrivial two-point functions, implying dim A[1,1] = 1; on the other hand, it is easy to see by inspection that dim A[2] ≥ 249. 4. Representations and Zhu’s Algebra We now shift from considering the vacuum representation H to more general representations of the conformal field theory. A representation of the conformal field theory is defined in terms of the amplitudes it induces [16, 35], W (φ1 , u1 )W (φ2 , u2 )

k

V (ψi , zi ) ,

(17)

i=1

where the ψi ∈ V are arbitrary. The amplitudes have the crucial property that they respect the operator product relations of the meromorphic conformal field theory. Furthermore, the amplitudes are M¨obius covariant, and are analytic as a function of the zi , except for possible poles at zi = zj , i = j , and singularities at zi = uj . We call the representation amplitudes non-singular if the singularities at zi = uj are poles of finite order; a non-singular representation amplitude is highest weight if the order of the pole at zi = uj is bounded by hψi . We can construct from the amplitudes two vector spaces H1 and H2 which form modules for the modes Vn (ψ) for ψ ∈ H. These modules are generated by the action of the modes (defined via contour integrals around the ui ) from φ1 and φ2 , respectively. The actual module is then a quotient space of the space so obtained, where we remove “null vectors” by identifying states whose difference vanishes in all amplitudes (17). The requirement that the amplitudes respect the operator product relations implies that the action of the modes satisfies the “Jacobi identity” required in algebraic definitions of representation (given e.g. in [22]). If the representation amplitudes are non-singular, then the two representations are “weak modules” in the sense of [6], i.e. Hi has the property that, for any ψ ∈ H and χ ∈ Hi , Vn (ψ)χ = 0 for n > N (where N may depend on ψ, χ ). Finally, if the representation amplitudes are highest weight, then the two representations are highest weight representations, i.e. Hi is generated from a single state φi with the property that Vn (ψ)φi = 0 whenever n > 0. On the other hand, we can construct representation amplitudes (that have the appropriate analytic properties) from purely algebraic data. Indeed, it follows from Theorem 1 that each element in the algebraic dual of A(u1 ,u2 ) defines representation amplitudes that have the highest weight property. It was furthermore shown by Zhu [35] (see also [15, 16] for an exposition more in line with the present point of view) that A(u1 ,u2 ) has the structure of an algebra, and that the equivalence classes of representation amplitudes (where we identify amplitudes that define equivalent modules Hi ) are in one-to-one correspondence with representations of A(u1 ,u2 ) . Finally, the irreducible representations

312

M.R. Gaberdiel, A. Neitzke

R of Zhu’s algebra are in one-to-one correspondence with the irreducible highest weight representations HR of the conformal field theory. The algebra structure of Zhu’s algebra is most easily understood for A = A(∞,−1) , whence it is defined by (0)

ψ ∗ χ ≡ V(∞,−1) (ψ)χ ,

(18)

(0)

where V(∞,−1) (ψ) is given in (9). This product is characterized by the identity (emphasized by Brungs and Nahm in [4]) V0 (ψ ∗ χ ) = V0 (ψ)V0 (χ )

(19)

which holds when both sides act on highest weight states, so that A is essentially the algebra of zero modes of fields in the vacuum sector acting on highest weight states. Theorem 1 states that A∗(u1 ,...,uk ) describes the space of correlation functions that correspond to k highest weight states. If we are, however, interested in understanding the different ways in which the various representations of the theory can couple in k-point functions, then this description contains a certain redundancy. In particular, we can act with zero modes V0 (ψ) on any of the φi in (12), and this will produce another highest weight state in the same representation. It is therefore useful to study A∗u as a representation of k copies of the zero mode algebra A acting at the k points ui . Theorem 4. Fix a multi-index n. For any i with ni = 1 there is a natural map of algebras, ρi : A → End(An ). The dual map ρi∗ : A → End(A∗n ) satisfies the identity (for ui = ∞) dz(z − ui )hψ −1 V (ψ, z) · · · η , (20) · · · ρi∗ (ψ)η = ui

i.e. it is the action of zero modes. Proof. Without loss of generality we may assume that u = (∞, u2 , . . . , uk ). Using the (0) notation introduced in (9) we define ρ1 (ψ) = Vu (ψ). It is shown in the appendix that for L > 0, [Vu(0) (ψ1 ), Vu(L) (ψ2 )]χ ∈ Ou ,

(21)

(L) Vu(0) V(∞,−1) (ψ) χ φ ≈ Vu(L) (ψ) Vu(0) (χ ) φ,

(22)

and that for L ≥ 0,

where ≈ denotes equality up to states in Ou . This implies that ρ1 defines an algebra homomorphism A → End(An ). Using the M¨obius invariance of the amplitudes, this is sufficient to prove the statement for all i. The formula (20) follows easily from the definition of the action.

If A is semisimple and if all ui are distinct, Theorem 4 allows us to decompose A∗(u1 ,...,uk ) completely into representations (R1 ⊗ · · · ⊗ Rk ) of Ak ; the multiplicity with

Rationality, Quasirationality and Finite W-Algebras

313

which (R1 ⊗ · · · ⊗ Rk ) appears in A∗(u1 ,...,uk ) then gives an upper bound on the number of different ways in which the spaces HR1 , . . . , HRk can be coupled.1 Given a representation, a rough measure of its size relative to the vacuum representation is given by the special subspace, defined by Nahm in [30] as follows: let W ⊂ Hi be defined by W = Span{Vn (ψ)χ : n ≤ −hψ < 0, ψ ∈ H, χ ∈ Hi }.

(23)

Then a special subspace, Hsi , is a subspace of Hi such that W + Hsi = Hi and W ∩ Hsi = {0}. The dimension of Hsi equals the dimension of the quotient space Hi /W , and thus is independent of the choice of Hsi . In the case of the vacuum representation, dim Hs = 1, and dim Hsi > 1 for any other representation. Representations whose special subspace is finite-dimensional play a preferred role (their fusion rules are finite), and are called quasirational. Finally, since Hi carries an action of the Vn (ψ), we note that we can define various quotients Ain of Hi just by replacing H with Hi in (7), (8). In particular, when Hi = HR , AR [1] is isomorphic to the highest weight space R, as can be seen from choosing u = (∞) in (7). The AR n obey an analogue of Theorem 1, but we will not use this fact explicitly in what follows. 5. Rationality One of the central concepts in conformal field theory is “rationality,” a condition which is supposed to express a kind of finiteness of the theory. There exist various notions of finiteness in the literature [6, 24, 28, 35] and the precise interrelations between the different assumptions are not all understood. On the other hand, most people would agree that every rational theory should have the following properties: (i) The conformal field theory has only finitely many irreducible highest weight representations. (ii) The characters χR (q) = Tr HR q L0 −c/24 are convergent for |q| < 1 and close under modular transformations. (iii) The fusion rule coefficients Nijk of three irreducible highest weight representations, Hi , Hj and Hk are all finite. There are various different conditions that imply some of these properties. For example, if Zhu’s algebra is semisimple, it follows from the Wedderburn structure theorem (see for example [9]) that A= EndVi (24) i

for a finite set of finite-dimensional vector spaces Vi , which form the only irreducible representations of A. Thus if A is semisimple, (i) is satisfied. It is reasonable to conjecture that (ii) and (iii) should also follow from the semisimplicity of A, but this conjecture is, at least at present, still out of reach. In order to make progress, two other conditions have been proposed: 1 For theories in which every representation is completely reducible (see Section 5) this bound is sharp, i.e. every element of A∗u corresponds to an actual coupling. The reason this is not true in general is that the correlation functions coming from an element of A∗u need not respect the null-vector relations in the HRi .

314

M.R. Gaberdiel, A. Neitzke

(a) Every N-graded weak module is completely reducible. (This is the condition called rationality by Zhu and many other authors on vertex operator algebras [6, 24, 35].) (b) The quotient space A[2] is finite-dimensional. (This is the C2 condition of Zhu.) It has been shown in [35] that (a) implies the semisimplicity of A, and therefore by the above argument (i). In the same paper it was shown that (a) together with (b) imply (ii). Zhu further conjectured that (a) implies (b), but this also seems at present out of reach. The C2 condition implies that A is finite-dimensional, but does not imply its semisimplicity [17]. In the following we shall mainly analyse the implications of (b). In particular we shall show that (b) implies that every highest weight representation is quasirational and that (iii) holds. We shall also give a direct argument for the convergence of the characters under the assumption of (b). 6. The Basis Lemma First we will prove three computational results that are originally due to Borcherds [2] (see also [22]). Lemma 5. We have [V(−N1 ) (ψ1 ), V(−N2 ) (ψ2 )] =

h 1 +h2

V(−N1 −N2 +1−r) (χr ),

(25)

r=1

where hi is the conformal weight of ψi , and the conformal weight of χr is h1 + h2 − r. Proof. The commutator [V−N1 +1−h1 (ψ1 ), V−N2 +1−h2 (ψ2 )] =

h1 +h 2 −1

V−N1 −N2 +2−h1 −h2 (χs ),

(26)

s=0

where the conformal weight of χs is h1 + h2 − 1 − s. Substituting r = s + 1, we then obtain the above formula.

Lemma 6. We have N2 + L − 1 V(−N1 ) V(−N2 ) (ψ)χ = V(−N2 −L) (ψ)V(−N1 +L) (χ ) L L≥0 N2 + L − 1 + (−1)N2 +1 V(−N1 −N2 −L) (χ )V(L) (ψ), L L≥0

(27) where both sums terminate when they are evaluated on an element of H. Proof. We rewrite V(−N1 ) (V(−N2 ) (ψ)χ ) as V(−N1 ) (V(−N2 ) (ψ)χ ) = V V(−N2 ) (ψ)χ , ζ ζ −N1 dζ 0 V (V (ψ, z)χ , ζ ) z−N2 ζ −N1 dzdζ = |ζ |>|z| = V (ψ, z + ζ )V (χ , ζ )z−N2 ζ −N1 dzdζ. |ζ |>|z|

(28)

Rationality, Quasirationality and Finite W-Algebras

315

We then substitute ω = z + ζ and find −N2 V (ψ, ω)V (χ , ζ )(ω − ζ ) dω ζ −N1 dζ V(−N1 ) (V(−N2 ) (ψ)χ ) = 0 ζ = V (ψ, ω) V (χ , ζ )(ω − ζ )−N2 ζ −N1 dζ dω |ω|>|ζ | − V (χ , ζ ) V (ψ, ω)(ω − ζ )−N2 dω ζ −N1 dζ. |ζ |>|ω|

(29) In the first line we can then write (ω − ζ )−N2 = ω−N2

L ∞ ζ N2 + L − 1 L

L=0

ω

,

and thus obtain = =

∞ N2 + L − 1 L

L=0 ∞ L=0

V (ψ, ω)ω

−N2 −L

dω

0

V (χ , ζ )ζ −N1 +L dζ

0

(30)

N2 + L − 1 V(−N2 −L) (ψ)V(−N1 +L) (χ ). L

Finally, we rewrite the second line as (ω − ζ )−N2 = (−1)N2 ζ −N2

L ∞ N2 + L − 1 ω ζ

L

L=0

,

and obtain N2 +1

= (−1)

N2 +1

= (−1)

∞ N2 + L − 1 L

L=0 ∞ L=0

V (χ , ζ )ζ

−N1 −N2 −L

V (ψ, ω)ωL dω

dζ

0

0

(31)

N2 + L − 1 V(−N1 −N2 −L) (χ )V(L) (ψ). L

This proves the claim.

Lemma 7. As an immediate corollary of Lemma 6, we have V(−1−L) (ψ)V(−2N+1+L) (χ ) V(−N) (ψ)V(−N) (χ ) = V(−2N+1) (V(−1) (ψ)χ )− −

L≥0,L=N−1

(32)

V(−2N−M) (χ )V(M) (ψ),

M≥0

where again both sums terminate when they are evaluated on an element of H. Proof. This follows from Lemma 6 with N1 = 2N − 1 and N2 = 1.

316

M.R. Gaberdiel, A. Neitzke

The next proposition is the core of this section. Recall that A[2] H/O(∞,∞) and that O(∞,∞) is spanned by states of the form V(−M) (ρ)χ , where ρ, χ ∈ H and M > 1. Proposition 8. Let {Wi } be a set of representatives for H modulo O(∞,∞) . Then H is spanned by the set of states V(−N1 ) (Wi1 ) · · · V(−Nn ) (Win ),

(33)

where N1 > N2 > · · · > Nn > 0. Proof. Define a filtration on H, reminiscent of the notion of “grade” introduced in [34]: H(0) ⊂ H(1) ⊂ · · · ⊂ H(g) ⊂ · · · ⊂ H,

(34)

where we define H(g) as the subspace spanned by all states of the form V(−N1 ) (ψ1 ) · · · V(−Nn ) (ψn )

(35)

with i hψi ≤ g. Clearly H = ∪g H(g) (since every has at least the trivial representation = V(−1) (), so that if is homogeneous we have ∈ H(h ) .) Two properties of this filtration will be useful in what follows. First, commutator terms always have lower grade: more precisely, let ∈ H be some state of the form (35), with i hψi ≤ g, and let R be the state obtained from by exchanging two adjacent modes in (35). Then − R ∈ H(g−1) , as follows readily from Lemma 5. Second, elements of O(∞,∞) decrease the grade: again let ∈ H be of the form (35), with i hψi ≤ g, but this time with the additional stipulation that some ψi ∈ O(∞,∞) , i.e. ψi = V(−M) (ρ)χ, M > 1. Then using Lemma 6 we find that ∈ H(g−1) , since the state V(−M) (ρ)χ is of weight hχ + hρ + (M − 1). For any pair (g, N ) of nonnegative integers we now consider the proposition: Inductive Hypothesis. The space H(g) is spanned by states of the form V(−N1 ) (Wi1 ) · · · V(−Nn ) (Win ), (36) where N1 ≥ N2 ≥ · · · ≥ Nn > 0, j hWij ≤ g, and Ni = Ni+1 is allowed only for Ni > N . We consider pairs to be ordered lexicographically: so (g, N ) < (g , N ) if either g < g , or g = g and N < N . Then the set of pairs is well ordered (every non-empty subset has a smallest member). So we can proceed by induction: fixing (g, N ) we assume the hypothesis holds for all smaller pairs and establish it for (g, N ). In particular, the inductive hypothesis means the proposition is true for (g − 1, N ) so that every ∈ H(g−1) can be expressed in the claimed form (this is true even for g = 0 since in that case H(g−1) = 0.) As remarked above, provided we begin with monomials (35) with hψi ≤ g, commutator terms and terms involving states in O(∞,∞) will always be in H(g−1) ; so in trying to reduce some state (35) with hψi ≤ g to the claimed form we are always free to reorder modes and to replace any V(M) (ψ) by V(M) (W ) (here and below, we suppress the index on Wi , which plays no role.) We consider separately the pairs (g, N ) with N = 0. In this case, given an element of H(g) of the form (35), we can put it in the claimed form simply by reordering modes into descending order and replacing all ψi by W . (If any mode V(M) (W ) with M ≥ 0 appears, it will annihilate the vacuum after the reordering.)

Rationality, Quasirationality and Finite W-Algebras

317

Now suppose N > 0 and consider of the form (35) with hψi ≤ g. Using the inductive hypothesis applied to (g, N − 1) we can write as a sum of states of the form V(−M1 ) (W ) · · · V(−Mm ) (W )[V(−N) (W )]s V(−L1 ) (W ) · · · V(−Ll ) (W ),

(37)

where M1 ≥ · · · ≥ Mm > N > L1 > · · · > Ll > 0, s ≥ 0. If s < 2 then (37) is already a state of the desired sort. If m = 0 then the expression [V(−N) (W )]s · · · is in H(g−1) and we can use the inductive hypothesis applied to (g − 1, N ) to replace it, obtaining a sum of expressions which have no repeated indices at or below N . On the other hand, if m = 0 and s ≥ 2 then we use Lemma 7 to replace the initial pair V(−N) (W )V(−N) (W ). This replacement generates two sorts of terms: first, it generates V(−2N+1) (ψ)[V(−N) (W )]s−2 V(−L1 ) (W ) · · · V(−Ll ) (W ),

(38)

second, it generates V(−N−K) (ψ)V(−N+K) (χ )[V(−N) (W )]s−2 V(−L1 ) (W ) · · · V(−Ll ) (W ),

(39)

where K > 0 (using our freedom to reorder modes.) As usual we are free to replace ψ, χ by W everywhere. Now omitting the first mode from (38) or (39) produces a state ∈ H(g−1) (unless the first W is actually the vacuum, which can happen in (38) in the special case N = 1 — we treat this case separately below). Using the inductive hypothesis for (g − 1, N ) we then rewrite in terms of monomials (36) with no repeated indices at or below N . This yields the desired result, since 2N − 1 and N + K are both greater than N , so that re-attaching the omitted mode does not generate a repeat at or below N . It only remains to consider (38) in the special case N = 1. In this case we can rewrite that term simply as [V(−1) (W )]s−1 , and repeat the process until we are left with V(−1) (W ). This completes the proof of the inductive hypothesis for all (g, N ). To complete the proof of the proposition we use the fact that H is graded by conformal weight, H = ⊕h≥0 Hh , where Hh consists of states of weight h. It is therefore sufficient to show that each Hh is spanned by states (33) with N1 > N2 > · · · > Nn > 0. But this follows directly from the inductive hypothesis together with the fact that the conformal weight of the state in (36) is greater than or equal to j (Nj − 1); thus if (36) is of weight h, none of the Ni can be greater than h + 1, so the result follows by choosing N = h + 1 and sufficiently large g in the inductive hypothesis. This completes the proof.

We remark that the spanning set given by Proposition 8 is not actually a basis; this can be seen already for the minimal model with c = −22/5, for which the set {Wi } can be taken to be {, L−2 }. Then (33) includes both L−3 L−2 and L−5 , but in fact these two states are linearly dependent. Nevertheless, Proposition 8 is a very useful tool as we shall see momentarily. Most of the known conformal field theories are generated by a finite set of quasiprimary fields, and are indeed what is called finite W -algebras. More precisely, a vertex operator algebra is a finite W -algebra if it contains a finite set of states Wi ∈ H, i = 1, . . . , n, such that H is spanned by states of the form (36), where N1 ≥ N2 ≥ · · · ≥ Nn > 0 and ij ≥ ij +1 whenever Nj = Nj +1 . It now follows directly from Proposition 8 that Corollary 9. If A[2] is finite-dimensional, then the vertex operator algebra is a finite W -algebra.

318

M.R. Gaberdiel, A. Neitzke

Proof. We take the {Wi } ∈ H to be a set of representatives for H modulo O(∞,∞) and apply Proposition 8.

It is sometimes assumed in the definition of a vertex operator algebra that each L0 eigenspace is finite-dimensional. It now follows directly from Proposition 8 that this is automatic provided that A[2] is finite-dimensional. Actually, Corollary 9 has been proven before in [24]. The generating set Li used was somewhat different, however. He defined a space C1 ⊂ H and then showed that H is spanned by all states Vn1 (ψ1 ) · · · Vnm (ψm ), where the ψi range over some complementary subspace to C1 . This result was then refined in [23] where it was observed that the modes can actually be taken in a fixed lexicographical order; furthermore it was shown that H/C1 is a “minimal” generating set in a certain sense. These results are actually stronger than our Corollary 9 because finite-dimensionality of H/C1 is much weaker than our hypothesis. On the other hand, our spanning set has the significant advantage that it allows us to prove the “no repeat” condition of Proposition 8, which will be critical in the arguments of Sects. 7 and 8. The next result has also been obtained before, in [7]: Proposition 10. If A[2] is finite-dimensional then the character χ (q) = Tr H q L0 − 24 , c

(40)

which is defined as a formal power series, converges for 0 < |q| < 1. Proof. Let us denote by Q(n, k) the number of partitions of n into integers of k colours, with no integer appearing twice in the same colour. Then Proposition 8 implies that Tr H q L0 ≤ q n Q(n, k) = (1 + q n )k , (41) n≥0

n>0

where the inequality holds for each coefficient of the power series and hence for real positive q. (We set k = dim A[2] − 1 rather than k = dim A[2] because we can always choose one of the Wi to be , and V(−N) () = δN,1 1.) The right-hand-side converges for 0 < |q| < 1 since the modulus of its logarithm is bounded by k

∞ n=1

|log(1 + q n )| ≤ k

∞ n=1

∞

k |q|n ≤ |q|n . n (1 − |q| ) (1 − |q|)

(42)

n=1

By the comparison test this then implies the convergence of the character χ (q) for 0 < |q| < 1.

We remark that by similar techniques to those used in the proof of Proposition 8 one can show that HR is spanned by the states of the form (see also [23] for a similar argument) V−N1 (Wi1 ) · · · V−Nn (Win )Ui ,

(43) HR ,

and N1 ≥ N2 ≥ · · · ≥ where Ui runs over a basis of the highest weight space R of Nn > 0. If the representation in question is irreducible, dim A[2] < ∞ implies that R is finite-dimensional, and we can bound the character of the representation HR (defined in analogy to (40)) by ∞

−k c − 24 n (1 − q ) . (44) χR (q) ≤ (dim R)q n=1

This is again sufficient to prove the convergence of these characters for 0 < |q| < 1.

Rationality, Quasirationality and Finite W-Algebras

319

7. Nahm’s Conjecture In this section we will be exploring some further consequences of the assumption that A[2] is finite-dimensional. We remark that results similar to those appearing in this section have been proven in [24], under the assumption that L0 acts semisimply on all weak modules. This assumption is somewhat difficult to check in practice, however, and in any case is strictly stronger than finite-dimensionality of A[2] .2 We shall first prove that every conformal field theory for which A[2] is finite-dimensional possesses only finitely many n-point functions. Given Theorem 1 this statement follows from the following observation. Theorem 11. Suppose A[2] is finite-dimensional. Then all Au are finite-dimensional. Proof. By Lemma 3 we see that it is sufficient to show that all A(∞k ) are finite-dimensional. By definition, O(∞k ) = Span{V(−M) (ρ)χ : ρ ∈ H, χ ∈ H, M > (k − 2)hρ + 1}.

(45)

Now consider the spanning set for H provided by Proposition 8. Since A[2] is assumed finite-dimensional we can choose the set {Wi } to be finite. So H is spanned by monomials V(−N1 ) (Wi1 ) · · · V(−Nn ) (Win ),

(46)

where N1 > · · · > Nn > 0. But if N1 > (k − 2)max{hWi } + 1 then the state (46) is in O(∞k ) . This leaves us only finitely many choices for the Ni , which gives a finite spanning set for H/O(∞k ) , completing the proof.

Now we are in a position to prove Nahm’s conjecture. Let HR be some irreducible highest weight representation of the conformal field theory. In [30] Nahm defined the special subspace HsR (as discussed in Sect. 4) and defined HR to be quasirational if HsR is finite-dimensional. Nahm conjectured that the rationality of the theory implies that all irreducible representations are quasirational. We shall now prove this statement under the condition that A[2] is finite-dimensional. In fact, we shall prove a slightly stronger statement, namely that all quotient spaces AR [n] are finite-dimensional. This implies that all representations are quasirational since R dim AR [2] ≥ dim Hs , because R R R AR [2] A(∞,∞) = H /Span{Vn (ψ)χ : n < −hψ , ψ ∈ V , χ ∈ H }.

(47)

The motivation for our proof comes from the interpretation of the quotients An as spaces of correlation functions. From Theorem 11 and Lemma 3 we know that A[2] finite-dimensional implies A[p,1] finite-dimensional for all p ≥ 1; and from Theorem 1 we know that A∗[p,1] can be understood as the space of correlation functions · · · η with the property that div V (ψ, z)η dz⊗hψ ≥ −phψ [u1 ] − hψ [u2 ].

(48)

But this analytic structure is exactly what we would expect from correlation functions that are induced by a single highest weight state at u2 and a state at u1 that is annihilated 2 The triplet algebra [17] satisfies the C condition, but it possesses representations for which L does 2 0 not act semisimply.

320

M.R. Gaberdiel, A. Neitzke

by all Vn (ψ) with n > (p − 1)hψ . If we choose u1 = ∞, u2 = 0, the state at u1 = ∞ defines a linear functional on the Fock space HR at u2 = 0. The property that the state at u1 is annihilated by the modes with n > (p − 1)hψ implies then that this functional R R R vanishes on O(∞ p ) ⊂ H , and therefore defines a functional on A(∞p ) . We therefore expect that we can construct an element of A∗[p,1] from a highest weight state U in the ∗ representation R, and an element η ∈ (AR [p] ) ; more specifically, if we evaluate the linear ∗ functional in A[p,1] on χ ∈ H (now regarding H as being placed at 1 ∈ P) we should have η(∞)χ (1)U (0) = η(∞)(V−n (χ )U )(0) (49) n∈Z (p−1)hχ

=

η(∞)(V−n (χ )U )(0),

(50)

n=0

where the terms with n < 0 are cut off by the highest weight property of U and the R terms with n > (p − 1)hχ are cut off by the assumption that η vanishes on O(∞ p ) . This formula motivates the proof of: Lemma 12. Let HR be any representation of the conformal field theory that is generated from a highest weight state U . Then there is an injection ∗ ∗ σ : (AR [p] ) → (A[p,1] ) .

(51)

R p Proof. We realize AR [p] as A(∞p ) and A[p,1] as A(∞ ,−1) . Then define σ , as suggested above, by the formula

[σ (η)] (χ ) = η(V (χ , 1)U ) =

0

η(Vn (χ )U ).

(52)

n=−(p−1)hχ

In order to check that σ (η) annihilates O(∞p ,−1) , we observe from (7) that O(∞p ,−1) is (M) generated by the states of the form V(∞p ,−1) (ψ)χ , where M > 0 and dζ (ζ + 1) L0 (M) V(∞p ,−1) (ψ) = V ψ, ζ . (53) M+1 ζ p−1 0 ζ It is therefore sufficient to show that for M > 0, (M) η V (V(∞p ,−1) (ψ)χ , 1)U = 0,

(54)

∗ provided that η ∈ (AR (∞p ) ) . Expanding out (53) in terms of modes we have

hψ hψ (M) V V(∞p ,−1) (ψ)χ , 1 = V (V(−(p−1)hψ +s−M−1) (ψ)χ , 1). s

(55)

s=0

Since the vertex operator is evaluated at z = 1, we can rewrite it in terms of a sum over all modes V(r) (·). We then collect together all those terms that have the same conformal

Rationality, Quasirationality and Finite W-Algebras

321

weight: this amounts to choosing r (as a function of s) as r = phψ + hχ + M − s + t, where now t labels the different values for the conformal weight of the resulting state. We then apply Lemma 6 to V(hχ +phψ +M−s+t) (V(−(p−1)hψ +s−M−1) (ψ)χ ). The first sum contains only terms of the form V(−R) (ψ)φ with R ≥ (p − 1)hψ − s + M + 1, for which η vanishes by assumption. The second sum gives rise to M+(p−1)hψ

(−1)

hψ

(−1)

hψ s

s

s=0

(p − 1)hψ + M − s + L L L≥0

× V(hψ +hχ −1+t−L) (χ )V(L) (ψ).

(56)

All terms with L ≥ hψ vanish since V(L) (ψ)U = 0 as U is a highest weight state. It therefore only remains to check that all the other terms vanish, i.e. that hψ s=0

(−1)

s

hψ s

(p − 1)hψ + M − s + L =0 L

for L = 0, . . . , hψ − 1.

(57)

In order to prove this identity, we observe that hψ

(−1)s

s=0

hψ (p − 1)hψ + M − s + L L u L s L≥0

hψ

=

(−1)

s

s=0

= =

hψ 1 (p−1)h ψ +M−s+1 s (1 − u)

1

hψ

(1 − u)(p−1)hψ +M+1

s=0

uhψ (1 − u)(p−1)hψ +M+1

(−1)s

hψ (1 − u)s s (58)

.

Thus the left-hand-side of (58) does not have any powers of u below hψ , and therefore (57) holds. To complete the proof we must check that σ is injective, i.e. that σ (η) = 0 implies η = 0. First we show that if σ (η) = 0 then in fact η(Vn (χ )U ) = 0 for all χ : to do this, we use the fact that σ (η) annihilates the states (−1)m Lm −1 χ for all m. Using the formula Vn (L−1 χ ) = −(n + hχ )Vn (χ ) we then have 0=

0 n=−(p−1)(hχ +m)

=

0 n=−(p−1)hχ

η Vn ((−1)m Lm −1 χ )U

m−1

(59)

(n + hχ + r) η(Vn (χ )U ),

(60)

r=0

where the terms with n < −(p −1)hχ have vanished because η ∈ A∗(∞p ) . Using (60) for 0 ≤ m ≤ (p −1)hχ we have a system of (p −1)hχ +1 vanishing linear combinations of

322

M.R. Gaberdiel, A. Neitzke

the (p−1)hχ +1 values η(Vn (χ )U ). The coefficient matrix is (with 0 ≤ m ≤ (p−1)hχ , −(p − 1)hχ ≤ n ≤ 0) Cmn =

m−1

(n + hχ + r),

(61)

r=0

and by subtracting from each row suitable multiples of the previous rows we can reduce this to the Vandermonde matrix = nm , Cmn

(62)

thus showing that its determinant is nonzero. Hence the only solution is η(Vn (χ )U ) = 0 for −(p − 1)hχ ≤ n ≤ 0. For n > 0 the vanishing is automatic because U is highest weight, and for n < −(p − 1)hχ the vanishing is a consequence of η ∈ A∗(∞p ) , so we get η(Vn (χ )U ) = 0 for all n. But in fact the states Vn (χ )U span HR , according to Corollary 3.13 of [27] (one can also check this result directly — it essentially boils down to the assertion that if a correlation function is nonzero then some coefficient in its Laurent expansion is nonzero.) So if η annihilates all such states then η = 0. This completes the proof.

Combining Lemma 12 and Theorem 11 we now obtain the desired result: Theorem 13. Suppose A[2] is finite-dimensional. Then every irreducible highest weight representation of the conformal field theory is quasirational. Proof. Using Theorem 11 and Lemma 3 we see that A[p,1] is finite-dimensional for any p ≥ 1. Then from Lemma 12 it follows that each AR [p] is finite-dimensional, and the case p = 2 implies that the special subspaces are finite-dimensional.

Finally we observe that the tools we have developed here also allow us to prove that the C2 condition implies the finiteness of the fusion rules: Corollary 14. Suppose A[2] is finite-dimensional and let HRi , HRj and HRk be three highest weight representations of the conformal field theory. Then the fusion rule coefficient Nijk is finite. Proof. From the perspective of correlation functions what we are claiming is that there are only finitely many ways to couple the three highest weight representations; this follows from the finite-dimensionality of A∗[1,1,1] , and hence is a consequence of Theorem 11. On the other hand, there are also more algebraic approaches to fusion products [11, 25]; in lieu of proving that these approaches are equivalent, we remark that it is known [11, 26] that R

Rk j i Nijk ≤ dim HomA (AR [1,1] ⊗A A[1] , A[1] ).

Since all spaces involved are finite-dimensional we get the desired result.

(63)

Rationality, Quasirationality and Finite W-Algebras

323

8. A Bound on the Central Charge Up to now we have analysed what follows from the C2 condition of Zhu. As we have seen, this assumption is already sufficient to prove Nahm’s conjecture. If we assume in addition that A is semisimple, then using Zhu’s result about the modular properties of the characters (see (ii) in Sect. 5) we can in some cases derive a bound on the effective central charge of the W -algebra. If c denotes the central charge of the Virasoro algebra, the effective central charge, c, ˜ is defined to be c˜ = c−24hmin , where hmin is the smallest conformal weight of any state in any (irreducible) highest weight representation of the theory. It is proven in [35] that for A semisimple the characters close under modular transformations: namely, if we write q = e2πiτ , and q˜ = e−2πi/τ , then we have χ0 (q) ˜ = aR χR (q), (64) R

where the aR are some coefficients, χ0 = Tr H q L0 −c/24 is the character of the vacuum representation, χR is the character of the representation HR , and the sum runs over all irreducible representations of A. Let us make the additional assumption that aRmin = 0 for the representation Rmin attaining the minimum weight hmin (some arguments suggesting that this is a natural condition are given in [14, 29].) With this assumption we can prove Proposition 15. Suppose A[2] is finite-dimensional, A is semisimple, and aRmin = 0. Then c˜ ≤

(dim A[2] − 1) . 2

(65)

Proof. As in the proof of Proposition 10, let k = dim A[2] − 1, and define f2 (q) =

∞ √ 1 2q 24 (1 + q n ).

(66)

n=1

This notation goes back to [32], although we have deviated slightly from their convention by replacing q 2 with q; we could also write f2 in terms of conventional theta functions. In terms of this function we can then rewrite (41) as Tr H q L0 ≤ 2− 2 q − 24 f2 (q)k . k

k

(67)

Here and in the following we shall always assume that 0 < q < 1. Next we follow closely an argument from [8], using the modular transformation properties of characters described above. Using the modular transformation properties of f2 (see for example [32]) and (67) we have ˜ ≤ q˜ − χ0 (q) = q˜

(k+c) 24

2− 2 f2 (q) ˜ k k

− (k+c) − 2k 24

2

k

f4 (q) ,

(68) (69)

where 1

f4 (q) = q − 48

∞ n=1

1

(1 − q n− 2 ).

(70)

324

M.R. Gaberdiel, A. Neitzke

In the limit τ → i∞ (q → 0, q˜ → 1), (64) implies that ˜ = q hmin − 24 (a + o(1)), χ0 (q) c

(71)

where a = 0 by assumption, while from (68) we get 1

χ0 (q) ˜ ≤ 2− 2 (q − 48 + O(q))k = 2− 2 q − 48 (1 + O(q)). k

k

k

Comparing (71) and (72) we get the desired result c˜ ≤ k/2.

(72)

Incidentally, this proposition makes it clear that the dimension of A[2] will often be bigger than that of A[1,1] . For example, for a self-dual theory we have dim A[1,1] = 1, but the proposition implies that dim A[2] ≥ 2c˜ + 1. (For the e8 theory at level 1, c˜ = 8, and thus we have that dim A[2] ≥ 17. As a matter of fact, we have checked that dim A[2] ≥ 4124.) To a physicist, the above argument can be explained as follows. Recall that a basis for a theory of k free R fermions is given by the states in i1 ψ−N · · · ψ−N , n 1

(73)

where N1 > · · · > Nn > 0. Comparing this with (33) one might loosely say that the number of degrees of freedom of our theory is bounded above by the number of degrees of freedom in a theory of k free fermions. The effective central charge measures in essence the number of degrees of freedom; since every free fermion contributes 1/2, this explains the bound c˜ ≤ k/2. The original argument of [8] was very similar to that presented above, except that they began with a spanning set (33) where N1 ≥ N2 ≥ · · · ≥ Nn and repeats are allowed. In essence, they were therefore comparing the theory to a theory of m free bosons (where m is the dimension of the generating set). The modular argument then involved the η function (rather than the f2 function), and the bound they obtained was c˜ < m. For theories for which an explicit (small) generating set is known, their bound tends to be stronger than (65), although not even this is the case in general: for the c = −22/5 minimal model, our bound is c˜ ≤ 1/2 while the bound in [8] is c˜ < 1; in actual fact c˜ = 2/5 for this example. At any rate, Proposition 15 gives a bound on the effective central charge in terms of an intrinsic quantity of the vertex operator algebra that can be easily determined. 9. An Interpretation of A[p,1] Finally we would like to give a more precise interpretation of the spaces A[p,1] : namely, we show that any correlation function of the type described by A∗[p,1] is in fact obtained by inserting one highest weight state and one state annihilated by all Vn (ψ) with n > (p − 1)hψ . To prove this result, strengthening Lemma 12, we will need to make a rather strong assumption on the theory: namely, we assume that every weak module is completely reducible into irreducible modules (this property has been called regularity in the literature on vertex operator algebras; in particular, it was shown in [24] that regularity actually implies dim A[2] < ∞.) Then we can prove

Rationality, Quasirationality and Finite W-Algebras

325

Proposition 16. Suppose every weak module is completely reducible. Then ∗ ∗ (AR [p] ) ⊗ R A[p,1] ,

(74)

R

where the sum runs over all irreducible representations of Zhu’s algebra A. Proof. We claim that the isomorphism is implemented by the map [σ (η ⊗ U )] (χ ) =

0

η(Vn (χ )U ).

(75)

n=−(p−1)hχ

The calculation in the proof of Lemma 12 demonstrates that σ , as given in (75), is well defined. In order to prove that σ is injective, we note that the argument in the proof of Lemma 12 shows that σ (η ⊗ U ) = 0 only if η ⊗ U = 0. Now suppose σ is not injective. Then there exists some linear dependence m

σ (ηi ⊗ Ui ) = 0.

(76)

i=1

Choose such a dependence with the smallest possible m; we have already observed that m = 1 is impossible. If m > 1 then U1 and U2 cannot be linearly dependent (else we could easily reduce m, contradicting the minimality.) The complete reducibility implies that Zhu’s algebra is semisimple [35], and (24) then guarantees that there exists some a ∈ A with aU1 = 0, aU2 = 0; equivalently, there exists some ψ ∈ H such that V0 (ψ)U1 = 0, V0 (ψ)U2 = 0. Next we use Theorem 1 to identify A∗(∞p ,−1) with a space of correlation functions. We can therefore re-express (76) as the statement that m V (ψj , zj ) = 0, (77) i=1

σ (ηi ⊗Ui )

j

for all ψj and zj . By taking a suitable contour integral this implies in particular that m V (ψj , zj )V0 (ψ) = 0, (78) i=1

j

σ (ηi ⊗Ui )

and therefore that m

σ (ηi ⊗ aUi ) = 0,

(79)

i=2

contradicting the minimality of m. This completes the proof of the injectivity. It remains to show that σ is surjective. Because of Theorem 4 Zhu’s algebra A acts on A∗(∞p ,−1) via its action at −1, and we can therefore decompose A∗(∞p ,−1) as R A∗(∞p ,−1) = B[p] ⊗ R, (80) R R denotes an as yet undetermined multiplicity space. Using Theorem 1 we can where B[p] regard A∗(∞p ,−1) as the space of correlation functions j V (ψj , zj )η , satisfying the

conditions

326

M.R. Gaberdiel, A. Neitzke

V (ψj , zj )Vn (ψ)

j

Vn (ψ)

=0

for n > 0,

(81)

=0

for n < −(p − 1)hψ .

(82)

η

j

V (ψj , zj )

η

R can then be regarded as the space of correlation Using the decomposition (80), B[p] functions for which the zero modes in (81) transform in the representation R of A. Each R defines a representation of the conformal field theory where the state at element of B[p] −1 is a highest weight state (that transforms in the representation R under the action of the zero modes), whereas the state at ∞ is only annihilated by the modes Vn (ψ) with n > (p − 1)hψ . R actually defines a linear functional Now we would like to argue that each ξ ∈ B[p] on HR , the Fock space generated by the action of the modes on the highest weight state at −1. We might a priori worry that the correlation functions associated with ξ did not respect the null-vector relations by which one quotients in the definition of HR ; indeed, in the definition of HR we divided out states that vanish in amplitudes involving an arbitrary number of vertex operators and a highest weight state in the (dual) representation, but now we are considering what seem to be more general amplitudes. To resolve this difficulty we use our extra assumption of complete reducibility. The condition (82) is sufficient to deduce that the Fock space that is generated by the action of the modes on the state at ∞ defines a weak module, and therefore must be completely reducible into a direct sum of irreducible highest weight representations. Thus in fact we are only considering amplitudes where, apart from an arbitrary number of vertex operators, we R indeed defines a linear funchave a highest weight state at ∞, and therefore ξ ∈ B[p] R R tional on H . It follows from (82) that this functional vanishes on O(∞ p ) , and hence R R that it can be regarded as a linear functional on A(∞p ) A[p] . It therefore follows that R (AR )∗ , and we have thus established the proposition. B[p]

[p]

Proposition 16 implies in particular that the dimension of the quotient spaces AR [p] for each representation HR is bounded in terms of the quotient space A[p+1] of the vacuum representation. This result reflects the familiar fact that, for rational theories, the vacuum representation already contains a substantial amount of information about all representation spaces HR . 10. Conclusions In this paper we have proven the conjecture of Nahm that every representation of a rational conformal field theory is quasirational (Theorem 13). More specifically, we have shown that if the conformal field theory satisfies the C2 condition of Zhu, i.e. if the space A[2] is finite-dimensional, then the quotient space AR [p] of each highest weight R representation H is finite-dimensional for p ≥ 1; this immediately implies that HR is quasirational. We have also shown that this implies that the theory has only finitely many n-point functions, and in particular that the fusion rules between irreducible representations are finite (Corollary 14). The main technical result of the paper is the spanning set for the vacuum representation of any conformal field theory (Proposition 8), from which

Rationality, Quasirationality and Finite W-Algebras

327

we have also been able to deduce various other properties of conformal field theories that satisfy the C2 condition of Zhu (Corollary 9 and Proposition 10). We have introduced systematically spaces Au that describe the correlation functions with k highest weight states at u1 , . . . , uk . Some of the structure of these spaces does not depend on whether the ui are pairwise distinct, and one may therefore hope that these spaces will be useful in extending the definition of conformal field theory to singular limits, as envisaged in the program of Friedan & Shenker [13]. In [31] it was shown that the finite-dimensionality of A[n] implies the existence of npoint functions satisfying the Knizhnik-Zamolodchikov equation. Given Theorem 11, it now follows that the existence of n-point functions already follows from the finite-dimensionality of A[2] . Similarly, the condition that A(∞k ) is finite-dimensional in Theorem 2 can now be relaxed to the assumption that A[2] is finite-dimensional. It may be possible to prove an inhomogeneous version of the finiteness lemma (Proposition 8). In particular, one may be able to prove that the finite dimensionality of A implies the finite dimensionality of all A[1,1,...,1] . This would go a certain way to proving (a version of) Zhu’s conjecture, that the finite dimensionality of Zhu’s algebra implies that the C2 condition is satisfied. However, it seems likely that this will require more sophisticated methods, since the conjecture apparently does not hold for meromorphic field theories (that are not conformal). Consider the theory for which V is spanned by states J a,i of grade 1, where a = 1, . . . , 248 labels the adjoint representation of e8 , and i ∈ I, where I is some countably infinite set. For any finite set of vectors in V we can define the amplitudes to be the products of the amplitudes that are associated to the different copies of the affine e8 theory at level 1. These amplitudes are well defined and satisfy all the conditions of [16] (except that the theory does not have a conformal structure and the weight spaces are not finite-dimensional). Since each e8 level 1 theory is self-dual, it is easy to see that the same holds for the infinite tensor theory; thus A is one-dimensional. However, the eigenspace at conformal weight 1 is infinite-dimensional, and Proposition 10 therefore implies that the C2 condition cannot be satisfied. On the other hand, most of our arguments (in particular all of Sect. 6 and 7) do not require a conformal structure or the assumption that the L0 eigenspaces are finite-dimensional. A. The Action of Zhu’s Algebra on Au In this appendix we want to prove (21) and (22). Both these statements follow from straightforward calculations. A.1. Proof of (21). Without loss of generality we may assume that ψi , i = 1, 2 are both vectors of definite conformal weight hi . Using (9), we can then write the commutator (0) (L) [Vu (ψ1 ), Vu (ψ2 )] as  

h1

h2 k  k dz  dζ j =2 (z − uj ) j =2 (ζ − uj ) V (ψ1 , z)V (ψ2 , ζ ) k−2 k−2 L+1   z z ζ ζ 0 ζ  

h1

h2 k  k dz  dζ j =2 (z − uj ) j =2 (ζ − uj ) = V (V (ψ1 , z − ζ )ψ2 , ζ ) L+1 k−2 k−2   z ζ z ζ 0 ζ

328

M.R. Gaberdiel, A. Neitzke

=

h1 +h 2 −1 0

m=0

×

  

k

j =2 (ζ − uj ) ζ k−2

V (Vm+1−h1 (ψ1 )ψ2 , ζ ) k

(z − ζ )−m−1 ζ

j =2 (z − uj )

h1

zk−2

h2

 dz  . z 

The integral in brackets is 

h1  k (z − u ) j 1 dm  1  j =2 m k−2 m! dz z z

z=ζ

=

1 m!

m s=0

d m−s

1 m (−1)s 1+s s ζ dzm−s

k

j =2 (z − uj )

zk−2

dζ ζ L+1 (83)

h1

,

(84)

z=ζ

and the last derivative is of the form k

h1

h1 −m+s % k & (z − u ) (ζ − uj ) j d m−s h1 j =2 j =2 −1 = ) , + O(ζ dzm−s zk−2 ζ k−2 m−s z=ζ

(85) where the last bracket consists of a finite sum of terms. Thus (84) becomes k

h1 −m−1 k

1+s m ' ( j =2 (ζ − uj ) j =2 (ζ − uj ) 1 + O(ζ −1 ) Cs k−2 k−1 ζ ζ s=0 k

h1 −m−1 m ' ( j =2 (ζ − uj ) −1 = 1 + O(ζ Cs ) , ζ k−2

(86)

s=0

s are some constants. Putting this back into (83) and observing that the where Cs and C conformal weight of Vm+1−h1 (ψ1 )ψ2 is h1 + h2 − m − 1, we obtain the statement. A.2. Proof of (22). We rewrite the left-hand-side of (22) as  

L0 k (ζ − u ) j dζ dw j =2 = (w + 1)hψ V  V (ψ, w)χ , ζ  ζ w L+1 ζ k−2 k

hψ +hχ dζ dw j =2 (ζ − uj ) hψ = (w + 1) ζ w L+1 ζ k−2 (87)

k (ζ − u ) j dζ dz j =2 χ, ζ = × V V ψ, w L+1 ζ k−2 0 ζ ζ (z − ζ )

hχ +L k k

hψ j =2 (ζ − uj ) j =2 (ζ − uj ) × + (z − ζ ) V (ψ, z)V (χ , ζ ), ζ k−2 ζ k−2

Rationality, Quasirationality and Finite W-Algebras

329

where in the first two lines the integrals are taken over the region |ζ | > |w|, and we have substituted, in the last line, k j =2 (ζ − uj ) + ζ. z=w ζ k−2 Using the usual contour deformation trick, the last line of (87) can be written as the difference of two contour integrals k

hχ +L 1 j =2 (ζ − uj ) = dζ dz ζ (z − ζ )L+1 ζ k−2 |z|>|ζ |

hψ k j =2 (ζ − uj ) + (z − ζ ) V (ψ, z)V (χ , ζ ) × ζ k−2 k

hχ +L 1 j =2 (ζ − uj ) − dζ dz ζ (z − ζ )L+1 ζ k−2 |z|>|ζ | k

hψ j =2 (ζ − uj ) × + (z − ζ ) V (χ , ζ )V (ψ, z). (88) ζ k−2 The two terms can now be considered separately. In the second term we write M ∞ 1 L+M z L+1 1 = (−1) , M (z − ζ )L+1 ζ L+1 ζ M=0

and observe that k

j =2 (ζ − uj ) ζ k−2

hψ + (z − ζ )

=z

hψ

c1 z 1+ +O . z ζ

The second term therefore consists of terms of the form Vu (χ )φˆ with M > 0, and therefore can be dropped. In reaching this conclusion we have used that if φ is in the Fock space, only finitely many powers of ζz contribute. In the first term we now write M ∞ dz ζ dz L + M = L+1 , (z − ζ )L+1 z z M (M)

M=0

and observe that k

j =2 (ζ − uj ) ζ k−2

hψ + (z − ζ )

k =

j =2 (z − uj )

zk−2 k

=

j =2 (z − uj ) zk−2

hψ

 zk−2 (z

  k

j =2 (z

hψ % 1+O

− ζ) − uj )

& ζ . z

k +

(ζ −uj ) j =2 (z−uj ) k−2 ζ z

hψ  

(89)

330

M.R. Gaberdiel, A. Neitzke

Putting this back into (88) proves (22). Again, we have used here that if φ is in the Fock space, only finitely many powers of ζz contribute. Acknowledgements. We are indebted to Peter Goddard for many useful conversations, explanations and encouragement. We also thank Terry Gannon for a helpful discussion and a careful reading of a draft version of this paper, Haisheng Li for making us aware of his important work on the subject, and Kiyokazu Nagatomo for a careful reading and several helpful discussions. M.R.G. is grateful to the Royal Society for a University Research Fellowship, and A.N. gratefully acknowledges financial support from the British Marshall Scholarship and an NDSEG Graduate Fellowship.

References 1. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) 2. Borcherds, R.E.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) 3. Borcherds, R.E.: Monstrous moonshine and monstrous Lie superalgebras. Invent. Math. 109, 405– 444 (1992) 4. Brungs, D., Nahm, W.: The associative algebras of conformal field theory. Lett. Math. Phys. 47, 379–383 (1999), hep-th/9811239 5. Di Francesco, P., Mathieu, P., S´enechal, D.: Conformal field theory. Berlin-Heidelberg-New York: Springer, 1997 6. Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310(3), 571–600 (1998), q-alg/9509005 7. Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory. q-alg/9703016 8. Eholzer, W., Flohr, M., Honecker, A., H¨ubel, R., Nahm, W., Varnhagen, R.: Representations of W-algebras with two generators and new rational models. Nucl. Phys. B383, 249–290 (1992) 9. Farb, B., Dennis, R.K.: Noncommutative Algebra. Berlin-Heidelberg-New York: Springer, 1993 10. Frenkel, I., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104, 1–64 (1993) 11. Frenkel, I., Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) 12. Frenkel, I., Lepowsky, J., Meurman, A.: Vertex operator algebras and the Monster. Pure and Applied Mathematics 134, New York: Academic, 1988 13. Friedan, D., Shenker, S.: The analytic geometry of two-dimensional conformal field theory. Nucl. Phys. B281, 509–545 (1987) 14. Gannon, T.: Monstrous moonshine and the classification of CFT. math.QA/9906167 15. Gaberdiel, M.R.: An introduction to conformal field theory. Rept. Prog. Phys. 63, 607–667 (2000), hep-th/9910156 16. Gaberdiel, M.R., Goddard, P.: Axiomatic conformal field theory. Commun. Math. Phys. 209, 549– 594 (2000), hep-th/9810018 17. Gaberdiel, M.R., Kausch, H.G.: A rational logarithmic conformal field theory. Phys. Lett. B386, 131–137 (1996), hep-th/9606050 18. Griffiths, P., Harris, J.: Principles of algebraic geometry. New York: Wiley, 1978 19. Huang, Y.-Z.: Geometric interpretation of vertex operator algebras. Proc. Nat. Acad. Sci. USA 88, 9964–9968 (1991) 20. Huang,Y.-Z.: Two-dimensional conformal geometry and vertex operator algebras. Progress in Mathematics 148, Boston: Birkh¨auser, 1997 21. Huang, Y.-Z.: A functional-analytic theory of vertex (operator) algebras. Commun. Math. Phys. 204, 61–84 (1999) 22. Kac, V.: Vertex algebras for beginners. Providence, RI: American Mathematical Society, 1997 23. Karel, M., Li, H.: Certain generating subspaces for vertex operator algebras. J. Alg. 217, 393–421 (1999), math.QA/9807111 24. Li, H.: Some finiteness properties of regular vertex operator algebras. J. Alg. 212, 495–514 (1999), math.QA/9807077 25. Li, H.: Representation theory and tensor product theory for vertex operator algebras Ph.D. thesis, Rutgers University, 1994, hep-th/9406211 26. Li, H.: Determining fusion rules by A(V )-modules and bimodules. J. Alg. 212, 515–556 (1999)

Rationality, Quasirationality and Finite W-Algebras

331

27. Li, H.: The regular representation, Zhu’s A(V )-theory and induced modules. math.QA/9909007 28. Moore, G., Seiberg, N.: Polynomial equations for rational conformal field theories. Phys. Lett. B212, 451–460 (1988) 29. Moore, G., Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989) 30. Nahm, W.: Quasi-rational fusion products. Int. J. Mod. Phys. B8, 3693–3702 (1994), hep-th/9402039 31. Neitzke, A.: Zhu’s algebra and an algebraic characterization of chiral blocks. hep-th/0005144 32. Polchinski, J., Cai, Y.: Consistency of open superstring theories. Nucl. Phys. B296, 91–150 (1988) 33. Segal, G.B.: Notes on conformal field theory. Unpublished manuscript 34. Watts, G.M.T.: W algebras and coset models. Phys. Lett. B245, 65–71 (1990) 35. Zhu,Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996) Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 238, 333–366 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0836-2

Communications in

Mathematical Physics

Global Regularity of Wave Maps from R3+1 to Surfaces Joachim Krieger Department of Mathematics, Fine Hall, Princeton University, Princeton, NJ 08544, USA E-mail: [email protected] Received: 16 March 2002 / Accepted: 31 January 2003 Published online: 5 May 2003 – © Springer-Verlag 2003

Abstract: We consider Wave Maps with smooth compactly supported initial data of small H˙ 3/2 -norm from R3+1 to certain 2-dimensional Riemannian manifolds and show that they stay smooth globally in time. Our methods are based on the introduction of a global Coulomb Gauge as in [17], followed by a dynamic separation as in [8]. We then rely on an adaptation of T. Tao’s methods used in his recent breakthrough result [24]. 1. Introduction Let M be a Riemannian manifold with metric (gij ) = g. A Wave Map u : Rn+1 → M, n ≥ 2 is by definition a solution of the Euler-Lagrange equations associated with the functional u → Rn+1 < ∂α u, ∂ α u >g dσ . Here the usual Einstein summation convention is in force, while dσ denotes the volume measure on Rn+1 with respect to the standard metric. In local coordinates, u is seen to satisfy the equation ui + ji k (u)∂α uj ∂ α uk = 0,

(1)

where ji k refer to the Riemann-Christoffel symbols associated with the metric g. The relevance of this model problem arises from its connections with more complex nonlinear wave equations of mathematical physics: for example, Einstein’s vacuum equations under U (1)-symmetry attain the form of a Wave Maps equation coupled with additional elliptic equations. More specifically, Einstein’s equations in this case can be cast in terms of a Wave Map u : (M, g) → H2 , the target being the standard hyperbolic plane with metric hij , as follows: Rαβ = hij ∂α ui ∂β uj , g ui = −ji k (u)g αβ ∂α uj ∂β uk .

334

J. Krieger

The 2nd equation here is of Wave Maps type, on a curved background. Our model equation deals with the simpler case involving a flat background, but the hope is that the techniques for the latter problem will eventually elucidate the more complicated former problem. We are interested in the well-posedness of the Cauchy problem for (1) with initial data u[0] × ∂t u[0] at time t = 0 in H s × H s−1 . Classical theory relying on the energy inequality and Sobolev inequalities allows one to deduce local well-posedness in H s for s > n2 + 1. n Ideally, one would like to prove local well-posedness in H 2 , as this would immedin ately imply global in time well-posedness. The reason for this is that H˙ 2 is the Sobolev space invariant under the natural scaling associated with (1). Unfortunately, it is known that “strong well-posedness” in the sense of analytic or even C 2 -dependence on the n initial data fails at the H 2 -level, n ≥ 3 [1, 22]. Thus the best result to be hoped for is n global regularity of Wave Maps with smooth initial data of small H˙ 2 -norm. In two space dimensions, the scale invariant Sobolev space coincides with the classical H˙ 1 , and numerical data as well as the conjectured non-concentration of energy suggest global regularity for Wave Maps with arbitrary smooth initial data, provided the target is negatively curved. Non-concentration of energy has been proved by M. Struwe for rotationally symmetric smooth Wave Maps to spheres [20] after earlier work of Christodoulou-Tahvildar-Zadeh[3] establishing the corresponding result for geodesically convex targets. Also, Shatah-Tahvildar-Zadeh [21] showed the corresponding result for smooth equivariant Wave Maps to geodesically convex targets.1 Moreover, numerical simulations of smooth equivariant Wave Maps to S 2 with large initial data by P. Bizon [2] suggest development of singularities. This underlines the importance of the hyperbolic plane as target manifold. In the quest for reaching the critical n2 regularity, local well-posedness for (1)with n initial data in H 2 + , > 0 was proved for n ≥ 3 by Klainerman and Machedon in [6], and for n = 2 in [11]. Later, Tataru established global in time well-posedness for small n n n data in the Besov space B 2 ,1 , [26, 27]. Note that B˙ 2 ,1 has the same scaling as H˙ 2 , but unlike the latter controls L∞ . An important breakthrough with respect to global regularity was recently achieved by T. Tao in the case of Wave Maps to the sphere [23, 24], proving global regularity n for smooth initial data small in H˙ 2 : Tao’s work exemplifies the importance of taking the global geometry of the target into account, an aspect largely ignored by the local formulation (1). Embedding the target sphere in an ambient Euclidean space, the Wave Maps equation considered by Tao takes the form u = −u∂α ut ∂ α u = −(u∂α ut − ∂α uut )∂ α u,

(2)

α as usual runs over the space-time indices 0, 1, . . . n. The nonlinearity encodes both geometric (skew-symmetry of u∂α ut − ∂α uut ) as well as algebraic information (“nullform” structure). Tao manages to analyze all possible frequency interactions of the nonlinearity up to the case in which the derivatives fall on high frequency terms while the undifferentiated term has very low frequency. This bad case is then gauged away, using the skew-symmetric structure. With this method, which served as inspiration for the following developments, as well as sophisticated methods from harmonic analysis, Tao manages to go all the way to n = 2 (note that the smaller the dimension, the 1

For a nice account of these matters, see [18].

Global Regularity of Wave Maps from R3+1 to Surfaces

335

more difficult the problem on account of the increasing scarcity of available Strichartz estimates). After Tao, Klainerman and Rodnianski [9], extended this result to Wave Maps from Rn+1 , n ≥ 5 to more general and in particular noncompact targets. More precisely, Klainerman and Rodnianski consider parallelizable targets which are well-behaved at infinity. Upon introducing a global orthonormal frame {ei }, they define the new variables φαi defined by u∗ ∂α = φαi ei . It turns out that these satisfy the system of equations ∂β φαi − ∂α φβi = Cji k φαj φβk , j

∂α φ iα = −ji k φβ φγk mβγ ,

(3) (4)

where mβγ is the standard Minkowski metric on Rn+1 and Cji k , ji k are defined as follows: [ej , ek ] = Cji k ei ,

(5)

∇ej ek = ji k ei .

(6)

There is again a skew-symmetric structure present in this formulation on account of ji k = −jki . Moreover, by contrast with Tao’s formulation (2), the boundedness of φ is replaced here by the boundedness of the Cji k , ji k . Klainerman and Rodnianski impose in addition the condition that all derivatives of these coefficients be bounded, or in their terminology that M be “boundedly parallelizable”. If one now passes to the wave equation satisfied by the vector φα := {φαi }, one obtains φα = −Rµ ∂ µ φα + E,

(7)

where Rµ is skew-symmetric and moreover depends linearly on φ, provided we assume the Cji k , ji k to be constant for simplicity’s sake. E is a cubic polynomial in φ. By contrast with (2), the leading term in the nonlinearity is “quadratic in φ”. It is now possible to control all possible frequency interactions on the right hand side (n ≥ 5) except when Rµ is localized to very low frequency while ∂ µ φ is at large frequency. However, as Klainerman and Rodnianski observed, the curvature ∂ν Rµ − ∂µ Rν + [Rµ , Rν ]

(8)

when R is reduced to low frequencies is “very small”, in the sense that it is quadratic in φ, hence amenable to good Strichartz estimates. To take advantage of this, they intro duce a Coulomb Gauge 3j =1 ∂j R˜ j = 0, which allows one to replace the Rµ in (7) by R˜ µ which is “quadratic in φ”, effectively replacing the nonlinearity by a term which is trilinear in φ and hence easily handled by Strichartz estimates. The general philosophy here is that the higher the degree of the nonlinearity, the more room is available to apply Strichartz estimates. Klainerman and Rodnianski’s method is thus similar to Tao’s in that it utilizes a microlocal Gauge Change to deal with specific bad frequency interactions. The last result to be mentioned in this development is the simplification and extension of the previous arguments to include the case of 4 + 1-dimensional Wave Maps to esssentially arbitrary targets achieved by Shatah-Struwe [17] and (in more restrictive formulation) Uhlenbeck-Stefanov-Nahmod [14]. The former observed that using a

336

J. Krieger

Coulomb Gauge, in a similar fashion as above, at the beginning without carrying out a frequency decomposition allows one to reduce the nonlinearity to a form directly amenable to Strichartz estimates. This allows them to avoid the microlocal Gauge Change of Tao and leads to a remarkable simplification of the argument. In addition, they are also able to treat the case of dimension 4 + 1. The methods in [9] and [17] run into serious difficulties for 3 + 1-dimensional Wave Maps, and even more so for 2 + 1-dimensional Wave Maps. This can be seen intuitively as follows: The global Coulomb Gauge puts the leading term of the nonlinearity roughly into the form D −1 (φ 2 )Dφ. In dimensions 4 and higher, Shatah and Struwe can estimate such terms relying on the Strichartz type inequality for Lorentz spaces φ||L1 H σ + C||φ[0]|| ||φ||L2 L2n,2 ≤ C|| t

x

t

H

n −1 2

,

(9)

−1 2 2 where σ = n2 − 2. This can be used to estimate the L1t L∞ x -norm of D (φ ). However, in three space dimensions, the above estimate fails. In order to handle the case when D −1 (φ 2 ) has much lower frequency than Dφ, one would have to use an end∞ point L2t L∞ x -Strichartz estimate, which is false, even replacing the Lx -norm by BMO, see [25]. The present paper starts with the basic formulation (3), (4) of Klainerman and Rodnianski applied to the simplified context of a 2-dimensional Riemannian manifold (M, g), but utilizes the Coulomb Gauge right at the beginning as do Shatah and Struwe. The main innovation over the preceding then is to introduce a special null-structure into the nonlinearity by way of what we term a dynamic separation3 , a method introduced first in [8]: in our context, we introduce “twisted variables” θαi := Aik (u)φαk for suitable well-behaved functions Aik (u), and utilize the div-curl system satisfied by these to split them into a dynamic part, which has the form of a gradient, and an elliptic part, which satisfies an elliptic div-curl system. Substituting these components into the leading term of the nonlinearity results in a fairly complicated trilinear null-structure4 , as well as error terms at least quadrilinear. These are decomposed into quadrilinear null-forms and error terms at least quintilinear, iterating dynamic separation. In order to estimate the trilinear and quadrilinear null-structures, we have to refer to estimates in [13] which were derived using the technical framework set forth in [24]. Moreover, in order to control the “twisted variables” we have to prove a sort of “Gauge Change estimate” (Proposition 3.1) which is new for the spaces introduced in [24]. Part of what distinguishes our setup from Tao’s is that we are working at the level of the derivative of the Wave Map. In particular, high-high interactions become more delicate. The result proved in this paper certainly extends to higher-dimensional targets5 satisfying similar constraints as the two-dimensional ones considered in this paper. Our main theorem is the following: Let (M, g) be a 2-dimensional Riemannian manifold, which satisfies one of the following technical conditions: 2 Alternatively, as pointed out by Klainerman and Rodnianski, one can utilize an improved bilinear version of Strichartz estimates in [12] to handle these cases. 3 This terminology was suggested by S. Klainerman. 4 This is to be contrasted with the null-structure in [8], which is bilinear. 5 Our restriction on the dimension of the target ensures the commutativity of the Gauge Group. This allows us to avoid certain technicalities related to controlling the Gauge Change. However, the method of Tao (“approximate Gauge Change”) as in [24] or in [9] should handle the general case.

Global Regularity of Wave Maps from R3+1 to Surfaces

337

(1): M is boundedly parallelizable6 and there exists an isometric embedding i : (M, g) → (Rk , δij ) “which doesn’t twist much” in the following sense: there exists an orthonormal frame (e1 (x), e2 (x)), x ∈ M for T M and an extension (e˜1 (x), e˜2 (x)) of (e1 (x), e2 (x)), x ∈ i(M) to a neighborhood of i(M) in Rk such that all the derivatives of the e˜i (x) are bounded. (2): M is a compact surface. Choose an isometric embedding i : (M, g) → (Rk , h), where h = (hij ) is a metric agreeing with the standard (δij ) outside of a compact set, such that i(M) is a totally geodesic submanifold of (Rk , h). That this is possible is shown in [3]. (3): M = H2 , the hyperbolic plane. Use the standard coordinates (x, y), y > 0 with 2 2 respect to which the metric attains the form dg = dx y+dy . 2 Then the following theorem holds true: Theorem 1.1. Let M be one of the above. Then there exists a number > 0 with the following property: Let (u(0), ∂t u(0)) : R3 → (M, T M) be smooth initial data satisfying the property7 3

||∂α (i ◦ u)(0)||

α=0

1

H˙ 2

<

in situations (1), (2), or 3 ∂α (x ◦ u) y

1

H˙ 2

α=0

∂α (y ◦ u) + ˙1 < y H2

in the third situation. Then there exists a global (in time) smooth Wave Map u : R3+1 → M with these initial conditions. 2. Outline of the Argument 2.1. Basic formulation of the problem. This section will serve as an outline for the rest of the paper, explaining the strategy for proving the theorem cited at the end of the last section in the case M = H2 . We translate the problem to the level of the derivative, utilizing the formulation (3), (4) with respect to the global orthonormal frame {−y∂x , −y∂y } for T H. More explicitly, we have φα1 = −

∂α x 2 ∂α y , φα = − . y y

(10)

The div-curl system satisfied by these quantities is then of the following form: ∂β φα1 − ∂α φβ1 = φα1 φβ2 − φα2 φβ1 , 6

(11)

This notion was introduced by Klainerman-Rodnianski and means that there exists a global orthonork , C k defined via ∇ e = k e , [e , e ] = C k e mal frame (e1 , e2 ) for TM such that the functions ij ei j i j ij ij k ij k have bounded derivatives of all orders. 7 The Sobolev spaces are defined by picking a point p ∈ M and considering the functions i ◦ u − i ◦ p instead of i ◦ u.

338

J. Krieger

∂β φα2 − ∂α φβ2 = 0,

(12)

∂α φ 1α = −φα1 φ 2α ,

(13)

∂α φ 2α = φα1 φ 1α ,

(14)

α, β here vary over the space-time indices 0, 1, 2, 3, and Einstein’s summation convention is in force. Once we can show that the φαi stay smooth globally in time, the actual Wave Map can be obtained by integration from (− ∂yt x , − ∂yt y ) = (φ01 , φ02 ). Letting φα denote the column vector with entries φα1 , φα2 , we obtain the following wave equations: φα = Mν ∂ ν φα + “φ 3 ”, where

(15)

0 −2φν1 , 2φν1 0

Mν =

(16)

and “φ 3 ” refers to a vector with entries that are cubic polynomials in the φαi . The fine structure of these entries will actually be relevant later on, but we leave it out for the present discussion. As explained in the introduction, this formulation does not lend itself to good estimates. 2.2. Introducing the global Coulomb Gauge. We now try to modify the matrix Mν by adding a term of the form 2∂ν A, in such a way that the resulting matrix M˜ ν = Mν +2∂ν A has better properties. More precisely, we want this to depend “quadratically” on φ. This can be achieved by utilizing the Coulomb Gauge condition 3j =1 ∂j M˜ j = 0, whence 3 A = − 21 −1 x j =1 ∂j Mj . Indeed, observe that the M˜ ν satisfy the following div-curl system: 3

∂i M˜ i = 0, ∂ν M˜ µ − ∂µ M˜ ν =

i=1

whence M˜ ν =

2(φν1 φµ2 − φµ1 φν2 ) , 0 −2(φν1 φµ2 − φµ1 φν2 )

0 2 3 2 1 −1 1 2 −2 i=1 ∂i (φν φi − φi φν )

0

3

i=1

−1 ∂

1 2 i (φν φi

− φi1 φν2 )

0

,

(17)

(18)

or in a first approximation M˜ ν = “D −1 (φ 2 )”. We can now set U = eA and obtain U −1 (U φα ) = U −1 (U )φα + M˜ ν ∂ ν φα + “φ 3 ”.

(19)

Of course, we use the commutativity of the Gauge group for 2-dimensional target. The difference between this wave equation for U φα and (15) is that the nonlinearity here

Global Regularity of Wave Maps from R3+1 to Surfaces

339

consists of trilinear expressions. In particular, this modification suffices to handle the case of 4 + 1-dimensional Wave Maps. For this, observe for example that one can easily estimate the L1t L2x -norm of M˜ ν ∂ ν φα since this is morally D −1 (φ 2 )Dφ and ||D −1 (φ 2 )Dφ||L1 L2 ≤ C||φ||2 2 t

Lt L8,2 x

x

||φ||L∞ 1. t Hx

(20)

The right-hand terms are controlled by means of Strichartz’ inequalities. Similarly, one can estimate the remaining terms of the nonlinearity in the L1t L2x -norm. This is Shatah and Struwe’s method for H2 . One can also estimate this term using the improved bilinear Strichartz estimate for D −1 (φ 2 ) in [12], as observed by Klainerman and Rodnianski. For the 3-dimensional case, Strichartz’ estimates alone don’t seem sufficient. This can be seen by analyzing the case when D −1 (φ 2 ) has very low frequency while Dφ has large frequency; in order to recoup the exponential loss caused by D −1 , one seems to be forced to employ a L2t L∞ x Strichartz estimate, which unfortunately doesn’t exist. To proceed, we need to take into account more of the special structure of the nonlinear terms. 2.3. Implementing the dynamic separation. We use complex notation. Introduce the variables φα = φα1 + iφα2 . Then introduce the “twisted variables” ψα := ψα1 + iψα2 := e−i φα , where := −1 3k=1 ∂k φk1 ( stands for x .) This is of course the same Gauge Change as in the previous subsection, in complex notation. The precise wave equation satisfied by the ψα is the following: 3

−i −1 1 2 2 1 ψα = 2ie ∂k [φk φν − φk φν ] ∂ ν φα + “φ 3 ” − [i + ∂ν ∂ ν ]ψα . k=1

The most difficult term on the right-hand side is the first summand, which we also refer to as the “leading term”. It can be cast into the more concise form (modulo quadrilinear error terms) −1

3

−1 ∂k [ψk1 ψν2 − ψk2 ψν1 ]∂ ν ψα .

k=1

Now observe that the ψα satisfy a special curl-system, namely the following: ∂α ψβ − ∂β ψα = iψβ

−1

3

(ψα1 ψj2

− ψj1 ψα2 ) − iψα −1

j =1

3

(ψβ1 ψj2 − ψβ2 ψj1 ).

j =1

(21) The dynamic separation consists in decomposing ψν = −Rν + χν := −Rν

3 k=1

Rk ψk + χ ν ,

340

J. Krieger

√ −1 where Rν denotes the Riesz multiplier −x ∂ν , ν = 0, 1, 2, 3. The χν (“elliptic part”) in turn are determined by the following elliptic div-curl system, which is easily verified: 3

∂j χj = 0,

j =1

∂i χν − ∂ν χi = ∂i ψν − ∂ν ψi . This in addition to (21) implies that χν = i

3

−1 ∂i (ψν −1 ∂j [ψi1 ψj2 − ψj1 ψi2 ] − ψ i −1 ∂j [ψν1 ψj2 − ψj1 ψν2 ]). (22)

k,j =1

Passing to real and imaginary parts, we can write ψν1 = −Rν 1 + χν , ψν2 = −Rν 2 + χν , where a = 3k=1 Rk ψka . The dynamic separation now enables us to decompose the leading term of the nonlinearity into a trilinear term with a special null-structure and error terms which are at least quintilinear in the ψαi . More precisely, upon substituting the gradient parts Rν for ψν , we modify the leading term to the following: 3

−1 ∂j [Rj 1 Rν 2 − Rj 2 Rν 1 ]∂ ν ψα .

j =1

This expression appears to intertwine what is customarily referred to as a Q0 -structure (referring to ∂ν u∂ ν v) with a Qνj -structure (referring to ∂ν u∂j v −∂j u∂ν v). The main reason for its being amenable to good estimates (as stated in Proposition 3.5 below) is given by the following simple lemma, which exemplifies the precise underlying null-structure: Lemma 2.4. Let f, g, h be Schwartz functions. Then we have 2

3

−1 ∂j [Rν f Rj g − Rj f Rν g]∂ ν h

j =1 3

[−1 ∂j [∇ −1 f Rj g]h] −

j =1

−

3

−1 ∂j [∇ −1 f Rj g]h

j =1 3

−1 ∂j [∇ −1 f Rj g] h − ∇ −1 f ((∇ −1 g)h)

j =1

+∇ −1 f (∇ −1 g)h + ∇ −1 f (∇ −1 g) h. Proof. Use the identities −1 −1 Rν f Rj g − Rj f Rν g = ∂ν ( − f Rj g) − ∂j ( − f Rν g), 2∂ν f ∂ ν g = (f g) − f g − f g.

Global Regularity of Wave Maps from R3+1 to Surfaces

341

Remark. The bilinear null form in [8] exhibits similar structure, though our formulation, which avoids the Fourier transform, is more simple and explicit. Now consider the terms arising upon substituting at least one “elliptic term” χν for ψν in the leading term. Schematically, they can be represented by either of the following: ∇ −1 (∇ −1 (∇ −1 (ψ 2 )ψ)ψ)∇x,t ψ, ∇ −1 (∇ −1 (∇ −1 (ψ 2 )ψ)∇ −1 (∇ −1 (ψ 2 )ψ))∇x,t ψ. Both of these turn out to be significantly easier to treat than the preceding null-form term. Indeed, we won’t have to refer to an inherent null-structure anymore. 2.5. The Bootstrapping argument. In order to prove the global regularity of u, we utilize a bootstrapping argument, quite similar to the one in [24]. More precisely, we introduce certain translation invariant Banach spaces S[k]([−T , T ] × R3 ), N[k]([−T , T ] × R3 ), k ∈ Z, T > 0 which enjoy a list of remarkable properties. The norms ||.||S[k]([−T ,T ]×R3 ) will be used to estimate the components at frequency ∼ 2k of the φαi 8 which are known to be smooth on the time interval [−T , T ], while the norms ||.||N[k]([−T ,T ]×R3 ) will be used to estimate the components at frequency ∼ 2k of the nonlinearity, again restricted to and smooth on the time interval [−T , T ]. Of course, ||.||S[k] will have to majorize the energy ||.|| ˙ 1 as well as a certain range of Strichartz H2

norms, all applied to functions microlocalized at frequency ∼ 2k . Our goal will be to bootstrap each of the norms ||Pk φαi ||S[k]([−T ,T ]×R3 ) . As a matter of fact, we will only have to bootstrap ||P0 φαi ||S[0]([−T ,T ]×R3 ) , because the S[k] scale appropriately with respect to “dilations” compatible with the div-curl system (11)–(14): denoting φλ := 2λ φ(x2λ ), we will have ||Pk+λ φλ ||S[k+λ]([−T ,T ]×R3 ) = ||Pk φ||S[k]([−T ,T ]×R3 ) , k, λ ∈ Z. Here Pk denotes the Littlewood-Paley projector to frequency ∼ 2k . A similar identity holds for N [k]([−T , T ] × R3 ). The S[k] and N[k] (leaving out the time-parameter T for simplicity’s sake) will be related by the fundamental energy inequality:

Pk φ||N[k]([−T ,T ]×R3 ) + ||Pk φ[0]|| ˙ 1 ˙ − 1 , (23) ||Pk φ||S[k]([−T ,T ]×R3 ) ≤ C || H 2 ×H

2

where C is independent of T . In order to use this inequality, we need to estimate the N[k]-norm of the nonlinearity. For this, it will be important to us amongst other things that there are: (1) Null-form estimates of the form ||P0 [Rν Pk1 φ∂ ν Pk2 ψ]||N[0] ≤ C2−δ max{k1 ,0} ||Pk1 φ||S[k1 ] ||Pk2 ψ||S[k2 ] , δ > 0. (24) (2) Bilinear estimates that make up for the missing L2t L∞ x -estimates. These come about by using null frame spaces, and have roughly the form ||Pk1 φPk2 ψ||L2 L2 ≤ C2 t

8

x

k1 −k2 2

||Pk1 φ||S[k1 ] ||Pk2 ψ||S[k2 ]

Suitable dilates of these spaces will be used for the frequency components of u.

(25)

342

J. Krieger

provided φ, ψ are microlocalized on small caps whose distance is at least comparable to their radius, and provided their Fourier support lives fairly closely to the cone. (3) Trilinear estimates: ||P0

3

−1 ∂j [Rν Pk1 ψ1 Rj Pk2 ψ2 − Rj Pk1 ψ1 Rν Pk2 ψ2 ]∂ ν Pk3 ψ3 ||N[0]

j =1

≤ C2−δ1 |k1 −k2 | 2−δ2 |k3 |

||Pkj φj ||S[kj ] , δ1 , δ2 > 0.

(26)

These are the crucial tool for the paper. (4) The S[k] have to be well-behaved under the Gauge Change. In particular, we need an assertion of the form that provided ||Pk φ||S[k] are small in a suitable sense, then so are ||Pk [f (∇ −1 φ)φ]||S[k] , where ∇ −1 stands for a linear combination of operators of the form −1 ∂j , and f (x) is a smooth function all of whose derivatives are bounded. 3. Technical Preparations The spaces S[k], N[k] and many of their properties were considered in Tao’s seminal paper [24], although their origins can be traced back to Tataru’s [27]. Most of this section (except the trilinear inequality and the Gauge Change result) is due to these two authors; we will therefore be rather brief with the definitions. First, we introduce Tao’s concept of frequency envelope, as in [23, 24]: for any Schwartz function ψ on R3 , we consider the quantities

1

ca :=

k∈Z

2

−σ |a−k|

2

||Pk ψ||2 1 H˙ 2

.

(27)

Here Pk , k ∈ Z are the standard Littlewood-Paley operators that localize to frequency ∼ 2k , i.e. they are given by Fourier multipliers mk (|ξ |) = m0 ( |ξ2k| ), where m0 (λ) is a smooth function compactly supported within 21 ≤ λ ≤ 2 with k∈Z m0 ( 2λk ) = 1, λ > 0. The σ > 0 is chosen to be smaller than any of the exponential decays occurring later in 1 the paper. E.g. 1000 would suffice. We note that all of the generic constants C occurring in the sequel depend at most on this parameter σ . Note that ck 2−σ |a−k| ≤ ca ≤ 2σ |a−k| ck as well as

2 k∈Z ck

(28)

≤ C||ψ||2 1 . H˙ 2

The main reason for the usefulness of this concept is that provided we know that the frequency localized components Pk ρ for some other Schwartz function ρ on R3 (think: 1 the time-evolved Wave Map) have H˙ 2 -norm bounded by a multiple Cck , we can im1 mediately bound the H˙ 2 + -norm of ρ for > 0 small enough. This will allow us later to continue the Wave Map, by referring to local well-posedness of the div-curl system 1 (11)–(14) in H 2 + , and finite speed of propagation. We introduce the following norms on frequency localized Schwartz functions on R3+1 for our bootstrapping argument: for every l > 10, choose a covering Kl of S 2

Global Regularity of Wave Maps from R3+1 to Surfaces

343

by finitely overlapping caps κ of radius 2−l . This is to be chosen such that the set of concentric caps with half the radius still covers the sphere. Now let ||ψ||S[k] := ||∇x,t ψ||

− 21

˙ L∞ t Hx



+ sup sup  ± l>10

+ ||∇x,t ψ||

− 21 , 21 ,∞

X˙ k

1 2

2  , ||P˜k,±κ Q±
(29)

κ∈Kl

where it is understood that ψ lives at frequency ∼ 2k , k ∈ Z. The operators P˜k,κ are given by symbols m ˜ k (|ξ |)aκ ( |ξξ | ), where a : S 2 → R is a smooth function with support contained in the concentric cap inside κ with half the radius of κ, and m ˜ k localizes frequency to size ∼ 2k and satisfies m ˜ m = m , where m is the multiplier chosen above. k k k k We also require that κ∈Kl P˜k,κ = P˜k , the latter being defined in the obvious way. k−2l and also restricts Q± < 0, i.e. to the upper or lower half-space. More precisely, it is given by the multiplier i0 (±τ ). The norm ||φ|| − 1 , 1 ,1 refers − 2k

X˙ k

j 2

2 2

to 2 j ∈Z 2 ||Qj φ||L2t L2x . The definition of S[k, κ] is a scaled-down version of the one in [24]: 1

||ψ||S[k,κ] := 2 2 ||ψ||NFA∗ [κ] + |κ|− 2 2− 2 ||ψ||P W [κ] + 2 2 ||ψ||L∞ 2. t Lx k

k

k

(30)

The definitions of the individual ingredients in turn are as follows: (1) NF A∗ [κ] is the Banach space obtained upon completing S(R3+1 ) with respect to the norm ||ψ||NFA∗[κ] := sup dist (ω, κ)||φ||L∞ 2 . t Lx ω∈2κ /

ω

Here (tω , xω ) refer to null-frame coordinates, i.e. tω = (t, x) · (t, x) − tω √1 (1, ω). 2

(31)

ω

√1 (1, ω), xω 2

=

(2) P W [κ] is the atomic Banach space whose atoms are the set A of all Schwartz functions ψ with ||ψ||Lt 2 L∞ ≤ 1 for some ω ∈ κ. In other words, ω xω

||ψ||P W [κ] = inf |λ||∃{0 ≤ λi ≤ 1}, {ψi } ⊂ A, 1 ≤ i ≤ N s.t. λ

λ i ψi = ψ .

λi = 1,

i

(32)

i

Of course, the Banach space S[k] is obtained by completing the Schwartz functions on R3+1 with respect to ||.||S[k] . Next, we will place frequency localized pieces of the nonlinearity into the following spaces N[k], again introduced by Tao and implicitly present in Tataru’s work: they are the atomic Banach spaces whose atoms are k

(1) Schwartz functions F at frequency between 2k−4 and 2k+4 with ||F ||L1 L2 ≤ 2 2 . t

x

344

J. Krieger

(2) Schwartz functions F with frequency between 2k−4 and 2k+4 and modulation j k between 2j −5 and 2j +5 such that ||F ||L2 L2 ≤ 2 2 2 2 . t x (3) Schwartz functions F for which there exists a number l > 10 and Schwartz functions Fκ with Fourier support in the region {(τ, ξ )| ±τ > 0, ||τ | − |ξ || ≤ 2k−2l−100 , 2k−4 ≤ |ξ | ≤ 2k+4 , ∈ 21 κ} such that F = κ∈Kl Fκ and ( κ∈Kl 1

k

ξ ∗ ||Fκ ||2NFA[κ] ) 2 ≤ 2 2 . Here = |ττ||ξ | and N F A[κ] is the dual space of N F A[κ] , i.e. the atomic Banach space whose atoms are Schwartz functions F which satisfy

1 ||F ||L1 L2 ≤ 1 tω xω dist(ω, κ) for some ω ∈ / 2κ. We try to briefly explain the reason for introducing these spaces: the P W [κ] component of S[k] is to be thought of as a substitute for the missing L2t L∞ x -estimate. This is directly exemplified by the following first fundamental bilinear inequality: k

||φψ||NFA[κ]

1

2 2 |κ | 2 ||φ||L2 L2 | ||ψ||S[k ,κ ] , ≤C t x dist (κ, κ ) k

(33)

1

which is a direct consequence of the inclusion S[k, κ] ⊂ 2 2 |κ| 2 P W [κ]. This inequality also suggests that NF A[κ] is to be seen as a substitute for L1t L2x , the energy space. This may seem odd, as we are substituting a null-frame analogue for the customary version, and there is no Duhamel’s formula in that context. However, we shall only place pieces of the nonlinearity into N F A[κ] which are microlocalized along an angular sector contained in κ, and it turns out that there is an analogue of the energy inequality then. The NF A ∗ [κ]-component of S[k] makes certain algebra estimates work and will in particular enable us to obtain a general Gauge Change estimate cited below. This shall be a consequence of the following 2nd fundamental bilinear inequality, which is essentially dual to the first: k

||φψ||L2 L2 ≤ C t

x

1

2 2 |κ | 2

k

dist (κ, κ )2 2

||φ||S[k,κ] ||ψ||S[k ,κ ] .

(34)

This is again an immediate consequence of the definitions, viz. also [24]. Finally, we also note that truncated free waves are naturally embedded into these spaces, which is of course crucial for an “energy inequality” (see below, (38)) to work. We exemplify this by the following inequality9 valid for all Schwartz functions φ ∈ S(R3+1 ): Pk,κ Q± φ ≤ C||φ|| 1 , 1 ,1 . (35)
2

p

q

In the sequel, it will be important to have some Strichartz norms of the form Lt Lx at our disposal. Unfortunately, the author was unable to build sharp Strichartz norms (satisfying p1 + q1 = 21 ) into the S[k], on account of difficulties related to the energy inequality (38). This means we have to make do with a certain range of non-sharp Strichartz norms, which can be seen to be controlled by the S[k]. This will be the content of a theorem below. 9

1 , 1 ,1 2

Keep in mind that elements of X˙ k2

are weighted averages of free waves.

Global Regularity of Wave Maps from R3+1 to Surfaces

345

Since we will be implementing a bootstrapping argument, we can only assume the a priori existence of a solution on a finite time interval [−T , T ]. We therefore need to localize the above (frequency-localized) norms to this interval. To wit ||Pk φ||S[k]([−T ,T ]×R3 ) := ||Pk φ||N[k]([−T ,T ]×R3 ) :=

inf

||Pk ψ||S[k](R3+1 ) ,

(36)

inf

||Pk ψ||N[k](R3+1 ) .

(37)

ψ∈S (R3+1 ), ψ|[−T ,T ] =φ

ψ∈S (R3+1 ), ψ|[−T ,T ] =φ

We can now formulate the following energy inequality, which is the essential link between the N[k] and S[k]-norm that will allow us to finish the bootstrapping argument: ||Pk φ||S[k]([−T ,T ]×R3 ) ≤ C[|| Pk φ||N[k]([−T ,T ]×R3 ) + ||φ[0]||

1

1

H˙ 2 ×H˙ − 2

],

(38)

where C is independent of T . This is proved as in [24]; the only difference between our S[k, κ] norm and Tao’s S[k, κ]-norm is their scaling, which doesn’t affect the proof. It is important that the S[k]([−T , T ] × R3 )-norms of the frequency localized components of a Schwartz function are in a sense uniformly lower semicontinuous with respect to T , as demonstrated in [24]. In particular, we may assume that T > 0 has been chosen such that the component functions φ of our Wave Map satisfy ||Pk φ||S[k]([−T ,T ]×R3 ) ≤ Cck ,

(39)

where ck is a frequency envelope associated with the initial conditions φ[0] × ∂t φ[0] as above, i.e.

1

ck :=

k

2

−δ|k −k|

2

(||Pk φ||

H˙

1 2

+ ||Pk ∂t φ||

1 ˙−2

H

)

2

.

(40)

Moreover, since we assume that φ is rapidly decaying in space directions, we can con˜ [−T ,T ] = φ and such that ||Pk φ|| ˜ S[k] ≤ 2Cck . This struct a Schwartz function φ˜ with φ| is achieved by using a partition of unity. We will always substitute φ˜ for φ when making actual estimates. √ −1 Notation. The Riesz operators Rν , ν ∈ {0, 1, 2, 3}, refer to operators ∂ν ( −x ) . We usually omit the subscript for operators like ∇x , x , understanding that they refer only to space variables. −1 is either a shorthand for an operator −1 ∂ , or else refers to i √ The symbol ∇ ( −)−1 , depending on the context. We use the notation Pk+O(1) = k1 =k+O(1) Pk1 , Qj +O(1) = j1 =j +O(1) Qj1 . Also, ||φ||S[k+O(1)] = k1 =k+O(1) ||Pk1 φ||S[k1 ] etc. The following terminology, introduced by T.Tao in [24], shall be useful in the future: we call a Fourier multiplier disposable if it is given by convolution with a translation invariant measure of mass ≤ O(1); in particular, operators such as Pk , Pk Q<>j , where j ≥ k + O(1) are disposable, see the above reference. By contrast, Qj is not disposable. p However, it acts boundedly on Lebesgue spaces of the form Lt L2x . Whenever we consider an expression of the form P0 (AB[CD]), for example, we shall refer to A, B, C, D as inputs and the whole expression as output. Also, when referring to [, ], we mean [CD], while (, ) would refer to P0 (AB[CD]); thus the shape

346

J. Krieger

of brackets matters in the discussion. When considering a part of the whole expression such as [CD], we may also refer to this as output, and C, D as inputs, depending on the context. In the proof of the Gauge Change estimate, we shall use the term modulation to refer to the distance of the (space time) Fourier support of a function to the light cone.

Summary of the key properties satisfied by these spaces. The paradifferential Calculus approach chosen in this paper enables us to divide the nonlinearity into different pieces (obtained upon microlocalizing all the inputs as well as the output) which can be controlled individually. However, the fact that we start out with refined information about the frequency localized components of the Wave Map forces us to retrieve the refined information via the bootstrapping argument. Thus while on the one hand we gain from the fact that we can subdivide the nonlinearity into many pieces each of which is amenable to an individual attack, we lose in that we have to recover the original frequency envelope from our estimates. For example, whenever enacting a Gauge Change of the ˜ where φ, ˜ φ˜ 1 are Schwartz functions (the latter real form ψ := f (−1 3k=1 ∂k φ˜ k1 )φ, k 1 10 valued ) agreeing with φ, φk on [−T , T ] and for which the S[k]-norms of the frequency localized pieces sit under approximately the same frequency envelope, we shall need to know that the frequency modes of ψ are controlled by a dilate of the same frequency envelope. Moreover, we shall have to rely on refined multilinear estimates which allow us to sum over all possible frequency interactions contributing to a fixed frequency mode of the nonlinearity, as well as to recover the original frequency envelope. We summarize here the key properties to be referred to throughout the rest of the paper:

3.1. The Gauge Change estimate. Proposition 3.1. Let f (x) be a smooth function all of whose derivatives are bounded. Also, let φi , i = 1, 2, 3, 4 be Schwartz functions satisfying the condition maxi ||Pk φi ||S[k] ≤ ck for a ‘sufficiently flat’ frequency envelope {ck }(i.e. σ in the definition sufficiently small). Then     3 ||Pk f −1 ∂j φj  φ4  ||S[k] ≤ Cck j =1

We shall give the proof later in the paper.

3.2. Bilinear estimates. Q0 null-form estimates. Theorem 3.2. Let φ,ψ be Schwarz functions on R3+1 . We have ||Pk [Rν Pk1 φ∂ ν Pk2 ψ]||N[k] ≤ C2−δ max{k1 −k,0} ||Pk1 φ||S[k1 ] ||Pk2 ψ||S[k2 ] for some δ > 0. Also, we have ||Pk ∇x [Rν Pk1 φR ν Pk2 ψ]||N[k] ≤ C||Pk1 φ||S[k1 ] ||Pk2 ψ||S[k2 ] . 10 Note that the S[k], N[k] are conjugation invariant. Thus we can always find real-valued extensions of our component functions with the required properties.

Global Regularity of Wave Maps from R3+1 to Surfaces

347

Finally 1 

 ||Rν φR ψ||L2 L2 ν

t

x

≤C

2

||Pk1 φ||2S[k1 ] 

k1



1 2

||Pk2 ψ||2S[k2 ] 

.

k2

The first two inequalities are due (in somewhat modified form) to T.Tao [24]. We present proofs for the above versions (our spaces being scaled down with respect to Tao’s) in [13]. Theorem 3.3. Let φ, F be Schwartz functions, and k1 = k2 + O(1). Then we have ||P0 (Pk1 φPk2 F )||N[0] ≤ C2−δk1 ||Pk1 φ||S[k1 ] ||∇x (Pk2 F )||N[k2 ] for some δ > 0. Moreover, we have the estimate ∞ + sup ||Pk ∇x φ||S[k] )||∇x (Pk2 F )||N[k2 ] . ||P0 ∇x (φPk2 F )||N[0] ≤ C(||φ||L∞ t Lx

k

This is again due to Tao [24] in slightly different form. Proofs may be found in [24, 13]. Bilinear algebra and Qνj -estimate. Theorem 3.4. Let φ1 , φ2 be Schwartz functions. Then if j ≤ k, we have ∀ > 0 and 0 < δ < , |k1 −k2 | ||Pki φi ||S[ki ] , ||Pk Qj (Pk1 φ1 Pk2 φ2 )||X˙ −,,∞ ≤ C,δ 2δ min{j −min{k1 ,k2 ,k},0} 2− 2 i=1,2

1

||Pk Qj (Pk1 φ1 Pk2 φ2 )|| ˙ − 1 , 1 ,∞ ≤ C 2 2+ min{j −min{k1 ,k2 ,k},0} 2−|k1 −k2 | X

2 2

||Pki φi ||S[ki ] .

i=1,2

Also, one has the inequality µ

||Pk (Pk1 φPk2 ψ)||L2 L2+µ ≤ Cµ 2 4+2µ k 2− t

|k1 −k2 | 2

x

||Pki ψi ||S[ki ]

i=1,2 p

for any µ > 0. In particular, we can control the L4t Lx -norm, p > 4, of the k th frequency 2 component in terms of S[k], and by interpolation with L∞ t Lx , one controls all norms of p q 1 1 1 11 the form Lt Lx , p + q < 2 , p ≥ 4, at that frequency. Finally, we have ||Pk (Rν Pk1 ψ1 Rj Pk2 ψ2 − Rj Pk1 ψ1 Rν Pk2 ψ2 )||L2 L2 t x |k −k | − 1 2 2 −|k−max{k1 ,k2 }| ≤ C2 2 ||Pki ψi ||S[ki ] . i=1,2

This theorem, proved in [13], would be essentially superfluous if S[k] could be customized in such a way as to be included in L4t L4x . One can also majorize ||P0 R0 φ||L4 Lp by C||P0 φ||S[k] . For P0 Q<0 φ, this follows from the immet x diately preceding, whereas for P0 Q≥0 φ, this is a consequence of Bernstein’s inequality. 11

348

J. Krieger

3.3. Trilinear null-form estimates. Proposition 3.5. Let ψl , l = 1, 2, 3 be Schwartz functions on R3+1 . We then have the estimate   3 ||P0  −1 ∂j [Rν Pk1 ψ1 Rj Pk2 ψ2 − Rj Pk1 ψ1 Rν Pk2 ψ2 ]∂ ν Pk3 ψ3  ||N[0] j =1

≤ C2−δ1 |k1 −k2 | 2δ2 (min{k3 −max{k1 ,k2 },0}) 2−δ3 |k3 |

3

||Pkl ψl ||S[kl ]

(41)

l=1

for appropriate constants δ1 , δ2 , δ3 > 0. As a corollary, we have  

3 −1 ν 2 ||P0  ∂j [Rν ψ1 Rj ψ2 − Rj ψ1 Rν ψ2 ]∂ ψ3  ||N[0] ≤ C ck c 0 j =1

k∈Z

provided maxi=1,2,3 ||Pk ψi || ≤ ck for some frequency envelope {ck } which is “sufficiently flat”, i.e. σ << min{δi }. Proposition 3.6. Let ψi be as above. Then we have the inequalities ||P0 [Rν Pk1 ψ1 R ν Pk2 ψ2 Pk3 ψ3 ]||N[0] ≤ C2

−δ1 |k1 −k2 | δ2 (min{k3 −max{k1 ,k2 },0}) −δ3 |k3 |

2

2

3

||Pkl ψl ||S[kl ] ,

l=1

||P0 [∇ −1 (Rν Pk1 ψ1 ∂ ν Pk2 ψ2 )Pk3 ψ3 ]||N[0] ≤ C2−δ1 |k1 −k2 | 2δ2 (min{k3 −max{k1 ,k2 },0}) 2−δ3 |k3 |

3

||Pkl ψl ||S[kl ]

l=1

for appropriate δ1 , δ2 > 0. One obtains a similar corollary as in the preceding proposition. Both of these are proved in [13]. The 2nd proposition is a simpler variant of an inequality in [24]. 3.4. Quadrilinear null-form estimates. Proposition 3.7. Let ψi , i = 1, 2, 3, 4 be Schwartz functions satisfying ||Pk ψi ||S[k] ≤ ck “for a sufficiently flat frequency envelope {ck }”. Then we have the inequality  3 ||P0  −1 ∂j (−1 ∂i (Rν ψ1 Ri ψ2 − Ri ψ1 Rν ψ2 )Rj ψ3 )∂ ν ψ4 i,j =1

−

3

 −1 ∂j (−1 ∂i (Rj ψ1 Ri ψ2 − Ri ψ1 Rj ψ2 )Rν ψ3 )∂ ν ψ4  ||N[0]

i,j =1

3

≤C

2

ck2

c0 .

k∈Z

The proof of this, which implicitly relies on an identity similar to but more complicated than the one recorded in Proposition 3.5, can also be found in [13].

Global Regularity of Wave Maps from R3+1 to Surfaces

349

4. Proof of Proposition 1.1 We shall present the detailed argument provided (M, g) falls into the first category. The other cases are handled more or less identically. For a given Wave Map u, we introduce the variables φαi , i = 1, 2, α = 0, 1, 2, 3, as follows:

φαi ei (u) = u∗ (∂α ).

i=1,2

Then recall the fundamental div-curl system ∂β φαi − ∂α φβi = Cji k (u)φαj φβk ,

(42)

j

∂α φ iα = −ji k (u)φβ φγk mβγ .

(43)

We pass from these to the corresponding wave equations, which take the form j

i φαi = −2kj (u)φβk ∂ β φαj + Aij kl (u)φβ φ kβ φαl ,

(44)

i , as well as where we have used the fact that Cji k = ji k − kj

∂λ (f (u)) =

ei (f )(u)φλi

i=1,2

for any smooth function f : M → R and λ = 0, 1, 2, 3. Our assumptions in Subsect 3.1 imply that we can extend the Aij kl to an open neighborhood of M in Rk , where all their derivatives are bounded. We shall prove Theorem 1.1 via the following Bootstrapping Proposition: Proposition 4.1. Let T > 0, let u : R3+1 → M be a smooth Wave Map on a time interval [−T , T ], , and let the notation be as above; then there exist a number > 0 and a large constant M > 0 independently of T , u, such that the following holds: ||Pk ∇x u||S[k]([T ,−T ]×R3 ) + sup ||Pk φαi ||S[k]([−T ,T ]×R3 ) < Mck ⇒ i,α

||Pk ∇x u||S[k]([−T ,T ]×R3 ) + sup ||Pk φαi ||S[k]([−T ,T ]×R3 ) < i,α

for all sufficiently flat12 frequency envelopes ck satisfying (

M ck 2

2 21 k∈Z ck )

< .

Theorem 1.1 follows from this and the subcritical result of Klainerman-Machedon [8].13 12 13

In the sense that the σ used in its defining property is small enough. Note that the Wave Maps equation in terms of u is (i ◦ u)l = Bjl k (u)(∂ν (i ◦ u), ∂ ν (i ◦ u)), where

Bji k is the 2nd fundamental form of the embedding i. This is structurally identical to the local formulation of Wave Maps studied in [8].

350

J. Krieger

Proof. We employ roughly the same strategy as the one outlined in Sect. 2. The first step consists in changing the Gauge in order to improve the leading term of the nonlinearity. For this, we employ a Coulomb Gauge of the following form: √ √ l 1 −1 3 j =1 ∂j (l2 (u)φj ) (φ 1 + −1ψα2 = e −1 −1φα2 ). α 1 (u)φ l ) = , we deduce the following Upon introducing the notation −1 3j =1 ∂j (l2 j wave equation: √ √ j ψα = Mµ ∂ µ ψα + −1[ + −1∂ν ∂ ν ]ψα + ei (A1j kl (u)φβ φ kβ φαl 3 √ √ j 1 + −1A2j kl (u)φβ φ kβ φαl ) − Mµ −1−1 ∂j ∂µ (l2 (u)φjl )ψα . (45)

ψα := ψα1 +

√

j =1

The Mµ in turn satisfy the following elliptic div-curl system: 3

∂j Mj = 0,

j =1

√ j 1 1 ∂l Mα − ∂α Ml = − −1[∂l (k2 (u)φαk ) − ∂α (k2 (u)φlk )] := Ej k (u)φl φαk , where the Ej k (.) are skew-symmetric in j, k and extend as smooth functions with bounded derivatives of all orders to a neighborhood of M in Rk .14 This system allows us easily to solve for the Mα , as follows:   3 j Mα = −1 ∂l  Ej k (u)φl φαk  . l=1

j,k=1,2

The conclusion upon substituting these expressions into (45) is that the new leading term of the nonlinearity is the following:   3 j ψα = −1 ∂l  Ej k (u)φl φµk  ∂ µ φα + ... . l=1

j,k=1,2

We need to make one more substitution, namely E12 (u)φλ2 = θλ1 . Note that by virtue of Proposition 3.1, the k th frequency mode of ψα as well as the k th frequency mode of θλ1 have their S[k]-norm bounded by a suitable dilate of {ck }. We reformulate the wave equation as follows: ψα =

3

−1 ∂l (θµ1 φl1 − θl1 φµ1 )∂ µ φα + ... .

l=1

In order to render the null-structure visible, we implement the dynamic separation associated with the curl equation (42) to decompose the φαi into a “dynamic” (gradient) part and an “elliptic” part (determined via an elliptic divergence curl system). 14

We shall from now on omit such qualifications as they are automatic from our assumptions.

Global Regularity of Wave Maps from R3+1 to Surfaces

351

It is easily checked that the θα1 satisfy an analogous curl-system, and can be similarly decomposed. More specifically, we write φαi = Rα i + φ˜ αi , i = 1, 2, θα1 = Rα 1 + θ˜α1 , where the Rα are Riesz operators as in Sect. 2, and we have set =− i

3

Rk φki ,

=− 1

k=1

3

Rk θk1 .

k=1

These “potentials” satisfy similar estimates (up to constants) as the φα . The trilinear null-form arising upon substituting the gradient parts is of an identical nature as the one discussed in Sect. 2. Moreover, taking into account the fact that we have identities of the form φ˜ αi =

3

−1 ∂l

l=1

Dji k (u)φαj φlk

j,k=1,2

for skew-symmetric Dji k (u), and similar identities for the θ˜α1 , reveals that substituting an “elliptic part” for either φαi or θαi results in terms at least quadrilinear of the following structure: 3

−1 ∂l (θl1

1 −1 ∂r (D12 (u)(φr1 φµ2 − φr2 φµ1 ))∂ µ φα

r=1

l=1

−

3

3 l=1

−1 ∂l (θµ1

3

1 −1 ∂r (D12 (u)(φr1 φl2 − φr2 φl1 ))∂ µ φα

r=1

∇ −1 (∇ −1 (C(u)φ 2 )∇ −1 (D(u)θ 2 ))∇x,t φ,

(46)

where the latter term15 is of course only recorded in schematic form (we don’t need its fine structure). As to the quadrilinear terms, we simply repeat the previous step of introducing new variables 1 ξλ = D12 (u)φλ2 .

These satisfy similar (frequency localized) estimates as the φαi and also a similar curl system, which allows us to apply dynamic separation ξλ = Rλ + ξ˜λ , ξ˜λ = ∇ −1 (A(u)(φ 2 )). Recall that we use the shorthand ∇ −1 for operators of the type −1 ∂j ; occasionally, we shall also √ −1 use this notation to denote the multiplier − . 15

352

J. Krieger

Carrying out the substitution leads to a quadrilinear null-form

3 3 −1 1 −1 1 1 1 1 ∂l R l ∂r (Rr Rµ − Rµ Rr ) ∂ µ φα l=1

−

3 l=1

−1

r=1

∂l R µ

1

3

−1

∂r (Rr Rl − Rl Rr ) ∂ µ φα 1

1

1

1

r=1

as well as error terms of the following schematic form16 : ∇ −1 (φ∇ −1 (φ∇ −1 (A(u)φ 2 )))∇x,t φ, ∇ −1 (∇ −1 (C(u)φ 2 )∇ −1 (D(u)φ 2 ))∇x,t φ, and similar terms of higher degree of linearity (up to degree 7). For future reference, we note that on account of Proposition 3.1, one can always replace A(u)φ by φ. Thus, to summarize the preceding discussion we state Observation 1. The leading term Mµ ∂ µ ψ can be decomposed into the sum of trilinear null-forms17 of the type in Proposition 3.5, quadrilinear null-forms of the type contained in Proposition 3.7 and error terms at least quintilinear of the schematic form: ∇ −1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t φ, ∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 ))∇x,t φ, and similar terms of higher degree of linearity. The remaining terms in the nonlinearity of (45) are handled similarly. The third, fourth and fifth term lead to trilinear null-forms of the type contained in Proposition 3.6 upon enacting dynamic separation, as well as quadrilinear terms of the form ∇ −1 (φ 2 )φ 2 . These in turn are decomposed into quadrilinear null-forms of the schematic type ∇ −1 (Rν φ1 Rj φ2 − Rj φ1 Rν φ2 )φ 2 , where φ1 , φ2 refer to suitable expressions A1,2 (u)φ, as well as terms at least quintilinear of the type ∇ −1 (∇ −1 (φ 2 )φ)φ 2 , ∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 )φ 2 . 16 We are fudging the distinction between the variables φ i , θ i , ξ i , since they are essentially equivalent α α α as far as estimates are concerned. 17 Whose inputs have frequency modes satisfying the same inequalities as the original φ i but with α respect to a dilate of the frequency envelope {ck }.

Global Regularity of Wave Maps from R3+1 to Surfaces

353

The sixth term of the nonlinearity is decomposed into terms of the exact same type as in the immediately preceding. What remains is the expression ψα contained in the 2nd term of the nonlinearity. We reformulate it using (44). One obtains the expression 3

−1 ∂l (ji k φνj ∂ ν φlk + Aij kl (u)φβ φ kβ φαl )ψα j

l=1 j

i := i φ and implementing dynamic which, upon introducing the new variables ηkν jk ν j

separation with respect to these variables (as well as the φβ for the 2nd summand), turns into a trilinear null-form (whose fine structure we have suppressed) ∇ −1 (Rν E∂ ν φ)ψ as well as quadrilinear terms of the rough form ∇ −1 (∇ −1 (φ 2 )∇x,t φ)φ, ∇ −1 (Rβ φ1 R β φ2 φ3 )φ4 , and error terms of the form ∇ −1 (∇ −1 (φ 2 )φ 2 )φ, ∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 )φ)φ. The first kind of quadrilinear expression needs to be further decomposed into quadrilinear null-forms and error terms at least quintilinear. Reiterating dynamic separation with respect to suitable variables allows one to decompose such terms into the sum of schematically written quadrilinear null-forms: ∇

−1

3

−1

∂l (Rl φ1 Rν φ2 − Rν φ1 Rl φ2 )∂ φ3 ψα , ν

l=1

as well as error terms of the schematic form ∇ −1 (∇ −1 (∇ −1 (φ 2 )φ)∇x,t φ)φ, ∇ −1 (∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 ))∇x,t φ)φ. We summarize this discussion as follows:

354

J. Krieger

Observation 2. The remaining terms of the nonlinearity can be expressed as a sum of trilinear null-forms of the types contained in Proposition 3.6, quadrilinear null-forms of the type 3

−1 −1 ν ∂l (Rl φ1 Rν φ2 − Rν φ1 Rl φ2 )∂ φ3 φ4 , ∇ l=1

∇ −1 (Rν φ1 Rj φ2 − Rj φ1 Rν φ2 )φ 2 , ∇ −1 (Rβ φ1 R β φ2 φ3 )φ4 , as well as error terms at least quintilinear of the schematic form ∇ −1 (∇ −1 (∇ −1 (φ 2 )φ)∇x,t φ)φ, ∇ −1 (∇ −1 (φ 2 )φ)φ 2 , ∇ −1 (∇ −1 (φ 2 )φ 2 )φ, ∇ −1 (∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 ))∇x,t φ)φ, ∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 ))φ 2 , ∇ −1 (∇ −1 (φ 2 )∇ −1 (φ 2 )φ)φ. In order to proceed with the proof of Proposition 4.1, we need to estimate the 0th frequency component of each of the expressions recorded in Observation 1, 2, and close by means of the energy inequality (38). More precisely, for any expression F (φ1 , φ2 , · · · , φk ) occurring in Observation 1, 2, we need to establish an inequality  ||P0 F (φ1 , φ2 , · · · , φk )||N[0] ≤ CM M

1 l 2 ck2  c0

k

for some l > 0, provided the φi are Schwartz functions satisfying ||Pk φi ||S[k] ≤ CMck for a sufficiently flat frequency envelope {ck }. This has already been achieved for the trilinear null-forms as well as the quadrilinear null-form in Observation 1 by means of Proposition 3.5, Proposition 3.6, Proposition 3.7. For the following computations, we shall make frequent use of the basic Bernstein’s inequality18 , which states that for any measurable set R ⊂ Rn and ∞ ≥ p ≥ 2, we have 1

||F −1 (χR Fφ)||Lpx ≤ C|R| 2 18

as well as simple variations thereof.

− p1

||φ||L2x .

Global Regularity of Wave Maps from R3+1 to Surfaces

355

The 2nd quadrilinear null-form in Observation 2. Use the shorthand ∇ −1 (Rν φ1 Rl φ2 − Rl φ1 Rν φ2 ) = Qν,j (φ1 , φ2 ). Then we decompose P0 [Qν,j (φ1 , φ2 )φ3 φ4 ] =

P0 [Qν,j Pk (Pk1 φ1 , Pk2 φ2 )Pk3 φ3 Pk4 φ4 ].

k,k1,2,3,4 | max{k1 ,k2 }>k+O(1) 1 1 Now we use Theorem 3.4. Choose 2+ close to 2 and let M + 2+ = 21 . Then P0 [Qν,j Pk (Pk1 φ1 , Pk2 φ2 )Pk3 φ3 Pk4 φ4 ]||L1 L2 || t

k,k1,2,3,4 | max{k1 ,k2 }>k+O(1)

≤C

k≥0, k1,2,3,4 | max{k1 ,k2 }>k+O(1)

+

k<0, k1,2,3,4 | max{k1 ,k2 }>k+O(1)

||Pk3 φ3 Pk4 φ4 ||L2 L2+ ||Qν,j Pk (Pk1 φ1 , Pk2 φ2 )||L2 L2 x t

t

x

||Pk3 φ3 Pk4 φ4 ||L2 L2+ ||Qν,j Pk (Pk1 φ1 , Pk2 φ2 )||L2 LM x

≤ CM 4

x

t

t

2−

(1−) 2 |k|

2k−max{k1 ,k2 } 2−

|k1 −k2 | 2

2−

|k3 −k4 | 2

k,k1,2,3,4 | max{k1 ,k2 }>k+O(1)

x

ci .

i

It is straightforward to verify that the summation can be carried out to provide the desired estimate for any sufficiently flat envelope. The first quadrilinear null-form in Observation 2. Use the shorthand −1

3

∂j (Rν φ1 Rj φ2 − Rj φ1 Rν φ2 )∂ ν φ3 = N (φ1 , φ2 , φ3 ).

j =1

We use the following Littlewood-Paley trichotomy: P0 [∇ −1 N(φ1 , φ2 , φ3 )φ4 ] = P0 [Pk ∇ −1 N (φ1 , φ2 , φ3 )Pk4 φ4 ] k>10, k=k4 +O(1)

+

P0 [Pk ∇ −1 N (φ1 , φ2 , φ3 )Pk4 φ4 ]

k∈[−10,10], k4 ≤15

+

P0 [Pk ∇ −1 N (φ1 , φ2 , φ3 )Pk4 φ4 ]. (47)

k<−10, k4 ∈[−5,5]

The first summand on the right-hand side is estimated by means of Proposition 3.5 as well as Theorem 3.3: || P0 [Pk ∇ −1 N (φ1 , φ2 , φ3 )Pk4 φ4 ]||N[0] k>10, k=k4 +O(1)

≤C

2−δk4 ||Pk N (φ1 , φ2 , φ3 )||N[k] ||Pk4 φ4 ||S[k4 ]

k>10, k=k4 +O(1)

≤ CM 4

r

cr2

k>10, k=k4 +O(1)

2−δk4 ck ck4 ≤ CM 4

cr2 c02

r

provided we choose the frequency envelope sufficiently flat, i.e. σ << δ.

356

J. Krieger

The 2nd summand on the right-hand side of (47) is more of the same. As to the third, we decompose it further as follows:

P0 [Pk ∇ −1 N (φ1 , φ2 , φ3 )Pk4 φ4 ]

k<−10, k4 ∈[−5,5]

=

P0 [Pk ∇ −1 N (φ1 , φ2 , P
k<−10, k4 ∈[−5,5]

+

P0 [Pk ∇ −1 N (φ1 , φ2 , P≥k+C φ3 )Pk4 φ4 ].

k<−10, k4 ∈[−5,5]

Observe that the first summand in the immediately preceding can be schematically written as a sum of terms of the following form:

P0 [Pk ∇ −1 N (φ1 , φ2 , P
k<−10, k4 ∈[−5,5]

=

P0 [Pk ∇ −1 (Qν,l (φ1 , φ2 )∇x,t P
(48)

k<−10, k4 ∈[−5,5]

This is estimated by means of Theorem 3.4: let

||

2 4+

+

1 M

= 21 ,

P0 [Pk ∇ −1 (Qν,l (φ1 , φ2 )∇x,t P
k<−10, k4 ∈[−5,5]

≤C

2−k ||P
k<−10, k4 ∈[−5,5]

≤ CM 4 (

r

cr2 )

2

k 2+

x

||∇x,t P
k<−10, k4 ∈[−5,5]

r

The 2nd term in (48) is estimated by means of the precise formulation of Proposition 3.5:

||

k<−10, k4 ∈[−5,5]

≤

≤

P0 [Pk ∇ −1 N (φ1 , φ2 , P≥k+C φ3 )Pk4 φ4 ]||N[0]

ki , i∈{1,2,3}| max{k1 ,k2 }>k3 +O(1), k3 ≥k+C k4 ∈[−5,5] ||P0 [Pk ∇ −1 N (Pk1 φ1 , Pk2 φ2 , Pk3 φ3 )Pk4 φ4 ]||N[0] 4 CM 4 2−δ1 |k1 −k2 | 2δ2 (k3 −max{k1 ,k2 }) cki . i=1 ki , i=1,2,3| max{k1 ,k2 }>k3 +O(1) k4 ∈[−5,5]

This summation can again be carried out, provided the frequency envelope is sufficiently flat.

The third quadrilinear null-form in Observation 2. This is treated similarly to the preceding by means of Proposition 3.6 and therefore left out.

Global Regularity of Wave Maps from R3+1 to Surfaces

357

The first quintilinear term of Observation 1. We note the following elementary estimates: on account of Theorem 3.4, we have ||∇ − Pa (Pb φ1 ∇ −1 Pc (φ2 φ3 ))|| where

1 p

=

5 12

4 p Lt3 Lx

≤ C 2µ()(min{a,b,c}−max{a,b,c}) ||Pb φ1 ||S[b] ,

− 3 , > 0 very small and µ() > 0.19

Next, we note that ||Pa ∇ −1 (Pb φ∇ −(1−) Pc F )||L1 L∞ ≤ C 2λ()(min{a,b,c}−max{a,b,c}) t x ||Pb φ||S[b] ||Pc F ||

||Pa ∇ −2 (Pb φ∇ −(1−) Pc F )||L1 L3+ ≤ C 2λ()(min{a,b,c}−max{a,b,c}) t x ||Pb φ||S[b] ||Pc F || where p is as before and λ() > 0,

1 3+

=

1 3

4

p

Lt3 Lx

4

p

Lt3 Lx

,

,

− 3 . Now use the trichotomy

||P0 [∇ −1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t φ]||L1 L2 t x ≤ ||P0 [∇ −1 Pk1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t Pk2 φ]||L1 L2 t

k1 >10, k1 =k2 +O(1)

+

x

||P0 [∇ −1 Pk1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t Pk2 φ]||L1 L2 t

k1 ∈[−10,10], k2 <15

+

x

||P0 [∇ −1 Pk1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t Pk2 φ]||L1 L2 . (49) t

k1 <−10, k2 ∈[−5,5]

x

Using the preceding calculations, we compute ||P0 [∇ −1 Pk1 (φ∇ −1 (φ∇ −1 (φ 2 )))∇x,t Pk2 φ]||L1 L2 t

k1 >10, k1 =k2 +O(1)

≤

x

C2−k1

k1 >10, k1 =k2 +O(1) ai , i=1,...4

||Pk1 (Pa1 φ∇ −1 Pa2 (Pa3 φ∇ −1 Pa4 (φ 2 )))||L1 L3+ ||∇x,t Pk2 φ||L∞ 2 t Lx t x

CM 5 cr2

≤

k1 >10, k1 =k2 +O(1) ai

≤ CM

5

2 ck2

r 1

2µ()(min{k1 ,a1 ...a4 }−max{k1 ,a1 ,...a4 }) 2−( 2 −)k1 ca1 ca3 ck1

c0 .

k

The remaining terms in (49) are estimated similarly, and left to the reader. 19

Use the fact that ||P0 φ||L4 L4+ ≤ C||P0 φ||S[0] . t

x

358

J. Krieger

The first quintilinear expression in Observation 2. First assume that there is a high-high interaction within the outermost bracket (, ), i.e. consider the contribution P0 [∇ −1 Pk (∇ −1 (∇ −1 (φ 2 )φ)∇x,t Pk1 φ)φ]. k1 >>k

This term is morally equivalent to P0 [∇ −1 Pk (∇ −1 (φ 2 )φPk1 φ)φ]. k1 >>k

It is easy to see upon using Theorem 3.4 as well as an additional frequency trichotomy that for fixed k,

||Pk (∇

−1

2

(φ )φPk1 φ)||

k1 >>k

−1 L1t H˙ x 2

≤ CM

3

3 2

cr2

.

r

This implies that min{||Pk ∇ −1 (∇ −1 (φ 2 )φPk1 φ)||L1 L2 , ||Pk ∇ −1 (∇ −1 (φ 2 )φPk1 φ)||L1 L∞ } t

k1 >>k

≤ CM 4

2 cr2

t

x

x

|k|

2− 2 .

r

From this the desired estimate follows easily. Next, assume that there is no high-high interaction in the outermost (, ), i.e. k ≥ k1 + O(1). This contribution is seen to be morally equivalent to P0 [Pk (∇ −1 (∇ −1 (φ 2 )φ)Pk1 φ)φ]. k1 ≤k+O(1)

Now use reasoning similar to the previous quintilinear estimate to obtain

2 −1 −1 2 4 2 ||Pk (∇ (∇ (φ )φ)Pk1 φ)||L1 L3+ ≤ CM cr 2δk . x t

r

k1
This in conjunction with another frequency trichotomy easily implies the desired inequality. The remaining error terms of degree five or higher are either similar or simpler and therefore left out. Having estimated all expressions in Observations 1, 2, we can now close the bootstrapping argument. Fix M >> 1, then choose << 1 such that (38) as well as Proposition 3.120 imply ||P0 ∇x u||S[k] + ||P0 φαi ||S[0] ≤ 20

Recall the definition of φαi via u.

M ck . 2

Global Regularity of Wave Maps from R3+1 to Surfaces

359

5. Proof of the Gauge Change Estimate We commence with the following simple lemma: Lemma 5.1. Let j ≥ k + O(1). Then provided f (x) : R → C as well as φi , i = 1, 2, 3 are as in the statement of Proposition 3.1, we have   3j k ||Pk Qj f  −1 ∂j φj  ||L2 L2 ≤ C2− 2 − 2 . t

x

j

Proof. Note that the operator Pk Qj −1 with symbol L2t L2x

C2−2j .

mk (|ξ |)mj (||ξ |−|τ ||) |τ |2 −|ξ |2

is bounded on

with norm ≤ Thus it suffices to show that 

 ||Pk Qj  −1 ∂ j ∂ν φj −1 ∂k ∂ ν φk f −1 ∂l φl  ||L2 L2 ≤ C, t

j,k

x

l



||Pk Qj 

−1 ∂j φj f

j

 −1 ∂l φl  ||

L2t L2x

≤ C2

j −k 2

.

l

The first inequality is immediate from Theorem 3.2. The 2nd is proved by invoking a frequency as well as modulation trichotomy. In particular, one uses the fact that provided l >> max{k1 , k2 , k3 }, we have Pk1 Q 0: ||Pk Qj φ||L2 Lpx ≤ C 2

−k (1− p2 ) min{ j2+ ,0}

t

||Pk Qj φ||L2 L2 . t

x

For a proof of this see [24]. Proceeding with the proof of the proposition, we use the frequency trichotomy P0 [φf (∇ −1 φ)] = P0 [Pk1 φPk2 f (∇ −1 φ)] k1 >10, k1 =k2 +O(1)

+

P0 [Pk1 φPk2 f (∇ −1 φ)]

k1 ∈[−10,10], k2 <15

+

P0 [Pk1 φPk2 f (∇ −1 φ)],

k1 <−10, k2 ∈[−5,5]

where we have used a schematic presentation for the exact expression in the statement of Proposition 3.1. We shall only deal with the first and second summand on the right-hand side, the third being much simpler.

360

J. Krieger

5.1. High-High interactions: The first term. Output restricted to small modulation:

P0 Q<10 [Pk1 φPk2 f (∇ −1 φ)].

k1 >10, k1 =k2 +O(1)

Freeze the output to modulation 2j , j < 10. Also, freeze k1,2 for the time being. We replace P0 Qj [Pk1 φPk2 f (∇ −1 φ)] by 3 l=1

R3

al (y)P0 Qj [Pk1 φ(x)Pk2 (Rl φf (∇ −1 φ))(x − y)]dy,

where al (y) is the convolution kernel of the operator −1 ∂l P˜k2 21 Then we observe that ≤ C 2k2 2−δ()j ck2 ||Pk2 (Rl φP≥j −20 f (∇ −1 φ))||L4 L4− x t

for suitable (small) , δ(). Also, using the preceding lemma as well as Bernstein’s inequality, we have ≤ C 2k2 2−δ()j ck2 . ||Pk2 (Rl φP<j −20 Q≥j −20 f (∇ −1 φ))||L4 L4− x t

The preceding pair of inequalities implies that j 2

2 ||

3 l=1

R3

al (y)P0 Qj [Pk1 φ(x)Pk2 (Rl φ(f (∇ −1 φ) 1

−P<j −20 Q<j −20 f (∇ −1 φ)))(x − y)]dy||L2 L2 ≤ C 2( 2 −δ())j 2(−1)k2 ck2 . t

x

Provided we choose > 0 small enough, we can sum this over j < 10, and also obtain the required exponential decay in k2 . This in particular implies that we control 1 1 , ,1 the X˙ 2 2 -norm of this contribution, which is all we need, on account of the inequality 0

||Pk Q
1 , 1 ,1 2

X˙ 02

.

For the remaining term, we introduce the notation (Ty f )(x) := f (x − y) and observe that 3 l=1

=

R3

al (y)P0 Qj [Pk1 φ(x)Pk2 (Rl φP<j −20 Q<j −20 f (∇ −1 φ))(x − y)]dy

3 l=1

21

y∈R3

z∈R3

al (y)b(z)P0 Qj Qj +O(1) (Pk1 φ(x)Pk2 +O(1) Rl Ty+z φ(x))

P<j −20 Q<j −20 Ty+z f (∇ −1 φ)(x)) dydz,

Recall that P˜k2 is like Pk2 but with Pk2 P˜k2 = Pk2 .

Global Regularity of Wave Maps from R3+1 to Surfaces

361

where b(z) is the kernel representing the disposable operator Pk2 . Then we use Theorem 3.4, as well as the translation invariance of the S[k]: j

2 2 ||

y∈R3

z∈R3

al (y)b(z)P0 Qj [Qj +O(1) (Pk1 φ(x)Pk2 +O(1) Rl Ty+z φ(x))

P<j −20 Q<j −20 Ty+z f (∇ −1 φ)(x))]dydz||L2 L2 t

≤ C2

x j

−k1

sup ||Qj +O(1) [Pk1 φPk2 +O(1) Rl Ty+z φ]|| ˙ 1 , 1 ,∞ ≤ C2−k1 2 2+ ck1 ck2 . X2

y, z∈R3

2

This can be summed over j < O(1) and furnishes the required exponential gain in −k1 . We now turn to the case when the output is at very large modulation 2j , j ≥ 10. We decompose into the case j + 10 ≥ k1 and its opposite. Also, we shall only consider the 1 1 , ,∞ X˙ 02 2 -component of S[0], since the proposition in the case of the energy component is standard. 5.1.1. j + 10 ≥ k1 . We apply another trichotomy with respect to modulation: P0 Qj (Pk1 φPk2 f (∇ −1 φ)) = P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 f (∇ −1 φ)) + P0 Qj (Pk1 Q≥j −10 φPk2 f (∇ −1 φ)) + P0 Qj (Pk1 Q<j −10 φPk2 Q≥j −10 f (∇ −1 φ)). We observe that the second and third summand on the right-hand side are rather easy to treat on account of Lemma 5.1. For the first, note that both inputs may be assumed to be microlocalized on the same half space τ >< 0, and k1 = j + O(1). We need to estimate 3j

2 2 ||P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 f (∇ −1 φ))||L2 L2 ∼2

3j 2

−k1

t

x

||P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 (φf (∇

−1

φ))||L2 L2 . t

x

We may assume f (∇ −1 φ) to be at frequency < 2j −10 , since otherwise, we can use ||Pk2 Q<j −10 (φP≥j −10 f (∇ −1 φ))||

4+

L4t Lx3

≤ C2−(1−)k1 ck21 .

We can also assume f (∇ −1 φ) to be at modulation < 2j −10 , on account of Lemma 5.1; of course this immediately restricts φ to modulation < 2j +O(1) . Next, assume f (∇ −1 φ) to be at frequency 2l , 0 ≤ l < j − 10. Then we have P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 ∇ −1 (Q<j +O(1) φPl Q<j −10 f (∇ −1 φ)) P0 Qj (Pk1 ,κ1 Q<j −10 φ = κ1,2 ∈Kl−k1 , dist(κ1 ,−κ2 )≤2l−k1 +O(1)

Pk2 Q<j −10 ∇ −1 (Pk2 +O(1),κ2 Q<j +O(1) φPl Q<j −C f (∇ −1 φ)).

362

J. Krieger

We discard the disposable operator Pk2 Q<j −10 ∇ −1 of L1 -norm < 2−k1 +O(1) , and obtain: 3j

2 2 ||P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 ∇ −1 (φPl f (∇ −1 φ)))||L2 L2 t x j l−k1 2 ≤ C2 2 κ1,2 ∈Kl−k1 , dist(κ1 ,−κ2 )≤2l−k1 +O(1)

||Pk1 ,κ1 Q<j −10 φ||S[k1 ,κ1 ] ||Pk2 +O(1),κ2 Q<j +O(1) φ||S[k2 ,κ2 ] ||Pl f (∇ −1 φ)||L∞ 3. t Lx −l Using the inequality ||Pl f (∇ −1 φ)||L∞ 3 ≤ C2 , as well as Cauchy-Schwarz and the t Lx following inequality22 :

 

1 2

||Pk1 ,κ Q<j −10 φ||2S[k1 ,κ] 

≤ C|k1 |||Pk1 φ||S[k1 ],

κ∈Kl−k1

we obtain the estimate 3j

2 2 ||P0 Qj (Pk1 Q<j −10 φPk2 Q<j −10 ∇ −1 (φPl f (∇ −1 φ)))||L2 L2 t

k1 2

≤ C|k1 |2 2

−k1

||Pk1 φ||S[k1 ] ||Pk2 +O(1) φ||S[k2 ] ≤

x

k

1 C2− 2+ ck21 .

This can be summed over k1 +O(1) > l ≥ 0 and is acceptable. The case when f (∇ −1 φ) ∞ is at frequency < 0 is almost identical, by placing f (∇ −1 φ) into L∞ t Lx . 5.1.2. We are left to estimate

P0 Qj [Pk1 φPk2 f (∇ −1 φ)] =

k1 ,k2 >j +10

(Pk1 Q<j −10 φPk2 Q≥j −10 f (∇ −1 φ)

k1 ,k2 >j +10

+

(Pk1 Q≥j −10 φPk2 Q<j −10 f (∇ −1 φ).

k1 ,k2 >j +10

The second summand on the right-hand side is straightforward on account of the definition of S[k]. As to the first, we need a simple modification of Lemma 5.1, proved similarly: ||Pk Qj f (∇ −1 φ)||L2 L2 ≤ C2−j −k , j < k + O(1). t

x

The desired inequality follows easily from this. 22

which follows easily from the definitions and Plancherel.

Global Regularity of Wave Maps from R3+1 to Surfaces

363

High-Low interactions. We leave the estimate of the energy of the output to the reader. 1 1 , ,∞ We commence by estimating the X˙ 02 2 -norm of the output provided the modulation is low. We use the following mixed trichotomy: let j < −10, say, P0 Qj [P[−5,5] φP<−10 f (∇ −1 φ)] = P0 Qj [P[−5,5] φP−10>.≥j −10 f (∇ −1 φ)] +P0 Qj [P[−5,5] Q≥j −10 φP<j −10 Q<j −10 f (∇ −1 φ)] +P0 Qj [P[−5,5] φP<j −10 Q≥j −10 f (∇ −1 φ)]. The second and third summand are easy on account of Lemma 5.1. For the first summand, we reformulate it as follows: P0 Qj [P[−5,5] φP−10>.≥j −10 f (∇ −1 φ)] P0 Qj [P[−5,5] φPj˜ ∇ −1 (φf (∇ −1 φ))]. = −10>j˜≥j −10

Freezing j˜ for the moment, we decompose further P0 Qj [P[−5,5] φPj˜ f (∇ −1 φ)]

= P0 Qj [P[−5,5] φPj˜ ∇ −1 (Q<j −10 φP<j −20 Q<j −20 f (∇ −1 φ))]

+P0 Qj [P[−5,5] φPj˜ ∇ −1 (Q≥j −10 φP<j −20 Q<j −20 f (∇ −1 φ))] +P0 Qj [P[−5,5] φPj˜ ∇ −1 (φP<j −20 Q≥j −20 f (∇ −1 φ))]

+P0 Qj [P[−5,5] φPj˜ ∇ −1 (φPj −20≤.≤j˜+10 f (∇ −1 φ))] +P0 Qj [P[−5,5] φPj˜ ∇ −1 (φP>j˜+10 f (∇ −1 φ))].

(50)

For the first term on the right-hand side, we observe that P0 Qj [P[−5,5] φPj˜ ∇ −1 (Q<j −10 φP<j −20 Q<j −20 f (∇ −1 φ))] = aj˜ (y)P0 Qj [Qj +O(1) (P[−5,5] φPj˜+O(1) Q<j −10 Ty φ) R3

P<j −20 Q<j −20 Ty f (∇ −1 φ)]dy, ˜

where aj˜ is the kernel associated with the multiplier ∇ −1 Pj˜ of L1 -mass ∼ 2−j . Using Theorem 3.4 as well as translation invariance of the S[k], we conclude that ||P0 Qj [P[−5,5] φPj˜ ∇ −1 (Q<j −10 φP<j −20 Q<j −20 f (∇ −1 φ))]||

1 , 1 ,∞ 2

X˙ 02

j −j˜

≤ C2 2+ c0 cj˜ . This can be summed over O(1) > j˜ > j to yield the desired inequality.

364

J. Krieger

For the second term on the right-hand side of (50), we use the improved Bernstein’s inequality: j

2 2 ||P0 Qj [P[−5,5] φPj˜ ∇ −1 (Q≥j −10 φP<j −20 Q<j −20 f (∇ −1 φ))]||L2 L2 t x ≤C ||[P[−5,5] φ||L∞ ||P Q φ|| 2 2 L L∞ j˜+O(1) l t Lx t

j˜>l>j −10

+

||[P[−5,5] φ||L∞ 2 ||P ˜ j +O(1) Ql φ||L2 L∞ t Lx t

l≥j˜

≤

x

l−j˜

C2 2+ 2

j −l 2

c0 cj˜ +

j˜>l>j −10

C2

j −l 2

x

j −j˜

c0 cj˜ ≤ C2 2+ c0 cj˜ .

l≥j˜

This can again be summed over j˜. For the third summand of (50), we invoke Lemma 5.1, of course, as well as Bernstein’s inequality. One computes j

2 2 ||P0 Qj [P[−5,5] φPj˜ ∇ −1 (φP<j −20 Q≥j −20 f (∇ −1 φ))]||L2 L2 t

≤ C2 ≤ C2

j ˜ 2 −j

−1 ||P[−5,5] φ||L∞ φ)||L4 L∞ 2 ||P ˜ φ||L4 L∞ ||P<j −20 Q≥j −20 f (∇ j t Lx t

j −j˜ 4

x

t

x

x

c0 cj˜ .

This can again be summed over j˜ > j − 10. The fourth term is similar to the third and therefore left out (one can place f (∇ −1 φ) −1 2 ∞ into L4t L∞ x ). Finally, for the fifth term, one places φP>j˜+10 f (∇ φ) into Lt Lx , using j˜

||Pj˜ [φP>j˜+10 f (∇ −1 φ)]||L2 L∞ ≤ C2 2 cj˜ . t

x

1 1

, ,∞ The simple details are left out. This finishes the treatment of the X˙ 02 2 component of ||.||S[0] , provided the output is at small modulation. The case when the modulation is large is dealt with similarly to the analogous situation in the high-high case. Now we estimate the “null-frame component” of ||.||S[0] , i.e.



sup sup  ± l<−10

1 2

−1 2  ||P0,±κ Q± <2l (φf (∇ φ))||S[0,κ]

.

κ∈Kl

Fix l < −10. We decompose ± ± −1 −1 P0 Q± <2l [P[−5,5] φf (∇ φ)] = P0 Q<2l [P[−5,5] Q<2l φP
We treat each term on the right-hand side:

Global Regularity of Wave Maps from R3+1 to Surfaces

365

First term. Use the disposability of P0,κ Q± <2l , see [24] ± −1 2 (||P0,κ Q± <2l [P[−5,5] Q<2l φP
=

κ∈Kl κ ∈Kl−5 , κ ⊂κ

≤ C(

κ ∈Kl−5

± −1 2 ||P0,κ Q± <2l [P[−5,5],κ Q<2l φP
2 2 ||P[−5,5],κ Q± <2l φ|| ) ≤ Cc0 . 1

Second term. This follows easily from the inequality ||Pk Q
1 , 1 ,1 2

X˙ k2

(51)

as well as the definition of S[k]. Third term. We reformulate it as follows: −1 P0 Q± <2l [P[−5,5] φP≥l−10 f (∇ φ)] ak (y)P0 Qr [P[−5,5] φ(x)Pk (φf (∇ −1 φ))(x − y)dy, = 3 r<2l k≥l−10 R

where ak (y) is the kernel representing the operator Pk ∇ −1 . Next, one decomposes Pk (φ(x − y)f (∇ −1 φ))(x − y) = Pk (φP≥r−10 f (∇ −1 φ))(x − y) + Pk (Pk+O(1) φP
=

3 r<2l k≥l−10 R

z∈R3

P
where bk (z) is the kernel representing the operator Pk . This is easily estimated by means of Theorem 3.4 as well as the inequality (51). The first and third summand yield contributions estimated by placing P 4, as in earlier instances. This is left to the reader, and concludes the proof of Proposition 3.1. Acknowledgements. The author would like to thank his Ph.D. advisor Sergiu Klainerman as well as Igor Rodnianski and Terence Tao for helpful suggestions and comments as well as reading the manuscript. Special thanks are also due to the referees for pointing out an error and suggesting many improvements. The research for this paper was conducted in the fall 2001.

366

J. Krieger

References 1. D’Ancona, P., Georgiev, V.: On the continuity of the solution operator of the wave maps system. Preprint 2. Bizon, P.: Commun. Math. Phys. 45, 215 (2000) 3. Christodoulou, D., Tahvildar-Zadeh, A.: On the regularity of spherically symmetric wave maps. C.P.A.M. 46, 1041–1091 (1993) 4. Helein, F.: Regularite des applications faiblement harmoniques entre une surface et une varietee Riemanienne. C.R. Acad. Sci. Paris Ser. 1, Math 312, 591–596 (1991) 5. Klainerman, S.: UCLA lectures on nonlin. wave eqns. Preprint, 2001 6. Klainerman, S., Machedon, M.: Smoothing estimates for null forms and applications. Duke Math. J. 81, 99–133 (1995) n 1

7. Klainerman, S., Machedon, M.: On the algebraic properties of the H 2 , 2 spaces. I.M.R.N. 15, 765–774 (1998) 8. Klainerman, S., Machedon, M.: On the regularity properties of a model problem related to wave maps. Duke Math. J. 87, 553–589 (1997) 9. Klainerman, S., Rodnianski, I.: On the global regularity of wave maps in the critical Sobolev norm. I.M.R.N. 13, 655–677 (2001) 10. Klainerman, S., Selberg, S.: Remark on the optimal regularity for equations of wave maps type. C.P.D.E. 22, 901–918 (1997) 11. Klainerman, S., Selberg, S.: Bilinear estimates and applications to nonlinear wave equations. Preprint 12. Klainerman, S., Tataru, D.: On the optimal regularity for the Yang-Mills equations in R4+1 . J. Am. Math. Soc. 12, 93–116 (1999) 13. Krieger, J.: Null-Form estimates and nonlinear waves. To appear 14. Nahmod, A., Stefanov, A., Uhlenbeck, K.: On the well-posedness of the wave maps problem in high dimensions. Preprint, (2001) 15. Selberg, S.: Multilinear space-time estimates and applications to local existence theory for nonlinear wave equations. Ph.D. thesis, Princeton University, 1999 16. Shatah, J., Tahvildar-Zadeh, A.: On the Cauchy Problem for Equivariant Wave Maps. Commun. Pure Appl. Math. 47, 719–754 (1994) 17. Struwe, M., Shatah, J.: The Cauchy problem for wave maps. I.M.R.N. 11, 555–571 (2002) 18. Struwe, M., Shatah, J.: Geometric Wave Equations. AMS Courant Lecture Notes 2 19. Struwe, M.: Equivariant Wave Maps in 2 space dimensions. Preprint 20. Struwe, M.: Radially Symmetric Wave Maps from 1+2 dimensional Minkowski space to the sphere. Math. Z. 242, (2002) 21. Shatah, J., Tahvildar-Zadeh, A.: On the Cauchy problem for equivariant Wave-Maps. Commun. Pure Appl. Math. 45, 719–754 (1994) 22. Tao, T.: Ill-posedness for one-dimensional Wave Maps at the critical regularity. Am. J. of Math. 122(3), 451–463 (200) 23. Tao, T.: Global regularity of wave maps I. I.M.R.N. 6, 299–328 (2001) 24. Tao, T.: Global regularity of wave maps II. Commun. Math. Phys. 224, 443–544 (2001) 25. Tao, T.: Counterexamples to the n=3 endpoint Strichartz estimate for the wave equation. Preprint 26. Tataru, D.: Local and global results for wave maps I. Commun. PDE 23, 1781–1793 (1998) 27. Tataru, D.: On global existence and scattering for the wave maps equation. Am. J. Math.123(1), 37–77 (2001) Communicated by P. Constantin

Commun. Math. Phys. 238, 367–378 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0861-1

Communications in

Mathematical Physics

Global Weak Solutions of the Relativistic Vlasov-Klein-Gordon System Michael Kunzinger1 , Gerhard Rein1 , Roland Steinbauer1 , Gerald Teschl1,2 1

Institut f¨ur Mathematik, Strudlhofgasse 4, 1090 Wien, Austria. E-mail: [email protected]; [email protected]; [email protected]; [email protected] 2 International Erwin Schr¨ odinger Institute for Mathematical Physics, Boltzmanngasse 9, 1090 Wien, Austria Received: 16 October 2002 / Accepted: 5 February 2003 Published online: 19 May 2003 – © Springer-Verlag 2003

Abstract: We consider an ensemble of classical particles coupled to a Klein-Gordon field. For the resulting nonlinear system of partial differential equations, which we call the relativistic Vlasov-Klein-Gordon system, we prove the existence of global weak solutions for initial data satisfying a size restriction. The latter becomes necessary since the energy of the system is indefinite, and only for restricted data a-priori bounds on the solutions can be derived from conservation of energy.

1. Introduction When considering the interaction of classical particles with classical or quantum fields various different situations arise: On the one hand one can consider the coupling of a single classical particle to a field. How this should properly be done for the case of a Maxwell field is a classical problem, cf. [1], and the effective dynamics and asymptotics of such systems is an active field of research, cf. [7, 9–11] and the references there. On the other hand, in kinetic theory ensembles of classical particles are considered which interact by fields which they create collectively. There is an extensive literature on such systems, with particles interacting by non-relativistic, gravitational or electrostatic fields – the Vlasov-Poisson system – , by electrodynamic fields – the Vlasov-Maxwell system – , or by general relativistic gravity – the Vlasov-Einstein system. In all these systems the only interaction of the particles is via the fields which they create collectively, a situation which is sometimes referred to as the mean field limit of a many-particle system. In the present paper we consider an ensemble of particles which can move at relativistic speeds and interact by a quantum mechanical Klein-Gordon field. Let f = f (t, x, v) ≥ 0 denote the density of the particles in phase space, ρ = ρ(t, x) their density in space, and u = u(t, x) a scalar Klein-Gordon field; t ∈ R, x ∈ R3 , and

Partially supported by the Austrian Science Fund’s Wittgenstein 2000 Award of P. A. Markowich.

368

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

v ∈ R3 denote time, position, and momentum respectively. The system then reads as follows: ∂t f + vˆ · ∂x f − ∂x u · ∂v f = 0,

(1.1)

∂t2 u − u + u = −ρ,

(1.2)

ρ(t, x) =

f (t, x, v) dv.

(1.3)

Here we have set all physical constants as well as the rest mass of the particles to unity, and v vˆ = (1.4) 1 + |v|2 denotes the relativistic velocity of a particle with momentum v. We call this system the relativistic Vlasov-Klein-Gordon system. To our knowledge it has not yet been considered in the literature. Our motivation for initiating a study of this system is the following. In [8] a single classical particle coupled to a Klein-Gordon field is considered, and the system (1.1), (1.2), (1.3) is meant as a natural generalization of this to the many-particle situation. On the other hand, the system falls within the general class of nonlinear PDE systems like the Vlasov-Poisson or Vlasov-Maxwell system, and by studying it one may hope to learn more about the general properties of this important class of problems from mathematical physics. A major issue in this area, which we also focus on in the present paper, is the question of global existence of solutions to the corresponding initial value problem. The coupling in the system above is set up in such a way that the system is conservative; 1 1 + |v|2 f dx dv + |∂t u|2 + |∂x u|2 + |u|2 dx 2 + ρu dx =: EK + EF + EC (1.5) is conserved along sufficiently regular solutions. From experience with the related systems from kinetic theory mentioned above one knows that for global existence questions it is essential to derive a-priori bounds on the solutions from conservation of energy. In this context the relativistic Vlasov-Klein-Gordon system poses the following specific difficulty: The interaction term EC in the energy need not be positive. Therefore, one has to try to estimate this term in terms of the positive quantities EK and EF . By H¨older’s inequality, interpolation, and Sobolev’s inequality, ρu dx ≤ ρ(t)6/5 u(t)6 ≤ CEK (t)1/2 ∂x u(t)2 ; the details of a more general version of this estimate can be found below, cf. Lemma 4.1. Here · p denotes the Lp norm. The problem is that, after applying the Cauchy inequality, the right hand side in this estimate is of the same order of magnitude as the positive terms in the energy, and no a-priori bound on the solution seems to follow from conservation of energy. This situation is similar to the gravitational case of the relativistic

Global Weak Solutions of the VKG System

369

Vlasov-Poisson system where it is known that spherically symmetric solutions with negative energy blow up in finite time [3]. Our way out of this difficulty is to observe ◦ ◦ ◦ that in the estimate above the constant C depends on f 1 and f ∞ , f being the initial datum for f , and if we require that it is smaller than an appropriate threshold then a-priori bounds on the solutions can be derived from conservation of energy. These bounds are such that one can pass to the limit along a sequence of global solutions to an appropriately regularized system, and this limit is a weak solution to the relativistic Vlasov-Klein-Gordon system. The details of these arguments are then similar to the corresponding ones for the Vlasov-Maxwell system [2], cf. also [12]. The paper proceeds as follows: In the next section we collect for easier reference some results on the linear, inhomogeneous Klein-Gordon equation. In Sect. 3 we prove a global existence and uniqueness result for smooth solutions to an appropriately regularized version of the Vlasov-Klein-Gordon system. We then proceed to derive uniform a-priori bounds on a sequence of such regularized solutions from conservation of energy for restricted data, and prove the existence of a global, weak solution to the original system. This is done in Sect. 4, where we also discuss some of the properties of these weak solutions. We conclude this introduction with some remarks on related results in the literature; we restrict ourselves to results on the initial value problem in the three dimensional case. For the Vlasov-Poisson system global classical solutions for general initial data have been established in [16, 14, 19]. For the relativistic Vlasov-Maxwell system such global classical solutions are so far only known for small, nearly neutral, or nearly spherically symmetric data [5, 4, 17]. For general data the existence of global weak solutions was obtained in [2]. When passing from the Vlasov-Poisson to the Vlasov-Maxwell system a lot of the difficulties with classical solutions of course arise from the different field equation which becomes hyperbolic instead of elliptic. However, it must be emphasized that already for the relativistic Vlasov-Poisson system where the only difference to the Vlasov-Poisson system is that a v in the Vlasov equation is replaced by a v, ˆ no global existence result is known for classical solutions with general initial data, and for the gravitational case where the energy is indefinite classical solutions can blow up in finite time as mentioned above, cf. [3]. An investigation of the fully relativistic gravitational situation, i.e., of the Vlasov-Einstein system, was initiated in [18]. 2. The Linear Klein-Gordon Equation Although our notation is mostly standard or self-explaining we explicitly mention the following conventions: For a function h = h(t, x, v) or h = h(t, x) we denote for given t by h(t) the corresponding function of the remaining variables. By . p we denote the usual Lp -norm for p ∈ [1, ∞]. The index c in function spaces refers to compactly supported functions. Throughout the paper the convolution denoted by ∗ refers to the spatial variables. For easier reference we now collect some results on the linear, inhomogeneous KleinGordon equation ∂t2 u − u + u = g

(2.1) R3

let with a prescribed right-hand side g = g(t, x). For t > 0 and x ∈ 1 J1 ( t 2 − |x|2 ) 1 δ(|x| − t) , − H (t − |x|) R(t, x) := 4π t 4π t 2 − |x|2

370

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

where δ is the δ-distribution, H is the Heaviside function and J1 is the Bessel function of the first kind. Given initial data ◦

◦

u(0) = u1 , ∂t u(0) = u2

(2.2)

the solution of the initial value problem (2.1), (2.2) can be written as u(t, x) = uhom (t, x) + uinh (t, x), t ≥ 0, x ∈ R3 . Here ◦

◦

uhom (t, x) := (∂t R ∗ u1 )(t, x) + (R ∗ u2 )(t, x) 1 1 ◦ ◦ = u1 (x − y) dSy − ∇ u1 (x − y) · y dSy 4π t 2 |y|=t 4πt 2 |y|=t 1 1 J1 (ξ ) t ◦ ◦ − u1 (x − y) dSy − u1 (x − y) dy 8π |y|=t 4π |y|≤t ξ ξ 1 1 J1 (ξ ) ◦ ◦ + u2 (x − y) dSy − u2 (x − y) dy 4πt |y|=t 4π |y|≤t ξ with the abbreviation ξ := t 2 − |y|2 is the solution of the homogeneous Klein-Gordon equation (2.1) with initial data (2.2), and t uinh (t, x) := R(t − s, x − y) g(s, y) dy ds 0 t t 1 ds J1 (ξ ) 1 = g(s, y) dSy g(s, y) − dy ds 4π 0 |x−y|=t−s t −s 4π 0 |x−y|≤t−s ξ with the abbreviation ξ :=

(t − s)2 − |x − y|2

is the solution of the inhomogeneous Klein-Gordon equation with vanishing initial data. These formulas can be found in [15] and [20]. They can be established by observing that the substitution w(t, x, ζ ) := u(t, x) exp(−iζ ) transforms (2.1) into a wave equation for w which can be solved in the usual way. If α ∈ N03 denotes an arbitrary multi-index and ∂ α the corresponding spatial derivative the above formulas imply the following estimate for the solution of (2.1), (2.2), which is not very sophisticated but good enough for our purpose:

◦ ◦ ◦ ∂ α u(t)∞ ≤ C(1 + t)4 ∂ α u1 ∞ + ∇∂ α u1 ∞ + ∂ α u2 ∞ + ∂ α g(t)∞ .

(2.3)

Global Weak Solutions of the VKG System

371

3. Global Classical Solutions of the Regularized System The aim of the present section is to establish the following global existence and uniqueness result for a suitably regularized Vlasov-Klein-Gordon system: Theorem 3.1. Let δ ∈ Cc∞ (R3 ). Consider the regularized, relativistic Vlasov-KleinGordon system where the right-hand side −ρ in (1.2) is replaced by −ρ ∗ δ. For initial ◦ ◦ ◦ data 0 ≤ f ∈ Cc1 (R6 ), u1 ∈ Cb3 (R3 ) and u2 ∈ Cb2 (R3 ) there exists a unique solution (f, u) 1 to the regularized system with f ∈ C ([0, ∞[×R6 ) and u ∈ C 2 ([0, ∞[×R3 ), satisfying ◦ ◦ ◦ the initial conditions f (0) = f, u(0) = u1 , ∂t u(0) = u2 . Proof. We begin by recursively defining iterates fn : [0, ∞[×R6 → [0, ∞[, un : ◦ [0, ∞[×R3 → R: Let f0 (t, z) := f(z), z = (x, v), and suppose fn has already been defined. Let ρn (t, x) := fn (t, x, v) dv and let un denote the solution to the Klein-Gor◦ ◦ don equation (2.1) with right-hand side −ρn ∗ δ and initial data u1 , u2 , cf. the previous section. Denote by Zn (s, t, z) = (Xn , Vn )(s, t, x, v) the solution of the characteristic system x˙ = v, ˆ v˙ = −∂x un (s, x) with initial datum Zn (t, t, z) = z. The next iterate is then defined as fn+1 (t, z) := ◦ f(Zn (0, t, z)), i.e., as the solution of the Vlasov equation (1.1) with u replaced by un ◦ and initial datum f. The flow of the characteristic system is measure preserving, hence fn (t)1 = ◦ ρn (t)1 = f 1 . This implies that for any α ∈ N03 , ∂ α (ρn (t) ∗ δ)∞ is bounded, uniformly in t ≥ 0 and n ∈ N. Let T > 0 be arbitrary. In what follows constants denoted by C may change from line to line and depend on the initial data, on the regularization kernel δ, and on T , but never on n ∈ N. By (2.3), ∂ α un (t)∞ ≤ C for |α| ≤ 2 and t ∈ [0, T ]. Hence there exist R > 0 and P > 0 such that fn (t, x, v) = 0 if |v| > P or |x| > R and t ∈ [0, T ], in particular, fn (t) ∈ Cc1 (R6 ). Using these bounds it is now easy to show that (fn )n∈N is a uniform Cauchy sequence on [0, T ] × R6 : By definition, fn+1 (t) − fn (t)∞ ≤ CZn (0, t, . ) − Zn−1 (0, t, . )∞ . Using the characteristic system and the bound on ∂x2 un (t) we obtain, abbreviating Zn (s, t, x, v) by Zn (s), t |Vn (τ ) − Vn−1 (τ )| dτ, |Xn (s) − Xn−1 (s)| ≤ s t |Vn (s) − Vn−1 (s)| ≤ C |Xn (τ ) − Xn−1 (τ )| dτ s t + ∂x un (τ ) − ∂x un−1 (τ )∞ dτ, s

hence

t

|Zn (0) − Zn−1 (0)| ≤ C 0

|Zn (τ ) − Zn−1 (τ )| dτ + 0

t

∂x un (τ ) − ∂x un−1 (τ )∞ dτ.

372

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

By Gronwall’s inequality, fn+1 (t) − fn (t)∞ ≤ C

t

∂x un (τ ) − ∂x un−1 (τ )∞ dτ.

0

On the other hand, the formulas for un from the previous section, the fact that we have regularized the right-hand side of (1.2), and the above bounds on the support of fn imply that ∂x un (τ ) − ∂x un−1 (τ )∞ ≤ Cρn (τ ) − ρn−1 (τ )1 ≤C |fn (τ, x, v) − fn−1 (τ, x, v)| dv dx ≤ Cfn (τ ) − fn−1 (τ )∞ . Combining these estimates we obtain fn+1 (t) − fn (t)∞ ≤ C

0

t

fn (τ ) − fn−1 (τ )∞ dτ.

Thus, by induction, fn+1 (t) − fn (t)∞ ≤ C

Cnt n , n!

so (fn )n∈N is uniformly Cauchy on [0, T ] × R6 . The same is true for (ρn )n∈N and (∂ α un )n∈N for |α| ≤ 2. That the uniform limit (f, u) of the iterative sequence has the required regularity and is the unique solution of our initial value problem on [0, T ] follows, and since T > 0 was arbitrary the proof is complete. In the next section we want to obtain a global weak solution of the relativistic Vlasov-Klein-Gordon system as a limit of a sequence of solutions to systems regularized with δn ’s that converge to the δ-distribution. To do so we will need energy bounds on these regularized solutions, but since we have modified the system energy conservation takes a somewhat different form from what was stated in the introduction: ◦

◦

Lemma 3.2. Let d ∈ Cc∞ (R3 ) be even, δ = d ∗ d, and u1 , u2 ∈ Cc (R3 ). Let 0 ≤ ◦ f ∈ Cc1 (R6 ) and let (f, u) be the unique solution to the regularized system according ◦ ◦ ◦ to Theorem 3.1 with initial conditions f (0) = f, u(0) = u1 ∗ δ, ∂t u(0) = u2 ∗ δ. Let ρ(t, x) = f (t, x, v) dv and denote by u˜ the unique solution to the initial value problem ∂t2 u˜ − u˜ + u˜ = −ρ ∗ d , ◦

◦

u(0) ˜ = u1 ∗ d, ∂t u(0) ˜ = u2 ∗ d. Then

1 [|∂t u| 1 + |v|2 f dx dv + ˜ 2 + |∂x u| ˜ 2 + |u| ˜ 2 ] dx + ρu dx 2 =: EK + E˜ F + EC

E˜ :=

is constant in t.

Global Weak Solutions of the VKG System

373

Proof. Using the Vlasov equation and integration by parts we obtain d EK = − ∂t ρ u dx . dt Also, d ˜ EF = − dt

∂t u˜ ρ ∗ d dx = −

∂t (u˜ ∗ d) ρ dx;

for the last equality observe that d is assumed to be even. The convolution u˜ ∗ d satisfies the Klein-Gordon equation with right-hand side −ρ ∗ δ and initial conditions ◦ ◦ (u˜ ∗ d)(0) = u1 ∗ δ, ∂t (u˜ ∗ d)(0) = u2 ∗ δ. Hence by uniqueness, u = u˜ ∗ d. Summing up, d d d EC = ∂t u ρ dx + u ∂t ρ dx = − E˜ F − EK . dt dt dt

4. Weak Solutions Based on Theorem 3.1 we now prove existence of global weak solutions to the relativistic Vlasov-Klein-Gordon system. The following auxiliary result will allow us to derive a-priori bounds from conservation of energy, at least for appropriately restricted initial data: Lemma 4.1. Let p ∈]3/2, ∞] and 1/p + 1/q = 1. In addition to the assumptions of Lemma 3.2 we assume further that d ≥ 0 with d = 1. Let (f, u) be a solution as obtained in Lemma 3.2. Then ◦ 1/2 u(t, x) ρ(t, x) dx ≤ C(f ) ∂x u(t) ˜ , t ≥ 0, 2 EK (t) where ◦

C(f ) :=

4q π

Proof. For any R > 0, ρ(t, x) =

1/2

q +3 q

(q+3)/6

◦

(3−q)/6

f 1

◦

q/6

f p .

|v|≤R

≤

3

−7/6

f dv +

4π 3 R 3

1/q

|v|>R

f dv

f (t, x, . )p +

1 R

1 + |v|2 f dv.

If we choose R such that the right-hand side becomes minimal we obtain the estimate q/(q+3)

ρ(t, x) ≤ Cq f (t, x, . )p

3/(q+3) 1 + |v|2 f dv ,

374

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

where Cq :=

4π 3

1/(q+3)

q +3 3

q/(q+3) 3 . q

We take this estimate to the power (q + 3)/(q + 2), integrate in x, apply H¨older’s ◦ inequality, and observe that f (t)p = f p to obtain ◦

q/(q+3)

ρ(t)(q+3)/(q+2) ≤ Cq f p

EK (t)3/(q+3) .

By assumption on p we have (q + 3)/(q + 2) > 6/5, and by interpolation, (3−q)/6

ρ(t)6/5 ≤ ρ(t)1

(q+3)/6

(q+3)/6

ρ(t)(q+3)/(q+2) ≤ Cq

◦

(3−q)/6

f 1

◦

q/6

f p EK (t)1/2 . (4.1)

By H¨older’s inequality, Young’s inequality, and Sobolev’s inequality, 1 ρu dx = ρ ∗ d u˜ dx ≤ u(t) ˜ ˜ 6 ρ(t) ∗ d6/5 ≤ √ ∂x u(t) 2 ρ(t)6/5 S3 1 (q+3)/6 ◦ (3−q)/6 ◦ q/6 1/2 ≤ √ Cq f 1 f p ∂x u(t) ˜ 2 EK (t) S3 with S3 := 3(π/2)4/3 , cf. [13, 8.3]; recall from the proof of Lemma 3.2 that u = u˜ ∗ d. We now proceed to the main result of the present paper. In its formulation we employ the space

L1kin (R6 ) := f : R6 → R | f kin := 1 + |v|2 |f (x, v)| dx dv < ∞ . ◦

◦

◦

Theorem 4.2. Let f ∈ L1kin (R6 ) ∩ Lp (R6 ) for some p ∈ [2, ∞], f ≥ 0, u1 ∈ H 1 (R3 ), ◦ u2 ∈ L2 (R3 ), and assume that (q+3)/3 ◦ (3−q)/3 ◦ q/3 q π 7/3 f p < , 3 f 1 2q q +3 where 1/p + 1/q = 1. Then there exists a global weak solution (f, u) of the relativistic Vlasov-Klein-Gordon system (1.1), (1.2), (1.3) with these initial data, more precisely, f ∈ L∞ ([0, ∞[, Lp (R6 )), u ∈ L∞ ([0, ∞[, H 1 (R3 )) with ∂t u ∈ L∞ ([0, ∞[, L2 (R3 )) such that the following holds: (a) (f, u) satisfies (1.1), (1.2), (1.3) in D (]0, ∞[×R6 ). (b) The mapping [0, ∞[ t → (f (t), u(t), ∂t u(t)) ∈ L2 (R6 ) × L2 (R3 ) × L2 (R3 ) is ◦ ◦ ◦ weakly continuous with (f, u, ∂t u)(0) = (f, u1 , u2 ).

Global Weak Solutions of the VKG System

375

◦

(c) f (t) ≥ 0 a.e., f(t)p ≤ fp , t ≥ 0, and ∂t ρ + div j = 0 in D (]0, ∞[×R3 ) where j (t, x) := vf ˆ (t, x, v) dv. The weak solution conserves mass: f (t)1 = ◦ f1 for a. a. t ≥ 0. Remark. Regardless of whether one considers data satisfying the restriction of the theorem as small or not it should be emphasized that this is not a small data result of the type known for example for the Vlasov-Maxwell system, cf. [5], since it does not rely on the fields being small and corresponding dispersive effects of the free streaming Vlasov equation. In particular, it is worthwhile to note that there is no size restriction on the data for the Klein-Gordon field. ◦

◦

Proof of Theorem 4.2. By our assumption on f, C(f )2 < 2 where the left-hand side is ◦ defined as in Lemma 4.1. We choose ε ∈]0, 1[ such that C(f )2 < 2ε. Next we choose ◦ ◦ ◦ ◦ ◦ ◦ sequences (fn ) in Cc∞ (R6 ), fn ≥ 0, (u1,n ), (u2,n ) in Cc∞ (R3 ) such that fn → f in ◦ ◦ ◦ ◦ L1kin (R6 ) ∩ Lp (R6 ), u1,n → u1 in H 1 (R3 ), u2,n → u2 in L2 (R3 ), and we require that ◦

sup C(fn )2 < 2ε.

(4.2)

n∈N

Let dn ∈ Cc∞ (R3 ) be non-negative, even, of unit integral, and with supp dn ⊆ B1/n (0), and define δn := dn ∗ dn . Denote by (fn , un ) the regularized solution according to Theorem 3.1 with δn replacing δ and initial conditions ◦

◦

◦

fn (0) = fn , un (0) = u1,n ∗ δn , ∂t un (0) = u2,n ∗ δn .

Let ρn := fn dv and denote by u˜ n the solution to the Klein-Gordon equation with right-hand side −ρn ∗ dn and initial data ◦

◦

u˜ n (0) = u1,n ∗ dn , ∂t u˜ n (0) = u2,n ∗ dn . As in the proof of Lemma 3.2 it follows that un = u˜ n ∗ dn . Using Lemma 4.1 and Cauchy’s inequality we find that (with E˜ from Lemma 3.2) 1 1 1 E˜ ≥ fn (t)kin + ∂t u˜ n (t)22 + u˜ n (t)22 + ∂x u˜ n (t)22 2 2 2 ◦ 1/2 − C(fn )∂x u˜ n (t)2 fn (t)kin 1 1 ≥ (1 − ε)fn (t)kin + ∂t u˜ n (t)22 + u˜ n (t)22 2 2 ◦ 1 1 2 2 + − sup C(fk ) ∂x u˜ n (t)2 . 2 4ε k∈N Hence by conservation of energy and (4.2) there exists C > 0 such that for all n ∈ N ◦ and t ≥ 0, fn (t)kin , ∂t un (t)2 , un (t)H 1 ≤ C. Also, fn (t)2 = fn 2 is bounded by interpolation. Since the bounds are independent of n and t, by extracting appropriate subsequences (again denoted by the same indices) we obtain f ∈ L∞ ([0, ∞[; L2 (R6 )), u ∈ L∞ ([0, ∞[; H 1 (R3 )) with ∂t u ∈ L∞ ([0, ∞[; L2 (R3 )) and fn un ∂x un ∂t un

f in L2 ([0, T ] × R6 ), u in L2 ([0, T ] × R3 ), ∂x u in L2 ([0, T ] × R3 ), ∂t u in L2 ([0, T ] × R3 )

376

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

for all T > 0. Moreover, f ∈ L∞ ([0, ∞[; L1kin (R6 )) since 1 + v 2 f dx dv = lim lim R→∞ n→∞

|x|≤R |v|≤R

1 + v 2 fn dx dv ≤ C .

◦

Since fn (t)p = fn p is bounded as well, (4.1) implies boundedness of ρn (t)6/5 . Hence without loss of generality we may assume that ρn ρ in L6/5 ([0, T ] × R3 ) for any T . Moreover, un (t)6 ≤ C∂x un (t)2 , so we may finally suppose that un u in L6 ([0, T ] × R3 ) for any T . We claim that u satisfies the Klein-Gordon equation in the weak sense. Indeed, let ϕ ∈ Cc∞ (]0, ∞[×R3 ). Then ∂t2 un − un + un + ρn ∗ δn ϕ dx dt = un ∂t2 ϕ − un ϕ + un ϕ + (ρn ∗ δn )ϕ dx dt → u∂t2 ϕ − uϕ + uϕ + ρϕ dx dt

0 =

which yields the claim; note that (ρn ∗ δn )ϕ = ρn (δn ∗ ϕ) and δn ∗ ϕ → ϕ in L6 (R3 ). Turning now to the Vlasov equation, we adapt an argument from [12, Sect. 3]. For 0 < ε < T let ξε ∈ Cc∞ (R), 0 ≤ ξε ≤ 1, ξε = 1 on [ε, T ], supp ξε ⊆ [ε/2, 2T ]. Then ∂t (fn ξε ) + v∂ ˆ x (fn ξε ) = divv (fn ξε ∂x un ) + fn ξε . Hence by the velocity-averaging lemma of Golse, Lions, Perthame and Sentis ([6, 2]) we have ∀R > 0 ∀ψ ∈ Cc∞ (BR (0)) ∃C = C(R, ψ) ∀n ∈ N, ε > 0 : ξε ( . )fn ( . , . , v)ψ(v) dv ∈ H 1/4 (R × R3 )

1/2 . ξε ( . )fn ( . , . , v)ψ(v) dvH 1/4 ≤ C ξε fn 22 + fn ξε ∂x un 22 + fn ξε 22 In the case p = ∞ the boundedness properties derived above already assure the bound ednes of the sequence ( ξε ( . )fn ( . , . , v)ψ(v) dv)n∈N in H 1/4 (R × R3 ). The case p < ∞ is more delicate and will be treated below. Since the restriction operator from H 1/4 (R × R3 ) to L2 ([0, T ] × BR (0)) is compact, by a diagonal sequence argument, for each ψ ∈ Cc∞ (R3 ) we may extract a subsequence (again denoted by (fn )n∈N ) independent of ε := 1/m, R = m (m ∈ N) such that fn ( . , . , v)ψ(v) dv → f ( . , . , v)ψ(v) dv (4.3) in L2 (]0, T [×BR (0)). For p < ∞ we show the validity of (4.3) as follows: For η > 0 we define βη (τ ) := τ/(1 + ητ ). Then βη ◦ fn , in addition to satisfying Vlasov’s equation (with force ∂x un ) and the same bounds as fn , is bounded in L∞ , and we may use the previous case. In the limit η → 0 we obtain L1 -convergence in (4.3) which together with the uniform integrability of {( f n (., ., v)ψ(v)dv)2 } implies the desired L2 -convergence.

Global Weak Solutions of the VKG System

377

We use this property to prove that f is a weak solution of the Vlasov equation. For ϕ1 ∈ Cc∞ (R), ϕ2 , ϕ3 ∈ Cc∞ (R3 ),

T

0=

0

fn (t, x, v) ϕ1 (t)ϕ2 (x)ϕ3 (v) + vˆ · ∂x ϕ2 (x)ϕ1 (t)ϕ3 (v) − ∂x un (t, x) · ∂v ϕ3 (v)ϕ1 (t)ϕ2 (x) dv dx dt.

Choose a subsequence as above for ψ = ∂v ϕ3 and R > 0 such that supp ϕ3 ⊆ BR (0). Then fn ( . , . , v)∂v ϕ3 (v) dv → f ( . , . v)∂v ϕ3 (v) dv in L2 ([0, T ] × BR (0)) and ∂x un ϕ1 ϕ2 ∂x uϕ1 ϕ2 in L2 ((0, T ) × BR (0)). Thus, finally, 0= 0

T

f (t, x, v) ϕ1 (t)ϕ2 (x)ϕ3 (v) + vˆ · ∂x ϕ2 (x)ϕ1 (t)ϕ3 (v)

− ∂x u(t, x) · ∂v ϕ3 (v)ϕ1 (t)ϕ2 (x) dv dx dt

which establishes (a). As to (b), we integrate the Vlasov equation once with respect to t and define t t ◦ ˜ vˆ · ∂x f (s)ds − ∂x u(s) · ∂v f (s)ds f (t) := f − 0

0

(to be understood as an identity between distributions in x and v). Then, using the construction of f it is not hard to see that f (t) = f˜(t) in D (R6 ) for a.e. t ∈ [0, ∞[, that ◦ is, we can replace f by f˜. Moreover, f˜(0) = f and a straightforward application of the dominated convergence theorem shows that t → f˜(t) is weakly continuous. A similar argument applies to u. As to (c), we only prove that mass is conserved, the other assertions being standard. Let ε > 0 and choose R > 0 such that |x|≥R ρ(0, x) dx < ε. Without loss of generality we can assume the same for ρn (0), but since for the regularized problem the particles move along characteristics with speeds |v| ˆ < 1 we have |x|≥R+T ρn (t, x) dx < ε on any time interval [0, T ]. Hence, ρ(t, x) dx = lim ρn (t, x) dx ρ(t, x) dx ≥ n→∞ |x|≤R+T |x|≤R+T ◦ ρn (t, x) dx − ε = f1 − ε. ≥ lim n→∞

Since on the other hand of mass.

◦

ρ(t, x) dx ≤ f1 and ε is arbitrary this proves conservation

Acknowledgements. We thank Peter A. Markowich for several helpful discussions and the referees for their useful suggestions.

378

M. Kunzinger, G. Rein, R. Steinbauer, G. Teschl

References 1. Abraham, M.: Theorie der Elektrizit¨at, Band 2: Elektromagnetische Theorie der Strahlung. Leipzig: Teubner, 1905 2. DiPerna, R.J., Lions, P.-L.: Global weak solutions of Vlasov-Maxwell systems. Comm. Pure Appl. Math. 42(6), 729–757 (1989) 3. Glassey, R.T., Schaeffer, J.: On symmetric solutions of the relativistic Vlasov-Poisson system. Commun. Math. Phys. 101, 459–473 (1985) 4. Glassey, R.T., Schaeffer, J.: Global existence for the relativistic Vlasov-Maxwell system with nearly neutral initial data. Commun. Math. Phys. 119, 353–384 (1988) 5. Glassey, R.T., Strauss, W.A.: Absence of shocks in an initially dilute collisionless plasma. Commun. Math. Phys. 113, 191–208 (1987) 6. Golse, F., Lions, P.-L., Perthame, B., Sentis, R.: Regularity of the moments of the solution of a transport equation. J. Funct. Anal. 76(1), 110–125 (1988) 7. Imaikin, V.M., Komech, A.I., Spohn, H.: Scattering theory for a particle coupled to a scalar field. Preprint 8. Imaikin, V.M., Komech, A.I., Markowich, P.A.: Scattering of solitons of the Klein-Gordon equation coupled to a classical particle. Preprint 9. Komech, A.I., Kunze, M., Spohn, H.: Long-time asymptotics for a classical particle coupled to a scalar wave field. Commun. Part. Diff. Eqs. 22, 307–335 (1997) 10. Komech, A.I., Kunze, M., Spohn, H.: Effective dynamics for a mechanical particle coupled to a wave field. Commun. Math. Phys. 203, 1–19 (1999) 11. Komech, A.I., Spohn, H.: Soliton-like asymptotics for a classical particle interacting with a scalar wave field. Nonlin. Anal. 33, 13–24 (1998) 12. Kruse, K., Rein, G.: A stability result for the relativistic Vlasov-Maxwell system. Arch. Rational Mech. Anal. 121(2), 187–203 (1992) 13. Lieb, E.H., Loss, M.: Analysis. Providence, RI: American Mathematical Society, 1996 14. Lions, P.-L., Perthame, B.: Propagation of moments and regularity for the 3-dimensional VlasovPoisson system. Invent. Math. 105, 415–430 (1991) 15. Morawetz, C.S., Strauss, W.A.: Decay and scattering of solutions of a nonlinear relativistic wave equation. Commun. Pure Applied Math. 25, 1–31 (1972) 16. Pfaffelmoser, K.: Global classical solutions of the Vlasov-Poisson system in three dimensions for general initial data. J. Diff. Eqs. 95, 281–303 (1992) 17. Rein, G.: Generic global solutions of the relativistic Vlasov-Maxwell system of plasma physics. Commun. Math. Phys. 135, 41–78 (1990) 18. Rein, G., Rendall, A.D.: Global existence of solutions of the spherically symmetric Vlasov-Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992) 19. Schaeffer, J.: Global existence of smooth solutions to the Vlasov-Poisson system in three dimensions. Commun. Part. Diff. Eqs. 16, 1313–1335 (1991) 20. Sideris, T.C.: Decay estimates for the three-dimensional inhomogeneous Klein-Gordon equation and applications. Commun. Part. Diff. Eqs. 14, 1421–1455 (1989) Communicated by H. Spohn

Commun. Math. Phys. 238, 379–410 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0877-6

Communications in

Mathematical Physics

Unextendible Product Bases, Uncompletable Product Bases and Bound Entanglement David P. DiVincenzo1 , Tal Mor2 , Peter W. Shor3 , John A. Smolin1 , Barbara M. Terhal1,4 1 2 3 4

IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA Dept. of Electrical Engineering, UCLA, Los Angeles, CA 90095-1594, USA AT&T Research, Florham Park, NJ 07932, USA ITF, UvA, Valckenierstraat 65, 1018 XE Amsterdam, and CWI, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands

Received: 19 October 1999 / Accepted: 2 November 2000 Published online: 11 June 2003 – © Springer-Verlag 2003

Abstract: We report new results and generalizations of our work on unextendible product bases (UPB), uncompletable product bases and bound entanglement. We present a new construction for bound entangled states based on product bases which are only completable in a locally extended Hilbert space. We introduce a very useful representation of a product basis, an orthogonality graph. Using this representation we give a complete characterization of unextendible product bases for two qutrits. We present several generalizations of UPBs to arbitrary high dimensions and multipartite systems. We present a sufficient condition for sets of orthogonal product states to be distinguishable by separable superoperators. We prove that bound entangled states cannot help increase the distillable entanglement of a state beyond its regularized entanglement of formation assisted by bound entanglement. 1. Introduction One of the essential features of quantum information is its capacity for entanglement. When pure state entanglement is shared by two or more parties, it permits them to send quantum data with classical communication via teleportation [BBC+ 93]. In a more general situation two parties may not start with a set of pure entangled states, but with a noisy quantum channel. To achieve their goal of transmitting quantum data over this channel, they could use an error correcting code [Got97], or alternatively they can attempt to share entanglement through the channel and later use teleportation. In the latter case, the protocol starts with the preparation of entangled states by, say, Alice, who sends half of each entangled state through the noisy channel to her partner Bob. Since the channel is noisy these states will not directly be useful for teleportation. As a next step Alice and Bob go through a protocol of purification [BBP+ 96]; they try to distill as many as possible pure entangled states out of the set of noisy ones using only local operations and classical communication. We will abbreviate such local quantum operations supplemented by classical communication hereafter as “LQ+CC” operations. Finally, they

380

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

can use these distilled states to teleport the quantum data. The amount of quantum data that can be sent via the protocol of distillation and teleportation can be higher than by “direct” quantum data transmission using error correcting codes [BDSW96]. This has been one of the motivations for studying bipartite mixed state entanglement. Let us review the definition of entanglement and introduce some notation. A density matrix ρ on a multipartite Hilbert space H is separable if we can find a decomposition of ρ into an ensemble of pure product states in H. Thus, for a bipartite Hilbert space a separable density matrix ρ can always be written as ρ=

pi |αi αi | ⊗ |βi βi |,

(1)

i

where pi ≥ 0. When a density matrix is not separable, the density matrix is called entangled. In the following we use the notation n ⊗ m or Hn ⊗ Hm to denote the tensor product between a n-dimensional Hilbert space and a m-dimensional Hilbert space. A Trace-preserving Completely Positive linear map S is abbreviated as a TCP map S. When a Hermitian matrix σ has eigenvalues greater than or equal to zero, we denote this as σ ≥ 0, i.e. σ is a positive semidefinite matrix. We denote the set of linear operators on a Hilbert space H as B(H). The theory of positive linear maps has turned out to be an important tool in characterizing bipartite mixed state entanglement [HHH96]. It has been shown [HHH98] that all density matrices ρ on Hn ⊗ Hm which remain positive semidefinite under the partial transposition (PT) map, i.e. (1 ⊗ T )(ρ) ≥ 0, where T is matrix transposition1 , are not distillable. We will say that such density matrices have the PPT property or “are PPT”. Here a density matrix ρ is called distillable when for all > 0, there exists an integer n and a LQ+CC procedure S : B(H⊗n ) → H2 with − |S(ρ n )| − ≥ 1 − , where | − is a singlet state. A state which has entanglement but which is not distillable is called a bound entangled state. All entangled states which are PPT are thus bound entangled. But do such states exist? It was shown in Ref. [HHH96] that entangled states with the PPT property do not exist in Hilbert spaces 2 ⊗ 2 and 2 ⊗ 3. The first examples of entangled density matrices with the PPT property in higher dimensional Hilbert spaces were found by P. Horodecki [Hor97]. In Ref. [BDM+ 99] we presented the first method for constructing bound entangled PPT states. This method relies on the notion of an unextendible product basis or UPB. This construction has also led to a method for constructing indecomposable positive linear maps [Ter]. In Ref. [BDM+ 99] we have given several examples of unextendible product bases, and therefore of bound entangled states. We showed that the notion of an unextendible product basis has another interesting feature, namely the states in the unextendible product basis are not exactly distinguishable by local quantum operations and classical communication. They form a demonstration of the phenomenon of “nonlocality without entanglement” [BDF+ 99]. In this paper we continue the work that was started in Ref. [BDM+ 99]. We will review many of the results that were presented in Ref. [BDM+ 99]. The paper is organized in the following way. In Sect. 2 we review some of the definitions and results that were presented in Ref. [BDM+ 99]. In Sect. 2.4 we present a first example and indicate a method to make 1 This matrix transposition can be carried out in any basis. The resulting matrices that one obtains by transposing in different bases are identical up to a local unitary transformation and therefore they have identical eigenvalues which determine whether they are positive semidefinite or not.

Product Bases and Bound Entanglement

381

bound entangled states which are based on uncompletable but not strongly uncompletable product bases. In Sect. 3 we present a sufficient condition for members of an orthogonal product basis to be distinguishable by separable superoperators. In Sect. 4 we introduce the notion of an orthogonality graph associated with a product basis; this notion helps us in establishing a complete characterization of all unextendible product bases in 3 ⊗ 3. In Sect. 5 we present unextendible product bases for multipartite and bipartite high dimensional Hilbert spaces. Again we will make fruitful use of the notion of an orthogonality graph. In Sect. 6 we report several results that are obtained in considering the use of bound entangled states. We will prove a restriction on the use of bound entangled states in the distillation of entangled states. In Sect. 6.2 we relate the sharing of bound entanglement to the possession of a quantum channel, namely a binding entanglement channel.

2. Properties of Uncompletable and Unextendible Product Bases In this section we exhibit various properties of uncompletable and unextendible product bases, and explore their relation to local distinguishability of sets of product states and bound entanglement.

2.1. Definitions and counting lemma. We give the definitions of three kinds of sets of orthogonal product states. First we define an unextendible and an uncompletable product basis: Definition 1. Consider a multipartite quantum system H = m i=1 Hi with m parties. An orthogonal product basis (PB) is a set S of pure orthogonal product states spanning a subspace HS of H. An uncompletable product basis (UCPB) is a PB whose complementary subspace HS⊥ , i.e. the subspace in H spanned by vectors that are orthogonal to all the vectors in HS , contains fewer mutually orthogonal product states than its dimension. An unextendible product basis (UPB) is an uncompletable product basis for which HS⊥ contains no product state. Thus, for an unextendible product basis S, it is not possible to find a product vector in H that is orthogonal to all the members in S. For an uncompletable product basis S, it may be possible to find product vectors that are orthogonal to all the states in S, however, we will never be able to find enough states so as to complete the set S to a full basis for H. Now we give the next definition, that of a strongly uncompletable product basis, for which we will use the notion of a locally extended Hilbert space. Let H = m i=1 Hi , a Hilbert space of an m-partite system. A locally extended Hilbert space is defined as Hext = m i=1 (Hi ⊕ Hi ), where Hi is a local extension. When we are given a set of states in H we can consider properties of this set embedded in a locally extended Hilbert space Hext . Definition 2. Consider a multipartite quantum system H = m i=1 Hi with m parties. A strongly uncompletable product basis (SUCPB) is a PB spanning a subspace HS in a locally extended Hilbert space Hext such that for all Hext the subspace HS⊥ (Hext = HS ⊕ HS⊥ ) contains fewer mutually orthogonal product states than its dimension.

382

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

v3

h

v4

v0

v2

v1

Fig. 1. Pyramid vectors in real 3-space. The height h is chosen so that v0 ⊥ v2,3 etc

Thus a strongly uncompletable product basis cannot be completed to a full product basis of some extended Hilbert space Hext . In Sect. 2.4 we will give an example of a PB that is uncompletable but not strongly uncompletable. We will review an example of an unextendible product basis of five states in 3 ⊗ 3 (two qutrits) given in Ref. [BDM+ 99]. Let v 0 , v 1 , . . . , v 4 be five vectors in real three dimensional space forming the apex of a regular pentagonal pyramid, the height h of the pyramid being chosen such that nonadjacent vectors are orthogonal (see Fig. 1). The vectors are 2πi 2πi v i = N cos , sin , h , i = 0, . . . , 4, (2) 5 5 √ √ with h = 21 1 + 5 and N = 2/ 5 + 5. Then the following five states in a 3 ⊗ 3 Hilbert space form a UPB, henceforth denoted Pyramid ψ i = v i ⊗ v 2i mod 5 , i = 0, . . . , 4.

(3)

To see that these five states form a UPB, note first that they are mutually orthogonal: states whose indices differ by 2 mod 5 are orthogonal for the first party (“Alice”); those whose indices differ by 1 mod 5 are orthogonal for the second party (“Bob”). For a new state to be orthogonal to all the existing ones, it would have to be orthogonal to at least three of Alice’s states or at least three of Bob’s states. However this is impossible, since any set of three vectors v i spans the full three dimensional space. Therefore the entire ⊥ four dimensional subspace HPyramid contains no product state. We formalize this observation by giving the necessary and sufficient condition for extendibility of a PB (the proof is given in Ref. [BDM+ 99]): Lemma 1 ([BDM+ 99]). Let S = {(ψj ≡ m i=1 ϕi,j ) : j = 1 . . . n} be an orthogonal product basis (PB)spanning a subspace of the Hilbert space of an m-partite quanm tum system H = i=1 Hi with dim Hi = di . Let P be a partition of S into a number m of disjoint subsets equal to the number of parties: S = S1 ∪ S2 ∪ . . . Sm . Let ri = rank{ϕi ,j : ψj ∈ Si } be the local rank of subset Si as seen by the i th party. Then S

Product Bases and Bound Entanglement

383

is extendible if and only if there exists a partition P such that for all i = 1 . . . m, the local rank of the i th subset is less than the dimensionality of the i th party’s Hilbert space. That is to say, S is extendible iff ∃P ∀i ri < di . The lemma provides a simple lower bound on the number of states n in a UPB, n≥

(di − 1) + 1,

(4)

i

since, for smaller n, one can partition S into sets of size |Si | ≤ di − 1 and thus ri < di for all m parties. 2.2. Unextendible product bases and bound entanglement. Every UPB on a bipartite or multipartite Hilbert space gives rise to a bound entangled state which has the PPT property. The construction is the following: Theorem 1 ([BDM+ 99]). Let S be a UPB {ψi : i = 1, . . . , n} in a Hilbert space of total dimension D. The density matrix ρ¯ that is proportional to the projector onto HS⊥ ,   n 1  |ψj ψj | , 1− ρ¯ = D−n

(5)

j =1

is a bound entangled density matrix. Proof. By definition, HS⊥ contains no product states. Therefore ρ¯ is entangled. If the UPB is a bipartite UPB then we can directly apply the PT map to ρ¯ and find that (1 ⊗ T )(ρ) ¯ ≥ 0. Then we use the fact from Ref. [HHH98] that if a bipartite density matrix has the PPT property, it is not distillable. To derive the PPT property of ρ¯ we recall that the PT map is linear so we may apply it separately to the identity and to the projector onto HS in ρ. ¯ The identity is invariant under the PT map. Each projector onto a product state is of the form |ψA ψA | ⊗ |ψB ψB | and as such will be mapped onto (1 ⊗ T )(|ψA ψA | ⊗ |ψB ψB |) = |ψA ψA | ⊗ T (|ψB ψB |) = |ψA ψA | ⊗ |ψB∗ ψB∗ |.

(6)

The product states making up the UPB are mapped onto another set of orthogonal product states. Therefore (1 ⊗ T )(ρ) ¯ ≥ 0. In case of a multipartite UPB the PPT condition cannot be used directly. However we can use the above argument to show that under every bipartite partitioning of the parties ρ¯ is PPT. Thus no entanglement can be distilled across any bipartite cut. If any pure “global” entanglement could be distilled it could be used to create entanglement across a bipartite cut. Therefore no entanglement can be distilled and thus the density matrix ρ¯ is bound entangled. It was pointed out by C.H. Bennett [Ben] that it is a simple matter to create a set S of nonorthogonal product states in a Hilbert space H such that no other product state can be found in H that is orthogonal to all the states in S. In fact, except for a set of measure zero, any set of randomly chosen product vectors whose number satisfies Eq. (4) will be unextendible in this sense. For every partitioning the new product vector to

384

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

be added to the set will have to be orthogonal to di other vectors for at least one party i. However di randomly chosen vectors will typically span a di -dimensional space, and therefore such a new product vector that is orthogonal to all the members in the set cannot exist. It is not clear how such a nonorthogonal set of product states could lead to a bound entangled state. The projector on HS⊥ where HS is now the space spanned by the nonorthogonal product vectors, is entangled, but does not necessarily have the PPT property; in the set of orthogonal vectors obtained by Gram-Schmidt orthogonalization of the nonorthogonal product vectors we might find entangled vectors and therefore ρ¯ might not have the PPT property. 2.3. Local distinguishability of product bases. When two parties Alice and Bob possess one state out of an ensemble of orthogonal product states, we may ask whether it is possible for them to determine exactly which state they have by performing local quantum operations and classical communication. As the states are orthogonal, a joint measurement for Alice and Bob that exactly distinguishes the states is always possible. In Ref. [BDM+ 99] we found that when a set of product states in a multipartite Hilbert space is strongly uncompletable, it implies that the members in the set cannot be distinguished by LQ+CC. This result is captured in the following lemma: Lemma 2 ([BDM+ 99]). Given a set S of orthogonal product states on H = m i=1 Hi . If the set S is exactly distinguishable by local von Neumann measurements and classical communication then it is completable in H. If S is exactly distinguishable by local POVMs and classical then the set can be completed in some extended communication Hilbert space H = m i=1 (Hi ⊕ Hi ). The proof is given in Ref. [BDM+ 99]. We note that in the lemma we only allow POVMs with a finite number of outcomes and we only allow a finite number of rounds of POVM measurements. This restriction comes about because we use Neumark’s theorem [Per93] to convert a POVM measurement by a party i into a von Neumann measurement on a locally extended Hilbert space Hi,ext = Hi ⊕ Hi . When the number of POVM measurement outcomes is infinite, then the extended Hilbert space is infinite dimensional. It is not clear how one can speak of completing a set of states to a full product basis for an infinite dimensional Hilbert space. We avoid the same problem by excluding the possibility for an infinite number of rounds of POVM measurements. It is possible to strengthen the lemma one step further and include measurements that do not exactly distinguish the set of states, but make an arbitrary small error . By this we mean that we use a measurement and a decision scheme for which the probability of correctly deciding what state the parties were given, is greater than or equal to 1 − for all possible states that the parties can possess. When we allow only LQ+CC by finite means, both in space and time, it is possible to prove that when measurement plus decision schemes exist that make an arbitrary small error for all > 0, then there will also exist a scheme that makes no error. The proof of this result is given in Ref. [Ter99] and relies on the fact that the set of measurement and decision schemes is a finite union of compact sets. We would like to stress that the converse of Lemma 2 does not hold. There do exist sets of orthogonal product states that are not distinguishable by LQ+CC, but which are completable. A prime example is the set of states given in Ref. [BDF+ 99]. For this set it was proved that even by allowing an infinite number of rounds of local measurements, it was not possible to distinguish the members with arbitrary small probability of error.

Product Bases and Bound Entanglement

385

2.4. Uncompletable product bases and bound entanglement. Lemma 2 relates uncompletable product bases (UCPBs and SUCPBs) to distinguishability. We may also ask how these (S)UCPBs relate to bound entanglement. First we recall a simple observation that was presented in Ref. [BDM+ 99]: Proposition 1 ([BDM+ 99]). Given a PB S on H = m i=1 Hi . If the set S is completable in H or a locally extended Hilbert space Hext , then the density matrix ρ¯S is separable. This directly implies that a UPB is strongly uncompletable, since the state ρ¯ corresponding to the UPB is always entangled. We now ask what properties the projector onto HS⊥ has when S is a UCPB or a SUCPB. Certainly, this projector has the PPT property; it will thus either be separable or have bound entanglement. In order to explore this question, we return to an example that was given in Ref. [BDM+ 99]. We consider the PB Pyr34, a curious set of states in 3 ⊗ 4 of which the members are distinguishable by local POVMs and classical communication, but not by von Neumann measurements. Pyr34 consists of the states v j ⊗ w j , j = 0, . . . , 4 with v j the states of the Pyramid UPB as in Eq. (2) and w j defined as √ √ w j = N ( cos(π/5) cos(2j π/5), cos(π/5) sin(2j π/5), √ √ cos(2π/5) cos(4j π/5), cos(2π/5) sin(4j π/5)), (7)

√ with normalization N = 2/ 5. Note that w jT w j +1 = 0 (addition mod 5). One can show that this set, albeit extendible on 3⊗4, is not completable: One can at most add three 0, w 1, w 4 )⊥ , v 3 ⊗ (w 2, w 3, w 4 )⊥ and ( v0 , v 3 )⊥ ⊗ (w 1, w 2, w 4 )⊥ . vectors like v 0 ⊗ (w Therefore this set is an example of a UCPB. However it is possible to distinguish the members of this set by local POVM measurements and classical communication. With this property, Lemma 2 implies that the set is completable in a locally extended Hilbert space. The set is thus not strongly uncompletable. This again implies with Proposition 1 that the state ρ¯Pyr34 is a separable density matrix. The local POVM that distinguishes the members of Pyr34 starts with a POVM performed by Bob on the four-dimensional side. Bob’s POVM has five projector elements, cos(2j π/5), − sin(4j π/5), each projecting onto a vector u j = N (− sin(2j π/5), √ cos(4j π/5)) with j = 0, . . . , 4, and normalization N = 1/ 2. Note that u 0 is orthogonal to vectors w 0, w 2 and w 3 , or, in general, u i is orthogonal to w i, w i+2 , w i+3 (addition mod 5). This means that when Bob obtains his POVM measurement outcome, three vectors are excluded from the set; then the remaining two vectors on Alice’s side, v i+1 and v i+4 , are orthogonal and can thus be distinguished. The completion of the Pyr34 set is particularly simple: Bob’s Hilbert space is extended to a five dimensional space. The POVM measurement can be extended as a projection measurement in this five-dimensional space with orthogonal projections onto ui , 0) + 21 (0, 0, 0, 0, 1). Then a completion of the set in 3 ⊗ 5 are the the states x i = ( following ten states: ( v1 , v 4 )⊥ ⊗ x 0 , ( v0 , v 2 )⊥ ⊗ x 1 , ( v1 , v 3 )⊥ ⊗ x 2 , ( v2 , v 4 )⊥ ⊗ x 3 , ( v0 , v 3 )⊥ ⊗ x 4 ,

v 0 ⊗ (w 0⊥ ∈ span( x4 , x 1 )), v 1 ⊗ (w 1⊥ ∈ span( x0 , x 2 )), v 2 ⊗ (w 2⊥ ∈ span( x1 , x 3 )), v 3 ⊗ (w 3⊥ ∈ span( x2 , x 4 )), v 4 ⊗ (w 4⊥ ∈ span( x3 , x 0 )).

(8)

386

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

Because the set Pyr34 is uncompletable in the Hilbert space 3 ⊗ 4, the state ρ¯Pyr34 has the notable property that although it is separable, it is not decomposable using orthogonal product states [DTT00]. If it were, those states would form a completion of the set Pyr34. Let us now take the set Pyr34 and add one product state, say the vector 0, w 1, w 4 )⊥ , v 0 ⊗ (w

(9)

to make it a six-state ensemble Pyr34+ . The density matrix ρ¯Pyr34+ has rank 12 −6 = 6. Is ρ¯Pyr34+ still a separable density matrix? We can enumerate the product states that are orthogonal to the members of Pyr34+ , which are not all mutually orthogonal: 2, w 3, w 4 )⊥ , v 3 ⊗ (w 1, w 2, w 3 )⊥ , v 2 ⊗ (w ⊥ ( v0 , v 3 ) ⊗ (w 1, w 2, w 4 )⊥ , ⊥ ( v0 , v 2 ) ⊗ (w 1, w 3, w 4 )⊥ .

(10)

⊥ These four vectors are not enough to span the full Hilbert space HPyr34 + . This means that the range of ρ¯Pyr34+ contains only four product states, whereas ρ¯Pyr34+ has rank 6. Therefore ρ¯Pyr34+ must be entangled. The entanglement of ρ¯Pyr34+ is bound by construction. Since ρ¯Pyr34+ is entangled, Proposition 1 implies that the set Pyr34+ is a SUCPB. So we have constructed a new bound entangled state whose range is not exempt from product states but has a product state deficit. This set is the first example of a bound entangled state related to a SUCPB, which is not a UPB. Pyr34+ shares with any UPB the fact that its members cannot be distinguished perfectly by local POVMs and classical communication. In conclusion, we have gone from a UCPB Pyr34 to a SUCPB Pyr34+ , or from a separable state ρ¯Pyr34 to a bound entangled state ρ¯Pyr34+ . This construction is an example of a general way to make a bound entangled state from a UCPB: Lemma 3. Given a UCPB S on H = m i=1 Hi there always exists a (possibly empty) set of mutually orthogonal product states orthogonal to S such that when added to S to make S+ , the density matrix ρ¯S+ is bound entangled.

Proof. We consider the density matrix ρ¯S which is either separable or bound entangled. If it is separable then there exists at least one product state in the range of ρ¯S . We add this state to S and repeat this procedure until the projector onto the complementary subspace of this augmented set is entangled. When S is uncompletable, then we cannot keep adding orthogonal product states: If we would be able to add orthogonal product states until we have a full product basis for H, then the set S would be completable on the given Hilbert space H, which is in contradiction with S being a UCPB. The lemma leaves open the possibility that the only bound entangled density matrices ρ¯S+ we can find are when S has been extended all the way into a UPB. Our example Pyr34+ shows that this is not always the case. One question of interest which we have not been able to answer is the following: Say we have a PB S which is a SUCPB, but not a UPB, such as the set Pyr34+ . Will it be necessary to add more product states to this set as Lemma 3 suggests to make a bound entangled state on the complementary subspace? Or is the state ρ¯S where S is a SUCPB, but not a UPB, always bound entangled? In Figs. 2 and 3 we show the network of relations that was partially discussed in this section. In Sect. 3 we will discuss one of these relations, which is the question when orthogonal product states are distinguishable by separable superoperators.

Product Bases and Bound Entanglement

387

Distinguishable using von Neumann measurements plus classical communication

Distinguishable using POVMs plus classical communications (full LQ+CC)

Completable

Completable in Hext

ρ separable Completable when 1 state left out

Distinguishable with separable measurements Completable in Hext when 1 state left out

Fig. 2. The network of positive implications of the results discussed in Sect. 2 and 3

Strongly uncompletable when 1 state left out

Unextendible

ρ bound entangled

Uncompletable when 1 state left out Strongly uncompletable Not distinguishable using separable measurements Uncompletable

Not distinguishable using POVMs plus classical communication

Not distinguishable using von Neumann measurements plus classical communication

Fig. 3. The network of negative implications of the results discussed in Sect. 2 and 3

388

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

3. The Use of Separable Superoperators In this section we address the question of what kind of measurement does distinguish the members of a PB. We are interested in finding measurements that need the least amount of resources in terms of entanglement between the two or more parties. We introduce a class of quantum operations that are close relatives of operations that can be implemented by local quantum operations and classical communication, the separable superoperators and measurements: n n = Definition 3 ([Rai]). Let H = i=1 Hi . Let H i=1 Hi . A TCP map S: B(H) → B(H ) is separable if and only if one can write the action of S on any density matrix ρ ∈ B(H) as S(ρ) = A1,i ⊗ A2,i ⊗ . . . ⊗ An,i ρ A†1,i ⊗ A†2,i ⊗ . . . ⊗ A†n,i , (11) i

where the “operation element” Ak,i is a dimHi × dimHi matrix and † A1,i A1,i ⊗ A†2,i A2,i ⊗ . . . ⊗ A†n,i An,i = 1.

(12)

i

Similarly, a quantum measurement on a multipartite Hilbert space is separable if and only if for each outcome m, the operation elements Am i for all i are of a separable form: m m m Am i = A1,i ⊗ A2,i ⊗ . . . ⊗ An,i .

(13)

Testing whether or not a superoperator is separable is not a simple problem since the operation elements Ai of a superoperator S are not uniquely defined. The results of Ref. [BDF+ 99] show that separable superoperators are not equivalent to local quantum operations and classical communication. There is a separable measurement for the nine states presented in Ref. [BDF+ 99]; it is the measurement whose operation elements are the projectors onto the nine states. But the nine states are not locally distinguishable by LQ+CC. The following theorem gives a sufficient condition under which a set of bipartite orthogonal product states is distinguishable with the use of separable measurements. Unfortunately, it is not known what entanglement resources are needed to implement such separable measurements. They do however form a rather restricted class of operations. Since they map product states onto product states it is not possible to use them to create entanglement where none previously existed. Theorem 2. Let S be a bipartite PB in H = HA ⊗ HB with k members. If S has the property that it is completable in H or local extensions of H (Hext ) when any single member is removed from S, then the members of S are distinguishable by means of a separable measurement. Proof. Denote the orthogonal rank 1 product projectors onto the states in S as { m }km=1 . Let Si , i = 1, . . . , k be the set S without a particular state i. Since each set Si is completable, the (unnormalized) states

S⊥ = 1 −

k (14) i

k=i

Product Bases and Bound Entanglement

389

for i = 1, . . . , k are separable. Note that S⊥ = †S⊥ . The projectors S⊥ and i for i

i

i

i = 1, . . . , k can be made to sum up to the identity by choosing the right coefficients: 1 † k−1 †

S⊥ S⊥ +

i i = 1, i k k i k

k

i=1

i=1

(15)

using 2 = for projectors. Since the projectors S⊥ are separable, one can decomi pose them into a set of Ni rank 1 product projectors, (S⊥ ,mi ) labeled by an index i mi = 1, . . . , Ni . Note that one can choose mutually orthogonal projectors (for a given i) (S⊥ ,mi ) when Si is completable in the given Hilbert space H. When Si is completable i only in a local extension of H, these projectors will be non-orthogonal. In both cases the set of product projectors

k,Ni k−1 1

i , (16) √ (S⊥ ,mi ) , i k k i=1,mi =1

are the operation elements of a separable measurement. This measurement projects onto states in S or onto product states that are orthogonal to all but one state in S. With a slight modification of this measurement one can construct a measurement which distinguishes the states in S locally. Formally one replaces the projectors of Eq. (16) by

i = |αi , βi αi , βi | → |iA , iB αi , βi |,

(S ⊥ ,mi ) = |δi,mi , γi,mi δi,mi , γi,mi | → |i , mi A , i , mi B δi,mi , γi,mi |,

(17)

i

such that the set of states |iA , |i , miA is an orthonormal set for A and the same for B. This modification leaves Eq. (15) unchanged, so that this new set of operation elements again corresponds to a (separable) measurement. Upon this measurement, however,Alice and Bob both get a classical record of the outcome. If they perform this measurement on states in S, their outcomes will uniquely determine which state in S they were given. We will show in Sect. 4.1, using the method of orthogonality graphs, that all UPBs in 3 ⊗ 3 have exactly five members. Theorem 3 (see also Sect. 4) tells us that any set of four orthogonal bipartite product states is completable. Therefore Theorem 2 implies that all UPBs in 3 ⊗ 3 are distinguishable by a separable measurement. 4. The Orthogonality Graph of a Product Basis It is convenient to describe the orthogonality structure of a set of orthogonal product states on a multipartite Hilbert space by a graph, which we will call the orthogonality graph of the PB: Essentially the same graph has appeared in connection with a problem in classical information theory [Lov79]. Definition 4. Let H = m i=1 Hi be a m-partite Hilbert space with dim Hi = di . Let m S = {(ψj ≡ i=1 ϕi,j ) : j = 1 . . . n} be an orthogonal product basis (PB) in H. We represent S as a graph G = (V , E1 ∪ E2 ∪ . . . ∪ Em ), where the set of edges Ei have color i. The states ψj ∈ S are represented as the vertices V . There exists an edge e of color i between the vertices vk and vl , i.e. e ∈ Ei , when states ψk and ψl are orthogonal on Hi . Since all the states in the PB are mutually orthogonal, every vertex is connected to all the other vertices by at least one edge of some color.

390

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

0

1

4

Bob Alice

3

2

Fig. 4. The orthogonality graph of any UPB on 3 ⊗ 3

1

1

1 Alice Bob Charlie

3

2 (1a)

3

2 (1b)

3

2 (1c)

Fig. 5. The possible orthogonality graphs of a multipartite PB with three members

An example of an orthogonality graph is given in Fig. 4; it is the graph for the bipartite Pyramid UPB. Note that it is also possible to have several edges of different colors between two vertices when states are orthogonal for more than one party. The representation of a PB in terms of a graph can be useful when we want to determine whether the members of the PB are distinguishable by means of local operations and classical communication. By enumerating the possible orthogonality graphs, it is not hard to prove the following Proposition 2. The members of any multipartite PB S with three or fewer members are distinguishable by local incomplete von Neumann measurements and classical communication, and the PB is completable to a full product basis. Proof. We first note that we need only show that the members of S are distinguishable by local von Neumann measurements to also show that S is completable because of Lemma 2. Now, when S has one member, there is nothing to distinguish and the statement is trivial. With two members, the states must be orthogonal for some party and that party can distinguish them. With three members the possible orthogonality graphs are depicted in Fig. 5. We have omitted graphs with multiply colored edges. A multiply colored edge can only make it easier to distinguish the members of the corresponding PB. Thus when a graph corresponds to a distinguishable set after we leave out any multiple coloring, it also corresponds to a distinguishable set with the multiple coloring. We have similarly omitted graphs which are the same as the graphs shown under interchange of parties as clearly those cases will follow the same line of reasoning.

Product Bases and Bound Entanglement

1

4

(2a)

1

2

1

3

4

2

1

(2b)

391

2

1

3

4

(2c)

2

1

3

4

2

(2d)

3

2 Alice Bob

4

(2e)

3

4

(2f)

3

Fig. 6. The possible orthogonality graphs of a bipartite PB with four members

In case (1a), as all the states are mutually orthogonal on Alice’s side, Alice can do a measurement that uniquely distinguishes them. In case (1b) the third state is orthogonal to both state 1 and state 2 on Alice’s side. Therefore Alice can distinguish between (1, 2) and 3. Then Bob can finish the measurement by telling apart 1 and 2 locally. In case (1c) Alice distinguishes state 2 from state 3, Bob distinguishes 1 from 2, and Charlie distinguishes 1 from 3, together determining the state. This proposition cannot be strengthened any further: a three party UPB exists with only four members, it is the set Shifts (Eq. (22) and Fig. 8a–c(a)). However, a stronger result may be obtained in the case of a bipartite PB: Theorem 3. Let S be a bipartite PB with four or fewer members, i.e. |S| ≤ 4 in any dimension (that allows for this PB). The set S is distinguishable by local incomplete von Neumann measurements and classical communication. The set S is completable to a full product basis for H. Proof. We will expand on the proof of Proposition 2. When the set S has one, two, or three members, Proposition 2 applies directly. When S has four members the six possible orthogonality graphs are as given in Fig. 6. Again we omit graphs that are identical to these six under interchanging of parties, and graphs with doubly colored edges. Case (2a) is trivial. In cases (2b), (2d), and (2e) there is always a state that is orthogonal to all the other states on one side. The measurer associated with that side can then distinguish this state from all the others. The result is that three states are left to be distinguished, which is covered by Proposition 2. In case (2c) (1, 3) can be distinguished from (2, 4) on Alice’s side after which we are left with two orthogonal states that can be locally distinguished by Bob. In Case (2f) a different type of measurement must be carried out. In the previous cases the measurements were such that none of the states were changed by the measurement. The set of states S was simply dissected in subsets. However, Alice and Bob can carry out a more general type of measurement, namely one that can change the states. Such a measurement must be orthogonality-preserving; by this we mean that the changed states that are left over to be distinguished in a succeeding round must remain orthogonal under the measurement. In case (2f) state 2 is orthogonal to both states 3 and 4 on Alice’s

392

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

side. Let Alice project with 34 , the projector on the subspace spanned by her side of states 3 and 4, and with 2 , the projector on her side of state 2, and possibly else where else 34 = 0, else 2 = 0 and else + 34 + 2 = 1. The projector else is only used when the states 2, 3 and 4 do not yet span the full Hilbert space of Alice; if this outcome is obtained, state 1 has been conclusively identified. Otherwise, state 1 is mapped onto 34 |1 or 2 |1. If the outcome is 2, Bob can finish the protocol by locally distinguishing 1 and 2. If the outcome is 34 we notice that we are in a three state case again and all states are still mutually orthogonal; 34 |1 and state 3 are still orthogonal on Alice’s side as state 3 is invariant under this projection 34 |3 = |3. These preliminary results will now be used to give a complete characterization of UPBs in 3 ⊗ 3. 4.1. A six-parameter family of UPBs in 3 ⊗ 3. In Ref. [BDM+ 99] we presented two examples of UPBs in 3 ⊗ 3. One is the Pyramid set which was discussed in Sect. 2.1 and the second was the set Tiles. The following five states on 3 ⊗ 3 form a UPB denoted as Tiles |ψ0 =

√1 |0(|0 − |1), 2 √1 (|0 − |1)|2, 2

|ψ2 =

√1 |2(|1 − |2), 2 √1 (|1 − |2)|0, 2

|ψ1 = |ψ3 = |ψ4 = (1/3)(|0 + |1 + |2)(|0 + |1 + |2).

(18)

Note that the first four states are the interlocking tiles of Ref. [BDF+ 99], and the fifth state works as a “stopper” to force the unextendibility. The set can be represented with the use of tiles as in Fig. 7. A tile can represent one or more states. For example, the tile in the upper left corner of Fig. 7 represents a state which is of the form |0 ⊗ (α0 |0 + α1 |1).

(19)

The “stopper” state is not included in the figure; as a tile it would cover the full square. These two sets, Pyramid and Tiles, are examples of a larger six-parameter family of unextendible product bases in 3 ⊗ 3. We will prove that this six-parameter family gives an exhaustive characterization of UPBs in 3 ⊗ 3. First we note that five is the smallest number of states in a UPB in 3 ⊗ 3, due to Eq. (4). We will now prove that any UPB with five members on 3 ⊗ 3 must have an orthogonality graph as in Fig. 4. We will do so by arguing that any vertex must be connected to exactly two other vertices by an edge of the same color. The argument goes as follows. If there exists a vertex that is connected to four other vertices with edges of a single color, then we can locally distinguish this state from the other four states. Theorem 3 implies that we can also distinguish the remaining four states. Now, assume that there exists a vertex, say vertex 1, that is connected to three other vertices, corresponding, say, to the states 2, 3 and 4. Then we can distinguish between 1 and (2, 3, 4) by a local projection that splits state 5 in two projected states. However this projected state 5 is still orthogonal to 1 and (2, 3, 4). Thus we are left with distinguishing a set of two or four orthogonal product states which can be done locally by Theorem 3. Finally, it is not hard to see that if all vertices have to be connected to exactly two other vertices, the orthogonality graph in Fig. 4 is the only possible graph.

Product Bases and Bound Entanglement

393

A 0

1

2

0

B

1

2 Fig. 7. Tile structure of the bipartite 3 ⊗ 3 Tiles UPB

Now that we have established a unique orthogonality graph, it remains to characterize the solution set. Let |ψi = |αi ⊗ |βi , i = 0, . . . , 4. Let (γA , θA , φA , γB , θB , φB ) be a set of six angles. We set |α0 = |0, |α1 = |1, |α2 = cos θA |0 + sin θA |2, |α3 = sin γA sin θA |0 − sin γA cos θA |2 + cos γA eiφA |1, 1 |α4 = (sin γA cos θA eiφA |1 + cos γA |2), NA |β0 = |1, |β1 = sin γB sin θB |0 − sin γB cos θB |2 + cos γB eiφB |1, |β2 = |0, |β3 = cos θB |0 + sin θB |2, 1 |β4 = (sin γB cos θB eiφB |1 + cos γB |2), NB

(20)

with normalizations NA,B =

cos2 γA,B + sin2 γA,B cos2 θA,B .

(21)

We have taken |α0,1 to define the first two vectors |0, |1 of the Alice Hilbert space; the overall phase of |α2 and |α3 and the phase of the |2 vector are chosen so that |α2 , and the first two terms of the above expression for |α3 , are real. Also, the overall phase of |α4 is fixed so that the coefficient of |2 is real. All the same remarks apply correspondingly to the Bob states. In order for this set of states to be a UPB we require

394

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

that cos θA,B = 0, cos γA,B = 0, sin θA,B = 0 and sin γA,B = 0. If this restriction is made, we see that any set of three different vectors for Alice or for Bob spans a three dimensional space. The Pyramid UPB is obtained from Eq. (20) with the parameter √ choices φA,B = 0, θA,B = γA,B = cos−1 (( 5 − 1)/2). The parameters for the Tiles UPB are φA,B = 0, θA,B = γA,B = 3π/4. We find that all the solutions having the orthogonality graph of Fig. 4 correspond to UPBs. If we had lifted the restriction on the angles, say, setting sin θA = 0, then the set would no longer be a UPB as |α2 ∈ span(|α0 , |α1 ). At the same time the set would no longer correspond to the graph of Fig. 4, as now state |α2 is orthogonal to |α4 . This suggests that UPBs can be characterized by their orthogonality graphs; when a set of states S has an orthogonality graph G and S is a UPB then all the sets with graph G are UPBs. If this conjecture were true, it would imply that we can classify UPBs by their orthogonality graphs leading to an important simplification. But in Sect. 5.3 we present a counterexample to this conjecture for three parties and seven states. Finally to finish the characterization, we prove that any PB with six or more members in 3 ⊗ 3 is completable. We give the proof excluding a six member UPB in Appendix A. The density matrix ρ¯P B of a PB with seven or eight states has rank two and rank one respectively. By construction this density matrix is either a bound entangled state or a separable state, as follows from Theorem 1. It can be shown by different arguments that there exists no bound entangled state with rank less than or equal to two [HSTT]. The state must therefore be separable. To the seven state PB we therefore can add a product vector to make it an eight state PB which is again extendible. 5. Multipartite and High Dimensional Bipartite UPBs In this section we introduce several examples and families of UPBs. In Sect. 5.1 we present UPBs on multi-qubit Hilbert spaces. In Sect. 5.2 we give two constructions of high dimensional bipartite UPBs based on tiling patterns such as in Fig. 7. In Sect. 5.3 we give multipartite UPBs based on a generalization of the orthogonality graph of the Pyramid UPB (Fig. 4). In Sect. 5.4 we present a bipartite high dimensional UPB which is based on quadratic residues. Finally, in Sect. 5.5 we prove that tensor products of UPBs are again UPBs. 5.1. GenShifts and other UPBs in qubit Hilbert spaces. We first give a theorem that was proved in Ref. [BDM+ 99]: Theorem 4. [BDM+ 99]. Any set of orthogonal product states {|αi ⊗ |βi }ki=1 in 2 ⊗ n for any n ≥ 2 is distinguishable by local measurements and classical communication and therefore completable to a full product basis for 2 ⊗ n. Even though any bipartite product basis involving a qubit Hilbert space is completable, we found in Ref. [BDM+ 99] that a tripartite UPB involving three qubits is possible. This was the set Shifts given by the states {|0, 0, 0, |+, 1, −, |1, −, +, |−, +, 1}.

(22)

It follows from Theorem 4 that when we make any bipartite split of the three parties, say we join parties BC, that the set Shifts is completable to a full product basis for HA ⊗ HBC . Thus the bound entangled state that we construct from Shifts as in Eq. (5),

Product Bases and Bound Entanglement

395

must be separable over any bipartite split. Therefore this bound entangled state could have been made without any entanglement between A and BC, or AB and C, or C and AB. However, the state is entangled. One may say that the entanglement is delocalized over the three parties. Our construction of Shifts can be generalized to multipartite UPBs, which we will call GenShifts. Again, because of Theorem 4, the bound entangled states based on GenShifts have a form of delocalized entanglement. The bound entangled states could have been made without entanglement between any single party and all the other remaining parties. We do not know whether the bound entangled states are separable over a split in two or more parties and all the other parties. GenShifts is a UPB on 2k−1 i=1 H2 with 2k members, the minimal number for a UPB, Eq. (4). The first state is |0, . . . 0, 0. The second is ⊥ |1, ψ1 , ψ2 , . . . , ψk−1 , ψk−1 , . . . , ψ2⊥ , ψ1⊥ .

(23)

The states |ψi and |ψj for all i = j are chosen to be neither orthogonal nor identical. Also, |ψi is neither orthogonal nor identical to the state |0 for all i. The other states in the UPB are obtained by (cyclic) right shifting the second state, i.e. the third state is ⊥ |ψ1⊥ , 1, ψ1 , ψ2 , . . . , ψk−1 , ψk−1 , . . . , ψ2⊥ .

(24)

These states are all orthogonal in the following way: The state |0, . . . , 0, 0 is special and it is orthogonal to all the other states as they all have a |1 for some party. Leaving this special state aside, all states are orthogonal to the next state, their first right-shifted ⊥ . All states are orthogonal to the 2nd state, by the orthogonality of |ψk−1 and |ψk−1 right-shifted state by the orthogonality of |ψ1 and |ψ1⊥ . The 3rd right-shifted state is ⊥ . We can continue this until the last (2k − 2)th made orthogonal with |ψk−2 and |ψk−2 right-shifted state and we are done. As there are no states repeated on one side of the UPB all sets of two states span a two dimensional space; Lemma 1 implies that the set is a UPB. The orthogonality graph for the first example of GenShifts which is just the set Shifts, is shown in Fig. 8a–c(a). For k = 2 and k = 3, the graph for GenShifts is the only orthogonality graph possible for a UPB in this Hilbert space. For k > 3 graphs other than the one corresponding to GenShifts are possible. It is simple to argue that, as in the 3 ⊗ 3 UPB (Sect. 4.1), all PBs having the GenShifts orthogonality graph are UPBs: The orthogonality graph of GenShifts is partially characterized by the fact that there is only one edge emanating from every vertex of a particular color. This implies that no states in a set corresponding to this orthogonality graph are repeated, that is, the same state for a party i is not used more than once in the set. As the set is minimal, this implies that such an orthogonality graph directly fulfills the conditions of Lemma 1; every pair of two states spans a two dimensional space. Also, when we consider a minimal PB (having n + 1 members for n parties) and its orthogonality graph has a doubly colored edge, then the PB cannot be a UPB. This is because the property of having a doubly colored edge directly translates into some pair of states not spanning a two dimensional Hilbert space. Using the orthogonality-graph construction, we can prove that if the number of parties n is even, then qubit UPBs with n + 1 states, the minimal possible number by Lemma 1, do not exist. We show this by demonstrating that some states would have to be repeated; but repeated states permit a partitioning as in Lemma 1 which allows the introduction of another orthogonal product state. Figure. 8a–c(b) illustrates the idea: Considering the

396

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

(a)

(b)

(c)

Fig. 8a–c. Orthogonality graphs for qubit UPBs. (a) Shifts, i.e. GenShifts for k = 2. (b) Demonstration of the nonexistence of a minimal UPB with an even number of qubits. (c) A six-state, 4-partite UPB

lines of just one color, we note that two cannot emanate from the same node (otherwise there would be a repeat), but after joining them up pairwise there will be one left over, since the number of states is odd. But since this last node has no lines of the first color coming into it, it will have to have at least two of some other color emerging from it, which would again force a repeat. Therefore, the basis would be extendible. On the other hand, non-minimal UPBs for even numbers of qubits do exist; Fig. 8a– c(c) shows the graph for one with six states in a space of four qubits. See Ref. [AL] for results on the existence of minimal UPBs in multipartite Hilbert spaces of arbitrary dimension. 5.2. GenTiles. We introduce a bipartite product basis GenTiles1 in n ⊗ n where n is even. These states have a tile structure which in the case of 6 ⊗ 6 is shown in Fig. 9. The general construction goes as follows: We label a set of n orthonormal states as |0, . . . , |n − 1. We define the set of “vertical tile” states |Vmk = |k ⊗ |ωm,k+1 = |k ⊗

n/2−1

ωj m |j + k + 1 mod n,

j =0

m = 1, . . . , n/2 − 1, k = 0, . . . , n − 1,

(25)

where ω = ei4π/n . Similarly, we define the set of “horizontal tile” states: |Hmk = |ωm,k ⊗ |k,

m = 1, . . . , n/2 − 1, k = 0, . . . , n − 1.

(26)

Finally we add a “stopper” state |F =

n−1 n−1

|i ⊗ |j .

(27)

i=0 j =0

The stopper state is not depicted in Fig. 9; as a tile it would cover the whole 6 by 6 square. The representation of the set as an arrangement of tiles informs us about the orthogonalities among some of its members. It is not hard to see that nonoverlapping tiles are orthogonal. The orthogonality of the states |Vmk and |Vm k for m = m follows from the identity (28) ωm,k | ωm ,k ∝ δmn .

Product Bases and Bound Entanglement

397

A 0

1

2

3

4

5

0

1

2

B 3

4

5

Fig. 9. Tile structure of the bipartite 6 ⊗ 6 UPB

With the same identity we can prove that the states |Hmk and |Hm k for m = m are mutually orthogonal. Finally, every state |Hmk or |Vmk is orthogonal to the “stopper” |F as n/2−1

ωj m ∝ δm0 ,

(29)

j =0

and m = 0. The set has n2 − 2n + 1 states, much more than the minimum number in a UPB on n ⊗ n, which is 2n − 1. We can prove that this construction is a UPB in 4 ⊗ 4 and 6 ⊗ 6 by exhaustive checking of all partitions (see Lemma 1). This procedure is hard to implement computationally for arbitrary high dimension, but one may conjecture (and prove, see Ref. [DT]) that Theorem 5. The set of states GenTiles1 form a UPB on n ⊗ n for all even n ≥ 4. Another tile construction which we call GenTiles2 can be made in dimensions m ⊗ n for n > 3, m ≥ 3 and n ≥ m. The construction is illustrated in Fig. 10. The small tiles which cover two squares are given by 1 |Sj = √ (|j − |j + 1 mod m) ⊗ |j , 0 ≤ j ≤ m − 1. 2

(30)

398

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

A 0

1

........

2

.....................

0

m−1

1

2

. ...

...

.

........ m−1

B

.....................

m

........ n−1

Fig. 10. Tile structure of the m ⊗ n GenTiles2 PB

These short tiles are mutually orthogonal on Bob’s side. The long tiles (in general not contiguous) that stretch out in the vertical direction in Fig. 10 are given by

|Lj k = |j ⊗ √

1 n−2

m−3

ω |i + j + 1 mod m + ik

i=0

n−3

ω |i + 2 , ik

i=m−2

0 ≤ j ≤ m − 1, 1 ≤ k ≤ n − 3, (31) 2π

with ω = ei n−2 . Lastly we add a “stopper” state m−1 n−1 1 |i ⊗ |j . |F = √ nm i=0 j =0

The total number of states is mn − 2m + 1.

(32)

Product Bases and Bound Entanglement

399

We can show that these states form a PB. For j = j the states |Lj k and |Lj k are orthogonal on Alice’s side. We also have Lj k | Lj k =

n−3 1 i2πp(k−k ) e n−2 = δkk . n−2

(33)

p=0

The states |Lj k and |Sp with p = j and p = j + 1 mod m are orthogonal on Alice’s side. The long tiles |Lj k are constructed such that they are orthogonal to the states |j and |j + 1 mod m on Bob’s side, see Fig. 10. Therefore |Lj k is orthogonal to |Sj and |Sj +1 mod m . The states |Sl are orthogonal to the stopper |F on Alice’s side. Finally, the states |Lj k are orthogonal to |F as F | Lj k = √

n−3 2πpk 1 ei n−2 = δk0 , nm(n − 2) p=0

(34)

and k = 0. We conjecture that this PB GenTiles2 is a UPB (the proof of the conjecture has been given in Ref. [DT]) Theorem 6. The set of states GenTiles2 form a UPB on m ⊗ n for n > 3, m ≥ 3 and n ≥ m. Note that GenTiles2 with m = n = 3 does not form a UPB. We will now give some UPBs corresponding to generalizations of the orthogonality graph in Fig. 4. The first generalization is a UPB on 3 ⊗ 3 ⊗ . . . ⊗ 3 (Sect. 5.3). The second generalization (Sect. 5.4) is another bipartite UPB in arbitrary high dimension. 5.3. Sept and GenPyramid. Let us first consider a generalization to 3 ⊗ 3 ⊗ 3. Define the following states: 2πi 2πi v i = N cos , sin , h , i = 0, . . . , 6, (35) 7 7

with h = − cos 4π and N = 1/ 1 + | cos 4π 7 7 |. The following seven states in 3⊗3⊗3 form the UPB Sept p i = v i ⊗ v 2i

mod 7

⊗ v 3i

mod 7 ,

i = 0, . . . , 6.

(36)

The orthogonality graph of these vectors p i is shown in Fig. 11. To prove that these states form a UPB, we must show that any subset of three of them on one of the three sides (Lemma 1) spans the full three dimensional space. As the vectors v i form the apex of a regular septagonal pyramid, there is no subset of three of them that lies in a two dimensional plane. It is not known whether the complementary state ρ¯Sept is separable over bipartite cuts, as with ρ¯Shifts (see Eq. (22)), or whether it is a bound entangled over the bipartite cuts. This construction can be extended to 3⊗n , the minimal UPB thus constructed we will call GenPyramid. We have n parties and p = 2n + 1 states where p is a prime number. Thus one can have (n, p) = (2, 5), (3, 7), (5, 11), etc. The states in the polygonal pyramid with p vertices are defined as 2πi 2πi v i = Np cos (37) , sin , hp , i = 0, . . . , 2n. p p

400

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

1

0

Charlie Bob 2

Alice

6

3

5

4

Fig. 11. The Sept UPB on 3 ⊗ 3 ⊗ 3

In Sept and Pyramid, hp was chosen such that nonadjacent vertices were orthogonal. For larger primes p one has to make a choice of which vectors to make orthogonal that depends on p: in order for the vectors v i and v i+m to be made orthogonal by lifting these vectors out of the plane of the polygon, we must have π 2πm ≤ (≤ π), 2 p

(38)

i.e. the angle between the vectors in the plane must be larger than 90 degrees. One can always find such an m given a p, for example, for p = 7, m = 2 or 3. With the choice of m one fixes hp and Np as

2πm (39) hp = − cos 2πm p , Np = 1/ 1 + | cos p |. Finally, the UPB GenPyramid is p i = v i ⊗ v 2i

mod p

⊗ . . . ⊗ v ni

mod p ,

i = 0, . . . , 2n.

(40)

The primality of p ensures that there are no states repeated on one side: there is no k in the range 1 ≤ k ≤ 2n such that ki mod p = kj mod p for some integers i = j if p is prime. Orthogonality is also ensured by primality. As in Fig. 11 there will be a party for whom next neighbor states are orthogonal, there will be a party for whom all second neighbor states are orthogonal, etc. up to the nth neighbor. This implies that all vertices in the orthogonality graph are mutually connected (orthogonal), so the orthogonality graph is complete. From basic three dimensional geometry it follows that any set of three vectors has full rank when hp = 0 and thus these generalized sets form UPBs . It was noted by A. Peres [Per] that this construction is quite general: instead of the vectors of Eq. (37), we take any set of vectors |ri such that ri | ri+1 mod p = 0 and such that any triplet of vectors (|ri , |rk , |rl ) with i = j = l spans the full three dimensional space. We construct the vectors p i as in Eq. (40) with v mi mod p = r i , with m given in Eq. (38). This set can form a UPB. But in this more general construction a more complete check is required to make sure that any three different vectors are linearly independent. If we restrict ourselves to just requiring that adjacent vectors be orthogonal, we find that

Product Bases and Bound Entanglement

401

there are sets with the same orthogonality graph as, for example, Sept, but which are not UPBs. An example of such a set is the following:         1 0 1 1 |r1 =  0  , |r2 =  1  , |r3 =  0  , |r4 =  1  , 0 0 1 −1 (41)       1 1 0 |r5 =  −1  , |r6 =  1  , |r7 =  1  . 0 1 −1 For these states we have ri | ri+1 mod 7 = 0 and no other states are orthogonal. We can construct a PB by replacing the states in Eq. (40) by these vectors v 2i mod 7 = r i . This PB has the same orthogonality graph as Sept. However, the vectors |r1 , |r2 and |r5 lie in a two dimensional plane. This implies that we can add a new product vector to the PB thus constructed. This provides the counterexample to the idea that there is a straightforward correspondence between orthogonality graphs and UPBs.

5.4. UPBs based on quadratic residues. QuadRes is a family of UPBs, which are based on quadratic residues [HW79]. The UPB is a set of orthogonal product states on n ⊗ n, where n is such that 2n − 1 is a prime p of the form 4m + 1. The set contains p = 2n − 1 members, which is the minimal number for a UPB . Thus we can have (m, p, n) = (1, 5, 3), (3, 13, 7), (4, 17, 9), etc. The first triple (1, 5, 3) is the Pyramid UPB. Let Z∗p be Zp \ {0}. Let Qp be a group of quadratic residues, that is, elements q ∈ Z∗p such that q = x 2 mod p, (42) for some integer x. Qp is a group under multiplication. The order of the group is p−1 2 . The following properties hold: when q1 ∈ Qp and q2 ∈ / Qp , a quadratic nonresidue, then q1 q2 ∈ / Qp . Also, if q1 ∈ / Qp and q2 ∈ / Qp , then q1 q2 ∈ Qp [HW79]. The states of the UPB are |Q(a) ⊗ |Q(xa) for a ∈ Zp , x ∈ Z∗p , x ∈ Qp , where |Q(a) = (N, 0, . . . , 0) +

e2πiqa/p eˆq ,

(43)

(44)

q∈Qp

where N is a real normalization constant to be fixed for orthogonality and eˆq are unit vectors of the form (0, 1, 0, . . . , 0), (0, 0, 1, . . . , 0), etc. The dimension n of the Hilbert space is p+1 2 , one more than the order of Qp . One can prove that these vectors can be made orthogonal by an appropriate choice of N , for a = b: Q(a)|Q(b)Q(xa)|Q(xb) = |N |2 + |N |2 + e2πiq(b−a)/p e2πiqx(b−a)/p = 0. q∈Qp

q∈Qp

(45)

402

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

One uses the properties of Qp to find that for b − a = 0: e2πiq(b−a)/p + e2πiqx(b−a)/p = e2πiz(b−a)/p = −1. q∈Qp

(46)

z∈Z∗p

q∈Qp

Thus the orthogonality relation of Eq. (45) for b = a is of the form (|N |2 + s) (|N |2 − 1 − s) = 0, where

s=

e2πiq(b−a)/p .

(47) (48)

q∈Qp

The value of s as a function of b − a (b = a) only depends on whether b − a ∈ Qp , because of the group property of Qp . Call this value s¯ when b − a ∈ Qp . Then when b − a ∈ Qp , s = −1 − s¯ because of Eq. (46). Finally we also need to establish that s is real. One considers s ∗ in which one sums over −q. As q ∈ Qp and −1 ∈ Qp when p is of the form 4m + 1 (see Theorem 82, [HW79]), −q ∈ Qp . Thus s = s ∗ . Both for negative as well as positive s, Eq. (47) has a solution for N , and thus Eq. (45) is satisfied for all a = b. Theorem 7. The states given in Eq. (43) and Eq. (44) on n ⊗ n with 2n − 1 a prime of the form 4m + 1 with the appropriate value of N determined by the solution of Eq. (47) form a UPB. Proof. The proof requires the application of Lemma 1, that is, one must show that any set of n states on either side spans the full n-dimensional Hilbert space. To do this, we need to show that any subset of n = (p + 1)/2 of the p vectors defined in Eq. (44) has full rank. Checking whether a subset T of these has full rank is easily seen to be equivalent to checking whether the determinant of a matrix MQp T does not vanish, where MQp T is the (p + 1)/2 × (p + 1)/2 matrix whose j, k entry is e2πiqj tk /p , qj being the j th element of the set Qp and tk the k th element of the set T . However, a theorem of ˇ Cebotarev [New76] shows that the matrix MST is of full rank for any two arbitrary sets S, T , subsets of {0, 1, . . . , p − 1}, proving Theorem 7. Drawn as orthogonality graphs as in Fig. 4, these UPBs are regular polygons, with a prime number p (of the form 4m + 1) of vertices. The elements of the quadratic residue group Qp correspond to the periodicity of the vectors that are orthogonal on one side. For example, when p = 13, one has quadratic residues 1, 3, 4, 9, 10 and 12. Thus on, say, Alice’s side, every vertex is connected to its first neighbor (1), every vertex is connected with the 3rd neighbor (3), etc. On Bob’s side the orthogonality pattern follows from the quadratic nonresidues. 5.5. Tensor powers of UPBs. When we have found two UPBs, we may ask whether the tensor product of them is again a UPB. The answer is yes, as indicated by the following theorem: Theorem 8. Given two bipartite UPBs S1 with members |ψi1 , i = 1, . . . , l1 on n1 ⊗ m1 1 ,l2 and S2 with members |ψi2 , i = 1, . . . , l2 on n2 ⊗ m2 . The PB {|ψi1 ⊗ |ψj2 }li,j =1 is a bipartite UPB on n1 n2 ⊗ m1 m2 .

Product Bases and Bound Entanglement

l1

P1

l2

l1

P2

A

B

B

A

A

A

A

A

A

A

A

A

A

A

A

B

A

B

B

B

B

A

B

A

B

B

B

B

B

B

A

A

A

B

B

B

B

A

A

A

B

B

A

A

B

B

A

B

A

A

A

B

B

A

B

A

B

B

A

A

B

A

B

B

B

A

A

B

A

B

l1

P4

l2

403

l2

l1

P3

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

B

B

B

B

B

B

B

A

A

A

A

A

A

A

l2

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

B

B

B

B

B

B

B

Fig. 12. The succession of partitions of the tensor power of UPBs used to prove Theorem 8. The As and Bs denote on what side a hypothetical product state is orthogonal to the members of PB2

Proof. Assume the contrary, i.e. there is a product state that is orthogonal to this new ensemble which we call PB2 . The idea is to show that this leads to a contradiction and thus PB2 is a UPB. Note first that for any UPB a partition P into a set with 0 states for Bob and all states for Alice gives rise to a rAP = dim HA (see Lemma 1); the states on Alice’s side together must span the entire Hilbert space of Alice. Also note that if one takes a tensor product of two UPBs (defined on HA1 ⊗ . . . and HA2 ⊗ . . .) this partition in which all states are assigned to Alice still leads to rAP = dim HA1 dim HA2 . The set PB2 has l1 l2 members. The new hypothetical product state to be added to the set has to be orthogonal to each member either on Bob’s side or on Alice’s side, or on both sides. One can represent this hypothetical orthogonality pattern as a rectangle of size l1 (number of columns) by l2 (number of rows) filled with the letters A and B, depending on how the new state is orthogonal to a member of the PB2 . This is illustrated as partition P1 in Fig. 12. When the hypothetical state is orthogonal on both sides, we are free to choose an A or B in the corresponding square. Consider a row of this rectangle, for example the first one. The pattern of As and the Bs can be viewed as a partition of the S1 UPB and therefore one knows that one of these sets, either the A or the B set, must have full local rank. But if the A set is the one with full local rank, then the state is also orthogonal to PB2 with respect to partition P2 (Fig. 12) in which the whole row is filled with As. This is so since the local rank rA is not changed (because the A rank was already full for the row) and the rank rB cannot increase (since states are removed from the B set). This is true for every row in turn, which leads us to partition P3 . Then doing the same to columns, since all columns are identical they will all be full rank for either A or B, so we will obtain the unanimous partition P4 . This partition P4 however contradicts the fact that both sets were UPBs.

404

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

The theorem has the consequence that arbitrary tensor powers of bipartite UPBs are again UPBs. A generalization of the theorem for multipartite states is true as well; in the proof we replace A/B partitions with partitions of As, Bs, Cs, etc. 6. The Use of Bound Entanglement It has been shown [HHH99a] that bound entangled PPT states are not a useful resource in the teleportation of quantum states. On the other hand it has also been shown [HHH99b] that bound entangled PPT states can have a catalytic effect in the (quasi) distillation of a single entangled state. In the next two sections we discuss the use of bound entangled states: in a protocol of distillation of mixed entangled states and in defining a binding entanglement channel. 6.1. Distillation of mixed entangled states. We will prove that bound entangled states cannot be used to increase the distillable entanglement of a state ρ beyond its regularized entanglement of formation assisted by bound entanglement Eb (ρ). By a bound entangled state ρb we mean a state that cannot be distilled, i.e. if we are given many copies of this density matrix we cannot distill any pure entanglement out of this set. These set of states include the bound entangled PPT states and also possibly some NPT entangled states [DSS+ 00, DCLB00]. We denote the density matrix of a set of n pure EPR pairs or other maximally entangled states as ⊗n EPR . The precise definition of distillable entanglement uses a limit in which the number of copies n of the state to be distilled goes to infinity while at the same time the fidelity of the distilled states with respect to a maximally entangled state goes to 1. In the notation we use here we omit these limits for the sake of clarity. We refer the reader to Ref. [Rai99] for a treatment and discussion of various equivalent definitions of distillable entanglement. We start with the following lemma: Lemma 4. For no integer k and bound entangled state ρb does there exists a LQ+CC TCP map S1 such that ⊗k ⊗3n (49) S1 ( ⊗n EPR ⊗ ρb ) = EPR . Proof. Suppose Eq. (49) were true. Expanding the density matrix proportional to the identity on the 2n ⊗ 2n dimensional Hilbert space, we obtain 1 1 4n − 1 I = n ⊗n δρ. EPR + n 4 4n 4 By the linearity of S1 it follows that 1 1 ⊗3n 4n − 1 ⊗k S1 = I ⊗ ρ

+ S1 (δρ ⊗ ρb⊗k ). b 4n 4n EPR 4n

(50)

(51)

1/4n . If the The fidelity of the output state in Eq. (51) with respect to ⊗3n EPR is3nF ≥ 3n output is projected into the Hilbert space of dimension is d ⊗ d = 2 ⊗ 2 inhabited by the ⊗3n EPR term of Eq. (51) this fidelity can only increase or remain the same. It has been shown by Horodecki et al. [HH99] that a state for which F > 1/d is distillable, so the output state is distillable (as 1/4n > 1/23n ). But this is a contradiction, since the

Product Bases and Bound Entanglement

405

input state of Eq. (51) has only bound entanglement, and the TCP map is LQ+CC and therefore it cannot create any free entanglement. This proves that such an LQ+CC S1 does not exist. Lemma 5. For no integer k and bound entangled state ρb and α > 1 does there exists a LQ+CC TCP map S2 such that ⊗k ⊗αn S2 ( ⊗n EPR ⊗ ρb ) = EPR .

(52)

⊗k Proof. If S2 existed, iterated application of it S2 (S2 (S2 (...( ⊗n EPR ⊗ρb )...))) log 3/ log α times would produce the map S1 of Lemma 4. However this S1 cannot exist and therefore S2 does not exist.

The distillable entanglement of a state ρ assisted by bound entanglement, Db (ρ), is defined by optimizing over all LQ+CC TCP maps and bound-entangled states ρb and values for k such that ⊗Db n S3 (ρ ⊗n ⊗ ρb⊗k ) = EPR . (53) Proposition 3. D(ρ) ≤ Db (ρ) ≤ Eb (ρ) ≤ E(ρ), where (Eb (ρ)) E(ρ) is (the BEassisted) regularized entanglement of formation of ρ. Proof. The BE-assisted regularized entanglement of formation Eb (ρ) of a density matrix ρ is defined by the optimal LQ+CC TCP map SEb and optimal choice for k and ρb such that ⊗Eb n SEb ( EPR ⊗ ρb⊗k ) = ρ ⊗n . (54) Suppose Db (ρ) > Eb (ρ). This leads to a contradiction, because the composed map ⊗Eb n ⊗Db n S3 (SEb ( EPR ⊗ ρb⊗k ) ⊗ ρb⊗l ) = EPR ,

cannot exist by Lemma 5.

(55)

These results also provide some partial answers to the questions raised in the discussion of Ref. [HHH00]; it bounds the use that bound entanglement can have in the distillation of mixed states. The result does leave room for nonadditivity though; for states which have D(ρ) < E(ρ) it could still be that D(ρ) < Db (ρ). 6.2. Binding entanglement channels. As noted independently by Horodecki et al. [HHH00], there exist quantum channels through which entanglement can be shared, but only entanglement of the bound variety. These binding entanglement channels are discussed in Ref. [HHH00]. Here we present a simple physical argument for their existence based on bound entangled states, both of the PPT kind as well as the NPT kind (if these exist, see Ref. [DSS+ 00]). Consider any bound entangled (BE) state ρ on m ⊗ m. With this state we define a channel which takes an m-dimensional input and measures it, along with one half of ρ, in a basis of maximally entangled states. The output of the channel is the other half of ρ and the classical result of the measurement, see Fig. 13a,b(a). It is easy to see that no pure entanglement can ever be shared through such a channel, as any procedure which could would also be able to distill entanglement from the BE state ρ itself. No pure entanglement can ever be shared through such a channel, since any procedure which

406

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

A M ρ

a)

B

E Ψ

R

D

A M ρ b)

C B

Fig. 13a,b. Binding Entanglement Channels: (a) The input (from point A) is measured along with half of the BE state ρ in a maximally entangled basis. The measurement M produces classical information represented by the heavy lines. The classical results along with the other half of ρ are returned at the output B. (b) Alice sends half of a maximally entangled pair through the channel. Bob sends the classical data back to Alice, who then performs rotation R on the remaing half of as determined by the data. The result is teleportation from C to D

could would also be able to distill entanglement from the BE state ρ itself: If Alice and Bob share many copies of ρ, they can simulate actually having the channel by having Alice measure each of her inputs to the channel along with her half of a copy of ρ in the basis of maximally entangled states, and telling Bob the classical result, just as the channel itself would have done. By plugging their simulated channel into a procedure that could share pure entanglement through the channel, Alice and Bob would have distilled entanglement from the bound entangled state ρ. This does not yet establish the existence of a BE channel as our channel might only be able to share separable states. Now suppose Alice, whose lab is at the top of Fig. 13a,b(b), creates a maximally entangled state in m ⊗ m and sends half of it into the channel. Bob, whose lab is at the bottom right of the figure, sends the classical output data to Alice, who does some unitary operation R depending on those data. If the set of possible R’s is chosen correctly, the result is precisely quantum teleportation [BBC+ 93] of the half of ρ at point C to point D. So a bound entangled state has been shared through the channel. Finally, we note that the actual transmission of the classical data from Bob to Alice, while a simplifying idea, is not strictly needed. That communication along with rotation R is a LQ+CC operation and therefore cannot create entanglement where there was none. So even before the classical communication the state shared between points B and E must have been bound entangled.

Product Bases and Bound Entanglement

407

7. Conclusion We have shown some of the mathematical richness of the concept of unextendible and uncompletable product bases, their relation to graph theory and number theory. By exhibiting some of this structure we have uncovered a large family of bound entangled states. We have presented the first example of a new construction for bound entangled states. It would be interesting to try to understand the geometry of uncompletable product basis in a more general way; some of the interesting open questions in this respect have been mentioned in the paper. For example, from every multipartite UPB we can derive bipartite PBs by considering the UPB over bipartite cuts. Do these PBs have any special properties; can they correspond UCPBs when the local Hilbert spaces have dimension more than 2? The question that this work only partially addresses is one concerning the fruitful use of bound entangled states and the resources needed to implement separable superoperators. Further investigations into this matter will be worthwhile. A. No Six Member UPB in 3 ⊗ 3 In this appendix we prove that there cannot exist a UPB with six members in 3 ⊗ 3. We use some elementary graph theory to simplify the argument. We denote the complete graph on n vertices as Kn , i.e. in this graph all pairs of vertices are connected by an edge. The Ramsey number R(s, t) (cf. Ref. [Bol79]) is defined as the smallest number n such that every coloring of the edges of Kn with 2 colors, say red and blue, contains either a red Ks or a blue Kt . The Ramsey number R(3, 3) = 6. This implies that the graph of any product basis with six members contains at least three states which are mutually orthogonal either on Alice’s or Bob’s side; they form an orthogonal triad. Let us assume that this occurs on Bob’s side. We label these states as |β1 , |β2 , |β3 . Before considering some special cases we establish a simple rule which follows from the fact that the states are defined on 3 ⊗ 3; it is depicted in graph language in Fig. 14. Figure. 14 says then when we have a connected square of one color, there will be a repeated state, denoted by the equality “=” sign. Let |β1 = |1 and |β2 = |0, then |β3 = |1⊥ ∈ span(|0, |2) and |β4 = |0⊥ ∈ span(|1, |2). Orthogonality of |β3 and |β4 implies that either |β3 = |0 or |β4 = |1. Now we consider some subcases. In these cases the non-UPB character of the set is derived, either by directly showing how to extend the set or by showing that the states can be distinguished by LQ+CC (see Lemma 2). We have depicted the cases in Fig. 15a–f: (a) There exists an |βi ∈ {|β1 , |β2 , |β3 } such that this vertex i is connected to two out of |β4,5,6 , say |β4 , |β5 on Bob’s side. Then state |(αi , α6 )⊥ ⊗ |βi is orthogonal to all the members of the PB and thus the PB is extendible. (b) None of the states |βi is orthogonal to any of |β4,5,6 ; then Alice can perform a dissection of the set into (1, 2, 3) and (4, 5, 6). Proposition 2 then applies. (c) There exists one state |βi ∈ {|β1 , |β2 , |β3 } such that this vertex i is connected to exactly one out of |β4,5,6 on Bob’s side. For example, i = 1. This means that Alice can do a von Neumann measurement with span(α2 ,α3 ) and span(α4 ,α5 ,α6 ) . This will split the state |α1 , but as we have seen before a von Neumann measurement that cuts a single state is orthogonality preserving. After the measurement three or four orthogonal states are left to be distinguished. They can be distinguished (Proposition 2 and Theorem 3) and thus all six states can be distinguished. (d) Here we consider the case in which two vertices, say, |β1 and |β2 are connected to two different vertices out of |β4,5,6 . Notice that there is a square on the vertices 2,4,3,6

408

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal

1

3

1

1

3

3

or 2

2

4

4

4

2

Fig. 14. The square rule for an orthogonality graph in 3 ⊗ 3 1

4

1

4

1

4

2

5

2

5

2

5

3

6

3

6

3

6

(a)

(b)

1

4

2

5

3

6

(c)

(d)

Bob Alice

4

1

5

2

3

6

2 (e)

4

1

4

1

2

5

2

5

3

6

3

6

(f1)

(f2)

1

4

2

5

3

6

(f3)

Fig. 15a–f. The orthogonality graphs of PBs with six members on 3 ⊗ 3

on Alice’s side. This implies (see Fig. 14) that either 2 is equal to 3 on Alice’s side, which implies that 2 is also orthogonal to 5 on Alice’s side, which results in case (c), or 4 is equal to 6 on Alice’s side which implies that 4 is orthogonal to 1 on Alice’ side which also results in a variant of case (c). (e) Here we consider the case in which all three vertices |β1,2,3 are connected to the three different vertices |β4,5,6 . When we try to connect, say, vertices 4 and 5 on Bob’s side, we create a square and extra orthogonalities, such that we find examples of case (a) on Bob’s side. If we connect all three vertices 4,5,6 on Alice’s side, we get examples of case (a) on Alice’s side. (f) When two vertices, say 1 and 2, are connected to the same vertex, say 4, on Bob’s side, it must be that state 3 is equal to state 4 on Bob’s side. Then there are three subcases. In case (f1) state 3 and therefore 4 is not connected to 5 or 6 on Bob’s side. Let us consider how we can connect 1 to 5 and 2 to 5. With any choices of coloring of these edges we create examples of case (a) on either Alice’s or Bob’s side. In case (f2) 3 and therefore 4 is only connected to, say, state 5 on Bob’s side. Then to avoid case (a) on Bob’s side we put Alice’s edges between 1 and (5, 6) and 2 and (5, 6) and 3 and 4. But then a case (a) occurs on Alice’s side. In case (f3) both 3 or 4 are connected to 5 and 6 on Bob’s side; this creates a case (a) again on Bob’s side. This establishes the no-go result for a 6-member UPB in 3 ⊗ 3.

Product Bases and Bound Entanglement

409

Acknowledgements. Part of this work was completed during the 1998 Elsag-Bailey-I.S.I. Foundation research meeting on quantum computation. We would like to thank Charles Bennett, Asher Peres, Danny Terno, Ashish Thapliyal and John Tromp for discussion. We would like to thank Noga Alon for bringing ˇ Cebotarev’s theorem to our attention, thus proving Theorem 7. JAS and DPD acknowledge support from the Army Research Office under contract number DAAG55-98-C-0041. The work of TM was supported in part by grant #961360 from the Jet Propulsion Lab, and grant #530-1415-01 from the DARPA Ultra program.

References [AL] [BBC+ 93] [BBP+ 96] [BDF+ 99] [BDM+ 99] [BDSW96] [Ben] [Bol79] [DCLB00] [DSS+ 00] [DT] [DTT00] [Got97] [HH99] [HHH96] [HHH98] [HHH99a] [HHH99b] [HHH00] [Hor97] [HSTT] [HW79] [Lov79]

Alon, N., Lov´asz, L.: Unextendible product bases. J. Combinatorial Theory, Ser. A 95, 169–179 (2001) Bennett, C.H., Brassard, G., Cr´epeau, C., Jozsa, R., Peres, A., Wootters, W.K.: Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993) Bennett, C.H., Brassard, G., Popescu, S., Schumacher, B., Smolin, J.A., Wootters, W.K.: Purification of noisy entanglement and faithful teleportation via noisy channels. Phys. Rev. Lett. 76, 722–725 (1996) Bennett, C.H., DiVincenzo, D.P., Fuchs, C.A., Mor, T., Rains, E.M., Shor, P.W., Smolin, J.A., Wootters, W.K.: Quantum nonlocality without entanglement. Phys. Rev. A 59, 1070–1091 (1999), quant-ph/9804053 Bennett, C.H., DiVincenzo, D.P., Mor, T., Shor, P.W., Smolin, J.A., Terhal, B.M.: Unextendible product bases and bound entanglement. Phys. Rev. Lett. 82, 5385–5388 (1999), quant-ph/9808030 Bennett, C.H., DiVincenzo, D.P., Smolin, J.A., Wootters, W.K.: Mixed state entanglement and quantum error correction. Phys. Rev. A 54, 3824–3851 (1996) Bennett, C.H.: Private communication Bollob´as, B.: Graph Theory. New York: Springer, 1979 D¨ur, W., Cirac, J.I., Lewenstein, M., Bruss, D.: Distillability and partial transposition in bipartite systems. Phys. Rev. A 61, 062313 (2000), quant-ph/9910022 DiVincenzo, D.P., Shor, P.W., Smolin, J.A., Terhal, B.M., Thapliyal, A.V.: Evidence for bound entangled states with negative partial transpose. Phys. Rev. A 61, 062312 (2000), quant-ph/9910026 DiVincenzo, D.P., Terhal, B.M.: Product bases in quantum information theory. Proceedings of the XIII International Congress on Mathematical Physics, London, July 2000 DiVincenzo, D.P., Terhal, B.M., Thapliyal, A.V.: Optimal decompositions of barely separable states. J. Modern Optics 47(2/3), 377–385 (2000), quant-ph/9904005 Gottesman, D.: Stabilizer Codes and Quantum Error Correction. PhD thesis, CalTech, 1997, quant-ph/9705052 Horodecki, M., Horodecki, P.: Reduction criterion of separability and limits for a class of distillation protocols. Phys. Rev. A 59, 4206–4216 (1999), quant-ph/9708015 Horodecki, M., Horodecki, P., Horodecki, R.: Separability of mixed states: Necessary and sufficient conditions. Phys. Letts. A 223, 1–8 (1996), quant-ph/9605038 Horodecki, M., Horodecki, P., Horodecki, R.: Mixed state entanglement and distillation: Is there a “bound” entanglement in nature? Phys. Rev. Lett. 80, 5239–5242 (1998), quantph/9801069 Horodecki, M., Horodecki, P., Horodecki, R.: General teleportation channel, singlet fraction, and quasidistillation. Phys. Rev. A 60, 1888–1898 (1999), quant-ph/9807091 Horodecki, P., Horodecki, M., Horodecki, R.: Bound entanglement can be activated. Phys. Rev. Lett. 82, 1056–1059 (1999), quant-ph/9806058 Horodecki, P., Horodecki, M., Horodecki, R.: Binding entanglement channels. J. Modern Optics 47(2/3), 347–354 (2000), quant-ph/9904092 Horodecki, P.: Separability criterion and inseparable mixed states with positive partial transposition. Phys. Lett. A 232, 333–339 (1997), quant-ph/9703004 Horodecki, P., Smolin, J.A., Terhal, B.M., Thapliyal, A.V.: Rank two bound entangled states do not exist. J. Theor. Comp. Sci. 292(3), 589–596 (2003) See also [Ter99], quantph/9910122 Hardy, G.H., Wright, E.M.: An Introduction to the Theory of Numbers. New York: Oxford University Press, Fifth edition, 1979 Lov´asz, L.: On the Shannon capacity of a graph. IEEE Trans. on Inf. Theory 25, 1–7 (1979)

410 [New76] [Per] [Per93] [Rai] [Rai99] [Ter] [Ter99]

D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal ˇ Newman, M.: On a theorem of Cebotarev. Linear and Multilinear Algebra 3, 259–262 (1976) Peres, A.: Private communication Peres, A.: Quantum Theory: Concepts and Methods. Amsterdam: Kluwer Academic Publishers, 1993 Rains, E.M.: Entanglement purification via separable superoperators. quant-ph/9707002 Rains, E.M.: Rigorous treatment of distillable entanglement. Phys. Rev. A 60, 173–178 (1999) Terhal, B.M.: A family of indecomposable positive linear maps based on entangled quantum states. Lin. Alg. and Its Appl. 323, 61–73 (2001) Terhal, B.M.: Quantum Algorithms and Quantum Entanglement. PhD thesis, University of Amsterdam, 1999

Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 238, 411–427 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0870-0

Communications in

Mathematical Physics

On “Time-Periodic” Black-Hole Solutions to Certain Spherically Symmetric Einstein-Matter Systems Mihalis Dafermos Mathematics Department, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139, USA. E-mail: [email protected] Received: 9 December 2002 / Accepted: 22 January 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: This paper explores black hole solutions of various Einstein-wave matter systems admitting a time-orientation preserving isometry of their domain of outer communications taking some point to its future. In the first two parts, it is shown that such solutions, assuming in addition that they are spherically symmetric and the matter has a certain structure, must be Schwarzschild or Reissner-Nordstr¨om. Non-trivial examples of matter for which the result applies are a wave map and a massive charged scalar field interacting with an electromagnetic field. The results thus generalize work of Bekenstein [1] and Heusler [13] from the static to the periodic case. In the third part, which is independent of the first two, it is shown that Dirac fields preserved by an isometry of a spherically symmetric domain of outer communications of the type described above must vanish. It can be applied in particular to the Einstein-Dirac-Maxwell equations or the Einstein-Dirac-Yang/Mills equations, generalizing work of Finster, Smoller and Yau [10, 8, 9 and also 7]. For equations of evolution, time-periodic or stationary solutions often correspond to the late time behavior of solutions for a large class of initial data. In the general theory of relativity, time-periodic “black hole” solutions, if they exist, seem to provide reasonable candidates for the final state of gravitational collapse. Such solutions can be defined as those invariant with respect to an isometry of the domain of outer communications which takes every point to its future, or more generally, such that points sufficiently close to infinity are mapped to their future. In the case of a continuous family of isometries (i.e. stationary and static solutions), this problem has a long history and goes under the name “no hair” conjecture. See [4] for a survey of classical results and recent important refinements. Current proofs depend on various extra assumptions and truly satisfactory theorems have only been obtained in the vacuum and electrovacuum static case. The aim of this paper is to try to generalize some results from the static case to the spherically symmetric “time-periodic” case. The study of periodic solutions to the Einstein equations was initiated in Papapetrou [14, 15]; see also [12]. The analyses indicate

412

M. Dafermos

that vacuum solutions which are periodic near null infinity should in fact be static there, but they are far from complete, and depend very much on analyticity assumptions on the nature of null infinity, assumptions which do not appear to be physically valid. The present paper appears to be the first to address the issue of the existence of periodic solutions in general relativity in a non-analytic setting, in particular, in a setting compatible with the evolutionary hypothesis. After briefly setting some basic assumptions (Sect. 1) regarding spherical symmetry, we shall show in Sect. 2 that for a certain class of matter, non-trivial spherically-symmetric black-hole phenomena cannot be described by solutions invariant with respect to a map taking some point to its future. In Sect. 3, we shall enlarge the class of matter for which the result applies by taking another approach, which in effect reduces the problem to the static case. The method of Sect. 3 is related to the arguments of [11, 16]. In the spherically symmetric context, the above two sections generalize in particular results of [1] and [13], and Sect. 2, where it applies, provides a new and easier approach for the static case. Moreover, no assumption of invariance of the matter with respect to the isometry is necessary, nor is any real understanding of the behavior of the isometry on the event horizon. In fact, the results apply equally well when the “periodic” assumption is weakened to an appropriate notion of “almost periodicity”. Key to the results are the monotonicity properties of the area radius or the Hawking mass.1 In Sect. 4, which is independent of Sects. 2 and 3, we shall show that Dirac fields preserved by an isometry of the form described above must vanish. The method exploits conservation of the Dirac current. There has a been a series of recent work [10, 8, 9] where static spherically symmetric solutions of various coupled Einstein-Dirac-matter systems are considered, and also work [7] where periodic solutions of the Dirac equation on a fixed Reissner-Nordstr¨om background are considered. Modulo differences in regularity assumptions, all this previous work follows as a special case of the result of this section, which furthermore excludes non-trivial periodic solutions to a large class of coupled Einstein-Dirac-matter systems. The assumptions in this paper have been laid out in such a way so as to refer to the details of the matter fields as little as possible. In particular situations, however, many of the geometric assumptions of the next section can be retrieved from more “primitive” ones, provided one makes explicit assumptions regarding the relationship of the isometry and the matter. For this, the results of [16] and [17] are essential, and we refer the reader there. A brief discussion of this issue, along with a comparison to [8–10 and 7] for the case of the Dirac equation, is included in the end (Sect. 5).

1. Some Basic Assumptions Let (M, g) be a spacetime on which SO(3) acts by isometry and let Q be the quotient manifold. We recall that for sufficiently regular g, Q inherits a Lorentzian metric which in local coordinates can be written −2 dudv, together with the area radius function r which retrieves a constant multiple of the area of each group orbit. We also recall the Hawking mass m = 2r (1 − |∇r|2 ). 1 We prefer to use the Hawking mass due to its special significance in spherical symmetry. One can equally well work only with area radius.

Black-Hole Solutions to Einstein-Matter Systems

413

Our geometric assumptions of Q are best formulated with the aid of a conformal diagram. We assume that some open subset of Q can be conformally mapped onto the region R depicted below: Future timelike infinity

iz Ev

en

ity

or

+

in

th

nf

li

on

ul N

H

D Ev th

li ul N

on

nf

iz

in

or

ity

en

H−

Spacelike infinity

Past timelike infinity

The dotted curve enclosing the above figure and the three points labelled future and past2 timelike infinity and spacelike infinity are not part of R, but form a boundary to which, however, causal relations can still be applied. We assume that null geodesics whose endpoint lies on null infinity have infinite affine length. Thus, null infinity is a boundary “at infinity”. The lighter shaded region D = J + (Past Null Infinity) ∩ J − (Future Null Infinity) will be called the domain of outer communications. We assume that D \ D = H+ ∪ H− , where H+ is a future affine complete null ray and H− is a past affine complete null ray, both emanating from a single point3 of Q. H+ ∪ H− will be called the event horizon. There are no “holes” in D: All inextendible null rays emanating from points of D either cross the event horizon and leave D or have infinite affine length. We also assume r is bounded below on D by a positive constant, and that r → ∞ at spacelike and null infinity. We will assume furthermore that the domain of outer communications of M, i.e. the subset of M that projects to D, admits an isometry τ , which descends to Q, i.e. such that it induces an isometry on D with τ ∗ r = r, etc. Moreover, we will assume that τ preserves the time orientation, and takes some point p ∈ D to its future, i.e. τ (p) ∈ I + (p). Consider the orbit {τ n (p)}. Since τ preserves the time orientation, it also preserves the causal relation between two points. Thus for ∞ > i > j > −∞, τ i (p) ∈ I + (τ j (p)). Since in addition the distance between τ (p) and p is non-zero and must equal to the distance between τ (i+1) (p) and τ i (p), it follows that the orbit {τ n (p)} has no limit points in D ∪ H+ ∪ H− . Thus, in view of the fact that r → ∞ at spacelike and null infinity while r(τ i (p)) = r(p), it follows that as i → ∞, τ i (p) approaches future timelike infinity, and τ −i (p) approaches past timelike infinity. In particular, given any point q ∈ D, 2

In various cases, one can in fact completely avoid making any assumptions on the past. See Sect. 5. Note that this is the familiar restrictive assumption from the “no-hair” theorems; it excludes in particular the case of the critical e = m Reissner-Nordstr¨om solution. See, however, Sect. 5, for a sense in which one can get around this assumption. 3

414

M. Dafermos

there will be a τ i (p) in I + (q), and a τ −i (p) in I − (q). Note now that the metric −g on D, where space and time are reversed, can also be time-oriented, and τ must preserve causal relations with respect to −g. This is because “future”-pointing geodesics with respect to −g terminate at spacelike infinity, while “past”-pointing geodesics terminate at H+ ∪ H− . It follows that if q and τ (q) can be connected by an achronal curve (with respect to g), then so can q and τ j (q) for all j . But then τ 2i q could never be in the future of τ i (p), a contradiction. Thus τ (q) ∈ I + (q) for all q ∈ D, and moreover, there are no limit points of the orbits {τ n (q)} in D ∪ H+ ∪ H− . The key behind all our arguments will be to show that the existence of the isometry implies that certain quantities vanish on the event horizon. In the simplest cases, covered by the following section, this will then uniquely determine the wave matter to be constant on the event horizon, and then constant everywhere, by application of a uniqueness theorem for solutions of the characteristic initial value problem. This illustrates one of the key points of this paper: In the spherically symmetric context, one can interchange the notions of spacelike and timelike on the quotient manifold Q, making + D = J(−g) (H+ ∪ H− ), and thus, for appropriate equations, data on the event horizon determines solutions throughout D, by applying standard theorems [3]. In the case where the matter satisfies the weak energy condition, the vanishing of the above-mentioned quantities is proven using the monotonicity properties of the area radius and the Hawking mass.

2. Exploiting the Coupling with Gravity We refer the reader to [2] for a derivation of the Einstein equations in spherical symmetry with a general energy-momentum tensor. We will assume here that these equations are satisfied pointwise (in particular all functions that appear are bounded) in the null coordinate charts of our atlas for an induced energy momentum tensor Tab on Q which satisfies the energy condition Tuu ≥ 0, Tuv ≥ 0, Tvv ≥ 0. Here we always select v such that null geodesic rays from points of D generated by ∂v are future-directed and have infinite affine length (i. e. “terminate” on null infinity). It follows from ∇a ∇b r =

1 (1 − ∂ c r∂c r)gab − r(Tab − gab trT ) 2r

(1)

that ∂u r ≤ 0 and ∂v r ≥ 0 in D4 and then, from ∂a m = r 2 (Tab − gab trT )∂ b r

(2)

that ∂u m ≤ 0 and ∂v m ≥ 0. We then have the following Proposition 1. Let τ be an isometry of the domain of outer communications D as described in the previous section. It follows that ∂v r = 0 and ∂v m = 0 on H+ and ∂u r = 0, ∂u m = 0 on H− . 4 The arguments are similar to those of [2]. Assuming one of the inequalities does not hold, one argues by integrating (1) that r will have to become zero after a finite affine length in the direction of a null geodesic that terminates at the event horizon, a contradiction.

Black-Hole Solutions to Einstein-Matter Systems

415

Proof. The proof is by contradiction. Suppose p and q are two points on H+ such that r(q) = r(p) + for > 0. By continuity of r there exists a point q ∈ D on the ray generated by −(∂u )q such that r(q ) > r(p).

−

in

p

n τ (p )

−

−

+

ity

q

p

H

nf

li

ul N

q

N

ul

li

nf

in

ity

H−

It follows from the equations ∂u r ≤ 0, ∂v r ≥ 0 that r > r(p) in D ∩ J + (q ). Now consider the point p at the intersection of the null ray generated by −(∂v )q and the null ray generated by −(∂v )p . Again, by the relation ∂u r ≤ 0 it follows that r(p ) ≤ r(p). Now the assumption on τ implies that there exists an N such that τ n (p ) ∈ D ∩ J + (q ) for all n > N. But since τ is an isometry r(τ n (p )) = r(p ) ≤ r(p). This is a contradiction. One can then apply the same argument with m in place of r, and then for H− replacing H+ , thus completing the proof. By virtue of Eq. (2) and the boundedness of Tuv , it follows from the above proposition that since ∂v m = 0 and ∂v r = 0 on H+ and similarly ∂u m = ∂u r = 0 on H− , we have Tvv = 0 on H+ and Tuu = 0 on H− . We now proceed to outline the more restrictive assumptions on the structure of the matter which will be necessary for our results. The first set of assumptions reflects the structure of the energy-momentum tensor itself. These are: 1. T = T (, F, g) where F is a skew symmetric 2-tensor, and takes values in ˜ such that if ∇˜ X = 0 identically for some space endowed with a connection ∇, all X ∈ T ∗ M, then T corresponds to the energy momentum tensor of a spherically symmetric electric field Fµν satisfying the source-free Maxwell equations. Here ∇˜ X is the induced connection on M. 2. Tvv = 0 should imply ∇˜ v = 0, and Tuu = 0 should imply ∇˜ u = 0. Here ∇˜ v = ∇˜ ∂v , etc. One example of such a T is the energy momentum tensor generated by a wave map φ : (M, g) → (N, h) interacting (via the gravitational field only, since it does not carry charge) in an electromagnetic field Fµν : 1 1 A B A B φ;ν − gµν g ρσ φ;ρ φ;σ . Tµν = Fµλ Fνρ g λρ − gµν Fλρ Fσ τ g λσ g ρτ + hAB φ;µ 4 2 A special case of the above is of course when N is R n with the flat metric and φ is then a collection of n real scalar fields. Another example of a T satisfying 1 and 2 above is the energy momentum tensor generated by a massive complex scalar field φ in an

416

M. Dafermos

electromagnetic field Fµν with electromagnetic potential Aµ . Here the connection Dµ is defined as Dµ = ∂µ + ieAµ , and T takes the form: 1 Tµν = Fµλ Fνρ g λρ − gµν Fρσ Fλκ g ρλ g σ κ 4 1 1 1 2 ¯ − gµν M φ φ + (Dµ φDν φ + Dµ φDν φ) − gµν g ρσ Dρ φDσ φ. 2 2 2 Supposing we have a τ as in Proposition 1, for T satisfying the above assumptions it follows that ∇˜ v = 0 on H+ and ∇˜ u = 0 on H− . Since the restriction of a connection to a one-dimensional set is trivial, we can then choose coordinates for a space representing the degrees of freedom for such that is in fact constant on the event horizon. If in local coordinates xa ∈ Q the system of equations for , with Fµν , gab , and r fixed, can be written in the form ∂ a ∂a = F (∇, , xa )

(3)

with F a sufficiently regular function5 , then the characteristic initial value problem with initial data on the event horizon is locally well posed, provided is assumed sufficiently regular [3].6 If this equation admits the solution = (H+ ∪ H− ), then this must be the only solution in the vicinity of the horizon, and by a continuity argument, this domain of dependence property can be extended to guarantee uniqueness throughout D. A sufficient condition for this is clearly F (0, , xa ) = 0. Thus, in view of the fact that spherically symmetric solutions of the Einstein-Maxwell equations are necessarily Reissner-Nordstr¨om, we have Theorem 1. If Tµν satisfies Conditions 1 and 2, and satisfies a system of the form (3), with F (0, , xα )(p) = 0, and if τ is as in Proposition 1, it follows that D coincides with the domain of outer communications of a Reissner-Nordstr¨om solution. Note that the above theorem applies to the wave map system discussed above, which can be written ∂ α ∂α = ( )(|∇ |2 ), where is an expression involving the Christoffel symbols of (N, g). (Compare with [13]. The above argument reproves, in particular, the static result, and seems considerably easier, as it does not depend on the geometry of the target.) There is another interesting feature of the above argument that should be noted. Nowhere have we assumed that τ preserves , only that it preserves the metric. Moreover, the fact that τ is an isometry is only used to obtain the result of Proposition 1. Examination of the proof of that proposition reveals that our previous assumptions on τ can be replaced by the following: 1. For all p, as n → ∞, τ n (p) tends to future timelike infinity, and as n → −∞, τ n (p) tends to past timelike infinity. 5 Note that for fields which couple directly to the F µν tensor, there is a regularity assumption on F as well as on g implicit in (3). 6 Recall the comment from before that to apply [3], one should first redefine the metric g to be its ab negative, so D becomes J + (H+ ∪ H− ).

Black-Hole Solutions to Einstein-Matter Systems

417

2. Given any point p, lim inf |r(τ n (p)) − r(p)| + |m(τ n (p)) − m(p)| = 0.7 n→±∞

Such solutions can be called “almost periodic”. 3. Constructing a Killing Vector and Reducing to the Static Case Unfortunately, as it stands, the argument of the previous section cannot be applied in the case of a complex scalar field or a massive scalar field, for then the dependence of F on is not of the type described above. In particular, there do not exist constant non-zero solutions. It is perhaps instructive to compare here with the static case. The argument of Bekenstein [1], say for the scalar field 2φ = Mφ with M > 0, goes roughly as follows: Integrating the equality ∇α (φ∇ α φ) = ∇a φ∇ α φ + Mφ 2 using Gauss’s theorem, the boundary contributions along the event horizon vanish, while the contributions along two spacelike curves which are carried to one another by the isometry cancel. Moreover, the divergence is non-negative, since in a static solution ∇φ is spacelike, and 0 only if the solution vanishes. Thus, either the solution is identically 0, or there must be a boundary contribution at infinity, i.e. the solution does not decay as r → ∞. The latter would imply that the curvature does not decay, and thus the solution would not be asymptotically flat. Thus asymptotically flat static solutions of 2φ = Mφ must vanish identically. The above Bochner-type method and arguments based on it cannot be applied directly in the periodic case as ∇φ may have negative length. With slightly more effort than in Sect. 2, one can show that for various examples of matter–including the case of a charged massive scalar field, for instance–our initial data determine the solution to be static, and then apply the above argument to show that this solution must thus not decay at infinity. The idea is similar in spirit to Theorem 1, except that now we will apply the uniqueness theorem to the solution of the characteristic value problem to a system of second order hyperbolic equations derived from Killing’s equation. First, we introduce the following new assumptions: Letting x A denote coordinates on S 2 , we assume 1. ∂v r = 0 on H+ and ∂u r = 0 on H− . 2. Tvv = 0, ∇u Tvv = 0 on H+ and Tuu = 0, ∇v Tuu = 0 on H− . 3. ∇v TAB = 0 on H+ and ∇u TAB = 0 on H− . Given an isometry τ as before, or alternatively an “almost isometry”, Assumptions 1, 2 and 3 above follow for a large class of matter, including the case of a complex scalar field interacting in an electromagnetic field. (See the Appendix where it is shown explicitly that this particular example satisfies all assumptions in this section. In fact, the reader can deduce that all assumptions of this section are valid for a wide class of matter, including Yang-Mills fields, from the results of [16].) We define v and u on the event horizon so as to yield an affine distance as measured from the point H+ ∩ H− on H+ and H− respectively, i.e. we will be assuming that guv = −1 on H+ ∪ H− . We now will define a particular null vector field K on the event 7

In fact the term involving m can be dropped.

418

M. Dafermos

horizon, and extend it to D as the unique solution of the initial value problem, with initial data on the event horizon8 , for the equation 2K α = −K β Rβγ g γ α .

(4)

The choice of the definition will be to ensure that LK gµν = 0 on H+ ∪ H− . For now write K|H+ = K v (v)∂v and K|H− = K u (u)∂u , where we will determine immediately following what K v (v) and K u (u) have to be. Let us concentrate first on H+ . In the null coordinates defined above (where in addition x A are taken to be normal coordinates on S 2 ), the only non-vanishing Christoffel u , B and v . Outside of H+ , u , v , B , B , u , v are the symbols are uu uu vv AB uA AB uA vA AB only non-vanishing components. Note also that on H+ , R = 2g uv Ruv + g AB RAB v A u v = −2(−∂u vv − ∂u vA ) + g AB (∂u AB + ∂v AB ) v = 2∂u vv + 4g AB ∂u ∂v gAB v = 2∂u vv + 2(R + 2Ruv ),

and thus, we have that 1 v = − R − 2Ruv . ∂u vv 2 We compute 2K u = −2∂u ∂v K u , v v 2K v = −2∂u ∂v K v + ∂v K v (−g AB AB ) + K v (−∂u vv ), and thus (4) gives, ∂u ∂v K u = 0, 1 1 v v ∂v ∂u K v = − ∂v K v g AB AB − (∂u vv + Ruv )K v 2 2 1 1 v = − ∂v K v g AB AB + (R + 2Ruv )K v 2 4 v 1 1 v = − ∂v K (R + 2Ruv )dv + K v (R + 2Ruv ). 4 4 0

(5)

(6)

Recalling (LK g)αβ = Kα;β + Kβ;α , in view of our knowledge of the Christoffel symbols, and the fact that Kv = 0 on H+ , we obtain (LK g)uv = ∂u Kv + ∂v Ku = −∂u K u − ∂v K v ,

(7)

(LK g)vv = 2∂v Kv = −2∂u K u = 0, (LK g)vA = 0, (LK g)uA = 0, 8 Recall the comment in Sect. 1 about the well-posedness of this problem in spherical symmetry, because of the symmetry between timelike and spacelike directions.

Black-Hole Solutions to Einstein-Matter Systems

419

u v (LK g)AB = −AB Ku − AB Kv = 0, u (LK g)uu = 2(∂u Ku − Ku uu ) = −2∂u K v .

(8)

Thus, if we are to have (LK g)αβ = 0 on H+ , it follows from (7) that ∂u K u = −∂v K v .

(9)

Rewriting (5) as ∂v ∂u K u = 0, it follows from (9) that ∂v ∂v K v = 0, and thus that K v = Cv. Using (9) again, and the same argument, it follows that K u = −Cu on H− . Of course, to show that indeed we have (LK g)αβ = 0 on H+ , for the K defined above, it remains to show, in view of (8), that ∂u K v = 0. Since the above equation is indeed true at H+ ∩ H− , it follows that it is enough to show that ∂v ∂u K v = 0, or, by (6), that v 1 1 − ∂v K v (R + 2Ruv )dv + K v (R + 2Ruv ) = 0. 4 4 0 Assumption 2 together with the conservation of energy-momentum implies that ∂v Ruv = 0, and Assumption 3 implies that ∂v R = 0, and thus, R + 2Ruv = c, where c = (R + 2Ruv )|H+ ∩H− . Thus we compute v 1 1 1 1 − ∂v K v (R + 2Ruv )dv + K v (R + 2Ruv ) = − C(cv) + (Cv)c = 0. 4 4 4 4 0 We can take then C = 1 and we have found a nontrivial vector field vanishing at H+ ∩H− satisfying LK g = 0 on H+ , and similarly, LK g = 0 on H− as well. Denote now the totality of matter by . In terms of this K defined, we assume further 4. LK = 09 on H+ ∪ H− . 5. The quanitities LK gµν and LK satisfy a system of equations which, when everything else is treated as fixed, only admits the zero solution if they vanish on H+ ∪H− . The above assumptions are motivated by [17], and the fact that equations which reduce to hyperbolic systems in 1 + 1 dimensions are well-posed with data on H+ ∪ H− , since the notion of space and time can be reversed. All our assumptions taken together imply that we have produced a vector field K such that LK g = 0, i.e. a Killing field K on the domain of outer communications. Note that a similar argument ensures that K is also a Killing field “downstairs”, i.e. that in addition, Kr = 0. From this, it follows that K must be timelike. Since K does not vanish identically on the event horizon, it follows that there exists a p ∈ D such that K(p) = 0. Thus K does not vanish along the curve r = r(p), which must be the orbit φt (p), where φt denotes the one parameter group of isometries generated by K. Since 9 The expression L can be tricky to define if the equations have a gauge invariance. Typically, this K will mean that there is a choice of gauge for which the matter can be expressed by some for which LK = 0. See the Appendix for the case of a complex scalar field.

420

M. Dafermos

all future directed constant-u null rays must intersect the curve r = r(p), it follows that K can nowhere vanish. For if it did at some point q, then choosing a point s on r = r(p) which can be connected to q by a spacelike curve, then φt (s) for large enough t is in the future of φt (q) = q, which contradicts the fact that φt is an isometry. We have thus proven Theorem 2. For an Einstein-matter system satisfying Assumptions 1, 2, 3, 4, and 5 above, the domain of outer communications is static. 4. Exploiting a Conserved Current: The Case of the Dirac Equation In the case of Dirac fields, the arguments outlined above do not apply because this matter does not satisfy the positive energy condition. This is related to the fact that the Dirac field probably provides a reasonable model only after second quantization. But in fact, considerations regarding periodic solutions are even easier than in the previous section, and can be studied without applying the coupling with gravity, which played a central role in the previous arguments. We refer the reader to [7] for background on this problem in the uncoupled case, and to [10, 8, 9] in the coupled static case. In particular, we recall the Dirac matrices Gα , which operate on Dirac fields , which in turn are sections of an appropriate spinor bundle. The precise form of the Dirac equation could depend on the other matter fields present to which the field is coupled. We will simply assume that satisfies in local coordinates, after fixing the metric and the other matter fields, a linear equation of the form iGα ∂α = F (, xα ),

(10)

where F (0) = 0. Note that by squaring the Dirac operator, it follows that satisfies a system 2 = F˜ (, ∇, xα ).

(11)

¯ α provides a positive current, i.e. We further assume that the vector field G ¯ α X β gαβ ≥ 0 G when Xβ is future directed and timelike, with equality only in the case where vanishes, and moreover, this current is conserved: ¯ α ) = 0. ∇α (G

(12)

We do not assume that is spherically symmetric, but we do assume that it is defined on a spherically symmetric manifold M, which satisfies the assumptions of Sect. 1. We assume that is locally regular in M, in the sense that in a neighborhood of every point of M, there exists a regular representation of the Dirac matrices such that are suitably regular functions. We will assume further that ¯ α ) = G ¯ α . τ∗ (G

(13)

Note that in this section, in general, Greek indices will indicate that the relation is to be interpreted “upstairs”, i.e. on M.10 10

Recall that the notation τ is also used for the isometry upstairs.

Black-Hole Solutions to Einstein-Matter Systems

421

Moreover, we assume that F˜ is such that (11) descends to a linear wave equation on Q for each spherical harmonic of . Consider a spacelike curve γ , terminating at spacelike infinity, which divides D into two connected components, and intersects the event horizon at H+ ∩ H− . Let X be the future normal vector field to γ . We will also denote by X the vector upstairs which projects to X and which is orthogonal to the group orbits. Fix a point q on the event horizon and p on γ . We denote the part of γ connecting p with spacelike infinity by γp . Then for an isometry τ as in Proposition 1, there exists an n such that τ n (p) and q can be connected by a spacelike curve γ˜ :

ity n

τ (γ p) γ N ul li nf in ity

H−

in

D

nf

p

li

τn(p)

+

ul

H

N

∼ γ

q

We will assume that11

¯ α X β gαβ < ∞. G

(14)

γ

The above assumption is quite reasonable in view of the fact that this integral should be equal to the probability of observing the particle on γ , which should be normalizable to something less than 1. Now integrating the conservation law (12) and applying Gauss’ theorem, and since ¯ α X β gαβ ≥ 0, G γ˜

it follows that H+ ∩J − (q)

¯ α N β gαβ ≤ G

¯ α X β gαβ − G

γ

γ

¯ α X β gαβ − G

= =

γ \γp

τ n (γp )

¯ α (τ∗n X)β gαβ G

¯ α X β gαβ G γp

¯ α X β gαβ , G

where the second line follows from the fact that τ is an isometry and (13). Here N β is an appropriately normalized null vector generating H+ upstairs. As p → H+ ∩ H− , the All integrals below should of course be interpreted on M, i.e. γ denotes the integral over the 3-hypersurface in M represented by γ with respect to its volume form. 11

422

M. Dafermos

term on the right hand side approaches 0. Thus, since the left hand side is nonnegative, it follows that ¯ α N β gαβ = 0 G H+ ∩J − (q)

¯ α N β gαβ = 0 identically on H+ , and similarly, for all q and consequently, G β ¯ α N− gαβ = 0 on H− , where N− denotes a null vector generating H− in M. G Since N− + N at H− ∩ H+ is timelike, it follows by the positivity of the current that in fact vanishes there. It turns out that the behavior of on the event horizon, deduced above, together with the Dirac equation, imply that vanishes identically on the event horizon: Choose coordinates u, v, x 1 , and x 2 in a neighborhood of H+ ∩ H− , such that, g = −2 dudv + g˜ ij dx i dx j . It follows from the properties deduced above that a spinor representation can be chosen such that Gu = 0, Gu ∂v = 0, and Gu ∂x i = 0 on H+ , while Gv = 0, Gv ∂u = 0, and Gv ∂x i = 0 on H− . From the anticommutation relations it follows that Gv Gv = 0, Gu Gu = 0. Multiplying the Dirac equation (10) by Gu , and restricting to H+ , one obtains, i

i(Gu Gv ∂v + Gu Gx ∂x i ) = Gu (F ()). i

i

Since Gu Gx = −Gx Gu by the anticommutation relations, it follows from Gu ∂x i = 0 that iGu Gv ∂v = Gu F (). Again, from the anticommutation relations, one obtains that Gu Gv = 2g uv − Gv Gu , and thus, since Gu ∂v = 0, ∂v = f˜()

(15)

for a well-behaved function f˜ with f˜(0) = 0. From the fact shown above that = 0 at H+ ∩ H− , it now follows immediately from (15) that must vanish identically on H+ . One argues in the same way to obtain that vanishes identically on H− . This condition completely determines initial data for the characteristic initial value problem12 for each of the spherical harmonics of , and thus in view of our assumption on F˜ of (11), assuming that is in a space sufficiently regular, all the spherical harmonics must vanish identically in D by uniqueness of the solution of the characteristic initial value problem. We thus have Theorem 3. For M, τ , satisfying the assumptions outlined in this section, = 0 throughout the domain of outer communications of M. In particular, this result immediately implies that “time-periodic” solutions of the coupled Einstein-Dirac equations are Schwarzschild, and “time-periodic” solutions of the coupled Einstein-Dirac-Maxwell equations are Reissner-Nordstr¨om. Combining the above theorem with the results of Sects. 2 and 3, one can prove for a wide class of “matter” that time-periodic solutions of coupled Einstein-Dirac-“matter” systems must be static solutions of Einstein-“matter”, and this in turn, in several cases, again implies that they are in fact Schwarzschild or Reissner-Nordstr¨om. 12

See the remark in the previous section about the change of sign of the metric gab .

Black-Hole Solutions to Einstein-Matter Systems

423

Again, it is clear from the proof that one can replace the previous assumptions on τ with Assumption 1 from the end of Sect. 2, together with the assumption that for all p, , there exists an N (, p) such that |n| ≥ N implies α β α β ¯ ¯ X gαβ − X gαβ < . G G γp γτ n (p) There is thus a sense in which the above theorem holds for “almost periodic solutions” as well. 5. A Note on the Assumptions As discussed in the beginning, the motivation for considering “time-periodic” solutions (Q, g) is as “limiting” final states of gravitational collapse. Thus, a priori, perhaps it makes more sense to assume that Q be defined to the future of a spacelike hypersurface S. ul

N

+

ity

in

nf

li

H

S In view of our assumptions on the existence of an isometry τ , however, given a fun˜ and τ˜ an damental domain F such that S ⊂ ∂F , one can construct in an obvious way Q ˜ such that Q ˜ = Q ∪i τ˜ i F . In the spherically symmetric case, if the energy isometry of Q, momentum tensor satisfies the energy condition, it follows that this spacetime will also have a past boundary. For, otherwise, it is clear by the arguments of Sect. 2 that since ˜ r must become less than its own infimum within finite affine length in the ∂v r > 0 in Q, direction −∂v , a contradiction. Moreover, by arguments similar in spirit to the proof of Proposition 2, this boundary can be shown to have a natural null structure and we can denote it as before by H− . N l ul

+

ity

fin in

H

F

H−

This kind of construction, in a slightly different setting, has in fact appeared in [11, 16]. In particular, the results of [16] imply that, in the case of the coupled Einsteinmatter system, for a wide class of matter, H− is regular and the matter fields extend

424

M. Dafermos

regularly there. Moreover, if a certain quantity κ0 defined in [16] is nonzero at any point of the event horizon, then H+ and H− intersect, and we retrieve in particular the assumptions of Sect. 1. Thus for a wide class of matter, one can indeed formulate the assumptions to refer only to the future of a spacelike S. Of course, for the above, one would have to explicitly assume that the matter fields are preserved by τ , something not necessary in Sects. 3 and 4, as well as drop any hope of an “almost perodic” extension of the result. It should also be noted that with a little more work, this procedure can be carried out for the Dirac equation, even though it violates the positive energy condition. Thus, a theorem can be stated analogous to Theorem 3, but where all assumptions are made on J + (S). As noted in the introduction, the results of Sect. 4 can be thought of as a generalization of previous work of Finster, Smoller, and Yau. It is important to note, however, that there are some differences in the basic assumptions between the present work and [8, 9, and 10]. These differences basically arise from a difference in point of view: In the aforementioned series of papers, the (r, t) coordinates themselves (and an associated gauge they define) seem to be assigned a fundamental physical ontology even where they are singular, whereas in the present paper, it is observations made by local observers which are considered fundamental. On the one hand, this leads the authors of [8, 9, and 10] to a much more stringent notion of periodicity and staticity that in particular refers explicitly to a certain choice of gauge. On the other hand, this leads them to consider solutions of the Dirac equation for which the condition (14) is violated for spacelike hypersurfaces crossing the event horizon, in particular, a priori is allowed to blow up identically on the event horizon at a very specific rate. To deal with the possibility of such a blow-up, however, [8, 9, and 10]. must then introduce very strong auxiliary regularity assumptions on the rest of the matter and the metric, against which one must measure this specific blow up: assumptions at the level of C ∞ of the metric at the horizon, various auxiliary coordinate-dependent conditions and power-law assumptions, etc. Moreover, [7] depends on an assumption on the vanishing of a certain flux over H− , and the global behavior of in the interior of the Reissner-Nordstr¨om black hole, a geometry which is widely thought to be unstable (see [5]). Acknowledgement. I thank Piotr Chru´sciel and Istv´an R´acz for some very useful discussions on a preliminary version of this paper.

Appendix We will show that a complex scalar field indeed satisfies the assumptions of Sect. 3. To reduce the equations to a determined system, we will have to set a gauge. We can always choose a gauge so that Av = 0 on H+ and Au = 0 on H− , and the components AB = 0 as well. We will also recall the notation D for the covariant derivative defined by the connection A, i.e. we have Dµ φ = φ,µ + ieAµ φ. Note that the only non-vanishing components of the electromagnetic tensor Fµν are Fuv and the collection FAB , where A and B range over coordinates on S 2 . Thus, Tvv = Dv φDv φ,

(16)

Tuu = Du φDu φ,

(17)

Black-Hole Solutions to Einstein-Matter Systems

425

1 1 Tuv = − guv FAB FCD g AC g BD + Fuv Fvu g vu , 4 2 1 TAB = FAC FBD g CD − gAB FMN FCD g MC g ND . 4 In particular, the existence of τ implies that Tvv = 0 on H+ , Tuu = 0 on H− , and thus ∂v φ = Dv φ = 0 on H+ and ∂u φ = Du φ = 0 on H− . Moreover, since ∇u Tvv = ∂u Tvv , applying ∂u to (16) yields ∇u Tvv = 0 on H+ and similarly ∇v Tuu = 0 on H− . Assumptions 1 and 2 of Sect. 3 thus hold. Now, Maxwell’s equations Fµν;ρ g νρ − ieφDµ φ + ieφDµ φ = 0, restricted to H+ , yield the equation ∂v Fvu = 0, and on H− , the equation ∂u Fvu = 0. Similarly, the equation F[AB,v] = 0 yields ∂v FAB = 0, and ∂u FAB = 0, throughout D. Thus we have ∂v TAB = 0 on H+ and ∂u TAB = 0 on H− , and this implies Assumption 3. To write a determined system of equations, we impose the equation ∇ α Aα = 0. Note that this equation, together with the condition that Au = 0 on H+ and Av = 0 on H− implies that LK Aµ = 0 on H+ ∪ H− . For, on H+ we obtain that Au,v = 21 Fuv , and thus µ (LK A)u = K µ Aµ,u + K,u Aµ u = K v Au,v + K,u Au v = K v Au,v − K,v Au

=

1 v 1 K Fuv − ∂v K v Fuv v 2 2

=

1 1 CvFuv − CvFuv = 0, 2 2

u v Au + K,v Av = 0 + 0 + 0 = 0. (LK A)v = K v Av,v + K,v

426

M. Dafermos

Our equations for the matter = (Aµ , Fµν , φ) thus now become ∇ α Aα = 0, A(µ,ν) = Fµν , ∇ α Fαµ = ie(φDµ φ − φDµ φ), g µν Dµ Dν φ = 0. Applying LK to these equations, and using Eq. (4) yields ∇ α (LK A)α = L(∇LK g) + L(LK g), (LK A)(µ,ν) = L(LK φ) + L(LK g), 2(LK φ) = L(∇LK (g)) + L(LK A) + L(LK g) + L(LK F ). Here, the notation L(x) means terms linear in x. Noting that LK T = L(g) + L(φ) + L(A) + L(F ), and that LK Fµν = (LK A)(µ,ν) , we have that given g, A, F , and K, the above system coupled with the equation 2LK g = L(LK T ) + L(LK g) can be written as a closed linear hyperbolic system in 1+1 dimensions for LK A, (LK φ), and (LK g), with vanishing initial data on H+ ∪ H− , and for which 0 is a solution. Since 0 is a solution of this system it must be the only solution, by uniqueness of this initial value problem, i.e. the final assumption of Sect. 3 is also verified. The argument clearly also applies to the massive case, and to more general so-called Higgs fields. It can be adapted to more general gauge theories. See [16]. References 1. Bekenstein, J.: Nonexistence of baryon number for static black holes. I. Phys. Rev. D (3) 5, 1239– 1246 (1972) 2. Christodoulou, D.: Self-gravitating relativistic fluids: A two-phase model. Arch. Rational Mech. Anal. 130(4), 343–400 (1995) 3. Christodoulou, D., zum Hagen, H.M.: Probl`eme de valeur initiale caract´eristique pour des syst`emes quasi lin´eaires du second ordre. C. R. Acad. Sci. Paris S´er. I Math. 293(1), 39–42 (1981) 4. Chru´sciel, P.: “No hair” theorems—folklore, conjectures, results. Differential geometry and mathematical physics (Vancouver, BC, 1993), Contemp. Math., 170, Providence RI: Amer. Math. Soc., 1994, pp. 23–49 5. Dafermos, M.: Stability and Instability of the Cauchy horizon for the spherically-symmetric Einstein-Maxwell-Scalar Field equations. To appear in Ann. of Math. 6. Finster, F., Kamran, N., Smoller, J., Yau, S.-T.: Nonexistence of time-periodic solutions of the Dirac equation in an axisymmetric black hole geometry. Commun. Pure Appl. Math. 53(7), 902–929 (2002) 7. Finster, F., Smoller, J., Yau, S.-T.: Non-existence of time-periodic solutions of the Dirac equation in a Reissner-Nordstr¨om black hole background. J. Math. Phys. 41(4), 2173–2194 (2000) 8. Finster, F., Smoller, J., Yau, S.-T.: Absence of stationary, spherically symmetric black hole solutions for Einstein-Dirac-Yang/Mills equations with angular momentum of the fermions. Adv. Theor. Math. Phys. 4, 1231–1257 (2000) 9. Finster, F., Smoller, J., Yau, S.-T.: The interaction of Dirac particles with non-abelian gauge fields and gravity–black holes. Mich. Math. J. 47, 199–208 (2000)

Black-Hole Solutions to Einstein-Matter Systems

427

10. Finster, F., Smoller, J., Yau, S.-T.: Non-existence of black hole solutions for a spherically symmetric, static Einstein-Dirac-Maxwell system. Commun. Math. Phys. 205(2), 249–262 (1999) 11. Friedrich, H., R´acz, I., Wald, R.M.: On the rigidity theorem for spacetimes with a stationary event horizon or a compact cauchy horizon. Commun. Math. Phys. 204, 691–707 (1999) 12. Gibbons, G.W., Stewart, J.M.: Absense of asymptotically flat solutions of Einstein’s equations which are periodic and empty near infinity. Classical general relativity (London, 1983), Cambridge: Cambridge Univ. Press, 1984, pp. 77–94 13. Heusler, M.: A no-hair theorem for self-gravitating nonlinear sigma models. J. Math. Phys. 33(10), 3497–3502 (1992) ¨ 14. Papapetrou, A.: Uber periodische nichtsingul¨are L¨osungen in der allgemeinen Relativit¨atstheorie. Ann. Physik 20, 399–411 (1957) 15. Papapetrou, A.: Non-existence of periodically varying non-singular gravitational fields. In: Les th´eo´ ries relativistes de la gravitation (Royaumont, 1959), Editions du Centre National de la Recherche Scientifique, Paris, 1962, pp. 193–198 16. R´acz, I.: On further generalization of the rigidity theorem for spacetimes with a stationary event horizon or a compact Cauchy horizon. Class. Quant. Grav. 17, 153–78 (2000) 17. R´acz, I.: Symmetries of spacetime and their relation to initial value problems. Class. Quant. Grav. 18, 5103–5113 (2001) Communicated by H. Nicolai

Commun. Math. Phys. 238, 429–479 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0833-5

Communications in

Mathematical Physics

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics Nicholas D. Alikakos1,2, , Giorgio Fusco3 1 2 3

Department of Mathematics, University of Tennessee, Knoxville, TN 37996-1300, USA University of Athens, Panepistimiopolis, 15484 Greece Dipartimento di Matematica, Universita Di L’Aquila, Italy

Received: 14 November 2000 / Accepted: 26 January 2003 Published online: 16 June 2003 – © Springer-Verlag 2003

Abstract: We consider a dilute mixture in 3D of a finite number of particles initially close to spherical, but of varying size, evolving according to quasistationary dynamics. Under the scaling hypotheses that (1) size/distance, (2) deviation from sphericity/size are both initially small, we show robustness of the almost spherical shape by associating centers and radii to each particle and then derive rigorously a set of O.D.E.’ s for the radii which we relate to the Lifschitz-Slyozov-Wagner theory of coarsening, and also establish an estimate for the speed of the centers. 0. Introduction We begin with a description of the phenomenon and of the basic physics behind it, and then we introduce the mathematical model we will be analyzing. We refer the reader to [7, 13 and 17]. We also point out some of the underlying simplifying assumptions. Consider a binary solution of species A and B (Al-Ni is an example) with phase diagram which for simplicity we take as in Fig. 1. [B] The axes represent the relative concentration XB = [A]+[B] , and the temperature T. The mixture exists as liquid above the “liquidus” line and as solid below the “solidus” line. T1 , T2 are the melting temperatures of A and B respectively. When the mixture, initially at M, is quenched at Q between the immiscibility lines, it decomposes, and ultimately reaches a thermodynamically stable state, partly liquid and partly solid, with constitution given by the phase diagram. The ratio of the solid phase α to the liquid phase β is PQR Q , while the compositions are given by XS and XL respectively. The separation into two phases is a complicated phenomenon. It involves distinct mechanisms like spinodal decomposition and nucleation. During these stages and under the hypothesis of dilution, a large number of small particles β is generated. In the present work we are not concerned with the initial separation phenomenon, but rather with a later Partially supported by NSF-DMS 9622791, and ENE 99/527 (funded by Greece and the European Community).

430

N.D. Alikakos, G. Fusco T

L

M

T1 R P

Q

T2

S 0%B

XS

XL

100%B

XB

Fig. 1. The Phase Diagrams. The axes represent the relative concentration of B and the temperature. QR/P Q gives the ratio of solid phase α to liquid phase β with compositions XS and XL respectively

stage, called Ostwald ripening, coarsening or aging, that is characterized by a decrease of the interfacial energy, via a transfer of mass away from the small β particles towards the larger β particles. We now explain more carefully the physical mechanism. We will assume isotropic growth and so we consider almost spherical particles. The reader may recall that the Gibbs free energy contains the mechanical term P V . If the α phase is acted upon by a pressure of 1 atm, the β phase is subjected to an extra pressure P due to the curvature of the α/β interface. If γ is the α/β interfacial energy and the particles 1 are spherical with radius r, then P = 2γ r . We note that there are differences with two space dimensions. We refer to [48] where also other references for work in 2D can be found. In Fig. 2 we show the effect on the free energy due to the curvature. This free energy increase is known as the Gibbs-Thomson effect. A consequence of this is that the solubility of β is sensitive to the size of the β particles. From the common tangent construction (Fig. 2) we see that the concentration XB , denoted by µ, of the solute B in α, in equilibrium with β across a curved interface is proportional to the (mean) curvature. It follows that mass will flow away from the small and towards the large. Thus the normal velocity V of the interface should be proportional to the concentration gradient ∂µ ∂n . This discussion ([7, pp 44–47, 314–317]) suggests that the concentration µ(x, t) satisfies:  ∂µ c ∂t − κµ = 0, off (t) := α/β interface,     µ = λH, on (t) (Gibbs-Thomson), ∂µ (0.1) 0, on ∂ ,  ∂ν =   ∂µ  V = ∂n , on (t). Here is a bounded smooth domain in R3 , the container, H is the mean curvature of (t), which is taken positive for a shrinking sphere, ν is the outward normal to ∂µi e ∂ , ∂µ = ∂µ ∂n ∂ne + ∂ni , where µe , µi are the restrictions of µ on e (t) and i (t), the exterior and interior of (t) in , ni , ne , the unit exterior normal to e (t), i (t), V is the normal velocity taken positive for a shrinking sphere, and c, κ, λ > 0 are various physical constants. Under appropriate hypotheses the problem (0.1) can be replaced by 1 Consider an expanding balloon and equate the work due to the surface tension to the work done by the difference in pressure.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

431

B ()

X1

X2 r2

r1 G

G (r2)

G

G (r1) G( ) (b) X1 X 2

X

Fig. 2. The origin of particle coarsening. β with a small radius of curvature (r2 ) has a higher molar free energy than β with a large radius of curvature (r1 ). The concentration of solute is therefore highest outside the smallest particles. (from [7])

the simpler quasistationary approximation which in dimensionless units takes the form  −µ(x, t) = 0, off (t), in ,     µ = H, on (t), ∂µ (0.2) 0, on ∂ , ∂ν =    ∂µ  V = , on (t). ∂n Equation (0.2) in the context of binary alloys is known as the Mullins - Sekerka problem [12]. This evolution free boundary problem is the object of study in the present paper. We mention that (0.2) is also the sharp interface limit of the Cahn-Hilliard equation ([8– [B] , and describes well the various 11]) which is an evolution equation for XB = [A]+[B] separation and coarsening phenomena on the line PQR (Fig. 1). We remark that it would be more appropriate to think of µ as the chemical potential in (0.2) above. There are various approximations involved in the derivation of (0.1), the most important by far being the assumption that the density is the same in both phases and that there is no convection. Also the elastic energy is completely ignored. Equation (0.2) is a volume preserving, perimeter shortening law: d P er( (t)) = −2 H V = −2 |∇µ|2 ≤ 0, (0.3) dt d V ol( i (t)) = − V = 0. dt These calculations presuppose well posedness of (0.2). We refer the reader to [14, 15, 19] for the local theory, and to [36, 18] for global weak solutions.

432

N.D. Alikakos, G. Fusco

In the present paper we consider an initial configuration of N almost spherical particles. Under the scaling hypotheses size of particles size of particles = O(ε), = O(ε), distance between particles distance of particles from ∂ deviation from sphericity = O(ε), size of particles

(0.4)

(0.5)

we obtain rigorous results without imposing any restriction or modification on (0.2). Specifically we establish that the particles continue2 to be close to the spherical shape, for all time. This global robustness result (note that patterns of unequal spherical particles are far from equilibrium) allows us to associate to each particle a center ξi , and a radius ρi . Our main result stated loosely, says that for 0 < ε < ¯ , for ε¯ > 0 after rescaling time by t = ε 3 τ,

dρi 1 1 1 + O(ε) , i = 1, . . . , N,3 (0.6) = − dτ ρi ρ¯ ρi where ρ¯ =

N 1 ρi . N

(0.7)

i=1

Equations (0.6) are justified for all τ > 0, and beyond singularities. Under the assumption that the initial radii ρ0i , i = 1, · · · , N satisfy the condition ρ01 < ρ02 < · · · < ρ0N , it is shown that there exist times T1 < T2 < . . . < TN−1 such that ρi (t) −→ 0 as τ −→ Ti , and moreover that particles shrink like spheres. Equation (0.6) can be applied in the interval [0, T1 ). At time T1 a singularity occurs and particle 1 shrinks to a point. 1 N Equation (0.6) can then be applied in [T1 , T2 ) for i = 2, · · · , N and ρ¯ = N−1 i=2 ρi , 1 N and then in [T2 , T3 ) for i = 3, · · · , N and ρ¯ = N−2 ρ , and so on until for i=3 i τ > TN−1 only the largest particle survives. In the limit ε → 0 Eqs. (0.6) state that at each time any given particle shrinks or expands according to whether it is smaller or larger than the average size. In the same limit solutions to (0.6) satisfy N d 3 ρi = 0 (Volume Conservation), dt

(0.8)

i=1

n d 2 ρi ≤ 0 (Perimeter Shortening). dt i=1

Relations (0.8) are inherited from (0.3). Equations (0.6) provide an enormous reduction of (0.2) to finite dimensions. They are also related to the celebrated LifschitzSlyozov-Wagner (LSW) theory of coarsening ([3, 4]). These works were the first where a successful quantitative analysis of the phenomenon of Ostwald ripening (known since 1901, [1]) was carried out. Under various assumptions the LSW theory derives formally 2 The initial configuration of exact but unequal spheres get immediately distorted. The spherical class is not invariant. 3 Equations (0.6) were brought to our attention by P.Bates and P.Fife at a meeting in 1991 in Crete.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

433

effective equations of the evolution of a spherical particle coupled with an external field representing the effect of the rest of the particles. The point of departure in [3, 4] is (0.2). They derive the equation ∂n ∂ + (J (R)n) = 0, n = n(R, t), ∂t ∂R where dR 1 = J (R) = dt R

1 1 − ¯ R R

(0.9)

,

∞ Rn(R, t)dR R¯ = 0 ∞ . 0 n(R, t)dR

(0.10)

(0.11)

Here n(R, t) is a density, giving the percentage of particles between R1 and R2 : R2 R1 n(R, t)dR. System (0.9)–(0.11) then is analyzed and it is argued that it possesses a self-similar solution

1 R (0.12) ns (R, t) ∼ = 4g ¯ R(t) t3 that captures the typical behavior of the system for intermediate times. Based on this the following temporal laws are derived for the average radius

1 4 3 3 ∼ ¯ ¯ R(t) = R (0) + t 9

(0.13)

and for the total number of particles

4 N (t) = R¯ 3 (0) + t 9

−1 .

(0.14)

We now present the beautiful idea of LSW ([37]) that leads to the kinetic equation

1 1 1 dR − = . (0.15) dt R R¯ R Applying (0.2) for (t) a sphere of radius R(t), and the whole space with the boundary condition µ∞ (t) at infinity representing the effect of the rest of the “particles” we obtain  −µ(r, t) = 0, r = R(t),    1 µ(R, t) = , (0.16) R    lim µ(r, t) = µ∞ (t), t→∞

∂µ V = . ∂r r=R −

(0.17)

434

N.D. Alikakos, G. Fusco

Solving (0.16) we obtain µ(r, t) = µ(r, t) =

1 R(t) , 1−µ∞ (t)R(t) r

+ µ∞ (t),

r ≤ R(t), r ≥ R(t).

(0.18)

Thus (0.17) gives J (R) =

dR 1 = −V = − 2 (1 − µ∞ R). dt R

Utilizing the conservation we have ∞ ∞ d 0= nR 3 dR = nt R 3 dR dt 0 0 ∞ ∂ (J (R)n)R 3 dR =− ∂R 0 ∞ n(1 − µ∞ R)dR. =3

(0.19)

(0.20)

0

Hence

∞

nRdR µ∞ = 0 ∞ 0 ndR

(0.21)

from which (0.15) follows. Niethammer [6] has given a rigorous argument for (0.15). However she had to modify ∂µ 1 (0.2) by replacing (0.2) (iv) by V = | | ∂n . This isotropic approximation allows her to restrict attention to the class of spheres. [6] also requires that the centers ξi remain fixed. As is pointed out in [41] the centers do move but in general slower than the radii. In the present paper we show that for 0 < ε < ε¯ , for ε¯ > 0, dξi = O(ε 3 ) dτ

(0.22)

globally in time. We remark that in a companion paper to the present paper [49] we obtain precise expressions for the O-terms in (0.6), (0.22) which render a correction to the equations that depends on the distance between the particles. These results were announced in [20]. The present paper is structured as follows. In Sect. 1 we give a precise statement of the main result (Theorem 1.1) and analyze the principal term in (0.6). The relationship with Eqs. (0.6) is made precise in Proposition 10.1, and in Theorem 1.2. The rest of the paper is devoted to the proof of Theorem 1.1. The main steps of the proof are as follows: 1 – Recall (cfr. [33]) that classical potential theory [46] implies that the quasi-static free boundary problem (0.2) can be reformulated as an integro-differential equation that only involves the interface = (t) : S(V ) = H − H¯ ,

(0.23)

where, as in (0.2), V is the normal velocity of , H is the mean curvature of , H¯ the average of H on and S is an integral operator: the Neumann-Dirichlet operator. This is done in Sect. 3 (cfr. Proposition 3.1). The operator S defined by the left-hand side of (3.3) is introduced in Sect. 6.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

435

2 – Introduce a suitable representation for = (t). As described in Sect. 0 the scope of the paper is to study the dynamics of (0.2) under the assumption that the initial interface 0 is the union of N ≥ 1 small and well separated quasi-spherical surfaces 0i , i = 1, · · · , N : 0 = N i=1 0i . Therefore a proper choice of the “coordinates” used to represent a quasi-spherical surface i is essential for the analysis. It is natural to associate to i a center ξi , a radius ρi and a function ri that describes the “distortion from sphericity”. We show in Sect. 4 (cfr. Proposition 4.1) that there are uniquely determined ξi , ρi , ri such that i = x/x = ξi + ρi (1 + ri (u))u, u ∈ S 2 , (0.24) where S 2 ⊂ R3 is the unit sphere, provided we require that ri : S 2 → R satisfies the orthogonality conditions ri (u)du = 0; ri (u) u, ej du = 0, j = 1, 2, 3, (0.25) S2

S2

3 where ej j =1 is the standard basis in R3 . With the representation (0.24) the unknowns of the problem become the 3N functions ξi = ξi (t), ρi = ρi (t), ri = ri (u, t), i = 1, · · · , N, that through (0.24) describe the time evolution of = N i=1 i . We work under the scalings (0.4) and (0.5) where it is assumed that ε > 0 is a small parameter. We introduce the parameter ε in the problem by replacing ρi with ερi and ri with εri in (0.24): iε = x/x = ξi + ερi (1 + εri (u))u, u ∈ S 2 . (0.26) 3 – Transform (0.23) into a system of evolution equations for the unknowns ξ = (ξ1 , · · · , ξN ), ρ = (ρ1 , · · · , ρN ), r = (r1 , · · · , rN ). This is quite an elaborate process which is accomplished in Sect. 5–9 and is divided in several steps: a): In Sect. 5 we give the expression of H in terms of ρ and r (cfr. Proposition 5.1). This is a standard computation that yields 1 Hi = (1 − εLri ) + · · · , (0.27) ερi where Hi is the restriction of H to i , L = s + 2I is the Jacobi operator, s being Laplace-Beltrami on S 2 , and where the dots denote higher order terms in ε. b): In Sect. 6 we discuss the linear operator S and its inverse T , the Dirichlet-Neumann operator which is needed to solve (0.23) for V . For small ε > 0 the operator T is well approximated by the operator T0 which is the analog of T for the special case = R3 , = S 2 and plays an important role in the analysis that follows. We list in Proposition 6.2 the main properties of T0 . The analysis in Sect. 6 also includes a discussion of the operator A = T0 L (cfr. Theorem 6.3) which will be the main term in the linear part of the evolution equation for r. c): In Sect. 7, cfr. Proposition 7.1, we derive the expression of Vi , the restriction of V i to i , as a function of the time derivatives ξ˙i , ρ˙i , ∂r ∂t , of the unknowns ξi , ρi , ri and show that, provided ε > 0 is sufficiently small, given a function Zi : S 2 → R, the equation ∂ri Vi (ξi , ρi , ri , ξ˙i , ρ˙i , ) = Zi , (0.28) ∂t uniquely determines ξ˙i , ρ˙i , ∂ri as a function of ξi , ρi , ri , Zi . ∂t

436

N.D. Alikakos, G. Fusco

d): In Sect. 8, using the result in Sect. 6 and in particular the properties of the operator T0 , we show (cfr. Proposition 8.1) that given a function W = (W1 , · · · , WN ), Wi ∈ C 1+α (S 2 ) the equation S(V ) = W,

(0.29)

can be solved for V = (V1 , · · · , VN ), Vi ∈ C α (S 2 ). This requires elaborate computations and some regularity results for integral operators for which we use results from [27]. The main result in Sect. 8 is (cfr. 8.5) 3ri (v) − ri (· ) ε ερi Vi = T0 Wi + T0 (T0 Wi )(v)dv + K + · · · , (0.30) 2 2 4π |· −v| S where K is a constant which depends on T0 Wh , h = 1, · · · , N and is explicitly given. The dots denote higher order terms in ε which are estimated. e): In Sect. 9 we set Wi = Hi − H¯ in (0.30) and using the expression of Hi derived in Sect. 5 we obtain an explicit formula giving Vi as a function of ξ, ρ, r (cfr. Proposition 9.2). This function is finally inserted in the place of Zi in the expressions i for ξ˙i , ρ˙i , ∂r ∂t derived in Sect. 7 so obtaining the sought system of evolution equations for ξ, ρ, r. After rescaling time by t = ε 3 τ this system takes the form (cfr. Proposition 10.1)  dξ 3 ξ i   dτ = ε fi ,  dρ ρ ρi 1 i = − 1 + ε 3 fi , (0.31) dτ ρi2 ρ¯    ∂ri = 13 Ari + l.o.t + ε 2 f r , i ∂t ρ i

ξ

ρ

where l.o.t denotes linear lower order terms and fi , fi , fir are functions of ξ, ρ, r which are estimated and are bounded if r is bounded. 4 – Derive a bound for r. The main step in the proof of Theorem 1.1 is to prove that r is bounded. This is done in Proposition 10.6. Then by Proposition 10.1 the justification of the equations for the radii follows. To obtain this bound we use in Sect. 10 a suitable functional-analytic setting for the evolution equation for r in (0.31). Here it is essential to use the optimal regularity theory of Da Prato and Grisvard [29, 32] which provides the appropriate semigroup setting and makes available the variation of constants formula. In fact, as discussed in Sect. 6 the operator A maps C 3+α (S 2 ) functions into C α (S 2 ) functions and, on the other hand, from the analysis in Sect. 9 (for fixed ξ, ρ) fir is a smooth function from C 3+α (S 2 ) to C α (S 2 ). The key result here is a linear fact stated in Sect. 6 and proved in [19] and independently in [14] that the operator A has the optimal regularity property with respect to the pair E0 = hα (S 2 ), E1 = h3+α (S 2 ), where hk+α space of Da Prato and Grisvard that is the completion with respect is the little H older ¨ k+α to the C norm of the set of C ∞ functions. This bound on r implies the robustness of the spherical shape. Xinfu Chen [15] and independently Constantine and Pugh [16] have established the stability of the single sphere equilibrium. Later Escher and Simonett [19] reestablished this result in a different more general way. Our result, in contradistinction to the above, addresses systems of arbitrary number of components where the situation is very different. First we note that the equilibrium states are equal sphere configuration, which for two or more spheres are unstable. A configuration of unequal spheres is generally far from equilibrium. What

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

437

makes the analysis harder is that the class of unequal spheres is not positively invariant under the dynamical system. Nevertheless we show, for arbitrary initial data of unequal spheres that the distortion away from sphericity is small, globally in time. A statement of the results of the present paper with a sketch of proofs, figures, and other relevant facts was first presented in [40]. 1. Statement of the Main Results The proofs of the theorem and the corollary below are the main purpose of this paper. Theorem 1.1. Let ⊂ R3 be a bounded connected and smooth domain. Assume that ◦ ◦ ◦ = ∪N i=1 i , N ≥ 2, where i is as in (0.26), with ξi0 ∈ , ρi0 > 0, and ri0 ∈ 2 3+α C (S ) with α ∈ (0, 1) satisfying (0.25). Assume ξi0 = ξj 0 for i = j, ρ10 < ρ20 < · · · < ρN0 . There is ε¯ > 0 such that, if 0 < ε < ε¯ , then the solution t −→ (t) of the Mullins-Sekerka problem for this class of initial conditions ◦ can be represented in ξ, ρ and r as in (0.26) above and exists globally as a weak solution in the sense of Chen [18, 36]. Moreover the following hold: (i) The solution is classical except at the times T1 < T2 < . . . < TN−1 which are characterized via lim ρi (t) = 0,

t→Ti

Moreover

i = 1, · · · , N − 1.

Ti = O(ε 3 ), i = 1, · · · , N − 1.

(ii) There exists constants Cρ , Cr > 0, depending only on that ερ˙h =

(1.1)

1 ερh

1 ερ¯

−

1 ερh

+

gh ερh

ρi0 , ρ10

i = 2, . . . , N such

with |gh | < Cρ ρρ¯h ε + O(ε 2 rh 2C 3+α (S 2 ) ), i ≤ h ≤ N ; Ti−1 < t < Ti ; i = 1, · · · , N (1.2)

,

1 ρh , N −i+1 N

where ρ¯ =

h=i

ri (t)C 3+α (S 2 ) < Cr ,

(1.3)

as long as ri is defined. Moreover lim ri (u, t) = 0,

t→Ti

i = 1, . . . , N − 1,

lim rN (u, t) = 0.

t→+∞

Finally dξi = O(1) dt globally in time.

(1.4)

(1.5)

438

N.D. Alikakos, G. Fusco

Theorem 1.2. Consider the ODE’s, 

N  1 1 1 1  ˙ − , i ≤ h ≤ N; Tˆi−1 < t < Tˆi ; εR¯ = εRh , εRh = εRh εR¯ εRh N − i + 1 h=i   εRi (0) = ερi0 , i = 1, . . . , N, (1.6) where Tˆ1 ≤ Tˆ2 ≤ . . . ≤ TˆN−1 are the extinction times for the system (1.6), defined by ρi0 , i = 2, . . . , N, lim Ri (t) = 0. Then there are constants CT , CR depending only on ρ10 t→Tˆi such that

|Tˆi − Ti | < ε4 CT , 1

|ερi (t) − εRi (t)| < ε1+ 3 CR , 1 |ερN (t) − εRN (t)| < ε1+ 3 CR ,

i = 1, . . . , N − 1,

(1.7)

t ∈ [0, Tˆi − ε 4 CT ), i = 1, . . . , N − 1, t ≥ 0.

Remarks on the Scaling. As we have indicated above there are three characteristic length scales in the problem: The size of the particle, the distortion from the spherical shape and the distance between the particles or between the particle and ∂ . For simplicity we have taken all three ratios (see (1.3), (1.4) above) to be of the same order of magnitude. The equations and the estimates are all invariant under the homothetic transformation → 1ε . The size hypotheses on the initial conditions involve ratios and are invariant under this transformation. There are two possible, equivalent normalizations that are equally natural to adopt. a) size of particle O(ε), size of domain O(1), distortion like O(ε2 ). This corresponds to ξi → ξi , ρi → ερi , τi → ετi in (0.24). b) size of particle O(1), size of domain O( 1ε ), distortion as O(ε). This corresponds to ξi → ξεi , ρi → ρi , τi → ετi in (0.24). Consider Eqs. (1.2) which correspond to the first scaling gi = O(ε) and so 1 ε.

gi ερi

is O(1),

small in relation to the main term which is like Under this scaling the extinction of the particle happens at O(ε 3 ) time. Under the second scaling ρgii is O(ε) which is small compared to the main term that is O(1). In the second scaling the extinction time is O(1). For definiteness we will work with the scaling a). 2. The ODE’s – The Equation of Ostwald Ripening We begin by analyzing the system of ODE’s associated to (1.6) which after rescaling time by setting t = ε 3 τ can be rewritten in the form dRh 1 1 1 ˆ ˆ = − dτ Rh R¯ Rh , i ≤ h ≤ N, Ti−1 < τ < Ti i = 1, . . . , N, (2.1) Ri (0) = ρi0 . We assume ρ10 ≤ ρ20 ≤ · · · ≤ ρN0 . Proposition 2.1. The system (2.1) has the following properties: (i) If ρi0 < ρj 0 , then Ri (τ ) < Rj (τ ) on their common domain of existence.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

(ii) (iii)

N i=1 N

Ri3 (τ ) =

N

3 ρi0

439

(Volume conservation).

i=1

Ri2 (τ ) is decreasing (Perimeter Shortening).

N d2 2 Moreover away from extinction times we have 2 Ri (τ ) ≤ 0. dτ i=1 (iv) R1 (τ ) is nonincreasing in time and RN (τ ) is nondecreasing. (v) Assume that ρN−10 < ρN0 . Then all except the N th particle get extinct in finite times Tˆ1 ≤ Tˆ2 ≤ · · · ≤ TˆN−1 . Moreover we have the estimates 1 i=1

2

1 3 3 R1 (0)

N3

1 3 ˆ 3 Ri (Ti−1 )

where

3

V

3

4π ≤ Tˆ1 ≤ 13 R13 (0) RN (0)−R1 (0) ,

2

≤ Tˆi − Tˆi−1 ≤ 13 Ri3 (Tˆi−1 )

V =

4 π 3

N

(N−i+1) 3

3 4π V

1 3

RN (Tˆi−1 )−Ri (Tˆi−1 )

,

3 ρi0 .

i=1

(vi) The system is invariant under the scaling λ−1/3 Ri (λτ ) . d R¯ (vii) ≤ 0 away from the extinction times. dτ Proof. (i) Suppose that there is τ ∗ > 0 such that Ri (τ ∗ ) = Rj (τ ∗ ) > 0. Then we note that Ri (τ ) and Rj (τ ) satisfy the same equation. By uniqueness we conclude that Ri (0) = Rj (0), contradicting the assumption. (ii) Between Nextinction times

N N N d 1 dRi 1 1 1 = − = Ri3 (τ ) = Ri2 Ri Ri − N = 0. dτ 3 dτ Ri R¯ R¯ i=1

i=1

i=1

i=1

The conservation of volume follows from this and the continuity of volume. (iii) Between we have times Nextinction

N N N d 1 dRi 1 N 1 1 2 = Ri (τ ) = Ri ≤ 0. − − = dτ 2 dτ Ri R R¯ R¯ i=1 i=1 i=1 i=1 i The statement now follows by the continuity of the perimeter. Between extinction times we have N N N 1 d N 1 1 dRi d2 1 2 =− Ri (τ ) = − 2 − ¯2 dτ 2 2 dτ R¯ Ri dτ Ri R i=1 i=1 i=1

N 1 1 1 2 1 1 =− + − ≤ 0. R R¯ Ri Ri R¯ i=1 i ¯ ) ≤ RN (τ ), (iv) By utilizing Eq. (2.1) and the fact that R1 (τ ) ≤ R(τ we conclude to the desired result.

440

N.D. Alikakos, G. Fusco

(v)

1 1 1 dR1 1 ≥ − 2 . Integrating we obtain − = dτ R1 R¯ R1 R1 R13 (τ ) ≥ −3τ + R13 (0). From this we obtain the lower bound. 1 1 Next notice that R1 − R¯ = [(R1 − R2 ) + · · · + (R1 − RN )] ≤ (R1 − RN ). N N We will be utilizing this estimate as follows: dR1 1 R1 − R¯ R1 (0) − RN (0) 1 R1 − R N 1 R1 (0) − RN (0) = = . ≤ ≤ N N dτ R1 R1 R¯ N R12 R¯ N 1 2 Ri R12 Ri N R1 i=1

i=1

In the above we have utilized also (iv). By 13 N N 1 1 3 Ri ≤ Ri N N i=1

i=1

we obtain dR1 R1 (0) − RN (0) , ≤ 2 3 1 1 dτ 3 R12 N 3 4π V3 which after integrating gives the upper bound in (ν). The rest of (ν) is straightforward and is omitted. (vi) This is a straightforward verification. (vii) Between extinction times we have   2 N N N N 1 1 1 d R¯ 1 1 1 1 1  ≤0 ≤ = − − dτ N R¯ R N N R R2 R2 i=1 i i=1 i i=1 i i=1 i (by Jensen’ s Inequality). The proof of the proposition is complete.

Note. At time Tˆi , R¯ has a positive jump: ¯ Tˆ − ) = ¯ Tˆ + ) − R( R( i i

N 1 Rj , i = 1, . . . , N − 1. (N − i)(N − i + 1) j =i+1

As a result R¯ eventually increases in spite of the fact that

d R¯ dτ

≤ 0 during a smooth phase.

Note. The estimate 1 1 c1 (Tˆ1 − τ ) 3 < R1 (τ ) < C1 (Tˆ1 − τ ) 3

holds, together with analogous estimates for R2 , . . . , RN−1 . It is established in the course of the proof of Proposition (10.1).

3. The Integral Equation Formulation In this section we reduce (0.2), to a problem that lives entirely on the interface . This is mostly known([33]). We give details and related facts for completeness.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

441

Let g be the Green’s function associated to     

−u = f, in , ∂u =0 on ∂ , ∂n     udx = f dx = 0,

(3.1)

where is as in (0.1). In what follows is taken, more generally an open, bounded set, connected and smooth in Rm , m ≥ 3. g is given by  1   −y g(x, y) = δx (y) − ,   | |   ∂g =0 ∂ny       g(x, y)dx = 0,

in ,

x ∈ ,

on ∂ ,

(3.2)

where δx (y) is the Dirac δ supported at x ∈ . Proposition 3.1. The Mullins-Sekerka problem (0.2) can be formulated as an integral equation in the class of C 3+α interfaces: g(z, y)V (y) dSy − (t)

1 | (t)|

g(z, y)V (y) dSy dSx = H (z) − H¯ , (3.3)

(t) (t)

where | | stands for the (m−1) Hausdorff measure of , H¯ =

1 | |

H dSy .

Proof. We begin by assuming that x ∈ / . We denote by i and e the interior and the exterior of in . Multiplying (3.2) by µ(y) and (0.2) by g(x, y) and integrating with respect to y over i and e and subtracting the equations we obtain via Green’s theorem g(x, y)V (y)dSy = µ(x) − µ, (3.4)

where µ denotes the average of µ over the set that is defined. Taking now the limit as x → z ∈ , we obtain g(z, y)V (y)dSy = H (z) − µ. (3.5)

Finally integrating (3.5) in z over we obtain g(z, y)V (y)dSy dSz = | |H − µ| |.

Substituting in (3.4) renders (3.3).

(3.6)

442

N.D. Alikakos, G. Fusco

Remarks. Note that g(x, y) =:

1 + γ (x, y), 4π |x − y|

where γ satisfies  1   −y γ (x, y) = − , x ∈ l , y ∈ ,    | |

  ∂γ (x, y) 1 ∂ , x ∈ l , y ∈ ∂ , =− ∂n 4π |x − y|

∂n  y y    1   dy,  γ (x, y)dy = − 4π |x − y|

(3.7)

(3.8)

where

l = {x ∈ |d(x, ∂ ) ≥ l > 0} , l fixed. Clearly γ is regular and the following estimates hold c c max |γ (x, y)| < , max |∇y γ (x, y)| < 2 , y y l l

where l emphasizes that γ is like γε (x, y) = εγ (εx, εy) and so

1 length .

Notice that under the scaling →

1 ε ,

max |γε (x, y)| < εC, max |∇y γε (x, y)| < ε2 C. y

y

This fact is important in keeping track of the effect of the boundary under the change of scaling from a) to b). Here we could have suppressed l. However, we retain it so that in the main estimates we can see that they are dimensionally correct, which is a useful check. 4. The Co-ordinate System Given an interface close to spherical, we would like to associate to it a unique sphere and view the interface as a small perturbation of this sphere. In particular, if the interface is already a sphere we would like this procedure to associate the same sphere. So to each interface in a certain class we will associate a center and a radius. There are many different coordinate systems involving moments that accomplish this. The special coordinates we single out can be interpreted as the application to the particular case in hand of the familiar idea of coordinatizing a neighborhood of an equilibrium point with the projections on the center-unstable and stable subspaces. The spherical shape is an equilibrium for a class of geometric operators related to the mean curvature and the coordinate system we use is defined by projecting on the eigenfunctions of the relevant linearized operator. We denote by Sξ,ρ ⊂ R3 the sphere of center ξ and radius ρ. Proposition 4.1. Given a surface in a sufficiently small C 1 neighborhood of a sphere Sξ¯ ,ρ¯ , there are unique ξ ∈ R3 , ρ > 0, r ∈ C 1 (S 2 ) such that = {x|x = ξ + ρ(1 + r(u))u, u ∈ S 2 }, r(u) du = 0, 2 S r(u) u, ei du = 0, i = 1, 2, 3, S2

where e1 , e2 , e3 are the standard basis vectors in R3 .

(4.1) (4.2) (4.3)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

443

Remarks. 1. Conditions (4.2), (4.3) introduce 4 constraints. These are satisfied by the four parameters ξ ∈ R3 and ρ > 0. 2. Condition (4.2) implies that, to principal order, ρ is the average radius of . To see the meaning of condition (4.3), we note that the translate Sξ¯ +δei ,ρ¯ of Sξ¯ ,ρ¯ by δ in the direction ei is given by Sξ¯ +δei ,ρ¯ = {x|x = ξ¯ + (ρ¯ + δ u, ei + O(δ 2 ))u, u ∈ S 2 }. Therefore if N is an operator which has Sξ¯ ,ρ¯ and all its translates as equilibria then u, ei has to be a zero eigenfunction of the linearization DN 0 of N at Sξ¯ ,ρ¯ . As we have pointed out in the introduction, the Mullins-Sekerka operator has this property and so (4.3) are orthogonality conditions. Note that by imposing on r condition (4.3) we take away from it the component corresponding to translation and so r represents properly the distortion of away from sphericity. The spectrum of the restricted operator is also stable. This reflects the stability of the spherical shape and also suggests that the coordinate system will be preserved along the evolution. A different co-ordinate system would have to be updated along the evolution. 3. If ξ in Proposition 4.1 is replaced by “centroid” of , then (4.2), (4.3) are satisfied to principal order. In spite of the fact that the centroid has a simpler definition, we require ξ to satisfy (4.2) and (4.3) which simplifies some of the computations that follow. Proof of Proposition 4.1. 1. All ’s in a C 1 neighborhood of Sξ¯ ,ρ¯ can be be represented in the form = {x|x = ξ¯ + ρ(1 ¯ + r¯ (u)) ¯ u, ¯ u¯ ∈ S 2 } and also alternatively by = {x|x = ξ + ρ(1 + r(u))u, u ∈ 2 S } for all ξ in the neighborhood of ξ¯ , where ξ, ρ, r are related to ξ¯ , ρ, ¯ r¯ via ξ¯ + ρ(1 ¯ + r¯ (u)) ¯ u¯ = ξ + ρ(1 + r(u))u.

(4.4)

From (4.4) we draw two separate relations ρ(1 + r(u)) = |ξ¯ − ξ + ρ(1 ¯ + r¯ (u)) ¯ u|, ¯ ¯ξ − ξ + ρ(1 ¯ + r¯ (u)) ¯ u¯ u= . ¯ |ξ − ξ + ρ(1 ¯ + r¯ (u)) ¯ u| ¯

(4.5) (4.6)

So, we have two free variables left, ρ and ξ . Imposing condition (4.2) is equivalent to taking 1 ρ= |ξ¯ − ξ + ρ(1 ¯ + r¯ (u(u))) ¯ u(u)| ¯ du, 4π S 2 and so ξ is the remaining free variable. 2. What remains to be shown is that we can choose ξ properly so that (4.3) is satisfied, or equivalently Fi (ξ, ρ, ¯ r¯ ) := |ξ¯ − ξ + ρ(1 ¯ + r¯ (u)) ¯ u| u, ¯ ei du = 0, i = 1, 2, 3 (4.7) S2

which corresponds to the statement that for a sphere the new and old co-ordinates coincide and where u¯ = u(u, ¯ ξ, ρ, ¯ r¯ ) is implicitly defined by (4.6). We observe that Fi (ξ¯ , ρ, ¯ 0) = 0.

444

N.D. Alikakos, G. Fusco

Moreover denoting by Dξ the gradient of Fi with respect to the first entry, we have Dξ Fi (ξ¯ , ρ, ¯ 0)ξˆ =

S2

−ξˆ + ρD ¯ ξ u¯ ξˆ , u u, ei du, ξˆ ∈ R3 ,

(4.8)

¯ ξˆ = −ξˆ + ρD ¯ ξ u¯ ξˆ , u ¯ and (4.6) which implies that where we have used Dξ |ξ¯ − ξ + ρ¯ u| u(u, ¯ ξ¯ , ρ, ¯ 0) = u and we have set Dξ u¯ = Dξ u(u, ¯ ξ¯ , ρ, ¯ 0). By implicit differentiation of Eq. (4.6) with r¯ = 0 we get, for ξ = ξ¯ , −ξˆ + ρD ¯ ξ u¯ ξˆ − −ξˆ + ρD ¯ ξ u¯ ξˆ , u u = 0, which, using the fact that |u| ¯ = 1 implies Dξ u¯ ξˆ , u = 0, it follows that ρD ¯ ξ u¯ ξˆ = ξˆ − ξˆ , u u. From this and Eq. (4.8) we obtain ¯ 0)ξˆ = − Dξ Fi (ξ¯ , ρ,

4 u, ξˆ u, ei du = − π ξˆi . 3 S2

For |ξ¯ − ξ | small the map u¯ → u defined by Eq. (4.6) is a C 1 diffeomorphism. Therefore the implicit function theorem implies that, provided ¯r C 1 (S 2 ) < δ for some δ > 0, Eq. ¯ such that ξ = ξ(0, ξ¯ , ρ) ¯ = ξ¯ . The proof (4.7) has a unique C 1 solution ξ = ξ(¯r , ξ¯ , ρ) of Proposition 4.1 is complete. Note. A similar co-ordinate system was introduced in [23] in two dimensions and in [22] in three dimensions. A related notion of barycenter was employed in [24] and especially in [25].

5. The Mean Curvature in Special Co-ordinates Proposition 5.1. Assume = {x|x = X(u) := ξ + ερ(1 + εr(u))u, u ∈ S 2 } with r ∈ C 2+α (S 2 ). Then the mean curvature H (X(u)) of at the point X(u) is given by H (X(u)) =

1 (1 − εLr + B), ερ

(5.1)

where L is the Jacobi operator on S 2 , that is Lr = S r + 2r,

(5.2)

S being the Laplace-Beltrami operator on S 2 , and B of the form B = b(εr, εDr, εD 2 r) with b(z, p, P ) a smooth function which is linear in P and, under the assumption |z| < δ, satisfies the estimate |b(z, p, P )| ≤ C(|z|2 + |p|2 + (|z| + |p|)|P |).

(5.3)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

445

Proof. The argument is a standard application of well known facts. A direct proof and an explicit formula for b(z, p, P ) can be derived from the expression of H that follows. This expression is obtained from the formula of the mean curvature of a surface in general coordinates by coordinatizing S with ϕ, ϑ (longitude and colatitude) and regarding r as a function of ϕ, ϑ. Do Carmo [50, p. 57, p. 136]. Set

J = (1 + εr) + ε 2

2

rϑ2

r 2

21 ϕ + , sin ϑ

(5.4)

then H =

1 1 cos ϑ 2 + 2εr − εrϑϑ − εr − εrϕϕ 2 1/2 2ερ(1 + εr)J sin ϑ sin ϑ rϕ rϕ 1 2 + εrϑ ε(1 + εr)rϑ + ε (rϑ rϑϑ + J sin ϑ sin ϑ ϑ !! rϑϕ rϕ rϕ rϕ rϕϕ 2 . + ε ε(1 + εr) + ε rϑ + sin ϑ sin ϑ sin ϑ sin ϑ sin2 ϑ

(5.5)

The proposition follows from (5.5) and the fact that in the special representation of S rϕϕ 1 adopted the operator S r becomes . The proof of Proposi(rϑ sin ϑ)ϑ + sin ϑ sin ϑ tion 5.1 is complete. Remark 5.2. For later reference we note that from the expression of b that is implicitly given in (5.5) and in particular from (5.3) we have that r ∈ C 3+α (S 2 ) implies BC 1+α (S 2 ) ≤ Cε 2 rC 1+α (S 2 ) rC 3+α (S 2 ) .

(5.6)

It follows from (5.1) that the linearization of the mean curvature on the sphere along normal perturbations is given by the Jacobi operator L.

This is a general fact that holds for general interfaces. In the general case L = + ki2 , where k1 , . . . , kn−1 are the principal curvatures. 6. The Operators T , L, and A Let ⊂ R3 be a bounded, connected, smooth set in R3 , and let be a C 1+α closed, orientable surface, in . As before we denote by i the part of enclosed by , and ¯ i . As before we will consider the following linear operators defined for e = \ sufficiently regular functions χ , ψ : → R. The operator T . Given χ consider the Dirichlet problems −ui = 0, ui = χ ,   −ue = 0, ue = χ ,  ∂ue ∂n = 0,

x ∈ i x ∈ , x ∈ e x∈ x ∈ ∂ .

446

N.D. Alikakos, G. Fusco

Set Tχ =

∂ui ∂ue + =: ∂ni ∂ne

∂u ∂n

,

x ∈ ,

(6.1)

where ni , ne are exterior normals to ∂ i , ∂ e . So T is the Dirichlet-Neumann operator ∂u χ −→ u −→ . ∂n Note that a standard application of Green’s identities yields ∂u = 0. ∂n On the other hand we have χ = const ⇒ T χ = 0. In particular T maps functions with zero average into functions with zero average, and on this set it is invertible. Indeed set 1 (Sψ)(x) = g(x, y)ψ(y) dy − g(x, y)ψ(y) dy dx. (6.2) | | Then the operator S can be interpreted as the inverse of the restriction T to the set of functions satisfying χ = 0. In fact if 1 u(x) = g(x, y)ψ(y) dy − g(x, y)ψ(y) dy dx, x ∈ , (6.3) | | "" ## then u is harmonic in i , e , ψ = ∂u by classical potential theory, and so S is the "" ∂u∂n## Neumann-Dirichlet operator ψ = ∂n −→ u −→ u| = χ . Moreover if , L2 ( ) stands for the inner product in L2 ( ), then if ψ = 0 we have Sψ, ψ L2 ( ) = g(x, y)ψ(y)ψ(x) dy dx

∂ui ∂ue dx = u + ∂ni ∂ne 2 2 = |∇ui | dx + |∇ue | dx = |∇u|2 i

= χ , T χ L2 ( ) .

e

(6.4)

Define 1 1 0 = {χ ∈ L2 ( )|! χ = 0}. We obtain the spaces − 2 , 2 , by completing χ ∈ C 1 ( )| χ = 0 under the norms

· − 1 = 2

$ S· , · L2 ( ) ,

· 1 = 2

$ T · , · L2 ( ) .

(6.5)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

447

1

1

These spaces coincide with the Sobolev spaces W − 2 ,2 ( ), W 2 ,2 ( ) of the functions with zero average. Equations. (6.4), (6.5) imply that χ 1 = ψ− 1 , 2

(6.6)

2

if ψ = T χ or χ = Sψ. Thus T , S are isometries as maps 1

1

T : 2 −→ − 2 , 1

1

S : − 2 −→ 2 , and ST = id

,

1

2

T S = id

1

− 2

.

(6.7)

The operator L( for the case = S 2 ). We will be considering the Jacobi operator introduced in (5.2) on the two dimensional unit sphere. So in spherical coordinates we have the representation χϕϕ (sin ϑχϑ )ϑ + + 2χ , 2 sin ϑ sin ϑ π 2π = χ ψ sin ϑ dϕ dϑ.

Lχ = χ , ψ L2 (S 2 )

0

(6.8) (6.9)

0

L is self adjoint in L2 (S 2 ). Formally this follows from − 0

π

= 0

2π

(Lχ )ψ sin ϑ dϕ dϑ

χ ϕ ψϕ + χϑ ψϑ − 2χ ψ) sin ϑ dϕ dϑ =: B(χ , ψ) sin2 ϑ 0

0 π 2π

(6.10)

via integration by parts. By direct calculation we obtain some information on the spectrum of −L: µ0 = −2,

µ1 = µ2 = µ3 = 0,

µ4 > 0

(6.11)

with the first few eigenfunctions given by w0 =

1 √ , 2 π$ w1 = 21 π3 u, e1 = 21 π3 cos ϑ sin ϕ, $ $ w2 = 21 π3 u, e2 = 21 π3 sin ϑ sin ϕ, $ $ w3 = 21 π3 u, e3 = 21 π3 cos ϕ.

$

(6.12)

Notice that, aside from the normalization coefficient the eigenfunctions wj coincide with the functions ·, ej , j = 1, 2, 3 used in the orthogonality conditions imposed on r in Proposition 4.1.

448

N.D. Alikakos, G. Fusco

The operator A. A = T L.

(6.13)

We will consider −A on −1/2 where it is self-adjoint. Formally this follows from −Aχ , ψ − 1 = −SAχ , ψ L2 (S 2 ) = −Lχ , ψ L2 (S 2 ) = B(χ , ψ). 2

(6.14)

Note that A leaves invariant the set of functions with zero average. The eigenvalues of −A are determined via the quotient −Aχ , χ − 1 2

χ , χ − 1

=

2

B(χ , χ ) . Sχ , χ L2 (S 2 )

(6.15)

It is easy to see that the eigenvalues satisfy λ1 = λ2 = λ3 = 0,

λ4 > 0.

(6.16)

Observe that we have numbered the λ’s so that they correspond to the µ’s and note that the zero average constraint (conservation) eliminates the analog of µ0 . Indeed the triple zero eigenvalue follows from (6.11) with the same eigenfunctions as in (6.12). For showing that the rest of the spectrum is positive we argue as follows: By the variational characterization, λ4 =

max

min B(χ , χ ),

M:CO(M)≥3 χ∈M

Sχ , χ L2 (S 2 ) = 1,

(6.17)

where M = L2 (S 2 ) are linear spaces of co-dimension more or equal to three. Hence min B(χ , χ ), λ4 ≥ 1

S 2 χ,qi L2 (S 2 ) =0 1

1

1

1

where qi = T 2 wi , i = 1, 2, 3. Now S 2 χ , qi L2 (S 2 ) = S 2 χ , T 2 wi L2 (S 2 ) = χ, wi L2 (S 2 ) , and so (6.16) follows from (6.11) . 5

,2

It is easy to see that the domain of −A coincides with W02 ( ), with norm $ · 5 = −T L · , · L2 (S 2 ) . 2

Note that (0.2), via (3.3) can be restated in the form V = T (H − H¯ )

(6.18)

and can be viewed as a gradient flow of the perimeter functional in −1/2 [21]: grad −1/2 P er( ) = −T (H − H¯ ). Also notice that Proposition 3.1 can be stated in the form S(V ) = H − H¯ .

(6.19)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

449

Next we consider the linearization of the evolution law (6.18) for the special case = S 2 . Given r ∈ C 3+α (S 2 ) define X s : S 2 → R3 by Xs (u) = (1 + sr(u))u,

u ∈ S2.

Then X0 = I dS 2 and for each s ∈ (−¯s , s¯ ) for some small s¯ > 0, s = {x ∈ R3 |x = Xs (u), u ∈ S 2 } is a C 3+α perturbation of S 2 . We let T s be the operator T defined in (6.1) when = s . Similarly we let H s be the mean curvature of s . We simply write T, H, for T 0 , H 0 . Proposition 6.1. If = S 2 then

d s s s ¯ = −Ar, T H −H ds s=0

(6.20)

where A := T L. Proof.

d s s d s = H − H¯ T H − H¯ s T ds ds s=0 s=0 d s d s s s ¯ ¯ +T =T , H −H H −H ds ds s=0 s=0 where we have used H = H¯ in the case at hand ( = S 2 ). From (5.1) with ρ = 1 it follows dH s = −Lr. ds s=0 Therefore we get

d s s 1 T H − H¯ s = −Ar − T Lr = −Ar, ds |S 2 | S 2 s=0 where we have used the fact that as remarked earlier T x = 0 if x = const. The proof of the proposition is complete. Note. The linearization of the one phase Mullins-Sekerka operator for general interfaces has been determined by [47]. For the sphere it can also be found in [17]. It appears that the linearization of the two-phase Mullins-Sekerka operator for general is not equal to T L. In the following we use systematically the operator T0 defined by T0 X =

∂ui ∂ue + ∂ni ∂ne

in S 2 ,

X ∈ C 1+α (S 2 ),

where ui , ue are the harmonic functions determined by −ui = 0 ui = X  

on B1 = {x ∈ R3 ||x| < 1}, on S 2 ,

−ue = 0 ue = X  limx→∞ ue = 0.

on R3 \ B¯1 , on S 2 ,

(6.21)

(6.22)

T0 is the analog, in the case = R3 , = S 2 , of the operator T considered above.

450

N.D. Alikakos, G. Fusco

Proposition 6.2. (i) T0 L(C 1+α (S 2 ),C α (S 2 )) < C, (ii) T0 (1) = 1, X (v) dv = X , ∀ X ∈ C 1+α (S 2 ) (iii) T0 S 2 4π|·−v| Proof. From Theorem 2.I in [27], given X ∈ C 1+α (S 2 ), the solution ui of (6.21) (ue ∂ue i of 6.22) can be extended to B¯1 and (R3 \ B1 ) as a C 1+α function. Therefore ∂u ∂ni + ∂ne is a C α (S 2 ) function and the above mentioned theorem also yields that the map X → ∂ui ∂ue 1+α (S 2 ) to C α (S 2 ), α ∈ (0, 1), that is T ∈ 0 ∂ni + ∂ne is bounded as a linear map from C 1 1+α 2 α 2 L(C (S ), C (S )). To show (ii) we note that X ≡ 1 implies ui ≡ 1, ue (x) = |x| , ∂ui ∂ni S2

e = 0, ∂u = 1. (iii) is a classical result from potential theory. Indeed ∂ne X (v) X (v) the function u → S 2 4π|u−v| dv is trivially extended to R3 x → S 2 4π|x−v| dv 3 3 ¯ which is continuous on R and has the property that its restriction to B1 and R \ B1 satisfy (6.21) (6.22) and on the other hand it is well known that the normal derivative of this function has a jump across S 2 given exactly by X .

and therefore

We consider A as an operator on E0 = hα (S 2 ) with domain E1 = h3+α (S 2 ), 0 < α < 1. Here hκ+α is the “little H older” ¨ space defined as the completion of the set of C ∞ functions with respect to the C κ+α norm. We note that this space is different from C κ+α . In the following theorem we list, among other things, certain properties of A which were proved in [19] and independently in [14] and are basic for the proof of the main results in Sect. 1. Theorem 6.3. (i) The operator A is the generator of an analytic semigroup which has the optimal regularity property with respect to the pair E0 , E1 . In particular if g : [0, t¯] → E0 is a continuous function, then t eA(t−θ) g(θ )dθE1 ≤ ct¯ sup g(t)E0 . (6.23) sup [0,t¯]

t∈[0,t¯]

0

(ii) The spectrum of A = T0 L on the subspace of hα functions with zero average can be determined explicitly and is given by kn = (2n + 1) [2 − n(n + 1)] ,

n = 1, 2, . . . ,

(6.24)

kn with multiplicity 2n + 1 and corresponding eigenspace spanned by the 2n + 1 spherical harmonics Yn (θ, ϕ) of degree n. Proof. For the proof of (i) we refer to [19, 14] where more general results are discussed. To show (ii) denote the Laplace-Beltrami on the sphere S 2 by s . Then s Yn = −n(n + 1)Yn

(6.25)

with multiplicity 2n+1 [46]. Notice also that the spherical harmonics are eigenfunctions of T0 . In fact the functions

x x 1 n Yn ui (x) = |x| Yn , ue (x) = are the appropriate harmonic exten|x| |x|n+1 |x| sions of Yn (related via Kelvin’s transform) off S 2 . It is easy to see that ∂ui ∂ue + = (2n + 1)Yn . (6.26) ∂ni ∂ne From this fact and the completeness of the set of spherical harmonics the formula for kn follows. The proof of the lemma is complete. T0 Yn =

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

451

Remark. As indicated in the introduction (cfr. Step 4) to derive a bound for the unknown r in Sect. 10 we need to use the optimal regularity theory of Da Prato and Grisvard and in particular the estimate (6.23) in Theorem 6.3 above. To do this we need estimates of the type fir (ξ, ρ, r)E0 ≤ ConstrE1 , i = 1, · · · , N. For obtaining such H¨older estimates we utilize Th 2I, in [27]. That result covers a class of potential operators U (x) = K(y − x)f (y) dy, (6.27) with K modeled after

∂ ∂xi

∂

1 |x − y|

and provides estimates of the type

U (·)C 1+α ( ) ≤ f (·)C α (∂ ) .

(6.28)

7. The Projections and the Decomposition of V In this section we give a decomposition result for a general V in terms of ρ, ˙ ξ˙ and ρrt for interfaces with the representation (7.1) below. As stated in the main theorem in Sect. 1, we are interested in the solution of small interfaces with small deviation from sphericity. Therefore on the basis of the discussion in Sect. 4 and as indicated in the introduction (cfr. Step 2) we consider the map t → (t) with (t) = {x|x = ξ(t) + ερ(t)(1 + εr(u, t))u, u ∈ S 2 },

(7.1)

where ξ(·), ρ(·) are C 1 functions and t → r(·, t) ∈ C 1+α (S 2 ) is a C 1 map. We let V = V (u, t) be the speed of (t) in the direction orthogonal to (t) at the point x ∈ (t). We take V positive for a shrinking sphere. We assume that r in (7.1) satisfies conditions (4.2), (4.3) and study the relationship between V and ρ, ˙ ρrt , ξ˙ . δ Proposition 7.1. Assume that rC 1+α (S 2 ) < , δ > 0 a small fixed number, so that ε Proposition 4.1 applies. Then V is a linear combination of ε ρ, ˙ ε 2 ρrt , ξ˙ . Moreover, the equation V = Z,

(7.2)

where Z ∈ C α (S 2 ) is a given function, uniquely determines ερ, ˙ ε2 ρrt , ξ˙ and the following estimates hold:  √ |2 π ε ρ˙ + Z, w0 L2 (S 2 ) | ≤ error   $ % & |2 π3 ξ˙j + Z, wj L2 (S 2 ) | ≤ error (7.3)  % &

 2 ε ρrt + Z − 3j =0 Z, wj L2 (S 2 ) wj − ε Z, w0 L2 (S 2 ) w0 rC α (S 2 ) ≤ error

where error = Cε εr2C 1+α (S 2 ) ZC α (S 2 ) +rC 1+α (S 2 ) 3h=1 Z, wh L2 (S 2 ) for some constant C > 0 and wj , j = 0, 1, 2, 3 defined in (6.12).

452

N.D. Alikakos, G. Fusco

Proof. Let X(u, t) := ξ(t) + ερ(t) (1 + εr(u, t)) u, then, by definition V (u, t) = Xt (u, t), n(u, t) ,

(7.4)

where n(u, t) is the interior normal to t at X(u, t) and Xt = ξ˙ + ε (ρ(1 ˙ + εr) + ερrt ) u.

(7.5)

We consider u ∈ S 2 as a function of the longitude ϕ and the colatitude ϑ and therefore also X as a function of ϕ, ϑ through u. Then we have Xϕ = ερ (1 + εr) uϕ + εrϕ u , (7.6) Xϑ = ερ ((1 + εr) uϑ + εrϑ u) . From this and n = Xϕ ∧ Xϑ /|Xϕ ∧ Xϑ | it follows that r u −(1 + εr)u + ε rϑ uϑ + sinϕϑ sinϕϑ n= . rϕ 2 21 2 2 2 (1 + εr) + ε rϑ + sin ϑ

(7.7)

From Eqs. (7.4), (7.5) and (7.7), we obtain V = −ε ρ˙ − ε2 ρrt −

3

T ξ˙j u, ej + q(εr) ε ρ, ˙ ε 2 ρrt , ξ˙1 , ξ˙2 , ξ˙3 ,

(7.8)

j =1

where x T is the transpose of x and q(εr) = (q1 (εr), · · · , q5 (εr)) is defined by (1 + εr)2 + 1, J (1 + εr) + 1, =− J rϕ uϕ 1 ) , ej = − (1 + εr)u − ε(rϑ uϑ + J sin ϑ sin ϑ + u, ej , j = 1, 2, 3,

q1 = − q2 qj +2

(7.9)

where J as in (5.4). δ rC 1+α (S 2 ) < implies ε q1

q2

qj +2

1 2 2 r ϕ 2 + ε 3 OC α (S 2 ) (||r||3C 1+α (S 2 ) ) = −εr + ε rϑ + 2 sinϑ = −εr + ε 2 OC α (S 2 ) r2C 1+α (S 2 ) ,

1 2 2 r ϕ 2 + ε 3 O(||r||3C 1+α (S 2 ) ) = ε rϑ + sinϑ 2 = ε2 OC α (S 2 ) r2C 1+α (S 2 ) , j = 1, 2, 3, = εO rC 1+α (S 2 ) ,

(7.10)

where here and in the following we employ the notation f = OC α (S 2 ) (||r||C 3+α (S 2 ) ) meaning that the C α (S 2 ) norm of f is OC α (S 2 ) (||r||C 3+α (S 2 ) ). Projecting Eq. (7.2) on

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

453

w0 , wj , j = 1, 2, 3 and on the complement of the span {wj }3j =0 in L2 (S 2 ) and using also the expression (7.8) for V yields ( ' √ 2 πε ρ˙ + Z, w0 L2 (S 2 ) = qζ T , w0 2 2 , L (S )

ε2 ρrt + Z − ) 2

3 %

Z, wj

j =0

& L2 (S 2 )

wj = qζ T −

3 '

qζ T , wj

j =0

' ( % & π ξ˙j + Z, wj L2 (S 2 ) = qζ T , wj 2 2 , L (S ) 3

( L2 (S 2 )

wj ,

(7.11)

j = 1, 2, 3,

where ζ = (ερ, ˙ ε2 ρrt , ξ˙1 , ξ˙2 , ξ˙3 ). The estimates (7.10) imply that, provided ε > 0 is sufficiently small, system (7.11) for ζ is diagonally dominant and therefore has a unique solution. This solution can be computed by iteration: the approximate solution ζ (n) at step n is computed by solving system (7.11) with ζ = ζ (n−1) in the right hand side starting with ζ (0) = 0. From this and the estimates (7.10) which imply q = OC α (S 2 ) εrC 1+α (S 2 ) it follows that the solution ζ (∞) satisfies ζ (∞) = ζ (n) + ε n OC α (S 2 ) rnC 1+α (S 2 ) ZC α (S 2 ) . (7.12) For our purpose it suffices to estimate ζ (2) . From (7.11) it follows that 3 % & 1 (1) Z, wj L2 (S 2 ) wj , ζ = − √ Z, w0 L2 (S 2 ) , −Z + 2 π ) ) j =0 )

1 3 1 3 1 3 Z, w1 L2 (S 2 ) , − Z, w2 L2 (S 2 ) , − Z, w3 L2 (S 2 ) . − 2 π 2 π 2 π (7.13) Inserting this expression for ζ (1) in the right hand side of (7.11) and recalling that r, wj L2 (S 2 ) = 0, j = 0, 1, 2, 3, we obtain √ (2) 2 π ζ1 = − Z, w0 L2 (S 2 ) + O ε 2 r2C 1+α (S 2 ) ZC α (S 2 ) +εrC 1+α (S 2 )

3 Z, wh L2 (S 2 ) , h=1

(2)

ζ2

= −Z +

3 %

Z, wj

j =0

& L2 (S 2 )

εr wj + √ Z, w0 L2 (S 2 ) 2 π

+O ε 2 r2C 1+α (S 2 ) ZC α (S 2 )

)

+εrC 1+α (S 2 )

3 Z, wh L2 (S 2 ) ,

h=1 & % π ˙ ξj = − Z, wj L2 (S 2 ) + O ε 2 r2C 1+α (S 2 ) ZC α (S 2 ) 2 3

3 Z, wh L2 (S 2 ) , j = 0, 1, 2, 3. +εrC 1+α (S 2 ) h=1

(7.14)

454

N.D. Alikakos, G. Fusco

From this and Eq. (7.12) with n = 2, the estimates (7.4) follow. The proof of the proposition is complete. ¯ for Given H 8. Solving the Linear Equation S(V ) = H − H In Sect. 3 we reformulated the Mullins-Sekerka problem as an integral equation (3.3) ((6.19)) S(V ) = H − H¯ .

(8.1)

This can be viewed as an equation in V , and it is ultimately nonlinear in due to the nonlinear dependence of H on the representation of the interface. For systematizing the presentation we break the solution of the Mullins-Sekerka problem into two stages. In the first, which we carry out in this section, we replace H by a given function W of class C 1+α , and after solving the linear equation (8.1) we obtain an expression for V in terms of W, explicit up to O(ε) terms, and with an estimate of the order O(ε 2 ). In this section t is suppressed and we write instead of (t). We take =

N *

(8.2)

i

i=1

with i = {x|x = Xi (u) := ξi + ερi (1 + εri (u))u, u ∈ S 2 }. If ε > 0 is small, the map Xi : S 2 → i is a diffeomorphism with the same regularity as ri . We let ui : i → S 2 be the inverse of X i . Under the above assumption Eq. (3.3) can be written in the form (cf. (7.4)) N

g(x, y)Vh uh (y) dy = H (x) − E,

h=1 h

1 where E = H¯ − | | to h .

x ∈ i , i = 1, · · · , N,

(8.3)

g(x, y)V (y) dy dx, and Vh uh ( · ) is the restriction of V

Proposition 8.1. Let ξi ∈ , ρi > 0, ri ∈ C 1+α (S 2 ), Wi ∈ C 1+α (S 2 ), i = 1, · · · , N be given and assume ξi = ξj for i = j . Then, for small ε > 0, the system N

g(x, y)Vh uh (y) dy = Wi ui (x) ,

x ∈ i , i = 1, · · · , N

(8.4)

h=1 h

has a unique solution Vi ∈ C α (S 2 ). Moreover + + 3 + + ri (v) − 21 ri ( · ) 2 +ερi Vi − T0 Wi + εT0 (T0 Wi )(v) dv + Ki + + + 4π | · −v| S2 C α (S 2 )   

N N ερ ερ k k  ≤C + + εri C 1+α (S 2 ) + εrh C 1+α (S 2 )  |ξh − ξk | l h=1 k=1, k=h

ερh ερh × + To Wh C α (S 2 ) + ε 2 ri 2C 1+α (S 2 ) To Wi C α (S 2 ) , (8.5) |ξh − ξi | l

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

455

where Ki is a constant given by Ki =

N h=1

ερh γ (ξi , ξh )

S2

T0 Wh +

N h=1 h=i

ερh 4π |ξi − ξh |

S2

T0 Wh ,

(8.6)

where γ is the regular part of g and T0 is the Dirichlet-Neumann type operator on the sphere S 2 with replaced by R3 (see Proposition 6.2). Note. In (8.5) we are thinking of V as a function of u ∈ S 2 . This proposition amounts to estimating the error in replacing with S 2 , and thus reducing the problem to the sphere. Note that K = O(ε) and that the right-hand side of (8.5) is O(ε 2 ). Proof. 1. We have g(x, y)Vh uh (y) dy = h

1 Vh uh (y) dy 4π |x − y| h + γ (x, y)Vh uh (y) dy.

(8.7)

h

Let h = {z|z = λu; 0 ≤ λ < 1 + εrh (u), u ∈ S 2 }. We first consider the case x ∈ i , h = i. Let U i : i → R be the function defined by 1 U i (z) := Vi ui (y) dy i 4π |ξi + ερi z − y| 1 i = ερi (ξ + ερ z ) dz . (8.8) u V i i i | 4π |z − z ∂ i Since ri ∈ C 1+α (S 2 ), ∂ i is a surface of class C 1+α (S 2 ) and therefore by Theorem 2.I p. 307 in [27] applied to the derivatives of U i , U i can be extended as a C 1+α function ¯ i of i and with the estimate to the closure U i (·)C 1+α ( ¯ i ) ≤ ερi CVi ui (ξi + ερi ·) C α (∂ i ) , (8.9) where C is O 1 + εri C 1+α (S 2 ) and can be considered as a constant independent of r δ under the standing assumption ri C 1+α (S 2 ) < . The map ∂ i z → ui (ξi + ερi z) ∈ ε S 2 is a C 1+α diffeomorphism and ui (ξi + ερi ·) C 1+α (∂ i ) < Const 1 + εri C 1+α (S 2 ) < C and a similar statement holds true for the inverse map u → z. It follows that Vi ui (ξi + ερi ·) C α (∂ i ) ≤ CVi C α (S 2 ) .

(8.10)

From (8.8) and the discussion after it and in particular from (8.9) and (8.10) we have a map Vi ∈ C α (S 2 ) → U i |∂ i ∈ C 1+α (S 2 ). From this and the properties of the

456

N.D. Alikakos, G. Fusco

Xi (u) − ξi discussed above we can define a map I1i : ερi C α (S 2 ) → C 1+α (S 2 ) by setting i

X (u) − ξi , (8.11) ερi I1i Vi (u) = U i ερi

diffeomorphism u → z(u) :=

and that I1i Vi C 1+α (S 2 ) ≤ CVi C α (S 2 ) .

(8.12)

2. Besides this estimate we also need to compute the main term in I1i Vi . From (8.8) and dz = 1 + 2εri + OC α (S 2 ) ε 2 ri 2C 1+α (S 2 ) du (8.13) it follows that

I1i Vi

1 + 2εri (v) + OC α (S 2 ) ε 2 ri 2C 1+α (S 2 ) (v) i (u) = Vi (v) dv i (v) S2 4π X (u)−X ερi 1 2ri (v)Vi (v) = Vi (v) dv + ε dv 2 2 4π |u − v| S S 4π |u − v| O α 2 ri 2 (v)Vi (v) C (S ) C 1+α (S 2 ) +ε2 dv 4π |u − v| S2 i  i (v)  |u − v| − X (u)−X ε ρi 1   i +ε i (v) S 2 4π |u − v| ε X (u)−X ερi 2 × 1 + 2εri (v) + OC α (S 2 ) ε ri 2C 1+α (S 2 ) (v) Vi (v) dv. (8.14)

By the result in [27] quoted above, I1i Vi as well as the first 3 integrals on the right-hand side of (8.14) are C 1+α (S 2 ) functions. Therefore also the last integral on the right-hand side of (8.14) belongs to C 1+α (S 2 ). Let ε (Vi ) (u) be this last integral. We have (Vi )(u)

1 |u − v| − |u − v + ε (ri (u)u − ri (v)v)| = Vi (v) dv ε |u − v + ε (ri (u)u − ri (v)v)| S 2 4π|u − v|

1 |u − v| − |u − v + ε (ri (u)u − ri (v)v)| + ε |u − v + ε (ri (u)u − ri (v)v)| S 2 4π|u − v| × OC α (S 2 ) ε ri C 1+α (S 2 ) (v)Vi (v) dv =: (1 Vi ) (u) + (2 Vi ) (u). (8.15)

From (8.15) it follows that (2 Vi ) C 1+α (S 2 ) ≤ Cε ri 2C α (S 2 ) Vi C α (S 2 ) ,

(8.16)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

457

where we have used the fact that |u − v| − |u − v + ε (ri (u)u − ri (v)v)| ri (u)u − ri (v)v, u − v = −ε + ε 2 O |ri (u)u − ri (v)v|2 . |u − v|

(8.17)

For |u| = |v| = 1 we have ri (u)u − ri (v)v, u − v =

1 (ri (u) + ri (v)) |u − v|2 , 2

(8.18)

and so (8.18) implies 1 1 Vi = − 2

ri (·) + ri (v) Vi (v) dv S 2 4π | · −v| +ε OC 1+α (S 2 ) ri 2C 1+α (S 2 ) Vi C α (S 2 ) .

(8.19)

3. We now turn to the analysis of the second integral in the right-hand side of (8.7) for the case x ∈ i , h = i. We have γ Xi (u), y Vi ui (y) dy i = ε2 ρi2 γ Xi (u), ξi + ερi z Vi ui (ξi + ερi z) dz =:

∂ i 2 2 i ε ρi (I2 Vi )(u).

From this and (8.10) it follows that + + + i + +I2 Vi + 1+α C

(8.20)

(S 2 )

≤ CVi C α (S 2 ) .

(8.21)

We also need an explicit expression for the principal part of I2i Vi . To derive this expression we note that from the definition of X i (u) it follows that ε ρ i γ X i (u), ξi + ερi z = γ (ξi , ξi ) + O , (8.22) l2 where l is defined after formula (3.8). From this and (8.13) it follows that

εri C 1+α (S 2 ) ερi I2i Vi = γ (ξi , ξi ) + 2 Vi C α (S 2 ) . (8.23) Vi + OC 1+α (S 2 ) l l S2 4. From the above analysis and in particular from Eqs. (8.14), (8.16), (8.19), (8.21) it follows that 1 Vi (v) dv g Xi (u), y Vi ui (y) dy = ερi i S2 4π |u− v| +ερi I ii Vi (u), (8.24)

458

N.D. Alikakos, G. Fusco

where I ii is a linear operator that satisfies4 + + + ii + +I Vi + 1+α 2 ≤ CVi C α (S 2 ) , C

(8.25)

(S )

where C = O ε ri + ε lρi . 5. We now consider the case x ∈ i , h = i in Eq. (8.7). Clearly for h = i and x = X i (u) both integrals on the right-hand side of (8.7) have, as functions of u ∈ S 2 , the same smoothness as Xi . We only need to analyze how the C 1+α (S 2 ) norm of these functions depends on ε, ρ, r. We can write 1 Vh uh (y) dy i h 4π X (u) − y 1 + OC α (S 2 ) εrh C 1+α (S 2 ) (v) 2 2 Vh (v) dv = ε ρh 4π Xi (u) − X h (v) S2 = ε2 ρh2 ×

1 4π |ξi − ξh |

S2

Vh + ε 2 ρh2 OC 1+α (S 2 )

εrh C 1+α (S 2 ) ε(ρi + ρh ) Vh C α (S 2 ) . + |ξi − ξh | |ξi − ξh |2

(8.26)

For the other integral on the right-hand side of (8.7) we have γ Xi (u), y Vh uh (y) dy h 2 2 = ε ρh γ Xi (u), X h (v) 1 + OC α (S 2 ) εrh C 1+α (S 2 ) (v) Vh (v) dv.

(8.27)

S2

From this and the definition of X h , h = 1, . . . , N that implies

ε (ρi + ρh ) i h γ X (u), X (v) = γ (ξi , ξh ) + O l2 it follows that i h h γ X (u), y Vh u (y) dy h) + = ε 2 ρh2 γ (ξi , ξh ) S 2 Vh + ε 2 ρh2 OC 1+α (S 2 ) ε(ρil+ρ 2

εri C 1+α (S 2 ) l

From Eqs. (8.26), (8.28) it follows that g Xi (u), y Vh uh (y) dy = ερh I ih Vh (u),

Vh C α (S 2 ) . (8.28)

(8.29)

h

where I ih is a linear operator that satisfies + + + ih + +I Vh + 1+α 2 ≤ CVh C α (S 2 ) , C

(S )

h = i,

(8.30)

+ + We mention that if d(ξi , ∂ ) = O(εη ), then +I ii Vi +C 1+α (S 2 ) ≤ Cε 1−(2+α)η (ρi + ri C 1+α (S 2 ) ) × Vi C α (S 2 ) . This estimate is given for future reference. 4

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

459

ερi +ερh ερh h where Cε = |ξiερ −ξh | [1 + εrh C 1+α (S 2 ) + O |ξi −ξh | ] + l [1 + εri C 1+α (S 2 ) + h ]. Here we have used again the assumption εrh C 1+α (S 2 ) < δ and also O ερi +ερ l the fact that ερh < δ. 6. Using again Theorem 2.I in [27] we see that the function I ih Vh , h = 1, . . . , N has a harmonic extension both to the interior B1 = {x| |x| < 1} and to the exterior R3 \B1 of S 2 and these harmonic functions can be extended as C 1+α functions to B¯ 1 and R3 \B1 with the estimate + + + + + ih − + + ih + + I Vh + ≤ C V + 1+α 2 , +I h + 1+α ¯ + C (S ) C (B1 ) + + + + + ih + + + + + I Vh + ≤ C +I ih Vh + 1+α 2 , + + 1+α 3 C (S ) C

(R \B1 )

where the subscripts ∓ denote the extensions, and C is a universal constant. From these inequalities and Proposition 6.2 it follows that + + + + + + + + (8.31) +T0 I ih Vh + α 2 ≤ C +I ih Vh + 1+α 2 . C (S )

C

(S )

7. From the above discussion and the observation (cf. Proposition 6.2)

1 T0 V (v) dv = V , S 2 4π | · −v|

(8.32)

it follows that system (8.4) is equivalent to ερi Vi = T0 Wi −

N

ερh T0 I ih Vh .

(8.33)

h=1

From the estimates (8.26), (8.30), (8.33) it follows that + +N + + + ih + ερh T0 I Vh + + + + h=1 C α (S 2 ) . N

ερh ερh 2 Vh C α (S 2 ) + ε ρi ri C 1+α (S 2 ) Vi C α (S 2 ) . ≤C ερh + |ξi − ξh | l h=1

(8.34) This estimate shows that provided the scaling parameter ε > 0 is sufficiently small, system (8.33) has a unique solution that can be computed by iteration. The approximate (n) (n) solution (ρV )(n) = (ρ1 V1 , · · · , ρN VN ) at step n is computed by solving Eq. (8.33) with ρV = (ρV )(n−1) in the right hand side starting with (ρV )(0) = 0. It results, for i0 = 1, · · · , N, in (1) ερi0 Vi0 = T0 Wi0

N /k−1

(n) k ij ij +1 T W , n ≥ 2. ερi0 Vi0 = T0 Wi0 + n−1 0 ik i1 ,··· ,ik =1 k=1 (−1) j =0 T0 I (8.35)

460

N.D. Alikakos, G. Fusco

In (8.35) we have renamed the index i as i0 to write the formula in a more compact way. From this and the estimates (8.24), (8.30) and (8.31) it follows that the solution ρV ∞ of (8.33) satisfies ερi0 Vi∞ = ερi0 Vi0 + Ri0 , i0 = 1, · · · , N, 0 (n)

(n)

(8.36)

with the estimate + + + (n) + +Ri0 +

C α (S 2 )

≤

∞

(εC)

k

N

k−1 0

ρij +1

+

ρij +1

|ξij − ξij +1 | l

+ + + + +δij ij +1 +rij +1 +C 1+α (S 2 ) +T0 Wik +C α (S 2 ) , i1 ,··· ,ik =1 j =0

k=n

where δhk is Kronecker’s symbol. In particular for n = 2, after setting i0 = i, i1 = h, i2 = k, we have + + + 2+ +Ri0 +

C α (S 2 )

≤C

1N N h=1

×

k=1

ερk ερk + |ξh − ξk | l

ερh ερh + |ξh − ξi | l

+ εri C 1+α (S 2 ) + εrh C 1+α (S 2 )

2

To Wh C α (S 2 ) + ε

2

ri 2C 1+α (S 2 ) To Wi C α (S 2 )

.

From (8.35) it follows (2)

ερi Vi

= T0 Wi −

N

T0 I ih T0 Wh .

(8.37)

h=1

On the other hand from Eqs. (8.14), (8.16), (8.19), (8.24) and (8.26), it follows + + + ii + 3ri (v) − ri ( · ) +I V i − ε Vi (v) dv − ερi γ (ξi , ξi ) Vi (v) dv + + + 1+α 2 2 S 2 4π | · −v| S2 C (S ) 2 2 ε ρi ε ρi εri =O + (8.38) + ε 2 ri 2C 1+α (S 2 ) Vi C α (S 2 ) , l2 l and Eqs. (8.27), (8.29) imply + + + + ih +I Vh − ερh 4π|ξ1i −ξh | + γ (ξi , ξh ) S 2 Vh (v) dv + 1+α 2 C (S ) ερh εrh C 1+α (S 2 ) ερh εri C 1+α (S 2 ) ερh (ερi +ερh ) ερh (ερi +ερh ) Vh C α (S 2 ) . =O + + + 2 2 |ξ −ξ | l |ξ −ξ | l i h i

h

(8.39) Inserting these expressions of I ih Vh , h = 1, · · · , N into Eq. (8.37) and using the estimate for R2i yields (8.5). The proof of Proposition 8.1 is complete.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

461

9. The Evolution Equation for ρ, r, ξ We are now in a position to show that the integral equation (3.3) in §3 is equivalent to a system of evolution equation for the N + N + 3N unknowns, ρ = (ρ1 , · · · , ρN ), r = (r1 , · · · , rN ), ξ = (ξ1 , · · · , ξN ). We assume ri ∈ C 3+α (S 2 ) and set in (8.5), 1 (1 − εLri + Bi ) − E, i = 1, . . . , N, (9.1) Wi = H | i − E = ερi where we have utilized the expressions for H given in Proposition 5.1. Then we use Proposition 8.1 to solve for ερi Vi , i = 1, . . . , N . The expression for ερi Vi obtained through Proposition 8.1 contains, as the expression (9.1) of Wi , the constant E which, as we show in the following proposition, is determined by the conservation requirement N V = V = 0, (9.2)

h=1 h

which holds automatically for the solution of (8.3). Once Vi is known, we employ Proposition 7.1 with Z = V , to compute ε ρ˙i , ε2 ρi rit , ξ˙i . Proposition 9.1. The condition (9.2) uniquely determines the constant E. Moreover

rC 3+α (S 2 ) 1 E= +O 1+ , (9.3) ε ρ¯ ρ¯ N 1 ρi and rC 3+α (S 2 ) is the norm of r = (r1 , · · · , rN ). where ρ¯ = N i=1

Proof. From (9.1), using Proposition 5.1, Remark 5.2, and the properties of the operator T0 given in Proposition 6.2, we have, making also use of the standing assumption εri C 1+α (S 2 ) < δ, 1 1 ε −E+ T0 Lri + OC α (S 2 ) ri C 1+α (S 2 ) ri C 3+α (S 2 ) ερi ερi ρi 1 1 = − E + OC α (S 2 ) ri C 3+α (S 2 ) . ερi ρi From (9.4), utilizing again Proposition 6.2 it follows

ri (u) − ri (·) ε T0 (T0 Wi )(u)du 2 S 2 4π | · −u|

1 (T0 Wi )(u) 1 = εri T0 Wi − εT0 ri du 2 2 | · −u| S2 1 ri 3 ri = − εri E + ε OC α (S 2 ) (ri C 3+α (S 2 ) ) 2 ρi 2 ρi

1 + − εE OC 2+α (S 2 ) (ri C 3+α (S 2 ) ) ρi 1 + OC α (S 2 ) εri C 1+α (S 2 ) ri C 3+α (S 2 ) ρ i

1 = − εE OC α (S 2 ) (ri C 3+α (S 2 ) ), ρi T 0 Wi =

(9.4)

(9.5)

462

N.D. Alikakos, G. Fusco

0 Wi )(u) where, for estimating S 2 (T4π|·−u| du we have used again Theorem 2.I in [27] which implies + + + + X (u) + + + 2 4π | · −u| du+ 1+α 2 ≤ CX C α (S 2 ) . S

C

(S )

We also note that from (9.4) and (8.6) it follows N N pih − ε ρh pih E + ε Ki = h=1

N

O rh C 3+α (S 2 ) ,

h=1, h=i

h=1

1 where we have set pih = 4πγ (ξi , ξh ) + |ξi −ξ , h = i; pii = 0. By inserting (9.4), h| (9.5) and the above expression of Ki into (8.5) yields + + N N + + ri C 3+α (S 2 ) +ερi Vi − 1 + E + ≤ C p + + ε rh C 3+α (S 2 ) ih + + ρi ερi C α (S 2 ) h=1

N N

+ ε

ρk pˆ ih pˆ hk + ε 2

h=1 k=1

N N h=1

+ ε ri C 3+α (S 2 ) +

N

h=1

ρk pˆ ih pˆ kh rh C 3+α (S 2 )

k=1

ρh pih +ε

h=1

N N

ρh ρk pˆ ih pˆ hk E ,

h=1 k=1

(9.6) where we have set pˆ hk = 0=

N i

i=1

= 4π

Vi =

N

1 l

+

N

ε 2 ρi2

i=1

S2

(1 − ερi E) + O ε

N

ri C 1+α (S 2 ) + ε

i=1

N N

ρi pih + ε

i=1 h=1

+ε 2

Vi 1 + OC 1+α (S 2 ) εri C 1+α (S 2 ) du

i=1

+ε

h = k, pˆ hh = 0. From (9.6) and (9.2) it follows

1 |ξh −ξk | ,

N

rh C 3+α (S 2 ) + ε

+ ε2

N i=1

2

ρi ρk pˆ ih pˆ hk + ε 3

ρi ri C 3+α (S 2 ) +

N i=1

i=1 h=1 k=1

N

ρi ri C 1+α (S 2 ) E

i=1

i=1

N N N

2

N N N

ρi

N

rh C 3+α (S 2 )

h=1

ρi ρk pˆ ih pˆ hk rh C 3+α (S 2 )

i=1 h=1 k=1 N h=1

ρh pih + ε

N N

ρh ρk pˆ ih pˆ hk E ,

h=1 k=1

(9.7) and therefore, provided that we assume εrh C 3+α (S 2 ) < δ, ερh < δ,

rC 3+α (S 2 ) 1 E= +O + 1 + ε rC 3+α (S 2 ) + ρ¯ E . ε ρ¯ ρ¯

(9.8)

If ε > 0 is smaller than some ε¯ > 0 this equation can be solved for E yielding the estimate (9.3).

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

Proposition 9.2. Equation (9.1) implies

1 ε 1 1 Vi = − 2 2 T0 Lri − ερi ερi ε ρ¯ ε

ρi

3 1 ε 1 1 ri − T 0 r i − − ερi ερi ε ρ¯ 2 2

Ki ερh ε|ρ| 1 + + + εrC 1+α (S 2 ) T0 Wh C α (S 2 ) ερi ερi dist dist h 1 1 + O ε2 ri C 1+α (S 2 ) ε ρ¯ ερi 1 + 2 2 O ε2 ri 2C 3+α (S 2 ) , ε ρi

463

(9.9)

where Ki (ρ, r) is a constant such that Ki =

1 O(ε), ε ρ¯

(9.10)

and where 1 1 1 := + . |ξh − ξk | l dist Proof. From Eqs. (9.3), (9.4) it follows

1 1 1 ε T0 Wh = − T0 Lrh + Kh − ε ρh ρ¯ ερh 1 (9.11) + O ε2 rh 2C 3+α (S 2 ) , ερh

rC 3+α (S 2 ) where Kh = O 1 + . By inserting Eq. (9.11) into the estimate (8.5) we ρ¯ get

1 1 1 ε ερi Vi = − T0 Lri + Kh − Ki + O ε2 ri 2C 3+α (S 2 ) − ερi ε ρ¯ ερi ερi

3 1 1 1 2 ri (v) − 2 ri ( · ) −ε T0 − ερi ε ρ¯ 4π | · −v| S2 3 1

1 ε 2 2 2 ri (v) − 2 ri ( · ) −εT0 T0 Lri (v) + Ki + O ε ri C 3+α (S 2 ) − 2 4π | · −v| ερi ερi S

ε|ρ| ερh + + εrC 1+α (S 2 ) T0 Wi C α (S 2 ) dist dist h + O ε2 ri 2C 1+α (S 2 ) T0 Wi C α (S 2 ) . In what follows we will need

464

N.D. Alikakos, G. Fusco

Lemma 9.1. The following estimates hold true: 3 1 1 2 ri (v) − 2 ri ( · ) ε T0 Lri (v) = OC α (S 2 ) ε 2 ri 2C 3+α (S 2 ) , εT0 4π| · −v| ερi ερi S2 3 1 1 2 ri (v) − 2 ri ( · ) 1 εT0 OC α (S 2 ) ε 2 ri 2C 3+α (S 2 ) = OC α (S 2 ) ε 2 ri 2C 3+α (S 2 ) , 4π| · −v| ερi ερi S2 3 1 1 2 ri (v) − 2 ri ( · ) εT0 Ki = OC α (S 2 ) ε 2 ri C 1+α (S 2 ) , 4π| · −v| ε ρ¯ S2 ερh T0 Wh C α (S 2 ) = O(1). dist Proof. From Proposition 6.2(i) and the fact that L is a second order operator we have T0 Lri C α (S 2 ) ≤ Cri C 3+α (S 2 ) . From this and Proposition 6.2(iii) it follows that + + + ri (v)T0 Lri (v) + +T0 dv + + α 2 = ri T0 Lri C α (S 2 ) ≤ Cri C α (S 2 ) ri C 3+α (S 2 ) . + 4π| · −v| S2 C (S ) Moreover the above estimate for T0 Lr1 and the same argument used in the proof of Proposition 9.1 imply + + + T0 Lri (v) + + + + 2 4π | · −v| dv + 1+α 2 ≤ Cri C 3+α (S 2 ) . S C (S ) This and Proposition 6.2(i) implies + + + T Lri (v) + +T0 ri dv + + α 2 ≤ Cri C 1+α (S 2 ) ri C 3+α (S 2 ) . + S 2 4π | · −v| C (S ) This concludes the proof of the first estimate. The second estimate is proved by the same argument. From Proposition 6.2 it follows 3 1 1 3 2 ri (v) − 2 ri (·) T0 dv = ri − T0 ri . 2 2 2 4π | · −v| S This and Ki = (9.11).

O(ε) ερ¯

imply the third estimate. The last estimate follows by inserting

Equation (9.9) follows from Lemma 9.1 and the above expression of ερi Vi . The proof of the proposition is complete. Proposition 9.3. There is an ε¯ > 0 such that, provided 0 < ε < ε¯ , the integral equation (8.3) is equivalent to the following system of evolution equations:  ρ dρi 1 1 1  = − ε  ερi ερ¯ ερi + fi (ρ, ξ, r),  dt ∂ri ˜ i − 3 12 3 ri − 1 T0 ri + f r (ρ, ξ, r), i = 1, · · · , N (9.12) = ε31ρ 3 Ar i ∂t 2 2 ε ρ ρ ¯  i i   dξi ξ = fi (ρ, ξ, r), dt

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

where

1 3 A˜ = T0 s − I + I, 2 2 ρ

465

(9.13)

ξ

where fi , fir , fi are smooth functions of ρ = (ρ1 , · · · , ρN ) , ξ = (ξ1 , · · · , ξN ) and r = (r1 , · · · , rN ) , r ∈ C 3+α (S 2 ) satisfying the following estimates: 1 1 1 ρ fi = (9.14) O (ε) + 2 2 O ε2 ri 2C 3+α (S 2 ) , ερi ε ρ¯ ε ρi

+ r+ 1 ε|ρ| +f + α 2 = 1 + εrC 1+α (S 2 ) O(1) i C (S ) ε 2 ρi ερi dist . 1 1 1 + O ε2 ri C 1+α (S 2 ) + 2 2 O ε2 ri 2C 3+α (S 2 ) , (9.15) ε ρ¯ ερi ε ρi ξ

fi =

ε|ρ| 1 + εrC 1+α (S 2 ) O(1) ερi dist 1 1 1 + O ε2 ri C 1+α (S 2 ) + 2 2 O ε2 ri 2C 3+α (S 2 ) . ε ρ¯ ερi ε ρi

(9.16)

Remark. Notice that A and A˜ coincide up to higher order terms. Proof. Equations (9.12) as well as (9.13), (9.14) follow from Eqs. (7.4) in Proposition 7.1 when one identifies Z with the right-hand side of Eq. (9.9). With this expression for Z we have

√ 1 1 1 Z, w0 L2 (S 2 ) = 2 π 2 − ε ρi ρi ρ¯

√ K ε|ρ| 1 +2 π + εrC 1+α (S 2 ) (1) + ερi ερi dist 1 1 + O ε2 ri C 1+α (S 2 ) ε ρ¯ ερi 1 + 2 2 O ε2 ri 2C 3+α (S 2 ) , (9.17) ε ρi

% & ε|ρ| 1 Z, wj L2 (S 2 ) = + εrC 1+α (S 2 ) (1) ερi dist 1 1 + O ε2 ri C 1+α (S 2 ) ε ρ¯ ερi 1 j = 1, 2, 3, (9.18) + 2 2 O ε2 ri 2C 3+α (S 2 ) , ε ρi where we have used the requirement ri , wj L2 (S 2 ) = 0, j = 0, 1, 2, 3 and also that, on the basis of Theorem 6.3 the orthogonal complement of span wj j =0,1,2,3 is invariant % & under T0 : ri , wj L2 (S 2 ) = 0 ⇒ T0 ri , wj L2 (S 2 ) = 0, j = 0, 1, 2, 3.

466

N.D. Alikakos, G. Fusco

We also note that, under the standing assumption εri C 1+α (S 2 ) < δ, the expression 3 Z, wh L2 is estimated by the right-hand side of Eq. (9.9). On the εri C 1+α (S 2 ) h=1

other hand setting Z = V and using Eq. (9.9) ε2 ri 2C 1+α (S 2 ) ZC α (S 2 )

ρi 1 ρi ≤ Cε2 ri 2C 1+α (S 2 ) 2 2 1 − + εri C 3+α (S 2 ) + εri C 1+α (S 2 ) 1 − ρ¯ ρ¯ ε ρi 1 1 + O(ε) + err ερi ε ρ¯

εri C 1+α (S 2 ) 1 1 O(ε) err ≤ err, (9.19) ≤ C ε 2 ri 2C 1+α (S 2 ) 2 2 + ερi ε ρ¯ ε ρi where err stands for an expression that can be estimated like the right-hand side of Eq. (9.16). Using Eqs. (9.15), (9.16), (9.17), Lemma 9.1, the estimate (9.10) of K , the previ3 % & Z, wj ≤ err, and recalling that L = s + 2I , ous observation that εri C 1+α (S 2 ) j =1

the proof is concluded by a standard computation.

10. The ρ, r, ξ Estimates After rescaling time by t = ε3 τ , system (9.12) becomes  dρi 1 1 1 2 ρ  = −   dτ ρi ρ¯ ρi + ε fi (ρ, r, ξ ),     ∂ri 1 ˜ 1 3 r ∂τ = ρ 3 Ari − ρ 2 ρ¯ (3ri − T0 ri ) + ε fi (ρ, r, ξ ),  i i       dξi 3 ξ dτ = ε fi (ρ, r, ξ ),

(10.1)

that is to be considered with initial conditions ρi (0) = ρi0 ,

ri (0) = ri0 ,

ξi (0) = ξi0 .

(10.2)

We now analyze Eqs. (10.1) and derive the estimates needed for the proof of Theorem 1. In references [14] and [17] it has been shown that, given a smooth 0 , there exists a T > 0 such that Mullins-Sekerka problem has a unique classical solution t → (t), t ∈ [0, T ) which satisfies (0) = 0 . As discussed in Sect. 9 if 0 is of the 2 form 0 = ∪N i=1 i0 with i0 = {x|x = ξi0 + ερi0 (1 + εri0 (u)) u, u ∈ S } and ε > 0 is sufficiently small, then (4.1), (4.3) is equivalent to Eqs. (9.12) and therefore to Eqs. (10.1). Thus we can assume that Eqs. (10.1) with the data (10.2) have a classical solution ρi = ρi (τ ), ri = ri (τ ), ξi = ξi (τ ) in some maximal interval of existence [0, T ). The proposition below links Eqs. (10.1) with (0.15) by showing that ρi can be approximated well by Ri , the solutions of (2.1). The proposition focuses on the interval (0, T1 ) where T1 is the 1st extinction time. By repeating the argument we obtain the analogous results over (T1 , T2 ), · · · , (TN−2 , TN−1 ).

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

467

Proposition 10.1. Assume N ≥ 2. Assume there exists ζ > 0 such that ri (τ )C 3+α (S 2 ) < ζ,

τ ∈ [0, T ).

(10.3)

Assume also that ρ1 (0) < ρ2 (0) < · · · < ρN−1 (0) < ρN (0), ρi (0) = Ri (0), i = 1, · · · , N,

(10.4)

and let Ri : [0, Tˆ1 ), where Tˆ1 is characterized by R1 (Tˆ1 ) = 0. Then there exists ε¯ > 0 and positive constants a, Cτ , c1 , C1 , cρ , Cρ , CR depending only on ρi (0), i = 1, · · · , N such that for ε ∈ (0, ε¯ ) the following estimates hold: (i) |T − Tˆ1 | < εCτ , 1 1 (ii) c1 (T − τ ) 3 < ρ1 (τ ) < C1 (T − τ ) 3 , (iii) cρ < ρi (τ ) < Cρ , τ ∈ [0, T ), i > 1, 1 (iv) |ρi (τ ) − Ri (τ )| < ε 3 CR , τ ∈ [0, Tˆ1 − εCτ ], ρ1 (v) 1 − (τ ) > a, τ ∈ [0, T ), ρ¯ (vi) ρ(τ ¯ ) > c, τ ∈ [0, T ). Also T = T1 , the extinction time of ρ1 and T1 < T2 < · · · < TN−1 , where ρi (Ti ) = 0, i = 1, 2, · · · , N − 1. Proof. Step 1. Equation (10.1)1 can be rewritten in the form

ρi 1 dρi ρ i = 1, · · · , N, = 2 − 1 + gi , dτ ρ¯ ρi

(10.5)

ρ

where gi satisfies the estimate ρ

gi =

ρi O (ε) + ε 2 O ri 2C 3+α (S 2 ) . ρ¯

(10.6)

It follows that for ε small enough 0 < ρ1 (τ ) < ρ2 (τ ) < · · · < ρN (τ ) on [0, T ), from which we deduce that T = T1 . At this point we have not excluded the possibility T1 = +∞. From (10.3), (10.6) we have ρ

|gi | < Cε, C > 0.

(10.7)

Step 2. Given a compact time interval [0, A] , A < T1 , then the following estimates hold on [0, A], with the constants in the estimates degenerating possibly only if A → +∞. (a)

N

(ρi3 (τ ) − ρi 3 (0)) = O(ε),

i=1

(b)

N

(ρi3 (τ ) − Ri 3 (τ )) = O(ε),

i=1

(c) ρi bounded uniformly in ε, (d) ρi equicontinuous in ε, (e) ρ(τ ) ≥ c > 0.

468

N.D. Alikakos, G. Fusco

Verification: (b) follows from (a) via the conservation of

trivially. (e) follows also from (a) since

N

ρi ≥

i=1

d dτ

1 3 ρi 3 N

=

i=1

N

N

Ri3 (τ ). (c) follows from (a)

i=1

13

ρi3

. We argue (a):

i=1

N ρi

− 1 + O(ε) = O(ε)

ρ i=1 by (10.5), (10.7).

Integrating we obtain N i=1

ρi3 −

N

ρi3 (0) ≤ Cε

i=1

From this we have estimate (a). To establish (d) note that

1 dρi3 ρi = − 1 + gρ . 3 dτ ρ

By (e), (c) it follows that (ρi3 ) is bounded on [0, A], uniformly in ε, and so ρi3 and ρi are continuous on [0, A], uniformly in ε. Step 3. Let T ∗ > Tˆ1 . Then

lim min ρ1 = 0.

ε→0 [Tˆ ,T ∗ ] 1

We argue this by contradiction. Assume that lim ε→0 inf ρ1 > 0. Then lim ε→0 inf [Tˆ1 ,T ∗ ]

[Tˆ1 ,T ∗ ]

ρj ≥ c > 0, j = 1, · · · , N . Since there is no singularity in [Tˆ1 , T ∗ ] by continuous dependence we can pass to the limit in (10.5) and deduce that R1 (τ ) ≥ c > 0 on [Tˆ1 , T ∗ ], contradicting that Tˆ1 < T ∗ . Finally the lim , etc., can be replaced by lim by the equicontinuity of ρ1 . Step 4. (iii), (v), (vi). We show that T1 < ∞, ρ1 > 0 on [0, T1 ), T1 < T2 < · · · < TN−1 . We argue by contradiction. Assume that T1 = ∞, hence T = ∞ (see Step 1). By (10.5), d 1 3 (10.7), for i = 1 we have dτ ( 3 ρ1 ) ≤ ρρ1 − 1 + Cε. Fix a T ∗ > Tˆ1 . By Step 3 for ε suffi-

ciently small ρρ1 is say less than 21 at T ∗ and hence over an interval [T ∗ , T ∗ + δ] which can be chosen uniformly in ε (by the equicontinuity of ρ1 ). It follows that ρ1 decreases on this interval by a fixed amount. By equicontinuity we can repeat the argument over [T ∗ + δ, T ∗ + 2δ], etc. This establishes that T1 < ∞. It follows that the estimates in Step 2 above hold with [0, A] replaced by [0, T1 ). In particular (vi) is established. Next we consider Eqs. (10.5) for i = 1, 2. By subtracting we have 1 d 3 1 (ρ2 − ρ13 ) ≥ (ρ2 − ρ1 ) − Cε ≥ −Cε, on [0, T1 ), 3 dτ ρ where we used that ρ2 > ρ1 and (10.7). Integrating we obtain ρ23 (τ ) ≥ ρ23 (0) − ρ13 (0) − 3Cετ

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

469

from which it follows that ρ23 (T1 ) ≥ (ρ23 (0) − ρ13 (0)) − 3CεT1 , hence T2 > T1 if ε small. Similarly we show that TN−1 > TN−2 > · · · > T2 > T1 . From these (iii) and (v) follow. Step 5. (ii). From (10.5), with i = 1, we obtain

d 1 3 ρ1 1 ρ1 ρ1 = − 1 + O(ε) ≤ − 1 + Cε < − dτ 3 ρ c 2 on some [T1 − δ, T1 ), where we have utilized the lower bound on ρ. Integrating this inequality from τ to T1 we obtain the lower bound in (ii). To obtain the other half we utilize the upper bound on ρ:

d 1 3 ρ1 ρ ≥ − 1 − Cε ≥ −C. dτ 3 1 c Integrating this from τ to T1 we obtain the other inequality. Step 6. (iv). 1 d ρi Ri − (Ri3 − ρi3 ) = + O(ε). ρ 3 dτ R Thus integrating |Ri3 (τ ) − ρi3 (τ )| ≤ 3

τ

0

Ri − ρi dt + Cε. R ρ

By utilizing R ≥ c1 > 0, ρ ≥ c2 > 0, ρi < c3 and Ri ≥ R1 we obtain R12

N

|Ri − ρi | ≤ C

N τ

0

i=1

Set y(τ ) = R12 (τ )

N

|Ri − ρi |dt + Cε.

i=1

|Ri (τ ) − ρi (τ )|.

i=1

Then y(τ ) ≤

0

τ

y(s) ds + Cε. R12 (s)

Utilizing that C(Tˆ1 − τ ) 3 ≥ R1 (τ ) ≥ c(Tˆ1 − τ ) 3 , 1

1

(this estimate can be shown by an argument identical to the one establishing (ii) above), which implies that the singularity is integrable, we obtain by a Gronwall argument N y(τ ) ≤ Cε, hence R12 (τ ) × |Ri (τ ) − ρi (τ )| ≤ Cε. i=1

470

N.D. Alikakos, G. Fusco

So, N

|Ri (τ ) − ρi (τ )| ≤

i=1

Cε . R12 (τ )

Now R1 (τ ) ≥ Cε if τ ≤ Tˆ1 − Cτ ε by the analog of (ii) for R1 mentioned above, and so N 1 |Ri (τ ) − ρi (τ )| ≤ CR ε 3 , τ ∈ [0, Tˆ1 − Cτ ε]. 1 3

i=1

The verification of (iv) is complete. Step 7. (i). First we note that (ii), (iv) imply a lower bound on T1 . Indeed, if T1 ≤ Tˆ1 − Cε, C ≥ Cτ , then by (iv) |ρ1 (T1 ) − R1 (T1 )| ≤ CR ε 3 , hence 1 |R1 (T1 )| ≤ CR ε 3 . On the other hand, by the lower bound on R1 , 1

1

c(Tˆ1 − T1 ) 3 ≤ CR ε 3 . 1

1

Hence, if c(C) 3 > CR we arrive at a contradiction. Thus T1 ≥ Tˆ1 − Cε, C appropriate. To obtain an upper bound we argue as follows: From (iv), for C > Cτ |ρ1 (Tˆ1 − Cε) − R1 (Tˆ1 − Cε)| < CR ε 3 1

1 we obtain ρ1 (Tˆ1 − Cε) < C ∗ ε 3 . For τ ≥ Tˆ1 − Cε we have

dρ1 dτ

≤ (−1 + δ) ρ12 where 1

we utilized the equicontinuity of ρ1 . Integrating we obtain 1 3 1 ρ (τ ) ≤ ρ13 (Tˆ1 − Cε) + (−1 + δ)(τ − (Tˆ1 − Cε)) 3 1 3 ≤ (C ∗ )3 ε + (−1 + δ)(τ − (Tˆ1 − Cε))

0≤

from which we obtain an upper bound for τ of the form Tˆ1 +Cε. The proof of Proposition 10.1 is complete. Proposition 10.2. Assume the same as in Proposition 10.1. Then, for ε ∈ (0, ε¯ ), we have the estimate |ξi (τ ) − ξi0 | < Cξ ε 3 ,

τ ∈ [0, T ),

(10.8)

for some constants Cξ > 0 independent of ε. Proof. If we assume that the inequality (10.3) holds and also utilize Proposition 10.1 we have for any τ ∈ [0, C), 1 ξ |f1 | < C(T − τ )− 3 , (10.9) ξ |fi | < C, i > 1, where ξ

fi =

1 2 ε ρi2

ε|ρ| + εrC 3+α (S 2 ) dist

1 1 ερi + O(ε 2 ri C 3+α (S 2 ) ) = O(ε2 ) dist ρ¯ ε ρ¯ ερi

for some constant C > 0. This and Eq. (10.1)3 imply the result.

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

Proposition 10.3. Let si , i = 1, · · · , N be defined by τ dτ si =: , i = 1, · · · , N. 3 0 ρi (τ )

471

τ ∈ [0, T )

(10.10)

and assume the same as in Proposition 10.1. Then s1 → ∞ as τ → T − , Tˆ1 + εCτ si → s¯i < as τ → T − , Cρ3

(10.11) i > 1.

(10.12)

Proof. This follows from Proposition 10.1 (ii), (iii).

It is convenient to utilize the “pseudo-times” si in estimating ri in order to handle ˜ i in Eq. (10.1)2 . We also need the following lemmas. the factor ρi−3 in front of Ar Lemma 10.4. Let A˜ be the operator defined in Eq. (9.13), then A˜ is a self-adjoint oper1 ator on X− 2 . The eigenvalues of A˜ are given by νn = (1 − n(n + 1)) (2n + 1) + 3,

n = 1, 2, · · · ,

(10.13)

and νn has multiplicity 2n + 1 and the corresponding eigenspace is the span of the 2n + 1 spherical harmonics Yn (ϕ, ϑ). Moreover all the eigenvalues of A˜ restricted to 1 the subspace X = {r ∈ X− 2 | < r, wi >L2 = 0, i = 0, 1, 2, 3} satisfy ν ≤ −22. Proof. It follows immediately from Theorem 6.3.

(10.14)

Lemma 10.5. There exist constants µ > 0, M > 0, β ∈ (0, 1) such that the semigroup generated by A˜ in E1 satisfies ˜ eAs φC 3+α (S 2 ) ≤ Me−µs φC 3+α (S 2 ) , φ ∈ E1 , M ˜ eAs φC 3+α (S 2 ) ≤ β e−µs φC 2+α (S 2 ) , φ ∈ h2+α (S 2 ) ∩ E0 , s Moreover if φ : (0, s¯ ] → E0 is continuous + s + + + ˜ (s−σ ) A + sup + e φ(σ ) ds + + 0<s≤¯s

0

C 3+α (S 2 )

≤ C¯ sup φ(s)C α (S 2 ) ,

(10.15) 1 β= . 3 (10.16)

(10.17)

0<s≤¯s

where C¯ > 0 is a constant independent of s¯ . Proof. Estimates (10.15)–(10.16) are well known consequences of the spectral bound, ˜ generates an analytic semigroup on E1 , and of basic interpolation properthe fact that A ˜ belongs to M1 (E0 , E1 ). ties of the “little” H¨older spaces. Estimate (10.16) states that A We note that generally for an operator in M1 (E0 , E1 ) the inequality (10.17) holds with C¯ replaced by a Cs¯ which grows with s¯ . In our special case we can take a C¯ independent ˜ is bounded above by a of s¯ because, as we have seen in Eq. (14.10) the spectrum of A negative number µ < 22 and as arbitrarily close to 22 as derived. The proof of Lemma (10.5) is complete.

472

N.D. Alikakos, G. Fusco

In the following we shall assume, as we can, that the constant M in Lemma (10.5) satisfies M > 1.

(10.18)

Proposition 10.6. Assume N ≥ 2. Then there exists ε¯ > 0 and ζ > 0, independent of ε, such that, for ε ∈ (0, ε¯ ) the inequality (10.3) holds. Proof. 1. If we introduce the “pseudo-times” si defined by Eq. (10.10), system (10.1) can be written in the form  ρ ρi i   dρ  dsi = ρi ρ¯ − 1 + gi , ρi dri r ˜ (10.19) dsi = Ari − ρ¯ (3ri − T0 ri ) + εgi ,    dξi = εg ξ , dsi

ρ

i

ρ

ξ

ξ

where gi = ε2 ρi2 fi , gir = ε2 ρi3 fir , gi = ε2 ρi3 fi and estimates (9.14) imply

ρ g < C0 ερi O(ε) + 1 O(ε 2 ri 2 3+α 2 ) , i C (S ) ερi 1

N + r+ 1 ρh +g + α 2 < C0 ρ 2 1 ε|ρ| + εrC 3+α (S 2 ) (1 − ) i C (S ) i ρi dist dist ρ h=1 2 1 , + 2 O(ε 2 r2C 3+α (S 2 ) ) ε ρi

ρi2 ξ 2 ε|ρ| 2 + εrC 3+α (S 2 ) + O(ε ri C 3+α (S 2 ) ) , (10.20) gi < C0 ερi dist ρ¯ for some constant C0 > 0. 2. Let ζ > 0 be any number that satisfies ri0 C 3+α (S 2 ) < ζ,

i = 1, · · · , N.

(10.21)

Then by continuity there exists T ≤ T such that the inequality (10.3) holds in [0, T ). Then, for ε << 1, the estimates (iii), (v), (vi) in Proposition (10.1) hold in [0, T ) and we can also assume ρ |gi | < εC0 Cρ + ζ + εζ 2 < a. Therefore Eq. (10.19)1 implies ρ1 (τ (s1 )) ≤ ρ10 e−as1 .

(10.22)

Let sˆ1 be chosen large enough so that, M(3 + T0 )N for β fixed as in (10.16).

ρ10 −a sˆ1 e sup ρN0 s1 ≥ˆs1

s1

sˆ1

e−µ(s1 −s) 1 ds < β (s1 − s) 4

(10.23)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

3. In the interval [0, T ) we have from (10.10), (10.22) and estimate (iii),

s1 3 ρ1 ρ10 3 s1 −3as 1 ds si = ds ≤ e 1 1 3 cρ 0 ρi 0

3 1 ρ10 ≤ = sˆ , i > 1. 3a cρ

473

(10.24)

Let s¯1 = max{ˆs , sˆ1 }. ρi 4. The map si → (3ri − T0 ri ) (τ (si )) is a continuous map from si ([0, T ))−1 into ρ¯ C 2+α (S 2 ) and we have + + + ρi + N Cρ + (3ri − T0 ri )+ (10.25) + ρ¯ + 2+α 2 ≤ ρ (3 + T0 )ri C 3+α (S 2 ) . N0 C (S ) From this and (10.16) it follows that + s + + i A(s −s) ρi + i + + e − T r ) (τ (s)) ds (3r i 0 i + + 3+α 2 ρ¯ 0 C (S ) N Cρ 1−β ≤M (3 + T0 )si sup ri (τ (s))C 3+α (S 2 ) . ρN0 0<s≤si

(10.26)

We fix σ > 0 small, so that MN

Cρ 1 (3 + T0 )σ 1−β < . ρN0 4

(10.27)

s¯1 + 1, where s¯ν , σ are defined in 3 and 4 and where [] stands for the 5. Let k¯ = σ integer part. 6. From Eq. (10.20) and the estimate (iii) in Proposition 10.1 we have for τ ∈ [0, T ), + r+ +g + α 2 ≤ C0 Cρ ζ + ζ ri C 3+α (S 2 ) . i C (S ) Therefore from (10.17) it follows that + s + + ˜ (s−σ ) r + A + sup + e εgi + + 3+α 2 0<s≤si 0 C (S ) ¯ ¯ 0 ζ sup ri (τ (s))C 3+α (S 2 ) = ε CC0 Cρ Cρ + ζ + ε CC 0<s≤si

≤ εC1 (1 + ζ ) + εC1 ζ sup ri (τ (s))C 3+α (S 2 ) ,

(10.28)

0<s≤si

where C1 is a suitably chosen constant. 7. Assume ε¯ > 0 so small that for ε ∈ (0, ε¯ ), 2εC1 ζ < 2εC1

¯ k−1 k=0

1 , 4

(2M)k <

(10.29) 1 . 8M

(10.30)

474

N.D. Alikakos, G. Fusco

We now make a definite choice for ζ (cf. (10.21)) ¯

ζ = 8M(2M)k r0 + 1,

(10.31)

where k¯ is defined in 5 and r0 = maxi ri0 C 3+α (S 2 ) . 8. From the variation of constants formula applied to Eq. (10.19)2 it follows via (10.26), (10.28), that 1−β

zi ≤ Mzi0 + C2 si

zi + εC1 (1 + ζ ) + εC1 ζ zi ,

(10.32)

where we have set zi = sup ri (τ (s))C 3+α (S 2 ) , 0<s≤si

zi0 = ri0 C 3+α (S 2 )

(10.33)

and C2 = MN

Cρ (3 + T0 ). ρN0

Equation (10.32) is valid in the interval Ii = [0, si (T )). From (10.27), (10.29) it follows that for si ∈ Ii ∩ [0, σ ] we have zi < 2Mzi0 + 2εC1 (1 + ζ ),

si ∈ Ii ∩ [0, σ ].

(10.34)

If we replace in (10.32) zi0 by zi (σ ) and si by (si − σ ) and use Eq. (10.35) to estimate zi (σ ) we get zi < (2M)2 zi0 + 2εC1 (1 + ζ )(2M + 1),

si ∈ Ii ∩ [0, 2σ ].

(10.35)

si ∈ Ii ∩ [0, kσ ].

(10.36)

By iterating this procedure we get zi < (2M)k zi0 + 2εC1 (1 + ζ )

k−1

(2M)h ,

h=0

Equation (10.36) is one of the basic estimates needed to complete the proof. In the particular case i = 1, besides Eq. (10.36) we also need another estimate, that we establish next. 9. The variation of constants formula applied to Eq. (10.19)1 between sˆ1 , the value of s1 defined in 2, and a generic value of s1 ≥ sˆ1 yields ˜ r1 (τ (s1 )) = eA(s1 −ˆs1 ) r1 (τ (ˆs1)) s1 ˜ (s1 −s) ρ1 A − e (3r1 − T0 r1 ) (τ (s)) ds ρ¯ sˆ1 s1 e˜ A(s1 −s) g1r ds. +ε sˆ1

From this and Eqs. (10.16), (10.27) and (10.28) it follows r1 (τ (s1 ))C 3+α (S 2 ) ≤ Me−µ(s1 −ˆs1 ) r1 (τ (ˆs1 ))C 3+α (S 2 )

(10.37)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

475

ρ10 −a sˆ1 s1 e−µ(s1 −s) +M(3 + T0 )N e r (τ (s))C 3+α (S 2 ) ds β 1 ρN0 sˆ1 (s1 − s) +εC1 (1 + ζ ) + εC1 ζ sup r1 (τ (s))C 3+α (S 2 ) (10.38) sˆ1 <s≤s1

which is valid under the assumption that s1 ∈ I1 . From Eq. (10.38) and Eq. (10.23) that defines sˆ1 and Eq. (10.29) it follows z˜ (s1 ) < 2Mz1 (ˆs1 ) + 2εC1 (1 + ζ ),

(10.39)

where we have set z˜ (s1 ) = sup r1 (τ (s))C 3+α (S 2 ) , and that Eqs. (10.38), (10.39) are sˆ1 <s≤s1

valid under the condition that s1 ≥ sˆ1 belongs to the interval I1 . 10. We now prove that sˆ1 ∈ I1 . This follows from Eq. (10.36). In fact for k ≤ k¯ we have (2M)k zi0 + 2εC1 (1 + ζ )

k−1

(2M)h

h=0

¯

≤ (2M)k r0 + 2εC1

¯ k−1

(2M)h + 2εC1

h=0 ¯

≤ (2M)k r0 +

¯ k−1

(2M)ζ

h=0

ζ 1 ζ + < , 8M 8M 4M

(10.40)

where we have utilized the definition (10.31) of ζ and (10.36). This inequality, recalling the definition of k¯ in 5, shows that Eq. (10.36) implies zi (si ) <

ζ < ζ, 4M

for si ∈ [0, s¯i ], where s¯1 is defined in 3 and s¯i 3 ρ1 s¯i = ds1 , 3 0 ρi

i = 1, · · · , N

(10.41)

if i > 1.

(10.42)

Since, by definition s¯i ≥ sˆi , the claim is proved. 11. We now utilize Eqs. (10.36), (10.39) to show that I1 = [0, ∞). For s1 > sˆ1 , si = s1 3 ρ1 ds1 , i > 1, we have from (10.36), (10.39) 3 0 ρi z˜ 1 < 2Mz1 (ˆs ) + 2εC1 (1 + ζ ) < ζ, ζ zi < 4M , i > 1.

(10.43)

In fact, the first inequality is a consequence of 2Mz1 (ˆs1 ) < ζ /2 which follows from Eq. (10.41), and of 2εC1 (1 + ζ ) < 4εC1 ζ < 1/2 < ζ /2 which follows from Eqs. (10.29), (10.31). The second inequality is a consequence of Eq. (10.40) and of the observation that for i > 1, si is bounded above by sˆ defined in Eq. (10.24) and sˆ ≤ s¯1 . 12. From Eqs. (10.41), (10.43) it follows that the condition ¯

r(τ )C 3+α (S 2 ) < 3 = 8M(2M)k r0 + 1

(10.44)

476

N.D. Alikakos, G. Fusco

is satisfied in the time interval [0, T ) and that τ dτ = ∞, lim s1 = 3 τ →C 0 ρ1 (τ )

(10.45)

by sˆ1 ∈ I1 that was established above and by the fact that sˆ1 can be chosen arbitrarily large. Notice that T cannot be infinity because, as we have seen from Proposition 10.1 (ii), condition (10.44) implies ρ1 (τ ) → 0 in finite time. Therefore T must be finite and Eq. (10.45) implies that lim ρ1 (τ ) = 0.

τ →T

[0, T ) is thus the maximal interval of existence of the solution of system (10.1) and therefore T = T1 . The proof of the proposition is complete. Proposition 10.7. Let T1 be the first extinction time, lim ρ1 (t) = 0. Then t→T1

lim r1 (t)C 3+α (S 2 ) = 0.

(10.46)

τ →T1

Moveover we have the estimate 1

r1 (t)C 3+α (S 2 ) ≤ C(T1 − t) 3

(10.47)

(cf. Proposition 10.1). Proof. We consider the rescaled time t = ε2 (cf. 10.1). We treat r1 which we write as r. Equation (10.19)2 gives dr ˜ − ρ1 (3r − T0 r) + εg1r . = Ar ds1 ρ¯

(10.48)

From the variation of constants formula we obtain 1 −ϑ0 ) r(ϑ ) r(ϑ1 )1 ≤ Me+µ(ϑ 0 1 + + ϑ1 −A + ˜ (s−ϑ0 ) ρ1 + ++ e (3r − T0 r) ds + + ρ¯ ϑ0 1 + ϑ1 + + + ˜ + ε+ e−A(ϑ1 −ϑ0 ) g1r (s)ds + + +

ϑ0

= I + I I + I I I, ∀ ϑ1 > ϑ0 ,

1

where .1 stands for the norm. ϑ1 1 −µ(s−ϑ0 ) II ≤ C e r(s)1 ds sup ρ1 (s) 1 [ϑ0 ,ϑ1 ] ϑ0 (s − ϑ0 ) 3

(10.49)

C 3+α (S 2 )

≤C

sup r(s)1 e−

ϑ0 3

(by Proposition 10.1(ii), (iii)),

[ϑ0 ,ϑ1 ]

where C is a generic constant. By (10.17) I I I ≤ εC sup g1r (s)0 [ϑ0 ,ϑ1 ]

(by (10.16), (10.25)) (10.50)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

≤ εC

477

sup ρ1 (s) + sup r(s)1 , by (10.20) [ϑ0 ,ϑ1 ]

[ϑ0 ,ϑ1 ]

≤ εC e

−

ϑ0 3

+ sup r(s)1 , by Proposition 10.1 (ii).

(10.51)

[ϑ0 ,ϑ1 ]

Putting these estimate together, we conclude sup r(s)1 ≤ Me−µ(ϑ1 −ϑ0 ) sup r1 (s)1 s≥ϑ1

s≥ϑ0

+ Ce

−

ϑ0 3

sup r1 (s)1 s≥ϑ0

+ εC e

−

ϑ0 3

+ sup r1 (s)1 .

(10.52)

s≥ϑ0

Let ϑ1 = ϑ0 + σ, where σ is fixed so that Me−µσ <

1 . 2

(10.53)

We introduce the monotone function f (t) = sup r(s)1 . Then taking the limit in (10.52) s≥t

we obtain lim f (ϑ) ≥

ϑ→∞

1 lim f (ϑ) + εC lim f (ϑ), ϑ→∞ 2 ϑ→∞

(10.54)

from which we conclude that (10.46) holds. To obtain the rate we define the sequence of times ϑ1 = ϑ0 + σ, ϑ2 = ϑ1 + σ, · · · , ϑn = ϑn−1 + σ. It follows from (10.52) that sup r(s)1 ≤ Me−µ(ϑn −ϑn−1 ) sup r1 (s)1 s≥ϑn

s≥ϑn−1 − ϑ3n

+ Ce + Cε sup r1 (s)1 s≥ϑn−1

+ εC e

− ϑ3n

+ ε sup r1 (s)1 ,

(10.55)

s≥ϑn−1

where we utilized µ > 13 , (10.14). By selecting σ appropriately we have

ϑn−1 f (ϑn ) ≤ C e− 3 + εf (ϑn−1 ) , C < 1.

(10.56)

Finally we take ε = e− 3 . Iterating (10.56) we obtain ϑn−1 f (ϑn ) ≤ e− 3 C + C 2 + · · · + C n + εn C n f (ϑ0 )

(10.57)

σ

478

N.D. Alikakos, G. Fusco

from which we conclude f (ϑn ) ≤ Ae− and thus

ϑn−1 3

, A constant,

(10.58)

r(s)1 ≤ Ae− 3 s

which is equivalent to (10.47). The proof of Proposition (10.5) is complete.

Acknowledgements. We are grateful to J. Lebowitz and E. Presutti for many valuable discussions. We also would like to thank G. Karali for working out several of the estimates in Sect. 8. Finally, we want to thank the referee for a very careful reading of the previous manuscript.

References 1. Ostwald, W.: Z. Phys. Chem. 37, 385 (1901) 2. Rubinstein, J., Sternberg, P.: Nonlocal reaction-diffusion equations and nucleation. IMA J. Appl. Math. 48, 249–264 (1992) 3. Lifschitz, I.M., Slyosov, V.V.: The kinetics of precipitation from supersaturated solid solutions. J. Phys. Chem. Solids 19, 35–50 (1961) 4. Wagner, C.: Theorie der Alterung von Niederschlagen dursch Umlosen. Z. Electrochem. 65, 581–594 (1961) 5. Alikakos, N.D., Fusco, G.: Preprint, 1998 6. Niethammer, B.: Derivation of the LSW-theory for Ostwald ripening by homegenization methods. Arch. Rat. Mech. Anal. 147(2), 119–178 (1999) 7. Porter, D.A., Easterling, K.E.: Phase Tranformations in Metals and Alloys. 2nd Edition, New York: Chapman and Hall, 1992 8. Cahn, J.W.: On the spinodal decomposition. Acta Metall. 9, 795–801 (1961) 9. Cahn, J.W., Hilliard, J.E.: Free energy of a nonuniform system I. Interfacial free energy. J. Phys. Chem. 28, 258–267 (1958) 10. Pego, R.L.: Front migration in the nonlinear Cahn-Hilliard. Proc. Roy. Soc. London, Ser. A 422, 261–278 (1989) 11. Alikakos, N.D., Bates, P., Chen, X.: The convergence of solutions of the Cahn-Hilliard equation to the solution of the Hele-Shaw model. Arch. Rat. Mech. Anal. 128, 165–205 (1994) 12. Mullins, W.W., Sekerka, R.F.: Morphological stability of a particle growing by diffusion and heat flow. J. Appl. Phys. 34, 323–329 (1963) 13. Alexiades, V., Solomon, A.D.: Mathematical Modeling of Melting and Freezing Processes. New York: Hemisphere Publishing, 1993 14. Chen, X., Hong, X., Yi, F.: Existence, uniqueness and regularity of solutions of Mullins-Sekerka problem. Commun. Partial Differ. Eqs. 21, 1705–1727 (1996) 15. Chen, X.: The Hele-Shaw problem and area-preserving curve shortening motion. Arch. Rat. Mech. Anal. 123, 117–151 (1993) 16. Constantine, P., Pugh, M.: Global solutions for small data to the Hele-Shaw problem. Nonlineariry 6, 393–415 (1993) 17. Escher, J., Simonett, G.: A center manifold analysis for the Mullins-Sekerka model. J.D.E. 143, 267–292 (1998) 18. Chen, X.: Global asymptotic limit of solutions of the Cahn-Hilliard equation. J. Diff. Geom. 44, 262–311 (1996) 19. Escher, J., Simonett, G.: Classical solutions for Hele-Shaw models with surface tension. Adv. Differ. Eqs. 2, 619–642 (1997) 20. Alikakos, N.D., Fusco, G.: The effect of distribution in space in Ostwald ripening. Centre de Recherches Mathematiques, CRM Proceedings and Lecture Notes, Vol. 27 (2001) 21. Fife, P.: Dynamical Aspects of the Cahn-Hilliard Equations. Barrett lectures, University of Tennessee, Spring 1991 22. Bellettini, G., Fusco, G.: Some aspect of the dynamics of S(V ) = H − H . J. D. E 157, 206–246 (1999) 23. Alikakos, N.D., Bates, P.W., Chen, X., Fusco, G.: Mullins-Sekerka motion of small droplet on a fixed boundary. J. Geom. Anal. 4, 575–596 (2000)

Ostwald Ripening for Dilute Systems Under Quasistationary Dynamics

479

24. Huisken, G., Yau, S.-T.: Definition of center of mass for isolated physical systems and unique foliations by stable spheres with constant mean curvature. Invent. Math. 124, 281–311 (1996) 25. Ye, R.: Foliation by constant mean curvature spheres. Pacific J. Math. 147, 381–396 (1991) 26. Huisken, G.: The volume preserving mean curvature flow. J. Reine Angew. Math. 382, 35–48 (1987) 27. Miranda, C.: Lincei-Memorie Sc. Fisiche, ecc. - S.VIII, vol. VII, Sez. I, 9, 303–336 (1965) 28. Agmon, S., Douglis, A., Nirenberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions. Commun. Pure Appl. Math. 12, 623–727 (1959) 29. da Prato, G., Grisvard, P.: Equations d’evolution abstraites non lineaires de type parabolique. Ann. Mat. Pura Appl. 120(4), 329–396 (1979) 30. Lunardi, S.: Analytic Semigroups and Optimal Regularity for Parabolic Problems. Basel: Birkhauser, 1995 31. Amann, H.: Linear and Quasilinear Parabolic. Vol. I, Basel: Birkhauser, 1995 32. Angement, S.B.: Nonlinear analytic semigroups. Proc. Roy. Soc. Edinburgh A 115, 91–107 (1990) 33. Zhu, J., Chen, X., Hou, T.X.: An efficient boundary integral method for the Mullins-Sekerka problem. J. Comput. Phy. 126(2), 246–267 (1996) 34. Penrose, O., Lebowitz, J.L., Murro, J., Kallos, M.H., Sur, A.: Growth of clusters in a first-order phase transition. J. Stat. Phy. 19, 243–267 (1978) 35. Kalos, M., Lebowitz, J.L., Penrose, O., Sur, A.: J. Stat. Phys. 18, 39–51 (1978) 36. Soner, H.M.: Convergence of the Phase-Field Equations to the Mullins-Sekerka Problem with Kinetic Undercooling. Arch. Rat. Mech. Anal. 131, 139–197 (1995) 37. Voorhees, P.W.: The theory of Ostwald ripening. J. Stat. Phys. 38, 231–252 (1985); Ostwald ripening of two-phase mixtures. Ann. Rev. Mater. Sci. 22, 197–215 (1992) 38. Voorhees, P.W., Glicksman, M.E.: Solution to the multi-particle diffusion problem with applications to Ostwald ripening I. Theory. Acta Metall. 32, 2001–2011 (1984) 39. Niethammer, B., Pego, R.: Non-self-similar behavior in the LSW theory of Ostwald ripening. J. Stat. Phys. 95, 867–902 (1999) 40. Alikakos, N.D., Fusco, G.: The equations of Ostwald Ripening for dilute systems. J. Stat. Phys. 95, 851–866 (1999) 41. Voorhees, P.W., Schaffer, R.J.: In situ observation of particle motion and diffusion interactions during coarsening. Acta Metal. 33, 327–339 (1987) 42. Alikakos, N.D., Fusco, G., Karali, G.: Continuum limits of particles interacting via diffusion. Preprint 43. Vel´azquez, J.J.: On the effect of stochastic fluctuations in the dynamics of Lifshitz-Slyozov-Wagner model. J. Stat. Phy. 99(1–2), 57–113 (2000) 44. Weins, J.J., Cahn, J.W.: The effect of size and distribution on second phase particles and voids on sintering. In: Sintering and Related Phenomena. Kuczynski (ed). New York: Plenum, 1973 45. Xun, J.J.: Interfacial Wave theory of Pattern Formation. Berlin-Heidelberg-New York: Springer, 1998 46. Folland, G.B.: Introduction to Partial Differential Equations. Princeton, N.J.: Princeton University Press and University of Tokyo Press, 1976 47. Kimura, M.: An application of the shape derivative to quasistationary Stefan Problem. Preprint 48. Alikakos, N.D., Fusco, G., Karali, G.: Ostwald ripening in two dimensions-the rigorous derivation of the equations from the Mullins-Sekerka dynamics. Preprint 49. Alikakos, N.D., Fusco, G., Karali, G.: The Effect of the Geometry of the Particle Distribution in Ostwald Ripening. Commun. Math. Phys., to appear; DoI 10.1007/s00220-003-0834-4 50. Do Carmo, M.P.: Differential Geometry of curves and surfaces. New York: Prentice-Hall, 1976 Communicated by J.L. Lebowitz

Commun. Math. Phys. 238, 481–488 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0834-4

Communications in

Mathematical Physics

The Effect of the Geometry of the Particle Distribution in Ostwald Ripening Nicholas D. Alikakos1,2, , Giorgio Fusco3 , Georgia Karali1,∗ 1 2 3

University of Athens, Panepistimiopolis, 15484 Greece Department of Mathematics, Brigham Young University, 292 TMCB, Provo, OT 84602, USA Dipartimento di Matematica, Universit´a di L’Aquila, Via Vetoio, 67010 Coppito (L’Aquila), Italy

Received: 9 January 2001 / Accepted: 26 January 2003 Published online: 16 June 2003 – © Springer-Verlag 2003

Abstract: Based on [1], we derive equations for the radii and the centers that we relate to the Lifshitz-Slyozov-Wagner theory.

Introduction In this note, based on the estimates in [1], we derive rigorously corrections to the equations of the Lifshitz-Slyozov-Wagner theory of coarsening (cf. (1.2) in [1]). Specifically we correct the equations for the radii by taking into account the distance and the size of the neighboring particles. We also provide equations for the motion of the centers of the particles. These corrections amount to carrying out (rigorously) the asymptotic expansion to a higher order. Because all this can be achieved by elaborating the main estimates in [1], for keeping the size of the present paper under control, we did not opt for a self contained exposition; instead we took the liberty of referring to the various formulae in [1] with a minimum of explanation. We establish the following refinement of the main theorem in [1]: Theorem 1. Let ⊂ R3 be a bounded, smooth and connected domain. Assume that 0 0 2 0 = ∪N i=1 i , N ≥ 2, where i satisfies i = {x\x = ξi + ρi (1 + ri (u))u, u ∈ S } 3+α 2 (S ) satisfying S 2 ri (u)du = 0, with ξi0 ∈ , ρi0 > 0, and ri0 ∈ C r (u)u, e du = 0, j = 1, 2, 3. Assume ξ j i0 = ξj 0 for i = j , ρ10 < . . . < ρN0 . S2 i There is ¯ > 0 such that, if 0 < < ¯ , then the solution t → (t) of the MullinsSekerka problem for this class of initial conditions 0 can be represented in ξ, ρ and r coordinates and exists globally as a weak solution. There exist constants Cρ , Cξ > 0 N.D.A. and G.K. were partially supported by a ENE 99/527 interdisciplinary grant in Materials, and by a grant from the University of Athens. N.D.A. would like to thank also the people at BYU for providing extraordinary hospitality and a stimulating environment during his visit in the Fall of 2000.

482

N. D. Alikakos, G. Fusco, G. Karali

such that 1 1 ρh 1 ρk 1 + − −1 ρi ρ¯ ρi N ρ¯ |ξh − ξk | ρ¯ h,k

ρj 1 ρi ρh 1 − 4π −1 + γ (ξi , ξh ) −1 |ξj − ξi | ρ¯ N ρ¯ ρ¯ i,h j gi ρh − , with |gi | < 2 Cρ , −1 + γ (ξi , ξh )4π ρ¯ ρi h 1 ξk − ξ i 1 ˙ξi = −3 ρk − ρ¯ ρk |ξk − ξi |3 k ∂γ (ξi , ξh ) 1 1 + φi , with |φi | < 3 Cξ , −3 ρh − ∂x ρ¯ ρh

ρ˙i =

(1)

h

where ρ =

1 N−i+1

N

ρh and ri (t)C 3+α (S 2 ) < Cr as long as ri is defined.

h=1

The symbol means summation avoiding equal indices. Here γ is the smooth part of the Green’s function (cf. (3.8)). Notice that ρ, and ξ form a closed system of equations if the highest order terms are ignored. Equations (1) are derived formally in [2] for the case when the boundary ∂ is further away (γ = 0). Special cases of (1) have appeared in the literature before. The ρ-equations with γ = 0 are derived in [5]. The ξ -equations for 2 particles were derived in [4] by the method of images (that does not extend to more particles). Also in two dimensions and for two particles a simpler analog of system (1) can be found in [3]. It is worth mentioning that Eqs. (1) provide a correction to the classical Lifshitz-Slyozov-Wagner theory of coarsening by taking into account the size, the distance of the neighboring particles and the effect of the boundary. If we consider the ρ equations, we observe that the main term is ρ1 i ( 1ρ¯ − ρ1 i ) and the rest of the terms

are smaller in relation to the main term which is like 1 . Moreover, the centers do move but in general slower than the radii. In what follows we denote by V the normal velocity which is taken positive for a 1 shrinking sphere, H the mean curvature, H = || H dS y the average mean curva 1 ture, || the surface area, Wi = H (x) − E with E = H − || g(x, y)V (y)dydx and T the Dirichlet-Neumann operator (cf Sect. 6 in [1]). We take = ∪N i=1 i with i = {x\x = X i (u) := ξi + ρi (1 + ri (u))u, u ∈ S 2 }. If > 0 is small, the map Xi : S 2 → i is a diffeomorphism with the same regularity as ri . We let ui : → S 2 be the inverse of X i . Under the above assumption Eq. (3.3) in [1] is written in the form N

g(x, y)Vh (uh (y))dy = H (x) − E,

x ∈ i , i = 1, . . . , N.

h=1 h

We begin by presenting a refinement of Proposition 8.1 in [1] from which the proof of Theorem 1 follows.

The Effect of the Geometry of the Particle Distribution in Ostwald Ripening

483

Proposition 1. Let ξi ∈ , ρi > 0, ri ∈ C 1+α (S 2 ), Wi ∈ C 1+α (S 2 ), i = 1, . . . , N, be given and assume that ξi = ξj for i = j . Then the system N

g(x, y)Vh (uh (y))dy = Wi (ui (x)), x ∈ i , i = 1, . . . , N

(2)

h=1 h

has a unique solution Vi ∈ C α (S 2 ). Moreover the following estimate holds true: 3 1 2 ri (v) − 2 ri (·) ρh γ (ξi , ξh ) T0 Wh (T0 Wi )(v)dv + ρi Vi − T0 Wi + T0 4π | · −v| S2 S2 h

ρh ri (·) + ri (v) + T0 W h − 2 T0 ri (v)(T0 Wi )(v)dv 4π|ξi − ξh | S 2 S 2 4π | · −v| h,i

2 ρ 2 ξi − ξh h + v, (T0 Wh )(v) T0 |ξi − ξh |2 |ξi − ξh | S2 h

2 ρh ρi ξi − ξh 2 − 3 u, T W + 2 ρ r γ (ξ , ξ ) T0 W h 0 h h h i h 2 |ξh − ξi |2 |ξi − ξh | S 2 S h h

∂γ (ξi , ξh ) ∂γ (ξi , ξh ) 2 2 2 2 v, + ρh 3 u, T0 Wh + ρh (T0 Wh ) 2 2 ∂x ∂y S S h h C α (S 2 )

ρµ ρµ ≤C + + ri C 1+α (S 2 ) + rk C 1+α (S 2 ) |ξk − ξµ | l µ h k ρk ρk + + ri C 1+α (S 2 ) + rh C 1+α (S 2 ) l |ξh − ξk | ρh ρh 3 3 (3) + T0 Wi C α (S 2 ) + ri C 1+α (S 2 ) T0 Wi C α (S 2 ) . |ξi − ξh | l Proof. We have that g(x, y)Vh (uh (y))dy = h

h

1 Vh (uh (y))dy + 4π |x − y|

γ (x, y)Vh (uh (y))dy h

and arguing as in [1] we obtain 1 |u − v| − |u − v + (ri (u)u − ri (v)v)| (J Vi )(u) = Vi (v)dv 2 4π |u − v| S |u − v + (ri (u)u − ri (v)v)| |u − v| − |u − v + (ri (u)u − ri (v)v)| 1 + 2ri Vi (v)dv |u − v + (ri (u)u − ri (v)v)| S 2 4π|u − v| 1 |u − v| − |u − v + (ri (u)u − ri (v)v)| + 2 |u − v + (ri (u)u − ri (v)v)| S 4π|u − v| O 2 ri 2C 1+α (S 2 ) Vi (v)dv = (I1 Vi )(u) + (I2 Vi )(u) + (I3 Vi )(u). (4) From (8.16), (8.17), (8.18) in [1] we obtain I3 Vi C 1+α (S 2 ) ≤ C 2 ri 3C 1+α (S 2 ) Vi C 1+α (S 2 ) .

(5)

484

N. D. Alikakos, G. Fusco, G. Karali

So,

ri (·) + ri (v) ri (·) + ri (v) 1 J Vi = − Vi (v)dv − ri (v)Vi (v)dv 2 S 2 4π | · −v| S 2 4π | · −v| + 2 O ri 3C 1+α (S 2 ) Vi C α (S 2 ) ,

∂γ (ξi , ξi ) ∂γ (ξi , ξi ) i + ρi v, γ (X (u), ρi z) = γ (ξi , ξi ) + ρi u, ∂x ∂y 1 1 +O 2 ρi2 3 + ρi ri C 1+α (S 2 ) 2 . l l

From the above, it follows that i I2 Vi = γ (ξi , ξi ) Vi + 2ri γ (ξi , ξi ) Vi S 2

S2 ∂γ (ξi , ξi ) ∂γ (ξi , ξi ) v, Vi + ρi u, Vi + ρi ∂y S2 S2 2 ∂x 2 ri C 1+α (S 2 ) 2 ρi ri C 1+α (S 2 ) + ρi O + l l2 1 1 + 2 ρi2 3 + ρi ri C 1+α (S 2 ) 2 Vi C α (S 2 ) . l l From the analysis above it follows that g(X i (u), y)Vi (ui (y))dy = ρi i

S2

1 Vi (v)dv + ρi (I ii Vi )(v), 4π |u − v|

(6)

(7)

(8)

(9)

where I ii is a linear operator that satisfies I ii Vi C 1+α (S 2 ) ≤ CVi C α (S 2 ) ,

(10)

where 3ri (v) − ri (·) ri (·) + ri (v) C= dv − 2 ri (v)dv + 3 O ri 3C 1+α (S 2 ) 2 S 2 4π| · −v| S 2 4π | · −v|

∂γ (ξi , ξi ) 2 2 2 + ρi γ (ξi , ξi ) + 2 ri ρi γ (ξi , ξi ) + ρi u, ∂x 3 ρi ri 2C 1+α (S 2 ) 3 ρ 2 ri C 1+α (S 2 ) 1 i 3 21 , +O + ρi ρi 3 + ρi ri C 1+α (S 2 ) 2 + l l2 l l 1 Vh (uh (y))dy i h 4π|X (u) − y|

2 ρh2 2 3 ρh2 rh 3 ρh2 ξi − ξh = ρi u, Vh + Vh − Vh |ξi − ξh | S 2 |ξi − ξh |2 |ξi − ξh | S 2 4π|ξi − ξh | S 2

3 ρh2 ξi − ξh ρh v, Vh (v)dv + |ξi − ξh |2 S 2 |ξi − ξh | 2 2 rh C 1+α (S 2 ) 2 rh C 1+α (S 2 ) 2 2 + ρh O + (ρi + ρh ) |ξi − ξh | |ξi − ξh |2 ρi ri C 1+α (S 2 ) + ρh rh C 1+α (S 2 ) ρi2 + ρh2 2 + Vh C α (S 2 ) . + (11) |ξi − ξh |3 |ξi − ξh |2

The Effect of the Geometry of the Particle Distribution in Ostwald Ripening

485

From the definition of X h , h = 1, . . . , N, that implies

∂γ (ξi , ξh ) ∂γ (ξi , ξh ) γ (X i (u), X h (v)) = γ (ξi , ξh ) + ρi u, + ρh v, ∂x ∂y 2 r 2 1 h C 1+α (S 2 ) +2rh γ (ξi , ξh ) + O + 2 ρi2 + ρh2 3 l l 1 + 2 (ρi ri C 1+α (S 2 ) + ρh rh C 1+α (S 2 ) ) 2 . l It follows that γ (X i (u), y)Vh (uh (h))dy h

∂γ (ξi , ξh ) 2 2 3 2 = ρh γ (ξi , ξh ) Vh + ρi ρh u, Vh ∂x S2 S2 ∂γ (ξi , ξh ) v, + 3 ρh3 Vh Vh + 2 3 ρh2 rh γ (ξi , ξh ) 2 ∂y S S2 1 + 2 ρh2 O 2 ρi2 + ρh2 3 + 2 ρi ri C 1+α (S 2 ) + ρh rh C 1+α (S 2 ) l 1 1 +ρi rh C 1+α (S 2 ) 2 + 2 rh 2C 1+α (S 2 ) (12) Vh C α (S 2 ) . l l Also we have the estimate I ih Vh C 1+α (S 2 ) ≤ CVh C α (S 2 ) , where C=

h = i,

(13)

ρh ξi − ξh 2 2 ρh rh 2 ρh ρi u, + − |ξi − ξh | |ξi − ξh | |ξi − ξh |2 |ξi − ξh | 2 ρh2 + + ρh γ (ξi , ξh ) + 2 2 ρh rh γ (ξi , ξh ) |ξi − ξh |2

∂γ (ξi , ξh ) ∂γ (ξi , ξh ) 2 2 2 + ρi ρh u, + ρh v, ∂x ∂y 2 rh 2C 1+α (S 2 ) 2 rh C 1+α (S 2 ) + + ρh O (ρi + ρh ) |ξi − ξh | |ξi − ξh |2 2 + ρ2 ρ r 1+α (S 2 ) + ρh rh C 1+α (S 2 ) ρ i i C i h + 2 + |ξi − ξh |3 |ξi − ξh |2 1 1 + 2 (ρi2 + ρh2 ) 3 + 2 (ρi ri C 1+α (S 2 ) + ρh rh C 1+α (S 2 ) + ρi rh C 1+α (S 2 ) ) 2 l l 1 2 2 . + rh C 1+α (S 2 ) l

System, (2) is equivalent to ρi Vi = T0 Wi −

N h=1

ρh T0 I ih Vh .

(14)

486

N. D. Alikakos, G. Fusco, G. Karali

From (12), (14) it follows that N ih ρh T0 I Vh h=1 C α (S 2 ) ρk ρk ρh ρh ≤C + + ρh Vh C α (S 2 ) |ξh − ξk | l |ξi − ξh | l h k + 3 ri 2C 1+α (S 2 ) ρi Vi C α (S 2 ) .

(15)

Thus, we have the estimates ii 3ri (v) − ri (·) I Vi − V (v)dv − ρ γ (ξ , ξ ) Vi (v)dv i i i i 2 S 2 4π | · −v| S2 ri (·) + ri (v) + 2 Vi (v)dv ri (v)Vi (v)dv − 2 2 ri ρi γ (ξi , ξi ) S 2 4π| · −v| S2

∂γ (ξi , ξi ) ∂γ (ξi , ξi ) 2 2 2 2 − ρi u, v, Vi (v)dv − ρi Vi (v)dv ∂x ∂y S2 S2 C 1+α (S 2 ) 2 2 ρi ri C 1+α (S 2 ) ρ ri C 1+α (S 2 ) 1 1 = 3O + ρi3 3 + ρi2 ri C 1+α (S 2 ) 2 + i l l2 l l Vi C α (S 2 ) , and (16) ih 1 I Vh − ρh Vh (v)dv + γ (ξi , ξh ) 4π|ξi − ξh | S2

2 2 ρh rh ξi − ξh 2 ρi ρh − u, Vh + Vh (v)dv |ξi − ξh | S 2 |ξi − ξh |2 |ξi − ξh | S 2

2 ρh2 ξi − ξh 2 − v, Vh Vh (v)dv − 2 ρh rh γ (ξi , ξh ) |ξi − ξh |2 S 2 |ξi − ξh | S2

∂γ (ξi , ξh ) ∂γ (ξi , ξh ) 2 2 2 ∂γ Vh (v)dv v, − ρi ρh u, Vh (v)dv − ρh ∂x ∂ξh S 2 ∂y S2 C 1+α (S 2 ) ρ r 2 h h C 1+α (S 2 ) ρh (ρi + ρh )rh C 1+α (S 2 ) ρh (ρi2 + ρh2 ) + + = 3O |ξi − ξh | |ξi − ξh |2 |ξi − ξh |3 +ri 3C 1+α (S 2 )

+

ρh ρi ri C 1+α (S 2 ) + ρh2 rh C 1+α (S 2 )

ρh rh 2C 1+α (S 2 )

+ |ξi − ξh |2 l 1 + ρh (ρi2 + ρh2 ) 3 + ρh (ρi ri C 1+α (S 2 ) + ρh rh C 1+α (S 2 ) l 1 + ρi rh C 1+α (S 2 ) ) 2 Vh C α (S 2 ) . l The proof of Proposition 1 is complete.

(17)

The Effect of the Geometry of the Particle Distribution in Ostwald Ripening

487

Consider the conservation condition (9.2) in [1] V = V = 0,

h

(18)

h

and recall that Wh = Hh − E.

(19)

Proposition 2. The conservation condition (18) determines uniquely the constant E. Moreover we have the following expression: 1 1 ρh E= − 4πρh γ (ξi , ξh ) 1 − ρ¯ N ρ¯ ρ¯ h

ρi 1 ρh 1 − 1− + O(ri C 1+α (S 2 ) ) N ρ¯ |ξi − ξh | ρ¯ N ρ¯ h 1 1 1 O( 2 ri 2C 1+α (S 2 ) ) + O(ri C 1+α (S 2 ) ) + O( 2 ), (20) + N ρ¯ N ρ¯ N ρ¯ where ρ = (ρ1 , . . . , ρN ), r = (r1 , . . . , rN ),

ρ¯ =

1 ρi . N i

Proof. Analogous to that of Proposition 9.1 in [1] and is omitted.

Proposition 3. We have that 1 1 1 − Vi = ρi ρi ρ¯ 1 1 ρh − 2 2 T0 Lri − −1 4πρi γ (ξi , ξh ) ρi N ρ¯ ρ¯ ρi h

ρh 1 ρi 1 − −1 ρi N ρ¯ |ξi − ξh | ρ¯ i,h 1

1 ρh − −1 ρi |ξi − ξh | ρ¯ h 1 ρh 1 1 γ (ξi , ξh )4π − −1 + O(ri C 1+α (S 2 ) ) ρi ρ¯ ρi N ρ¯ h 1 1

1 O 2 ri 2C 1+α (S 2 ) + Ki − K i + ρi N ρ¯ ρ i 3 1 1 1 1 − ri − T0 ri + 2 2 O 2 ri 2C 1+α (S 2 ) − ρi ρi ρ¯ 2 2 ρi 1 1 − O 2 ri C 1+α (S 2 ) ρi N ρ¯ 3 1 ρh 1 + 4πρh γ (ξi , ξh ) −1 r i − T 0 ri ρi N ρ¯ ρ¯ 2 2 3 1

ρi 1 ρh + −1 ri − T 0 r i ρi N ρ¯ |ξi − ξh | ρ¯ 2 2 h

488

N. D. Alikakos, G. Fusco, G. Karali

1 2 ρ h ρ i ξi − ξh 1 1 1 + 3 u, − ρi |ξi − ξh |2 |ξi − ξh | ρh ρ¯ h

2 2 ∂γ (ξi , ξh ) 1 1 1 1 ρh 3 u, − − ρi ∂x ρh ρ¯ h ri (·) + ri (v) 1 2 ri (v)(T0 Wi )(v)dv + T0 ρi S 2 4π | · −v| 2 ρh |ρ| + ri C 1+α (S 2 ) T0 Wh C α (S 2 ) + dist dist h + O 3 ri 3C 1+α (S 2 ) T0 Wi ,

(21)

where Ki

satisfies the estimate Ki

=

1 1 O( 2 ri ) + O( 2 ). N ρ¯ N ρ¯

Proof. Analogous to that of Proposition 9.2, and it is omitted.

(22)

From (21), (22) we obtain (1) in the same way as (1.2) in [1] is obtained from (9.9). References 1. Alikakos, N.D., Fusco, G.: Ostwald Ripening for dilute systems under quasistationary dynamics. Commun. Math. Phys., to appear; DOI 10.1007/s00220-003-0833-5 2. Alikakos, N.D., Fusco, G.: The effect of distribution in space in Ostwald Ripening. In: Centre de Richerches Mathematiques. CRM Proceedings and Lecture Notes, Vol. 27, 2001 3. Zhu, J., Chen X., Hou, T.X.: An Efficient Boundary Integral Method for the Mullins-Sekerka Problem. J. Comput. Phys. 126, (1996) 4. Voorhees, P.W., Schaeffer, R.J.: In Situ Observation of Particle Motion and Diffusion Interactions During Coarsening. Acta Metal. 33, 327–339 (1987) 5. Weins, J.J., Cahn, J.W.: The effects of size and distibution of second phase particles and voids on sintering. In: Sintering and Related Phenomena, G.C. Kuczynski (ed). New York: Plenum, 1973, pp. 151–162 Communicated by J. L. Lebowitz

Commun. Math. Phys. 238, 489–504 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0848-y

Communications in

Mathematical Physics

Frobenius Modules and Hodge Asymptotics Eduardo Cattani1, , Javier Fernandez2 1

Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003, USA. E-mail: [email protected] 2 Department of Mathematics, University of Utah, Salt Lake City, UT 84112,USA. E-mail: [email protected] Received: 15 June 2001 / Accepted: 27 January 2003 Published online: 7 May 2003 – © Springer-Verlag 2003

Abstract: We exhibit a direct correspondence between the potential defining the H 1,1 small quantum module structure on the cohomology of a Calabi-Yau manifold and the asymptotic data of the A-model variation of Hodge structure. This is done in the abstract context of polarized variations of Hodge structure and Frobenius modules. 1. Introduction The even cohomology of a compact smooth manifold is a Frobenius algebra with respect to the cup product and the intersection form. For a compact, K¨ahler manifold X, multiplication by a K¨ahler class defines a representation of the Lie algebra sl(2) on the full cohomology H ∗ (X, C), whose semisimple element induces the standard Z-grading. This is the content of the Hard Lefschetz Theorem. Beginning with the formulation of the Mirror Symmetry phenomenon [5], there has been considerable interest in studying the simultaneous action on cohomology of the K¨ahler cone K of X. Looijenga and Lunts [22] have shown that the copies of sl(2) associated with the elements of K generate a semisimple Lie algebra and have studied some of their properties. Another point of view, introduced in [10], consists in studying H ∗ (X, C) as a mixed Hodge structure which splits over R and is polarized by the action of every K¨ahler class. Hence, the crucial information is contained in the structure of H ∗ (X, C) as a Sym H 1,1 -module. In particular, it follows from [9, Prop. 4.66] that we may define a polarized variation of Hodge structure on H ∗ (X, C) parametrized by the complexified K¨ahler cone of X. If a polyhedral cone of K¨ahler classes is chosen, this variation becomes a nilpotent orbit in the sense of Schmid [27]. This approach has proved fruitful in the study of mixed Lefschetz theorems [10]. Quantum cohomology is a deformation of the cup product on H ∗ (X, C) defined in terms of the Gromov-Witten potential – a generating function for certain enumerative

E. Cattani was partially supported by NSF Grant DMS-0099707

490

E. Cattani, J. Fernandez

invariants. If X is a Calabi-Yau manifold, the action of H 1,1 on ⊕H p,p (X), with respect to the small quantum product, leads to a variation of Hodge structure, called the A-model variation by Morrison [25]. A local variation of Hodge structure is described by an algebraic component – the nilpotent orbit – and an analytic part described by a holomorphic map with values in a graded component of a nilpotent Lie algebra. For the A-model variation the nilpotent orbit is the one described in the previous paragraph. Both Frobenius algebras and polarized variations of Hodge structure have been extensively studied in the recent physics literature. Variations of Hodge structure appear, for instance, in connection with the tree level amplitudes of twisted N = 2 theories – the B-model– and, for Calabi-Yau threefolds, as special geometry ([4, 11, 12]). On the other hand, 2D topological field theories are equivalent to Frobenius algebras. Families of these algebras were also considered: the tangent bundle of the moduli space of topological conformal field theories has, on each fiber, a Frobenius algebra structure ([17, 18]). A relation between the two objects arises in mirror symmetry via the equivalence of the A and B model correlation functions ([5, 20, 14, 25]). What is perhaps not so well known is a direct construction due to Morrison of a variation of Hodge structure based on the A-model [25]. In this paper we show a correspondence between any polarized variation of Hodge structure with appropriate degenerating behavior and a certain sub-structure of a family of Frobenius algebras. Our main result is to exhibit a simple, direct correspondence between the holomorphic data of the variation and the (small) quantum potential in such a way that the horizontality equation of a variation of Hodge structure corresponds to a graded component of the WDVV equations. We will work throughout in the setting of abstract variations of Hodge structure. The analogous abstract notion on the “quantum” side is that of a Frobenius module introduced in Sect. 3 and their deformations defined by potentials encoding the essential properties of a graded portion of the Gromov-Witten potential. The paper is organized as follows. In §2 we review the asymptotic description of variations. Theorem 2.2 contains the algebraic and analytic characterization of local variations. We also recall the notion of maximally unipotent boundary points and of canonical coordinates [5, 23, 16]. In Sect. 3 we define Frobenius modules and their deformations. Sect. 4 is devoted to the proof of our main result, Theorem 4.1, which establishes an equivalence between local variations with appropriate behavior at the boundary and quantum potentials. Finally, in §5 we review the construction of the Amodel variation and show that it coincides with the one constructed in Theorem 4.1. As a byproduct, we obtain a direct proof that the A-model variation is indeed a polarized variation of Hodge structure. We note that the A-model variation involves only the small quantum module structure. In the case of Hodge structures of weights 3, 4 and 5, corresponding to threefolds, fourfolds and fivefolds, the module structure suffices to recover the full quantum algebra, so that our results extend the previously known correspondences ([26, 6]) in weights 3 and 4. Also, the full quantum algebra can be recovered if it is assumed to be generated, in the geometric context, by H 1,1 . In this last case, the family of Frobenius algebras obtained from a variation of Hodge structure can be seen as a Frobenius manifold. These matters will be analyzed elsewhere [19]. S. Barannikov [1–3] has introduced the notion of semi-infinite variations of Hodge structure to deal with the full quantum algebra. He has also shown that, for projective complete intersections, the A-model variation is of geometric origin and coincides with the polarized variation of Hodge structure of the mirror family. Finally, we wish to thank Gregory Pearlstein for his very helpful comments.

Frobenius Modules and Hodge Asymptotics

491

2. Hodge Theory Preliminaries In this section we briefly review the asymptotic description of variations of Hodge structure. We refer to [21, 27, 8, 6] for details and proofs. A (real) variation of Hodge structure V over a connected complex manifold M consists of a holomorphic vector bundle V → M, a flat connection ∇ on V with quasiunipotent monodromy, a flat real form VR ⊂ V, and a finite decreasing filtration F of V by holomorphic subbundles – the Hodge filtration – satisfying ∇F p ⊂ 1M ⊗ F p−1 V = F ⊕F p

(Griffiths’ Transversality) and

k−p+1

(2.1) (2.2)

for some integer k – the weight of the variation – and where barring denotes conjugation relative to VR . As a C ∞ -bundle, V may then be written as a direct sum q V= V p,q , V p,q := F p ∩ F ; (2.3) p+q=k

the integers hp,q := dim V p,q are the Hodge numbers. A polarization of the variation is a flat non-degenerate bilinear form Q on V, defined over R, of parity (−1)k , whose associated flat Hermitian form Qh ( · , · ) := i −k Q( · , ¯· ) makes the decomposition (2.3) orthogonal and such that (−1)p Qh is positive definite on V p,k−p . Via parallel translation to a fixed fiber V we may describe a polarized variation of Hodge structure by a holomorphic period map : M → D/ , where D is the classifying space of polarized Hodge structures on V and is the monodromy group. We recall that D is Zariski open in the smooth projective variety Dˇ consisting of all filtrations F p in V , with dim F = r≥p hr,k−r , satisfying Q(F p , F k−p+1 ) = 0 , where Q denotes the restriction of Q to V . The complex Lie group GC := Aut(V , Q) acts transitively on ˇ and D is an open orbit of GR := Aut(VR , Q). D, Let g and gR denote the Lie algebras of GC and GR , respectively. The choice of a base point F ∈ Dˇ defines a filtration F a g := { T ∈ g : T F p ⊂ F p+a } compatible with the Lie bracket. In particular, F 0 g is the isotropy subalgebra at F and since [F 0 g, F −1 g] ⊂ F −1 g, the quotient F −1 g/F 0 g defines a GC -invariant subbundle of the holomorphic tangent bundle of Dˇ – the horizontal tangent bundle. Because of (2.1), the differential of or, more precisely, of any local lifting of takes values on the horizontal bundle. Such maps are called horizontal. Suppose now that M has a smooth compactification M such that X := M \ M is a normal crossings divisor. Around a point of X, the local variation may be described by a horizontal map : (∗ )r × m → D/ ,

(2.4)

where is the unit disk in C and ∗ the punctured disk. We shall also denote by its lifting to the universal covering U r × m , where U is the upper-half plane. We let z = (zj ), t = (tl ) and s = (sj ) be the coordinates on U r , m and (∗ )r respectively. By definition, we have sj = e2πizj .

492

E. Cattani, J. Fernandez

Asymptotically, a period map has an algebraic component – the nilpotent orbit – encoding the singularities of the connection ∇, and an analytic part described by a holomorphic map with values in a nilpotent Lie algebra. Assuming, for simplicity, that the local monodromy of the variation is unipotent, let N1 , . . . , Nr denote the monodromy logarithms. Our convention is such that (z+ei , t) = exp(Ni )(z, t), where ei denotes the i th standard vector. It follows from Schmid’s Nilpotent Orbit Theorem [27] that the ˇ D-valued map   r log sj (s, t) := exp − Nj  · (s, t) 2πi j =1

extends holomorphically to the origin. The limiting Hodge filtration is F0 := (0, 0) ∈ ˇ The map D.   r θ (z) := exp  zj Nj  · F0 ∈ Dˇ (2.5) j =1

is holomorphic, horizontal, and D-valued for Im(zj ) 0; i.e., is the period map of a local variation. A nilpotent linear transformation N ∈ gl(VR ) defines an increasing filtration, the weight filtration, W (N ) of V , defined over R and uniquely characterized by requiring W (N) W (N) → Gr −l be an isomorphism. It that N(Wl (N )) ⊂ Wl−2 (N ) and that N l : Gr l follows from [7, Theorem 3.3] that if N1 , . . . , Nr are the monodromy logarithms of a local variation, then the weight filtration W ( λj Nj ), λj ∈ R>0 , is independent of the choice of λ1 , . . . , λr and, therefore, is associated with the positive real cone C ⊂ gR spanned by N1 , . . . , Nr . The shifted weight filtration W = W (C)[−k] and the limiting Hodge filtration F0 ∈ ˇ D define a mixed Hodge Structure on V ; i.e. F0 induces a Hodge structure of weight on Gr W

for each . Recall ([9, Theorem 2.13]) that mixed Hodge structures are equivalent to (canonical) bigradings of V , I ∗,∗ , satisfying I p,q ≡ I q,p mod (⊕a
N k+1 = 0, W = W (N )[−k], where W [−k]j = Wj −k , Q(F a , F k−a+1 ) = 0 and, W the Hodge structure of weight k + l induced by F on ker(N l+1 : Gr W k+l → Gr k−l−2 ) is polarized by Q(·, N l ·).

It follows from Schmid’s SL2 -orbit theorem [27] that the mixed Hodge structure (W (C)[−k], F0 ) associated with a local variation is polarized by every N∈ C. Conversely, given commuting nilpotent elements N1 , . . . , Nr ∈ gR so that the weight filtration W ( λj Nj ), λj ∈ R>0 , is independent of the choice of λ1 , . . . , λr , and F0 ∈ Dˇ such

Frobenius Modules and Hodge Asymptotics

493

that (W (C), F0 ) is polarized by every element N ∈ C, the map (2.5) is a period mapping for Im(zj ) sufficiently large [9, Prop. 4.66]. Moreover, if (W (C), F0 ) splits over R, then θ(z) ∈ D for Im(zj ) > 0. We refer to the map θ , or equivalently, to {N1 , . . . , Nr ; F0 } as a nilpotent orbit. The following example shows the relationship between nilpotent orbits (equivalently, polarized mixed Hodge structures) and the Lefschetz structure on the cohomology of a compact K¨ahler manifold. This point of view was introduced in [10] where it was used to obtain relations between the Lefschetz decompositions corresponding to different K¨ahler classes. Example 2.1. If X is a compact K¨ahler manifold of dimension k, the bigrading I p,q := H k−q,k−p (X) defines a mixed Hodge structure (W, F ) on H ∗ (X, C) that splits over R. The interest of this construction lies in the fact that this mixed Hodge structure is polarized by the K¨ahler cone. Indeed, the Hard Lefschetz Theorem is equivalent to the statement that if ω is a K¨ahler class and Lω denotes multiplication by ω, then W = W (Lω )[−k]; while the Hodge-Riemann bilinear relations imply that Lω polarizes (W, F ) relative to the intersection form. The restriction of (W, F ) to V := ⊕kp=0 H p,p defines a mixed Hodge structure of Hodge-Tate type. We now describe the analytic component of a local variation. The bigrading associated with the limiting mixed Hodge structure (W, F0 ) defines a bigrading I ∗,∗ g of the Lie algebra g by I a,b g := {X ∈ g : X(I p,q ) ⊂ I p+a,q+b }. Set pa :=

I a,q g

and

g− :=

q

pa .

(2.6)

a≤−1

The nilpotent subalgebra g− is a complement of the stabilizer subalgebra at F0 . Hence (g− , X → exp(X) · F0 ) provides a local model for the GC -homogeneous space Dˇ near F0 . Thus, locally around the origin, we may write (s, t) = exp((s, t)) · F0 , where (s, t) is a holomorphic g− -valued map with (0, 0) = 0. We also write 

 r 1 (s, t) = exp  log(sj )Nj  · exp((s, t)) · F0 = exp X(s, t) · F0 , 2πi j =1

where X(s, t) ∈ g− . The horizontality of now translates, in terms of the gradings (2.6), into: exp −X(s, t) d exp X(s, t) = dX−1 ∈ p−1 ⊗ T ∗ ((∗ )r × m ), (2.7) where X−1 denotes the p−1 -graded part of X. In particular, dX−1 ∧ dX−1 = 0,

(2.8)

where X−1 = 2π1 i rj =1 log(sj )Nj + −1 . The following result, which follows from [8, Thm. 2.8] and [6, Thm. 2.7], shows that the nilpotent orbit together with the p−1 -valued holomorphic function −1 completely determine the local variation:

494

E. Cattani, J. Fernandez

Theorem 2.2. Let {N1 , . . . , Nr ; F0 } be a nilpotent orbit and R: r × m → p−1 be a holomorphic map with R(0, 0) = 0. Define X−1 (z, t) := rj =1 zj Nj + R(s, t), sj = e2π izj , and suppose that the differential equation (2.8) holds. Then, there exists a unique period mapping   r 1 (s, t) = exp  log(sj )Nj  · exp((s, t)) · F0 , 2πi j =1

defined in a neighborhood of the origin in r+m such that −1 = R. In the ensuing sections we will be concerned with a special type of maximally degenerating variation. These are relevant to the study of mirror symmetry and, from a Hodge theoretic perspective they have the advantage of allowing us to use a canonical system of coordinates on the parameter space of the variation. Following Morrison [24, Def. 3], we consider Definition 2.3. Given a polarized variation of Hodge structure of weight k over (∗ )r whose monodromy is unipotent, we say that 0 ∈ r is a maximally unipotent boundary point if 1. dim I k,k = 1, dim I k−1,k−1 = r and dim I k,k−1 = dim I k−2,k = 0, where I ∗,∗ is the bigrading associated to the limiting mixed Hodge structure and, 2. SpanC {N1 (I k,k ), . . . , Nr (I k,k )} = I k−1,k−1 , where Nj are the monodromy logarithms of the variation. The limiting Hodge filtration F0 and the holomorphic function of a local variation depend on the choice of coordinates on (∗ )r . However, in the maximally unipotent case we may normalize our choices as follows. 1 Proposition 2.4. Let = exp( rj =1 2πi log(sj )Nj )·exp((s))·F0 be a polarized variation of Hodge structure that has a maximally unipotent boundary point at 0 ∈ r . Then, there is a coordinate system on r , unique up to scaling, where satisfies (I 1,1 ) = 0. For a proof of Proposition 2.4, see [6, §3]. We will refer to these as canonical coordinates. They are standard in the physics literature and their Hodge-theoretic interpretation is due to D. Morrison [23] and P. Deligne [16]. 3. Frobenius Modules The cohomology of even degree of a compact manifold is a graded Frobenius algebra relative to cup product and the intersection form. When X is K¨ahler, the Hard Lefschetz Theorem and the Hodge-Riemann bilinear relations describe the action of H 1,1 (X) on the full cohomology. We abstract these properties in the notion of a (framed) Frobenius module. Let V = ⊕kp=0 V2p be a graded C-vector space and B a symmetric nondegenerate bilinear form on V pairing V2p with V2(k−p) . Let {Ta }0≤a≤m be a B-self dual, graded basis of V . We will refer to {Ta } as an adapted basis. For 0 ≤ a ≤ m define δ(a) by B(Tδ(a) , Tb ) = δab for all b = 0, . . . , m. We also set a˜ := p if and only if Ta ∈ Vp and assume that the map ∼ : {0, . . . , m} → {0, . . . , 2k} is increasing.

Frobenius Modules and Hodge Asymptotics

495

Definition 3.1. (V , B, e, ∗) is a graded V2 -Frobenius module of weight k if 1. e = 0 and V0 = e. 2. V is a graded Sym V2 -module under ∗. 3. For all v1 , v2 ∈ V and w ∈ V2 , B(w ∗ v1 , v2 ) = B(v1 , w ∗ v2 ).

(3.1)

4. w ∗ e = w for all w ∈ V2 . Since T0 ∈ V0 , it must be a non-zero multiple of e and we assume that an adapted basis satisfies T0 = e. Clearly, the fact that V is a Sym V2 -module is equivalent to Tj ∗ (Tl ∗ T ) = Tl ∗ (Tj ∗ T ) for all Tj , Tl ∈ V2 and T ∈ V .

(3.2)

We say that V is real if V has a real structure, VR , compatible with its grading, ∗ is real, e ∈ VR , and B is defined over R. Example 3.2. If X is a compact K¨ahler manifold of dimension k, let V2p := H p,p (X), Bint the intersection pairing on V := ⊕kp=0 V2p , and the restriction of the cup product to V . Then, (V , Bint , 1, ) defines a real Frobenius module. The real structure is induced by H ∗ (X, R). As in the case of the cohomology of a compact K¨ahler manifold, to any real Frobenius module we can associate a Hodge-Tate mixed Hodge structure: I p,p := V2(k−p) .

(3.3)

The multiplication operator Lw ∈ End(V ), w ∈ V2 , is an infinitesimal automorphism of the bilinear form ˜ B(va , vb ), Q(va , vb ) := (−1)k+a/2

(3.4)

as well as a (−1, −1)-morphism of the associated mixed Hodge structure. We will say that w ∈ V2 ∩ VR polarizes V if the mixed Hodge structure (I ∗,∗ , Q, Lw ) is polarized. A real Frobenius module V is said to be polarizable if it contains a polarizing element. Given a polarizing element w, the set of polarizing elements is an open cone in V2 ∩ VR . We can then choose a basis T1 , . . . , Tr of V2 ∩VR spanning a simplicial cone C contained in the closure of the polarizing cone and with w ∈ C. Such a choice of a basis of V2 will be called a framing of the polarized Frobenius module. Given an adapted basis {T0 , . . . , Tm } of V , let z0 , . . . , zm be the corresponding linear coordinates on V and set qj := exp(2πizj ) for j = 1, . . . , r := dim V2 . We may identify U r ∼ = (V2 ∩ VR ) ⊕ i C and view the correspondence r

zj Tj ∈ (V2 ∩ VR ) ⊕ i C → (q1 , . . . , qr ) ∈ (∗ )r

j =1

as the natural covering map. Proposition 3.3. Framed, real Frobenius modules of weight k are equivalent to nilpotent orbits of weight k whose limiting mixed Hodge structure is of Hodge-Tate type, split over R, have a marked real element in F k , and have the origin as a maximally unipotent boundary point.

496

E. Cattani, J. Fernandez

Proof. Let (V , B, e, ∗) be a real Frobenius module with framing T1 , . . . , Tr . Set Nj := LTj and F p := ⊕a≥p I a,a . Then {N1 , . . . , Nr ; F } is a nilpotent orbit. The element e ∈ I k,k = F k is a distinguished real element and the conditions of Definition 2.3 are clearly satisfied. Conversely, suppose {N1 , . . . , Nr ; F } is a nilpotent orbit whose limiting mixed Hodge structure is of Hodge-Tate type, split over R and satisfies both conditions of Definition 2.3. Set V2p := I k−p,k−p ; in particular, the marked element e ∈ F k = I k,k = V0 and it follows from (2) in Definition 2.3 that the map N ∈ SpanC {N1 , . . . , Nr } → N (e) identifies the polynomial algebra C[N1 , . . . , Nr ] with Sym V2 and defines a Sym V2 action on V . Let B be defined from the polarization Q as in (3.4), then since the monodromy transformations Nj are infinitesimal automorphisms of Q, (3.1) is satisfied. Thus, (V , B, e, ∗) is a Frobenius module. The equivalence between nilpotent orbits and polarized mixed Hodge structures implies that Tj = Nj (e), j = 1, . . . , r, are a framing of V and the fact that N1 , . . . , Nr are real implies that the Frobenius structure is real. A Frobenius module structure may also be encoded in a polynomial of degree 3 in the variables z0 , . . . , zm . Indeed, if we let φ0 (z0 , . . . , zm ) := zj za zb C(a) ˜ B(Tj ∗ Ta , Tb ) , ˜ j˜=2, 0≤a, ˜ b≤2k

with

 1   6 if k = 3 and a˜ = 2, C(a) ˜ := 41 if k = 3 and a˜ = 2 or a˜ = 2k − 4,   1 otherwise, 2

then we recover the Sym V2 -action by: Tj ∗ Ta :=

c= ˜ a+2 ˜

∂ 3 φ0 Tc ; ∂zj ∂za ∂zδ(c)

j = 1, . . . , r .

The polynomial φ0 is called a (classical) potential for the Frobenius module. We may generalize this construction by considering deformations of the classical potential. This is motivated by the construction of the quantum product as a deformation of the cup product on the cohomology. We assume, for simplicity, that k > 3. Let R := C{q1 , . . . , qr }0 denote the ring of convergent power series vanishing for q1 = · · · = qr = 0 and R be its image under the map induced by qj → e2πizj for 1 ≤ j ≤ r. Definition 3.4. Let (V , B, e, ∗) be a Frobenius module of weight k > 3 with classical potential φ0 . A quantum potential on V is a function φ : V → C of the form φ = φ0 +φ , where φ (z) := za φha (z1 , . . . , zr ) + za zb φhab (z1 , . . . , zr ), (3.5) a=2k−4 ˜

2
Frobenius Modules and Hodge Asymptotics

497

with φha , φhab ∈ R and such that c= ˜ a+2 ˜

∂ 3φ ∂ 3φ ∂ 3φ ∂ 3φ = ∂zl ∂za ∂zδ(c) ∂zj ∂zc ∂zδ(d) ∂zj ∂za ∂zδ(c) ∂zl ∂zc ∂zδ(d)

(3.6)

c= ˜ a+2 ˜

holds for all a, j˜ = l˜ = 2 and d˜ = a˜ + 4. Given a quantum potential φ on (V , B, e, ∗), we can define a deformation of the module structure by

Tj ·q Ta :=

c= ˜ a+2 ˜

∂ 3φ Tc , ∂zj ∂za ∂zδ(c)

with q = (q1 , . . . , qr ) ∈ r .

(3.7)

We should stress that, even though the right side of (3.7) depends explicitly on the variables z0 , . . . , zm , (3.5) implies that it is actually a function of q1 , . . . , qr . Condition (3.6) guarantees that (3.7) defines an action of Sym V2 for all q. Moreover, (V , B, T0 , ·q ) is a Frobenius module of weight k for all q, and ·0 = ∗. We will say that a deformation of the Frobenius module V is framed if V is framed. Remark 3.5. Definition 3.4 abstracts the properties of the graded portion of the Gromov-Witten potential needed to describe the action of H 1,1 (X, C) in the small quantum cohomology ring of a Calabi-Yau manifold X. In particular, (3.6) is a graded component of the WDVV equations. We refer to [14, §8.2, §8.3] and [6, §5] for details. We can extend the definition of quantum potential to the weight 3 case by taking φ = φ0 + φ for φ ∈ R . With this notion, all the results from Sects. 4 and 5 extend to this weight. For V of weight 1 or 2, the Frobenius module is determined by B and e; hence no deformations are possible. 4. Correspondence In this section we will prove the main result of this paper, namely the correspondence between deformations of framed Frobenius modules and degenerating polarized variations of Hodge structures. In §5 we will show that when the deformation arises from the quantum product of a Calabi-Yau manifold, the associated variation of Hodge structure is the so-called A-model variation. Theorem 4.1. There is a one to one correspondence between – Quantum potentials φ on a framed Frobenius module (V , B, e, ∗) of weight k, and – Germs of polarized variations of Hodge structure of weight k on V degenerating at a maximally unipotent boundary point to a limiting mixed Hodge structure of HodgeTate type, split over R, and together with a marked real point e ∈ F k . Under this correspondence, classical potentials – equivalently, framed Frobenius modules – correspond to nilpotent orbits as in Proposition 3.3. Proof. Let (V , B, e, ∗) be a framed Frobenius module of weight k, {T0 , . . . , Tm } an adapted basis, and let {N1 , . . . , Nr ; F } be the nilpotent orbit associated by Proposition 3.3. Given a quantum potential φ = φ0 + φ on V define −1 (q)(Ta ) :=

∂ 2 φ (q) Tc . ∂za ∂zδ(c)

c= ˜ a+2 ˜

(4.1)

498

E. Cattani, J. Fernandez

Notice that because of (3.5), −1 is holomorphic on some open neighborhood of q = 0 ∈ r , −1 (0) = 0, and it takes values on p−1 relative to the grading (2.6) defined by the limiting mixed Hodge structure of {N1 , . . . , Nr ; F }. 1 r As before, we set X−1 (q) := 2πi j =1 log(qj )Nj + −1 (q) ∈ p−1 and note that the deformed Frobenius structure may be recovered from X−1 (q) by T j ·q Ta =

∂X−1 (Ta ) ; ∂zj

j˜ = 2 , 0 ≤ a ≤ m .

(4.2)

Equations (3.6) imply that X−1 satisfies the integrability condition (2.8). Indeed, ∂X−1 ∂X−1 ∂X−1 ∂X−1 = ∂zj ∂zl ∂zl ∂zj ⇐⇒ Tj ·q (Tl ·q Ta ) = Tl ·q (Tj ·q Ta ) ,

dX−1 ∧ dX−1 = 0 ⇐⇒

(4.3)

which, by (3.6), holds whenever j˜ = l˜ = 2 and all a. Theorem 2.2 now implies that X−1 defines a unique polarized variation of Hodge structure on a neighborhood of 0 ∈ r whose nilpotent orbit is {N1 , . . . , Nr ; F }. Hence the origin is a maximally unipotent boundary point and the limiting mixed Hodge structure is of Hodge-Tate type. Conversely, let be the period map of a local variation having a maximally unipotent boundary point at the origin. Let {N1 , . . . , Nr ; F } be the corresponding nilpotent orbit and I ∗,∗ the limiting mixed Hodge structure, which we assume to be of Hodge-Tate type. Let (V , B, e, ∗) be the real, framed Frobenius module given by Proposition 3.3 and φ0 the corresponding classical potential. Let {T0 , . . . , Tm } be an adapted basis such that Tj = Nj (e), j = 1, . . . , r. Using canonical coordinates q on r , we define a quantum potential from the holomorphic function : r → g− associated with by: 1 B(−1 (Ta ), Tb ) for 2 < a˜ < 2k − 4 and a˜ + b˜ = 2k − 2, 2 φha (q) := B(−−2 (Ta ), T0 ) for a˜ = 2k − 4, za φha + za zb φhab , φ :=

φhab (q) :=

a˜ = 2k−4

2 < a˜ < 2k−4 a+ ˜ b˜ = 2k−2

φ := φ0 + φ . Clearly, φ is as in (3.5). In order to verify that (3.6) is satisfied we consider the associated deformation (3.7) of the Frobenius module structure Tj ·q Ta :=

c= ˜ a+2 ˜

∂ 3φ Tc ∂zj ∂za ∂zδ(c)

and show that it may also be given as T j ·q Ta =

∂X−1 (Ta ). ∂zj

(4.4)

Frobenius Modules and Hodge Asymptotics

499

Indeed, for 2 < a˜ < 2k − 4 we have −1 (Ta ) =

aδ(c) φh T c , c= ˜ a+2 ˜

∂ aδ(c) ∂−1 ∂3 (Ta ) = φh Tc = ∂zj ∂zj ∂zj ∂za ∂zδ(c) c= ˜ a+2 ˜

=

c= ˜ a+2 ˜

c= ˜ a+2 ˜

∂ 3φ

∂zj ∂za ∂zδ(c)

so that

u+ ˜ v=2k−2 ˜

1 zu zv φhuv 2

Tc ,

where we have used that φhab = φhba . Then ∂X−1 ∂−1 (Ta ) = Nj (Ta ) + (Ta ) ∂zj ∂zj ∂ 3 φ0 ∂ 3 φ = Tc + Tc ∂zj ∂za ∂zδ(c) ∂zj ∂za ∂zδ(c) c= ˜ a+2 ˜

=

c= ˜ a+2 ˜

c= ˜ a+2 ˜

∂ 3φ ∂zj ∂za ∂zδ(c)

T c = Tj · q T a .

In order to verify (4.4) when a˜ = 2k − 4 we first prove the identity ∂ −1 (Ta ) = B(−−2 (Ta ), T0 ) Tc , a˜ = 2k − 4 ∂zδ(c)

(4.5)

c=2k−2 ˜

as a consequence of the horizontalitycondition (2.7). If this condition is rewritten in terms of G(q) := exp (q) and = Nj dzj we get dG = [G, ] + G d−1 . This equation is graded by (2.6) and its homogeneous pieces are dG− = [G− +1 , ] + G− +1 d−1 ,

≥ 2.

(4.6)

In particular, for = 2 we obtain d−2 = [−1 , +

1 d−1 ]. 2

Evaluating at Ta and given that the canonical coordinates (q1 , . . . , qr ) are characterized by −1 (Tb ) = 0 for all b˜ = 2k − 2, we obtain d−2 (Ta ) = − −1 (Ta ) . (4.7) By the B-self-duality of the basis {T0 , . . . , Tm }, we can write −1 (Ta ) = B(−1 (Ta ), Tδ(c) )Tc .

(4.8)

c=2k−2 ˜

Now, if c˜ = 2k − 2 and j = 1, . . . , r, then Nj (Tc ) = δj c Tm and, therefore, (Tc ) = Tm dzδ(c) and (4.7), (4.8) imply dzδ(c) B(−1 (Ta ), Tδ(c) )Tm , d−2 (Ta ) = − c=2k−2 ˜

500

E. Cattani, J. Fernandez

so that, ∂ −2 (Ta ) = −B(−1 (Ta ), Tδ(c) )Tm ∂zδ(c) implying that

∂ B −2 (Ta ), T0 ∂zδ(c)

= −B(−1 (Ta ), Tδ(c) ).

(4.9)

Finally, (4.5) follows from applying (4.9) to (4.8). Thus, if a˜ = 2k − 4, ∂ ∂−1 ∂ (Ta ) = B(−−2 (Ta ), T0 ) Tc ∂zj ∂zj ∂zδ(c) c=2k−2 ˜   ∂ ∂ ∂  = zb φhb (q) Tc ∂zj ∂zδ(c) ∂za c= ˜ a+2 ˜

=

c= ˜ a+2 ˜

˜ b=2k−4

∂ 3φ

∂zj ∂za ∂zδ(c)

Tc ,

and (4.4) follows as before. Given (4.4), the equivalences in (4.3) show that the integrability condition (2.8) implies that the quantum potential φ satisfies (3.6). Finally, we note that (4.4) and (4.2) imply that these correspondences are inverses of each other. 5. A-Model Variation Here we will show that the polarized variation of Hodge structure associated to a quantum potential by Theorem 4.1 agrees with the A-model variation defined, in the case of the cohomology on a Calabi-Yau manifold, by the Gromov-Witten potential, as in, for example, [14, Chapter 8]. As a byproduct we give a different proof of the fact that the A-model variation associated with a general potential, in the sense of Definition 3.4, is a polarized variation of Hodge structure. We begin by recalling the definition of the A-model variation. Let φ = φ0 + φ be a quantum potential on the framed Frobenius module (V , B, e, ∗). Let {T0 , . . . , Tm } be an adapted basis of V and (z0 , . . . , zm ) the corresponding linear coordinates on V ; set qj = exp(2π izj ) for j = 1, . . . , r. We view (q1 , . . . , qr ) as coordinates on (∗ )r . Let ∇ be the connection on the vector bundle V := (∗ )r × V defined on a constant section T by ∇

∂ ∂qj

T :=

1 Tj ·q T . 2πiqj

(5.1)

Proposition 5.1. The connection ∇ is flat. It has a simple pole at qj = 0 and its residue is the nilpotent operator   3 ∂ φ0 1  Resqj =0 (∇)(Ta ) = (5.2) Tc  . 2πi ∂zj ∂za ∂zδ(c) c= ˜ a+2 ˜

Frobenius Modules and Hodge Asymptotics

501

Proof. Given the definition of the quantum product (3.7) and (5.1), if Ta denotes a constant section,   3 1  ∂ φ0 Tc  + Hj a (q) ∇ ∂ Ta = ∂qj 2πiqj ∂zj ∂za ∂zδ(c) c= ˜ a+2 ˜

for some function H , which extends holomorphically to 0 ∈ r . This implies the residue assertion. The curvature of ∇ reduces to ∂ ∂ 1 1 1 (Ta ) = , ∇ ∂ (Tl · Ta ) − ∇ ∂ (Tj · Ta ) . R∇ ∂qj ∂ql 2πi ql ∂qj qj ∂ql A straightforward computation shows that this last expression vanishes since φ satisfies (3.6). Remark 5.2. It follows from (5.2) that the operators Resqj =0 (∇) agree, up to a constant, with the morphisms LTj of left multiplication by Tj in the Frobenius module (V , B, e, ∗). Consider the flags of subbundles of V: F p := (∗ )r × (⊕a≥p V2(k−a) )

and

U := (∗ )r × (⊕b≥ V2b ).

Proposition 5.3. The subbundles F p satisfy Griffiths’ horizontality (2.1). Moreover, for any given qˆ ∈ (∗ )r , there is a (multivalued) flat frame of V, {Ta }, such that Ta (q) ≡ Ta mod Ua+1 and Ta (q) ˆ = Ta . ˜ Proof. Since the maps T → Tj ·q T are homogeneous of degree 2, the horizontality follows directly from (5.1). Since ∇ defines a connection on the bundle U inducing a trivial connection on U /U +1 , the second statement follows. Next, we want to compute the monodromy of ∇. We fix all the coordinates qi for i = j and consider the one-dimensional problem around qj = 0. The flat sections Ta can be written in terms of the constant sections as Ta = b fba Tb , and the flatness condition leads to the ODE with a regular singularity at the origin 1 ∂fba = − (Resqj =0 (∇))bc + Hj cδ(b) fca , (5.3) ∂qj qj c where Hj cd are holomorphic at qj = 0. Therefore, classical results for such an equation (see [13, Ch. 4, Thm. 4.1]) imply that the coefficients fba are of the form (5.4) fba (q) = G(qj ) exp(− log(qj ) Resqj =0 (∇)) ba for some function G, holomorphic at qj = 0, with G(0) = I dn . Parallel transport around qj = 0, in the anti-clockwise direction, gives that the mo nodromy of ∇, written relative to the frame {Ta }, is Mj := exp −2πi Resqj =0 (∇) . We let Nj := − log(Mj ) = 2πi Resqj =0 (∇). Notice that, in view of (5.2), the monodromy in a flat frame can be computed purely in terms of the classical potential. All together we conclude:

502

E. Cattani, J. Fernandez

Proposition 5.4. The matrix of the local monodromy logarithm operator Nj with re spect to the frame {Ta } coincides with the matrix of left ∗-multiplication by Tj , LTj , with respect to the basis {Ta }. The fact that ∇ has a simple pole at qj = 0 with nilpotent residue LTj allows us to construct Deligne’s canonical extension (V c , ∇ c ) [15] which is characterized by the fact that   r log(qj ) T˜a := exp  (5.5) Nj  Ta , a = 0, . . . , m, 2πi j =1

are a flat frame of (V c , ∇ c ). Proposition 5.5. For a = 0, . . . , m, T˜a is the unique ∇ c -flat section of V c such that ˜ ˆ = Ta . The matrix of Nj acting on the frame {T˜a } equals T˜a ≡ Ta mod Ua+1 ˜ , and Ta (q) the matrix of the classical product ∗ acting on {Ta }. Proof. The first statement follows from Proposition 5.3 and (5.5). Since [Nj , Nl ] = 0 for all 1 ≤ j, l ≤ r, we have       r r log(qj ) log q j Nl (T˜a ) = Nl exp  Nj  Ta  = exp  Nj  Nl (Ta ) 2πi 2π i j =1 j =1 = (LTl )ba T˜b , b

and the second statement follows.

Remark 5.6. In the context of the Gromov-Witten potential, the previous result reduces to [14, Prop. 8.5.4] whose proof involves the formalism of gravitational correlators. The elementary proof given above shows that it is a direct consequence of the definition of the connection and, in particular, of the homogeneity of the the operators LTj . Because of Propositions 5.3 and 5.5, we know the first (graded) component of the sec tions Ta and T˜a . A lengthy but straightforward computation yields the second component of both Ta and T˜a .

Lemma 5.7. The ∇-flat sections Ta satisfy Ta (q) ≡ Ta −

c= ˜ a+2 ˜

∂ 2φ Tc ∂za ∂zδ(c)

mod Ua+2 ˜ .

Lemma 5.8. The ∇ c -flat sections T˜a satisfy the following formulas, for k > 3. For a˜ ≥ 2k − 2, T˜a = Ta . For a˜ = 2k − 4, T˜a = Ta − c= 2πiqδ(c) ∂q∂δ(c) φha Tc + φha Tm . ˜ a+2 ˜ aδ(c) φh Tc mod Ua+2 For 2 < a˜ < 2k − 4, T˜a ≡ Ta − c= ˜ . ˜ a+2 ˜ δ(c) ∂ For a˜ = 2, T˜a ≡ Ta − c= 2πiq φ T mod U a ∂qa h c a+2 ˜ . ˜ a+2 ˜ For a˜ = 0, T˜0 ≡ T0 mod Ua+2 ˜ .

Frobenius Modules and Hodge Asymptotics

503

We can now extend trivially the form Q, defined by (3.4), to a form Q on V. Q is flat because of (3.2). To define a flat real structure VR on V we proceed as follows. Let V˜ := r × V

and

r 1 dqj ∇˜ := ∇ − Nj . 2π i qj j =1

˜ for v ∈ V we define σ˜ v to be the ∇-flat ˜ Then ∇˜ is a flat connection on the bundle V; ˜ such that σ˜ v (0) = v. Then VR is the local system generated by the sections section of V 1 r exp(− 2πi ˜ v (q), for all v ∈ VR . j =1 log(qj )Nj ) σ Definition 5.9. Let φ = φ0 + φ be a quantum potential on the framed, real Frobenius module (V , B, e, ∗). Then (V, ∇, F, VR , Q) is the A-model variation of the potential. Theorem 5.10. The A-model variation is a polarized variation of Hodge structure. Moreover, it is the variation associated to the potential φ by Theorem 4.1. Proof. Let be the “period map” of (V, ∇, F, VR , Q) defined by parallel transport to the fiber Vqˆ , qˆ ∈ (∗ )r . By Proposition 5.4 the local monodromy logarithms Nj are the left multiplication operators LTj and, by Proposition 5.5, the limiting Hodge filtration becomes F p := ⊕a≥p V2(k−a) . Thus, Proposition 3.3 implies that {N1 , . . . , Nr ; F } is a nilpotent orbit. Let now exp(− j zj Nj ) · (q) = exp (q) · F , where is a holomorphic, g− -valr ued map defined locally around 0 ∈ . Since the map is horizontal, the p−1 -valued map X−1 = j zj Nj + −1 satisfies the integrability condition (2.8) and it follows from Theorem 2.2 that (V, ∇, F, VR , Q) is a polarized variation of Hodge structure. In order to prove that this variation agrees with the one defined in Theorem 4.1 we appeal to the uniqueness statement in Theorem 2.2. Hence, it suffices to show that −1 is related to the potential φ by (4.1). But, the matrix presentation of exp(−(q)) in the basis {Ta } is the matrix expressing the ∇ c -flat frame {T˜a } in terms of the constant frame {Ta }. Thus, it follows from Lemma 5.8, that −1 (Ta ) =

c= ˜ a+2 ˜

as desired.

∂ 2 φ Tc , ∂za ∂zδ(c)

References 1. Barannikov, S.: Semi-infinite Hodge structures and mirror symmetry for projective spaces, arXiv:math.AG/0010157, 2000 2. Barannikov, S.: Quantum periods. I. Semi-infinite variations of Hodge structures. Internat. Math. Res. Notices, 23, 1243–1264 (2001). Also, arXiv:math.DG/0006193 3. Barannikov, S.: Non-commutative periods and mirror symmetry in higher dimensions. Commun. Math. Phys. 228(2), 281–325 (2002). Also, arXiv:math.AG/9903124 4. Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165(2), 311–427 (1994). Also, arXiv:hep-th/9309140 5. Candelas, P., de la Ossa, X., Green, P., Parkes, L.: A pair of Calabi-Yau manifolds as an exactly soluble superconformal theory. Nucl. Phys. B 359(1), 21–74 (1991)

504

E. Cattani, J. Fernandez

6. Cattani, E., Fernandez, J.: Asymptotic Hodge theory and quantum products. In: Advances in Algebraic Geometry Motivated by Physics, E. Previato (ed.), Contemp. Math., Vol. 276, pp. 115–136, Providence, RI: Amer. Math. Soc., 2001. Also, arXiv:math.AG/0011137 7. Cattani, E., Kaplan, A.: Polarized mixed Hodge structures and the local monodromy of a variation of Hodge structure. Invent. Math. 67(1), 101–115 (1982) 8. Cattani, E., Kaplan, A.: Degenerating variations of Hodge structures. Ast´erisque 179–180, 67–96 (1989) 9. Cattani, E., Kaplan, A., Schmid, W.: Degeneration of Hodge structures. Ann. Math. 123, 457–535 (1986) 10. Cattani, E., Kaplan, A., Schmid, W.: L2 and intersection cohomologies for a polarizable variation of Hodge structure. Invent. Math. 87(2), 217–252 (1987) 11. Cecotti, S.: N = 2 supergravity, type IIB superstrings, and algebraic geometry. Commun. Math. Phys. 131(3), 517–536 (1990) 12. Cecotti, S., Vafa, C.: Topological–anti-topological fusion. Nucl. Phys. B 367(2), 359–461 (1991) 13. Coddington, E., Levinson, N.: Theory of ordinary differential equations. International Series in Pure and Applied Mathematics, New York: Mc Graw-Hill, 1955 14. Cox, D., Katz, S.: Mirror Symmetry and Algebraic Geometry. Providence, RI: American Mathematical Society, 1999 ´ 15. Deligne, P.: Equations diff´erentielles a` points singuliers r´eguliers. Lecture Notes in Mathematics, Vol. 163, Berlin: Springer-Verlag, 1970 16. Deligne, P.: Local behavior of Hodge structures at infinity. Mirror symmetry, II, Providence, RI: Amer. Math. Soc., 1997, pp. 683–699 17. Dijkgraaf, R.: A Geometrical Approach to Two-dimensional Conformal Field Theory. Ph.D. thesis, Utrecht, 1989 18. Dijkgraaf, R., Verlinde, H., Verlinde, E.: Topological strings in d < 1. Nucl. Phys. B 352(1), 59–86 (1991) 19. Fernandez, J., Pearlstein, G.: Opposite filtrations, variations of Hodge structure and Frobenius modules. arXiv:math.AG/0301342 20. Greene, B.: String theory on Calabi-Yau manifolds. In: Fields, Strings and Duality (Boulder, CO, 1996), River Edge, NJ: World Sci. Publishing, 1997, pp. 543–726. Also, arXiv:hep-th/9702155 21. Griffiths, P.: Periods of integrals on algebraic manifolds. I. Construction and properties of the modular varieties. Am. J. Math. 90, 568–626 (1968) 22. Looijenga, E., Lunts, V.: A Lie algebra attached to a projective variety. Invent. Math. 129(2), 361–412 (1997). Also, arXiv:alg-geom/9604014 23. Morrison, D.: Picard-Fuchs equations and mirror maps for hypersurfaces. In: Essays on Mirror Manifolds. Hong Kong: Internat. Press, 1992, pp. 241–264. Also, arXiv:alg-geom/9111025 24. Morrison, D.: Compactifications of moduli spaces inspired by mirror symmetry. Ast´erisque, no. 218, 243–271 (1993), Journ´ees de G´eom´etrie Alg´ebrique d’Orsay (Orsay, 1992). Also, arXiv: alg-geom/9304007 25. Morrison, D.: Mathematical aspects of mirror symmetry. In: Complex Algebraic Geometry (Park City, UT, 1993), Providence, RI: Amer. Math. Soc., 1997, pp. 265–327. Also, alg-geom/9609021 26. Pearlstein, G.: Variations of mixed Hodge structure, Higgs fields, and quantum cohomology. Manuscripta Math. 102(3), 269–310 (2000). Also, arXiv:math.AG/9808106 27. Schmid, W.: Variation of Hodge structure: the singularities of the period mapping. Invent. Math. 22, 211–319 (1973) Communicated by R. Dijkgraaf

Commun. Math. Phys. 238, 505–524 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0854-0

Communications in

Mathematical Physics

Regular Spacings of Complex Eigenvalues in the One-Dimensional Non-Hermitian Anderson Model Ilya Ya. Goldsheid, Boris A. Khoruzhenko School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, U.K. Received: 26 August 2002 / Accepted: 27 January 2003 Published online: 28 May 2003 – © Springer-Verlag 2003

Abstract: We prove that in dimension one the non-real eigenvalues of the non-Hermitian Anderson (NHA) model with a selfaveraging potential are regularly spaced. The class of selfaveraging potentials which we introduce in this paper is very wide and in particular includes stationary potentials (with probability one) as well as all quasi-periodic potentials. It should be emphasized that our approach here is much simpler than the one we used before. It allows us a) to investigate the above mentioned spacings, b) to establish certain properties of the integrated density of states of the Hermitian Anderson models with selfaveraging potentials, and c) to obtain (as a by-product) much simpler proofs of our previous results concerned with non-real eigenvalues of the NHA model. 1. Introduction The non-Hermitian Anderson model (NHA model) was introduced by N. Hatano and D. Nelson in 1996. It arises in the physics of vortex matter, [5, 6], and in many other contexts, see e.g. [4, 12]. This model is described by the following operator: (Hn ϕ)k = −eg ϕk+1 + qk ϕk − e−g ϕk−1 , 1 ≤ k ≤ n g

(1)

with periodic boundary conditions ϕ0 = ϕn , ϕ1 = ϕn+1 .

(2)

Here g is a real parameter, g ≥ 0. The Hilbert space is l2 (1, n) with the standard inner product: if ϕ = {ϕj }jn=1 and ψ = {ψj }jn=1 are two vectors from l2 (1, n), then (ϕ, ψ) = nj=1 ϕj ψ¯ j . Hatano and Nelson considered the case when the values qj of the potential are taken as a realization of a sequence of independent identically distributed random variables. By conducting a numerical experiment they discovered a number of remarkable propg erties both of the spectrum and the eigenfunctions of the operator Hn . It turns out that

506

I.Ya. Goldsheid, B.A. Khoruzhenko

6

6

4

4

2

(a)

2

6 4

(b)

0

0

0

−2

−2

−2

−4

−4

−4

−6

−6

−5

0

5

−5

(c)

2

0

5

−6

−5

0

5

g Hn ,

Fig. 1. The eigenvalues of are presented by dots plotted in the complex plane. Here n = 50 and {qk } is a fixed realization of independent samples from the uniform distribution on [−4, −3] ∪ [3, 4]; the values of g are: (a) g = 0.2, (b) g = 1.1, (c) g = 1.4

the asymptotic behavior of the eigenvalues depends strongly on the value of the parameter g. To demonstrate this statement we present in Fig.1 results of a similar numerical experiment. These pictures are not so easily predictable in the following sense. It is a consequence of the Weyl criterion that the spectrum of the limiting random operator (n = ∞) contains with probability one the union of spectra of operators with constant potentials, qj ≡ q, for any real q belonging to the support of the random variable q1 . g A very simple calculation shows then that the spectrum of H∞ would typically (i.e. with probability 1) contain a two-dimensional subset of the complex plane (see [3] for g more on the spectrum of H∞ ). However numerical experiments reproduce pictures like that in Fig.1 with remarkable stability also for large values of n (in [5, 6] n = 1000). g They clearly show that the eigenvalues of Hn have no tendency to spread over any two-dimensional region but rather tend to belong to smooth curves. Our attempt to understand whether this phenomenon would indeed persist as n → ∞ or is due to the fact that n is not large enough stimulated the appearance of two papers g [8, 9] where the analysis of the spectra of operators Hn for finite but large values of n was carried out. We shall now briefly describe some of our results restricting ourselves to the case of bounded potentials. Namely it turns out that there are critical values g cr and g¯ cr depending only on the distribution of the random potential and such that: g 1. If 0 ≤ g < g cr then the eigenvalues of Hn are “asymptotically” real (see Theorem 3.2 for the exact meaning of this statement). Moreover their limiting distribution does not depend on g and is the same as in the Hermitian case (that is with g = 0). 2. If g cr < g < g¯ cr then a finite proportion of eigenvalues moves out of the real axes and places itself on very smooth curves in the complex plane. These curves converge to non-random limiting curves as n → ∞. Moreover we found the limiting density of states for non-real eigenvalues and proved that the “asymptotically” real eigenvalues have the same limiting distribution as in the case of the self-adjoint model. 3. If g > g¯ cr then virtually all eigenvalues leave the real axes. One thus concludes that the phenomenon described above persists as n is growing. The fact that the spectrum of the limiting operator is two-dimensional means that on this g spectrum (but off the above mentioned curves) the resolvent of Hn exists but its norm tend to infinity as n → ∞. (See [14] where the spectrum and the norm of the resolvent of random bidiagonal matrices were studied.) In the present paper we investigate the spacings between neighboring non-real eigenvalues. We do this for models with selfaveraging (deterministic!) potentials which are introduced below. The class of selfaveraging potentials is very wide and includes, in particular, random stationary, quasi-periodic and many other potentials.

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

507

The approach used in [8, 9] was based on the theory of products of random matrices and on the potential theory. We would like to emphasize that the present paper is selfcontained and that the technique we use here is much simpler and more straightforward than what we used before. It allows us: To prove that the non-real eigenvalues of the NHA model behave in a very regular way: not only do they belong to very smooth curves as n → ∞ but also for any two neighboring non-real eigenvalues zk+1 and zk , 2πi 1 zk+1 − zk = +o , (3) nf (zk ) n where f (z) is an analytic function of z (see Theorem 3.5). This is our principal new result. To establish the log-H¨older continuity of the density of states for Hermitian Anderson models with selfaveraging potentials. To obtain (essentially as a by-product) much simpler proofs of our previous results listed above. Remarks. 1. In [8, 9] we considered tri-diagonal matrices with off-diagonal elements H (j, j + 1) and H (j, j − 1) depending on j . All results of this paper can be extended to these models. Themain additional condition that is needed is the existence of finite limits limn→∞ n−1 nj=1 ln H (j, j + 1) and limn→∞ n−1 nj=2 ln H (j, j − 1). 2. We have already mentioned above that, apart from the fact that stationary potentials provide natural examples of self-averaging potentials, randomness does not play any role in this paper. However as soon as one wishes to understand the next order term in (3) randomness becomes crucial. The very same thing applies to the properties of asymptotically real eigenvalues. Namely, it is natural to conjecture that the asymptotically real eigenvalues in fact are real for sufficiently large values of n. However there are reasons to believe that in order to prove this conjecture one should restrict himself to the class of random potentials with good properties. 3. Though the approach based on the theory of products of random matrices (TPRM) is more difficult than that of the present work it should be said that all main results of this paper can be obtained within the framework of this theory. Moreover this is what we did first. The additional advantage of the TPRM approach is that the case of one-dimensional differential Schr¨odinger operators can be treated by this method in exactly the same way as the discrete case. The attempt to extend the approach of this paper to the differential case would lead to the necessity of a certain regularization procedure (see [2]). The paper is organized as follows. We introduce the class of selfaveraging potentials and discuss the log-H¨older continuity of the relevant density of states in Sect. 2. Section 3 contains the statement of main results which are proved in Sec. 4. In Appendix A we prove a well known property of Lyapunov exponents. We do this in order (a) to make our exposition even more self-contained and (b) to demonstrate one more application of the technique we use. Appendix B contains an example of calculation of the Lyapunov exponent for models whose potentials have rare high peaks. We believe that these examples are of interest on their own but the initial purpose of finding them was to provide a natural completion for the study of selfaveraging potentials. 2. The Selfaveraging Potentials We introduce now a class of deterministic potentials for which the distribution of the eigenvalues of the NHA model (1) – (2) will be investigated in this paper.

508

I.Ya. Goldsheid, B.A. Khoruzhenko

Given an infinite sequence of real numbers q ≡ {qk }∞ k=1 , we consider the sequence of selfadjoint operators Hn0 , n = 2, 3, . . . , with potential q. Let Nn (E) be the distribution function of the eigenvalues of Hn0 , 1 #{Ei : Ei ∈ spectrum of Hn0 and Ei < E}. (4) n Definition. We say that a real potential q is selfaveraging if the integrated density of states of the self-adjoint Anderson model with this potential exists, i.e. there exists a non-decreasing function N (E) such that Nn (E) −→ N (E) as n → ∞ at the points of continuity of N(E). Nn (E) =

Remarks. 1. We have borrowed the name “selfaveraging” from the theory of random operators, where it is normally associated with convergence of Nn (E) to a nonrandom limiting function. 2. The class of selfaveraging potentials is very wide. For example, it contains decaying potentials, periodic and almost periodic potentials, and stationary random potentials1 , see e.g. book [11] for proofs and more examples. 3. We do not require dN (E) = 1. However, in many cases (and in particular in those mentioned above) the sequence of measures dNn (E) is tight and hence cannot lose mass. 4. The choice of periodic boundary conditions for Hn0 is not essential, for example one could use the Dirichlet boundary conditions ϕ0 = ϕn+1 = 0 instead. This is because changing boundary conditions amounts to a rank two perturbation of Hn0 , and hence has no effect on N(E) provided the perturbed finite interval operators remain selfadjoint. Our preference for the periodic boundary conditions will become apparent in the next section. Let def

Un (x, y) =

+∞ −∞

1 ln |x + iy − E|dNn (E) ≡ ln |x + iy − Ej |, n n

(5)

j =1

where the summation is over all eigenvalues of Hn0 , and +∞ def U (x, y) = ln |x + iy − E|dN (E), −∞

x, y ∈ R.

(6)

Obviously the logarithmic integrals U (x, y) and Un (x, y) are the real parts of the analytic functions +∞ def F (z) = ln(E − z)dN (E) = U (x, y) + iV (x, y), (7) −∞

def

Fn (z) =

+∞ −∞

ln(E − z)dNn (E) = Un (x, y) + iVn (x, y).

(8)

Here and below we consider analytic functions defined in the upper half-plane C+ = {z ∈ C : Im z > 0}, and the branch of the ln(E − z) is chosen so that ln(−i) = −i π2 . The functions defined in (5) – (8) play an important role in this paper. In this section we study their properties under the following conditions: 1 If {q }∞ is a strictly stationary sequence of random variables then N (E) is weakly converging n k k=1 for almost all realizations of q with respect to the corresponding probability measure.

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

C1 C2

509

The potential q is selfaveraging. supn≥1 n1 nk=1 ln(1 + |qk |) ≤ C < +∞.

The main role of Condition C2 is to ensure that the functions U (x, y) and F (z) are well defined. If q is bounded then Condition C2 is trivially valid and Condition C1 immediately implies that Fn (z) converges to F (z) uniformly on compact sets in C\R. In this case the technical statements of this section can be omitted. Proposition 2.1. Assume C1–C2. Then for every x, y ∈ R the integral in (6) is converging. Proof. Suppose first that y > 0 and let z = x + iy. Note that ln y ≤ Un (x, y) ≤ ln(2 + |z|) + C

for all x ∈ R and y > 0.

(9)

The LHS inequality is trivial, and the RHS inequality is ensured by Condition C2. Indeed, 1 ln(2 + |z| + |qj |), n n

Un (x, y) ≤

j =1

where the last inequality follows, e.g., from the representation Un (x, y) = 1 0 n ln | det(zIn − Hn )| and Hadamard’s inequality for the determinants, see e.g. [7]. By Condition C2, 1 ln(2 + |z| + |qj |) ≤ ln(2 + |z|) + C. n n

j =1

If A > −∞ and B < +∞ are points of continuity of N (E) then, in view of Condition C1, B B B ln |z − E|dN (E) = lim ln |z − E|dNn (E) ≤ lim ln |z + i −E|dNn (E) n→∞ A

A

n→∞ A

≤ lim sup Un (x, y + 1). n→∞

Applying (9), we obtain that

B

ln |z − E|dN (E) ≤ ln(3 + |z|) + C.

(10)

A

As y = 0, the function ln |z − E| is bounded from below, and (10) implies that the integral in (6) is converging for y > 0. By the symmetry, it is also converging for y < 0. We shall now make use of the following inequality which will be proved later (Theorem 2.6): U (x, y) ≥ −c0 for some c0 > 0 and all x and y = 0. This inequality together with the monotone convergence theorem yield that +∞ lim U (x, y) = ln |x − E|dN (E) ≥ −c0 , (11) y↓0

−∞

hence the integral in (6) is also converging for y = 0.

510

I.Ya. Goldsheid, B.A. Khoruzhenko

In view of the inequalities Qn − 2 ≤ Hn0 ≤ Qn + 2,

(12)

where Qn is the operator of multiplication by q, (Qn ϕ)j = qj ϕj , j = 1, . . . , n, Condition C2 also ensures that the sequence of measures dNn (E) is tight: Proposition 2.2. Suppose that q satisfies Condition C2. Then for any ε > 0 there exists B > 0 such that Nn (B) − Nn (−B) > 1 − ε for all n. Proof. Denote by χB (E) the indicator-function of the interval [B, +∞), and let E1 , . . . , En be the eigenvalues of Hn0 . Then for any B > 0, ln(1 + Ej ) 1 1 χB (Ej ) ≤ χB (Ej ) n n ln(1 + B) n

1 − Nn (B) =

n

j =1

≤

1 n

n

j =1

χB (2 + qj )

j =1

ln(1 + 2 + qj ) , ln(1 + B)

where the last inequality follows from (12). We use here the following fact2 : if f (E) is a non-decreasing function and A ≤ B then tr f (A) ≤ tr f (B). Since χB (2 + qj ) ln(1 + 2 + qj ) ≤ ln(1 + |2 + qj |) we conclude, in view of Condition C2, that Nn (B) ≥ 1−C/ ln(1+B) for some constant C > 0 and all n. Similarly, Nn (−B) ≤ C / ln(1+B) for some constant C > 0 and all n.

Remark. It follows from (11) that N (·) is a continuous function. This, together with the just proved tightness of the sequence dNn (E), implies that, under Conditions C1 and C2, Nn (E) converges pointwise to N (E) as n → ∞, and the convergence is uniform in E, −∞ < E < ∞. We shall now investigate the relation between Fn (z) and F (z) in the limit n → ∞. Proposition 2.3. Assume C1–C2. Then for every real x and y = 0, lim inf Un (x, y) ≥ U (x, y). n→∞

(13)

Proof. For all B large enough (so that [x − 1, x + 1] ⊆ [−B, B]) we have that for any n, +∞ B Un (x, y) = ln |x + iy − E|dNn (E) ≥ ln |x + iy − E|dNn (E) −∞

−B

and, by Condition C1, lim inf Un (x, y) ≥ n→∞

By letting B → ∞, we obtain (13). 2

B

−B

ln |x + iy − E|dN (E).

This fact is a straightforward consequence of the Courant-Fisher minimax theorem [7].

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

511

Under Conditions C1 and C2, the sequence {Un (x, y)} is not necessarily converging, even for y = 0, and for some selfaveraging potentials the inequality in (13) is strict, for examples see Appendix B. In view of (9), for any compact set K ⊂ C+ , supK |Fn (z)| ≤ M(K) for some constant M(K) < +∞ and all n. Hence the sequence {Fn (z)} has a uniformly converging subsequence. We shall now describe all limit points of {Fn (z)}: Theorem 2.4. Assume C1-C2. Suppose that Fnj (z) is a converging subsequence of {Fn (z)}. Then necessarily lim Fnj (z) = F (z) + c

j →∞

(14)

for all z ∈ C+ and some real constant c satisfying the inequality 0 ≤ c ≤ ln 3 + C, where C is the constant defined in Condition C2. +∞ +∞ Proof. We note that Fn (z) = −∞ (z−E)−1 dNn (E) and F (z) = −∞ (z−E)−1 dN (E). def

Since the function h(E) = (z − E)−1 decays to zero at infinity, Condition C1 ensures that

lim Fn (z) = F (z)

n→∞

uniformly in z on compact sets in C+ . Therefore there exists a sequence of complex constants cn such that Fn (z) − cn −→ F (z) as n → ∞ for all z ∈ C+ . Passing on to the converging subsequence Fnj (z), we have that cnj is also converging. Putting c = limj cnj we arrive at (14). It remains to prove that c is real, non-negative and satisfies the inequality c ≤ ln 3 + C. Due to our choice of the branch of the log-function we have that −π ≤ Im ln(E−z) ≤ 0 for all z ∈ C+ . Thus as n → ∞, Im Fn (z) = Im ln(E − z)dNn (E) −→ Im F (z), because the integrand is bounded, the sequence of measures dNn (E) is tight and we have Condition C1. Therefore the constant c is real. The fact that it is non-negative follows from Proposition 2.3. To complete the proof, note that Unj (x, y) converges to U (x, y)+c in the upper half of the xy-plane. Therefore, because of (9), U (x, y)+c ≤ ln(2+|z|)+C. Putting here x = 0 and y = 1, we obtain c ≤ ln 3 + c − U (0, 1) ≤ ln 3 + C. (Obviously, U (x, 1) ≥ 0 for all x.)

The following condition: C2*

For any ε > 0 there is a B > 0 such that

1 n

n j =1

χB (|qj |) ln(1 + |qj |) < ε for all n,

guarantees the convergence of Fn (z) to F (z). It is obvious that Condition C2* is somewhat more restrictive than C2. On the other hand it is satisfied by many popular classes of potentials. For example, the random stationary potentials with finite expectation of ln(1 + |qj |) satisfy Condition C2* with probability one. It is also satisfied if 1 1+δ ln (1 + |qj |) ≤ C ∗ < +∞ n n

j =1

for some δ > 0.

512

I.Ya. Goldsheid, B.A. Khoruzhenko

Proposition 2.5. Assume C1 and C2*. Then Fn (z) −→ F (z) uniformly in z on compact sets in C+ n→∞

(15)

and, in particular, Un (x, y) −→ U (x, y) uniformly in z = x + iy on compact sets in C+ . n→∞

(16)

Proof. If the potential q is bounded then the statement of Proposition 2.5 is a straightforward corollary of Condition C1 and the fact that the functions Fn (z) are equicontinuous on any compact subset of C+ . If q is unbounded then one needs to show additionally that the contribution of the tails of dNn (E) to Fn (z) is negligible in the limit n → ∞. Obviously, it will suffice to prove that for any ε > 0 there is a B > 0 such that: 1 χB (|Ek |) ln(1 + |Ek |) < ε for all n, n n

(17)

k=1

where the summation in (17) is effectively over all eigenvalues of Hn0 such that |Ej | ≥ B. To complete the proof note that, in view of the inequalities in (12), it is apparent that Condition C2* implies (17).

We finish this section with a proof of the log-H¨older continuity of N (E). This property is well known for random potentials and in this case it follows from the fact that U (x, y) ≥ 0 [2]. In turn, this inequality is a consequence of the Thouless formula according to which U (x, y) coincides with the Lyapunov exponent of Hn0 with n = ∞, see e.g. [2, 11, 1]. In our case Conditions C1 and C2 are too weak to guarantee the existence of the Lyapunov exponent even for non-real values of the spectral parameter, see examples in Appendix B. However these two conditions ensure that the function U (x, y) is bounded from below which, in turn, implies (very much in the same ways as in [2]) that N (E) is log-H¨older continuous. Theorem 2.6. Assume C1-C2. Then: (i) U (x, y) ≥ −c0 for some c0 > 0 and all real x and y. (ii) N(E) is log-H¨older continuous: for any E and |σ | ≤ 21 , |N(E + σ ) − N (E)| = c(E, σ )| ln |σ ||−1 where lim c(E, σ ) = 0. σ →0

(18)

If E belongs to a compact set then c(E, σ ) ≤ c1 with the constant c1 depending only on this compact set and the constant C in Condition C2. Proof. Part (i). In view of (11) and the symmetry in y, it will suffice to prove the inequality for y > 0 only. It follows from Theorem 2.4 that lim inf Un (x, y) = U (x, y) + c0 n→∞

(19)

for some c0 ≥ 0 and all x and y = 0. To finish the proof, it is sufficient to show that the LHS in (19) is non-negative. If the limn→∞ Un (x, y) exists then it coincides with the Lyapunov exponent, and hence is non-negative. The general case can be treated similarly (see Appendix A).

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

Part (ii). Since

513

ln |x − E|dN (E) ≥ −c0 , we have that

|x−E|≤1

ln |x − E| dN (E) +

|x−E|>1

ln |x − E| dN (E) ≥ −c0 ,

and ln |x−E|≤1

1 dN (E) ≤ |x − E|

ln |x − E| dN (E) + c0 ≤ U (x, 1) + c0 . |x−E|>1

Therefore, for any |δ| ≤ 21 , U (x, 1) + c0 ≥

ln |x−E|≤|δ|

1 1 dN (E) ≥ |N (x + δ) − N (x − δ)| ln , |x − E| |δ|

and |N (x + δ) − N (x − δ)| ≤ [U (x, 1) + c0 ]| ln |δ||−1 .

(20)

Note that for any compact set K ⊂ R, maxK U (x, 1) < +∞. This is because U (x, 1) is continuous in x. Now, define for |δ| ≤ 21 , c(x, δ) =

x+δ

ln x

1 dN (E). |x − E|

Obviously, |N (x + δ) − N (x)| ≤

c(x, δ) . | ln |δ||

To complete the proof, note that (20) implies that the measure dN (E) has no atoms, and therefore c(x, δ) −→ 0 when δ → 0.

3. Main Results For the sake of convenience and clarity of exposition, we shall formulate and prove our results for the class of potentials q satisfying Conditions C1 and C2*. We emphasize however that our main results hold true, modulo trivial modifications, under Conditions C1 and C2, and the corresponding proofs are identical to those given in Sect. 4. This is a mere reflection of Theorem 2.4 and the fact that our proofs are based on the convergence of Fn (z) to F (z).

514

I.Ya. Goldsheid, B.A. Khoruzhenko

3.1. Notations and auxiliary statements. Let us fix any finite interval [a, b] of the real g axis (a < b). Most of our results apply to the part of the spectrum of Hn belonging to the strip {z : Re z ∈ [a, b]} in the complex plane C. We define several critical values of the parameter g: g cr = inf U (x, 0),

g¯ cr = sup U (x, 0),

x∈S

(21)

x∈S

where we have introduced the notation S for the support of the measure dN (E), and g cr (a, b) =

inf

x∈S ∩[a,b]

U (x, 0),

g¯ cr (a, b) =

sup

x∈S ∩[a,b]

U (x, 0).

(22)

It may happen that g¯ cr = +∞, and it is obvious, in view of Propositions 2.5 and A.1, that g cr ≥ 0 for any potential satisfying Conditions C1 and C2*. For every g ∈ R, define g = {x ∈ R : U (x, 0) < g}.

(23)

Since U (x, 0) is upper-semicontinuous, see e.g. [2], g is an open set. If g ≤ g cr then g = ∅, otherwise g consists of (possibly infinitely many) disjoint open intervals: g = (aj , bj ). (24) j

We note also that U (x, y) = U (x, −y) and that ∂ U (x, y) > 0 for any x ∈ R, y > 0. ∂y

(25)

Proposition 3.1. Suppose that g > g cr and let (aj , bj ) be the intervals defined in (24). Then the level set Lg = {(x, y) : U (x, y) = g, y > 0} consists of disjoint analytic arcs y = yj (x),

aj < x < bj ,

(26)

whose end-points lie on the real axis, i.e. yj (aj + 0) = yj (bj − 0) = 0, if −∞ < aj , bj < +∞. / g then U (x0 , y) > g for all y = 0. Therefore the equation U (x0 , y) = g Proof. If x0 ∈ cannot be solved for y > 0. Consider now any of the intervals (aj , bj ). If x0 ∈ (aj , bj ) then U (x0 , 0) < g, and in view of (25) and U (x, +∞) = +∞, there exists a unique positive solution def

y0 = yj (x0 ) > 0 of the equation U (x0 , y) = g. As U (x, y) can be analytically continued into a neighborhood of (x0 , y0 ) in C2 , the implicit function theorem asserts that yj (x) is analytic in a disk |x − x0 | < δ in the complex x-plane. The union of all such disks, when x0 runs through (aj , bj ) covers (aj , bj ). Therefore the function yj (x) can be analytically continued into a domain in the complex x-plane that contains (aj , bj ), and, for any closed interval [α, β] ⊂ (aj , bj ), this domain contains Dα,β = {x ∈ C : α − h ≤ Re x ≤ β + h, | Im x| ≤ h}

(27)

for some h > 0. If aj > −∞ then yj (aj + 0) = 0. For, if not then y¯ := lim supx→aj ,x>aj yj (x) > 0. But then U (aj , y) ¯ = g and hence U (x, 0) < g for every x from some neighborhood of aj which contradicts the definition of aj as the end point of our interval. The same argument proves that if bj < +∞ then yj (bj − 0) = 0.

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

515

3.2. Statement of results. We are now in a position to formulate our main results. g

Theorem 3.2. For any g > 0 all the eigenvalues of Hn belong to the level lines of the function Un (x, y) defined by the equation Un (x, y) = g +

2 ln(1 − e−ng ). n

(28)

Theorem 3.3. (i) Suppose that g ≤ g cr (a, b). Then for any ε > 0 there exists g n0 = n0 (ε, g, q, a, b) such that for any n > n0 all the eigenvalues zj of Hn with Re zj ∈ [a, b] belong to the ε-neighborhood of the real axis: | Im zj | ≤ ε. (ii) Suppose that g > g cr and (aj , bj ) is one of the intervals comprising g . Then for any [α, β] ⊂ (aj , bj ) there exists n1 = n1 (q, g, α, β) such that for any n > n1 there exists a solution yj,n (x) to Eq. (28) which is analytic in the domain Dα,β defined in (27) and lim yj,n (x) = yj (x),

n→∞

uniformly in x ∈ Dα,β .

(29)

The function yj,n (x), for n > n1 , is the only solution of (28) which is non-negative when x ∈ [α, β]. g

Remarks. 1. The previous two theorems imply that if Hn has eigenvalues in the strip aj < α ≤ Re z ≤ β < bj , then, for n > n1 , they must lie on the analytic arc An (α, β) = {(x, y) : y = yj,n (x), α ≤ x ≤ β} and on its reflection with respect to the real axis. 2. Relation (29) implies that the arcs An (α, β) converge to the level lines of U (x, y) when n → ∞ together with all their derivatives. g

The next two theorems describe the asymptotic distribution of the eigenvalues of Hn g along the arc An (α, β). In particular they prove that Hn , for large n, has eigenvalues on An (α, β). g By νn (α, β) we denote the number of complex eigenvalues of Hn lying on An (α, β). Theorem 3.4. For any closed interval [α, β] ⊂ (aj , bj ), lim

n→∞

1 1 νn (α, β) = [θ (β) − θ(α)], n 2π

where θ (x) = −V (x, yj (x)) and V (x, y) is the imaginary part of F (z). Remark. Let l be the natural parameter on the curve y = yj (x), that is the length of the part of this curve contained between say (α, yj (α)) and (x, yj (x)). A simple calculation involving the Cauchy-Riemann equations for F (z) shows that dθ = |f (z(l)|dl, where dN (E) f (z) = F (z) = . (30) z−E Hence 1 1 νn (α, β) = n→∞ n 2π

lim

α

β

θ (x)dx =

1 2π

β+iyj (β)

α+iyj (α)

|f (z(l))|dl,

(31)

516

I.Ya. Goldsheid, B.A. Khoruzhenko

where the integration is carried out along the path y = yj (x) from α + iyj (α) to β + iyj (β). Theorems 3.3 and 3.4 are not entirely new and can be inferred from Theorems 2.1 and 2.2 in [9]. We are now going to formulate our principal new result. Let [α, β] be g the same as before. Let us label the eigenvalues zk = xk + iyk of Hn lying on the arc An (α, β) so that α ≤ x1 ≤ x2 . . . ≤ xm ≤ β (we note that in fact the multiplicity of these eigenvalues is one and the inequalities here are strict; this follows from the inequality θn (x) ≥ C0 > 0 which is a part of the proof of Theorem 3.4). g

Theorem 3.5. For any two consecutive eigenvalues zk and zk+1 of Hn lying on An (α, β), n(zk+1 − zk ) =

2πi + δn (zk , zk+1 ), f (zk )

(32)

where lim δn (zk , zk+1 ) = 0

n→∞

uniformly in zk , zk+1 ∈ An (α, β).

(33)

4. Proofs g

The eigenvalues and eigenfunctions of Hn are determined by the equation −eg ϕk+1 + qk ϕk − e−g ϕk−1 = zϕk , 1 ≤ k ≤ n,

(34)

ϕ0 = ϕn , ϕ1 = ϕn+1 .

(35)

where

The parameter g can be eliminated from (34) by making use of the standard substitution ϕk = e−kg ψk which transforms (34) into −ψk+1 + qk ψk − ψk−1 = zψk , 1 ≤ k ≤ n,

(36)

and boundary conditions (35) into ψ0 = e−ng ψn , ψ1 = e−ng ψn+1 .

(37)

Note that the transformed boundary conditions are asymmetric (unless g = 0). To solve Eq. (36) we shall follow the standard routine and rewrite it in the matrix form: ψk qk − z −1 ψk+1 = Ak , 0 ≤ k ≤ n, where Ak = . 1 0 ψk ψk−1 Then

ψn+1 ψn

On the other hand,

ψ1 = Sn (z) , where Sn (z) = An An−1 . . . A1 . ψ0

ψn+1 ψn

= eng

ψ1 ψ0

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

517 g

because of boundary conditions (37). Therefore the eigenvalues of Hn are determined by the equation det[Sn (z) − eng I ] = 0.

(38)

Since det Sn (z) = 1 for all z, we have that det[Sn (z) − eng I ] = 1 − eng tr Sn (z) + e2ng . Hence: g

Lemma 4.1. z is an eigenvalue of Hn iff tr Sn (z) = eng + e−ng . The trace of the matrix Sn (z) is a polynomial in z of degree n. The following representation of this polynomial, which is well known in the context of the discrete Hill equation (see e.g. [13 or 10]) is useful for our purposes. Lemma 4.2. Let Ej , j = 1, . . . , n, be the eigenvalues of Hn0 . Then tr Sn (z) =

n

(Ej − z) + 2.

(39)

j =1

Proof. For g = 0, Lemma 4.1 asserts that the polynomials nj=1 (Ej −z) and tr Sn (z)−2 have the same set of zeros. It is easy to verify both polynomials have the same coefficient, (−1)n , in front of the highest power of z, hence they must coincide.

Here is our main technical lemma: g

Lemma 4.3. Suppose that g > 0. Then z is an eigenvalue of Hn iff Fn (z) =

iπ −ng 2 ng ln e 2 − e 2 + n n

mod

2π i , n

(40)

where Fn (z) is the function defined in (8). g

Proof. It follows from Lemmas 4.1 and 4.2 that z is an eigenvalue of Hn iff n

(Ej − z) = −(e

ng 2

−e

−ng 2

)2 .

(41)

j =1

Since Fn (z) =

1 n

n

j =1 ln(Ej

− z), Eq. (41) is equivalent to (40), provided g = 0.

We are now in a position to prove Theorems 3.2 and 3.3. Proof of Theorem 3.2. This theorem is a straightforward corollary of Lemma 4.3.

518

I.Ya. Goldsheid, B.A. Khoruzhenko

Proof of Theorem 3.3. (i) Let g(ε) = minx∈[a,b] U (x, ε). According to Proposition 2.5 one can find n0 such that for all x ∈ [a, b], |Un (x, ε) − U (x, ε)| ≤

1 [g(ε) − g cr (a, b)] 2

if n > n0 . (Note that g(ε) > g cr (a, b) because of (25).) Thus, for all a ≤ x ≤ b and y ≥ ε, 1 Un (x, y) ≥ Un (x, ε) ≥ U (x, ε) − [g(ε) − g cr (a, b)] 2 1 1 ≥ U (x, ε) + g cr (a, b) ≥ g cr (a, b). 2 2 Recall that by the assumption g cr (a, b) ≥ g. Since g > g + n2 ln(1 − e−ng ) for any n > 0, we conclude that, for all n > n0 Eq. (28) has no solutions in the half-strip a ≤ x ≤ b, y ≥ ε. To complete the proof remember that Un (x, −y) = Un (x, y). (ii) First, note that for every real x Eq. (28) has one non-negative solution at most. Now, let g > g cr and [α, β] ⊂ (aj , bj ), where (aj , bj ) is one of the intervals comprising g . Recall that yj (x) is analytic in Dα,β , see Proposition 3.1. Because of the compactness of [α, β], it will suffice to prove the existence of the solution yj,n (x) to Eq. (28) and its convergence to yj (x) as n → ∞ in a small neighborhood of every point (x, yj (x)), where x runs through [α, β]. ˜ It follows from the Fix x˜ ∈ [α, β] and consider the point (x, ˜ y), ˜ where y˜ = yj (x). integral representations for Un (x, y) and U (x, y) that these two functions are analytic in the domain

y˜ y˜ def ˜ < , |y − y| ˜ < . D˜ = (x, y) : |x − x| 2 2 def

We shall use the following general lemma. Put Dr = {(x, y) : |x−x| ˜ < r, |y−y| ˜ < r}. ˜ Lemma 4.4. Let (x, y) and (x, y) be two functions analytic in Dr and such that for all (x, y) ∈ Dr ,

˜ |x (x, y)| ≤ c1 , 0 < c2 ≤ |y (x, y)| ≤ c3 , |(x, y)| ≤ 1. Suppose that (x, ˜ y) ˜ = 0. Then there is a positive ε0 which depends only on c1 , c2 , c3 , ˜ ·)!) such that the equation and r (but not on (·, ·), (·, ˜ (x, y) + ε (x, y) = 0

(42)

has a unique solution y = y(x, ε) which is analytic in (x, ε) in the domain {(x, ε) : |x − x| ˜ < 2ε0 , |ε| < 2ε0 } and y(x, ˜ 0) = y. ˜ def ˜ Proof of Lemma 4.4. Consider the function G(x, ε, y) = (x, y) + ε (x, y) of three r complex variables x, ε, and y. In the domain D 2 we have:

˜ −1 x (x, y) = (2π)

|u|=r

˜ (u, y)(x − u)

−2

2 du ≤ , r

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

˜ and similarly y ≤

519

2 r.

Hence, for ε sufficiently small, Gx and Gy are close to x and y correspondingly. It is clear that Gε ≤ 1. The implicit function theorem for an analytic function of three variables x, ε, and y implies now the existence of the solution y = y(x, ε) to the equation G(x, ε, y) = 0. It should be emphasized that the domain where this solution exists and is analytic depends only on the corresponding estimates of Gx , Gy , and Gε .

To finish our proof of Theorem 3.3, note that in our case U (x, y) plays the role of (x, y) and Eq. (42) has the form U (x, y) + ε

Un (x, y) − U (x, y) = 0. ε0

Here we first choose ε0 so that to satisfy the conditions of Lemma 4.4, and then choose n0 such that |Un (x, y) − U (x, y)| ≤ ε0 for all (x, y) ∈ D˜ and n > n0 . The wanted result follows from our lemma when ε = ε0 .

Define θn (x) = −Vn (x, yj,n (x)) and θ (x) = −V (x, yj (x)) for x ∈ [α, β] ⊂ (aj , bj ) and n > n1 with n1 as in part (ii) of Theorem 3.3. As before, Vn (x, y) and V (x, y) are the imaginary parts of the analytic functions Fn (z and F (z), see (7) and (8). In view of Theorem 3.3 and Proposition 3.1, we have that lim θn (x) = θ (x)

n→∞

uniformly in x ∈ [α, β].

(43)

It follows from the Cauchy-Riemann equations for Fn (z) and F (z) that |∇Un (x, y)|2 |∇U (x, y)|2 θn (x) = ∂ and θ (x) = ∂ ∂y Un (x, y) ∂y U (x, y) y=yj,n (x)

.

(44)

y=yj (x)

Therefore we also have that

lim θ (x) n→∞ n

= θ (x)

uniformly in x ∈ [α, β].

(45)

∂ ∂ As ∂y Un (x, y) and ∂y U (x, y) are positive in the upper half of the xy-plane, the functions θn (x) and θ (x) are monotone increasing. Moreover, in view of (45), it apparent that there is a constant C0 > 0 such that

θn (x) ≥ C0 > 0

for every x ∈ [α, β] and n > n1 .

(46)

Now we are in a position to prove Theorems 3.4 and 3.5. Proof of Theorem 3.4. On the arc An (α, β), i.e. when z = x + iyj,n (x), α ≤ x ≤ β, the eigenvalue equation (40) reduces to einθn (x) = −1.

(47)

When x runs through [α, β] in the positive direction, θn (x) gets positive increment and w = einθn (x) moves anticlockwise along the unit circle |w| = 1. Obviously, νn (α, β),

520

I.Ya. Goldsheid, B.A. Khoruzhenko g

the number of eigenvalues of Hn on An (α, β), is, up to ±1, equal to the number of circuits completed by w when x completes its run. Thus νn (α, β) =

n[θn (α) − θn (β)] + κ, 2π

where |κ| ≤ 1. When n → ∞, θn (x) converges to θ (x) = −V (x, yj (x)), and, therefore, 1 1

n νn (α, β) converges to 2π [θ (β) − θ (α)]. Proof of Theorem 3.5. Let zl = xl + iyj (xl ) and zl+1 = xl+1 + iyj (xl+1 ) be two cong secutive eigenvalues of Hn on An (α, β). We assume that xl+1 > xl . It follows from Eq. (47) that θn (xl+1 ) − θn (xl ) =

2π , n

and therefore xl+1 − xl =

2π 1 n θn (x ∗ )

(48)

for some x ∗ ∈ (xl , xl+1 ). In view of (46), 0 < xl+1 − xl ≤

2π 1 C0 n

for all n large. Hence x ∗ → xl as n → ∞, and (48) and (45) imply that n(xl+1 − xl ) =

2π + δn (xl , xl+1 ), θ (xl )

(49)

uniformly in xl , xl+1 ∈ [α, β].

(50)

where lim δn (xl , xl+1 ) = 0

n→∞

To prove (32) – (33), note that

zl+1 − zl = xl+1 − xl + iyj,n (x ∗∗ )(xl+1 − xl ) for some x ∗∗ ∈ (xl , xl+1 ). By making use of (49), one obtains that n(zl+1 − zl ) =

2π [1 + iyj,n (x ∗∗ )] + δn (xl , xl+1 ). θ (xl )

Now (32)–(33) easily follow from Theorem 3.3 and the following relation:

1 + iyj (x)

θ (x)

=

1 . iF (z) z=x+iyj (x)

To verify this relation, make use of the equation log |x + iyj (x) − E|dN (E) = g,

(51)

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

to obtain

Ux (x, y) yj (x) = − Uy (x, y)

521

.

y=yj (x)

Now, in view of (44),

1 + iyj (x)

θ (x)

1 = Uy (x, y) + iUx (x, y)

, y=yj (x)

and (51) follows by the Cauchy-Riemann equations for F (z).

A. Appendix Proposition A.1. For all real x and y = 0 we have lim inf Un (x, y) ≥ 0. n→∞

(52)

Proof. This result can be proved in many ways. We present here a proof based on (39). It follows from (39) that 1 2 1 ln tr Sn (z) − Fn (z) = ln 1 + n . (53) n n j =1 (Ej − z) Therefore for every z ∈ C+ such that Im z > 1, 1 lim ln tr Sn (z) − Fn (z) = 0. n→∞ n

(54)

The two functions in the LHS in (53) are analytic and uniformly bounded in n on compact sets in C+ . Therefore, by the Vitali theorem, (54) must hold for every z ∈ C+ . Consider now the eigenvalue equation for Sn (z). If z ∈ / R then Sn (z) has no eigenvalues on the unit circle. As det Sn (z) = 1, we then have that for every non-real z the 2 × 2 matrix Sn (z) has one eigenvalue, λn (z), in the exterior of the unit circle, i.e. |λn (z)| > 1, and the other one, 1/λn (z), in the interior of the unit circle. Thus tr Sn (z) = λn (z) + λ−1 n (z) and 1 1 1 ln λn (z) − ln tr Sn (z) = ln[1 + λ−2 (z)]. n n n

(55)

It follows from (39) that | tr Sn (z)| grows exponentially fast with n provided | Im z| > 1, and then so does the dominant eigenvalue of Sn (z). This is because | tr Sn (z)| ≤ |λn (z)| + 1. (In fact, λn (z) grows exponentially fast with n for every non-real z.) Hence, for every z ∈ C+ such that Im z > 1, 1 1 lim ln λn (z) − ln tr Sn (z) = 0. (56) n→∞ n n

522

I.Ya. Goldsheid, B.A. Khoruzhenko

The two functions in the LHS in (55) are analytic and uniformly bounded in n on compact sets in C+ . Therefore, by the Vitali theorem, (56) holds for every z ∈ C+ . From (54) and (56) we have that 1 lim ln λn (z) − Fn (z) = 0 n→∞ n for every z ∈ C+ . Taking the real part, 1 lim ln |λn (x + iy)| − Un (x, y) = 0, n→∞ n and therefore (recall that |λn (z)| ≥ 1) lim inf Un (x, y) = lim inf n→∞

n→∞

1 ln |λn (x + iy)| ≥ 0. n

B. Appendix Obviously the integrated density of states, N (E), depends on the potential q. To make this dependence explicit, we shall write in this section Nn (E; q) and N (E; q) instead of Nn (E) and N(E). Let k1 , k2 , . . . be an increasing (infinite) sequence of natural numbers such that #{j : kj ≤ n} −→ 0 n

as n → ∞,

(57)

and let v = {vk }∞ k=1 be a potential supported by the sequence kj , i.e. vk = 0 unless k ∈ {kj }. Proposition B.1. If q is a selfaveraging potential then so is q˜ = q + v, and N (E; q) ˜ = N(E, q). Proof. According to the well known theorem from linear algebra, if A and B are two selfadjoint n × n matrices then the number of eigenvalues of the matrix A + B in interval differs from that of the matrix A by rank(A − B) at most. Hence |Nn (E; q) ˜ − Nn (E; q)| ≤

#{j : kj ≤ n} , n

which proves the proposition.

Let us now assume that the potential v satisfies Condition C2 and |vkj | → ∞ as j → ∞. Define 1 sn = ln(1 + |vkj |), n j : kj ≤n

and U˜ n (x, y) =

+∞

−∞

ln |x + iy − E|dN(E; q), ˜

y > 0.

Complex Eigenvalues in the 1D Non-Hermitian Anderson Model

523

Theorem B.2. Let q be a selfaveraging potential with supk |qk | = M, and q˜ = q + v. Then lim [U˜ n (x, y) − sn ] = U (x, y).

n→∞

Proof. We prove this theorem under the following additional condition on v: |vkj +1 | > |vkj | + 2 + M for all j . The proof for the general case requires minor but cumbersome modifications. Let z = x + iy. By definition, 1 1 ln |z − E˜ k | = U˜ n (x, y) = n n n

k=1

|E˜ k |≤2+M

1 ln |z − E˜ k | + n

ln |z − E˜ k |,

|E˜ k |>2+M

where the E˜ k are the eigenvalues of Hn0 with the potential q. ˜ By Proposition B.1 the first sum converges to U (x, y) when n → ∞. We note next that the eigenvalues E˜ k in the interval |E˜ k | > 2 + M have the following property: for all but maybe a finite number of them there is a unique j such that |E˜ k − vkj | ≤ 2 + M. This is due to the fact that the eigenvalues of the operator of multiplication by v differ from the eigenvalues of Hn0 with the potential q˜ by 2 + M at most. Hence taking into account that ln when |Ek | → ∞, we obtain that  1 lim  n→∞ n

|z−E˜ k | 1+|vkj |

→0



ln |z − E˜ k | −

|E˜ k |>2+M

1 n

 ln(1 + |vkj |) = 0.

kj ≤n, |vkj |>2+M

It is apparent that  1 lim  n

n→∞

which completes the proof.



 ln(1 + |vkj |) − sn  = 0,

kj ≤n, |vkj |>2+M

It is easy now to construct examples showing that the statement of Proposition 2.3 cannot be improved. Example 1. Let kj = j 2 and vj 2 = ej , j = 1, 2, . . . . Then limn→∞ sn = limn→∞ U˜ n (x, y) = U (x, y) + 21 . j

1 2.

Hence

Example 2. Let kj = 2j and v2j = e2 , j = 1, 2, . . . . Then lim supn→∞ sn = 2 and lim inf n→∞ sn = 1. Hence lim supn→∞ U˜ n (x, y) = U (x, y) + 2 and lim inf n→∞ U˜ n (x, y) = U (x, y) + 1. It is easy to check that for every 1 ≤ c ≤ 2 there is a subsequence Unj (x, y) converging to U (x, y) + c when j → ∞.

524

I.Ya. Goldsheid, B.A. Khoruzhenko

References 1. Carmona, R., Lacroix, J.: Spectral Theory of Random Schr¨odinger Operators. Boston: Birkh¨auser, 1990 2. Craig, W., Simon, B.: Subharmonicity of the Lyapunov Index. Duke Math. Journ. 50, 551–560 (1983) 3. Davies, E.B.: Spectral theory of pseudo-ergodic operators. Commun. Math. Phys. 216, 687–704 (2001) 4. Efetov, K.B.: Directed quantum chaos. Phys. Rev. Lett. 79, 491–494 (1997) 5. Hatano, N., Nelson, D.R.: Localization transitions in non-Hermitian quantum mechanics. Phys. Rev. Lett. 77, 570–573 (1996) 6. Hatano, N., Nelson, D.R.: Vortex pinning and non-Hermitian quantum mechanics. Phys. Rev. B56, 8651–8673 (1997) 7. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge: Cambridge University Press, 1986 8. Goldsheid, I.Ya., Khoruzhenko, B.A.: Distribution of eigenvalues in non-Hermitian Anderson models. Phys. Rev. Lett. 80, 2897–2900 (1998) 9. Goldsheid, I.Ya., Khoruzhenko, B.A.: Eigenvalue curves of asymmetric tridiagonal random matrices. Electronic J. Probability 5(16), 26 (2000) 10. Last, Y.: On the measure of gaps and spectra for discrete 1D Schr¨odinger operators. Commun. Math. Phys. 149, 347–360 (1992) 11. Pastur, L.A., Figotin, A.L.: Spectra of random and almost-periodic operators. Berlin, Heidelberg, New York: Springer, 1992 12. Shnerb, N.M., Nelson, D.R.: Non-Hermitian localization and population biology. Phys. Rev. B58, 1383–1403 (1998) 13. Toda, M.: Theory of non-linear lattices. Berlin, Heidelberg, New York: Springer, 1981 14. Trefethen, L.N., Contendini, M., Embree, M.: Spectra, pseudospectra, and localization for random bidiagonal matrices. Commun. Pure Appl. Math. 54, 595–623 (2001) Communicated by B. Simon

Commun. Math. Phys. 238, 525–543 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0850-4

Communications in

Mathematical Physics

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds Dmitri V. Alekseevsky1 , Vicente Cort´es2 , Chandrashekar Devchand2 , Antoine Van Proeyen3 1

Department of Mathematics, University of Hull, Cottingham Road, Hull, HU6 7RX, UK. E-mail: [email protected] 2 Mathematisches Institut, Universit¨at Bonn, Beringstr. 1, 53115 Bonn, Germany. E-mail: [email protected]; [email protected] 3 Instituut voor Theoretische Fysica, Katholieke Universiteit Leuven, Celestijnenlaan 200D, 30001 Leuven. E-mail: [email protected] Received: 12 September 2001 / Accepted: 28 January 2002 Published online: 13 May 2003 – © Springer-Verlag 2003

Abstract: BPS solutions of 5-dimensional supergravity correspond to certain gradient flows on the product M × N of a quaternionic-K¨ahler manifold M of negative scalar curvature and a very special real manifold N of dimension n ≥ 0. Such gradient flows are generated by the “energy function” f = P 2 , where P is a (bundle-valued) moment map associated to n + 1 Killing vector fields on M. We calculate the Hessian of f at critical points and derive some properties of its spectrum for general quaternionic-K¨ahler manifolds. For the homogeneous quaternionic-K¨ahler manifolds we prove more specific results depending on the structure of the isotropy group. For example, we show that there always exists a Killing vector field vanishing at a point p ∈ M such that the Hessian of f at p has split signature. This generalizes results obtained recently for the complex hyperbolic plane (universal hypermultiplet) in the context of 5-dimensional supergravity. For symmetric quaternionic-K¨ahler manifolds we show the existence of non-degenerate local extrema of f , for appropriate Killing vector fields. On the other hand, for the non-symmetric homogeneous quaternionic-K¨ahler manifolds we find degenerate local minima. Contents 1. 2. 3. 4. 5. 6. 7. A.

Introduction . . . . . . . . . . . . . . . . . Quaternionic-K¨ahler Moment Map . . . . . The Hessian of the Energy at a Critical Point Very Special Real Manifolds . . . . . . . . The Dressed Moment Map . . . . . . . . . The Hessian of the Energy . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . Remarks on the Notation . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

526 528 532 536 538 539 541 542

This work was supported by the priority programme “String Theory”of the Deutsche Forschungsgemeinschaft.

526

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

1. Introduction Theories of 5 dimensional supergravity have recently obtained increased attention in the context of the AdS/CFT correspondence (for a review, see [1]) and for a supersymmetrisation of the Randall–Sundrum (RS) scenario [2, 3]. In both cases one eventually uses a 5-dimensional metric of the form ds 2 = a(x 5 )2 dx µ dx ν ηµν + (dx 5 )2 ,

(1.1)

where µ, ν = 0, 1, 2, 3. We thus have a flat 4-dimensional space with a warp factor a that depends on the fifth coordinate x 5 . The warp factor is interpreted as the energy scale of the renormalization group flow in the comparison between 5-dimensional AdS theories and 4-dimensional conformal theories. For this application it should therefore run from a low value (infrared: IR) to a high value (ultraviolet: UV). For the RS scenario it should have a maximum at the value of x 5 which we want to associate with the position of a domain wall, and should drop off at both infinities x 5 = ±∞ towards zero. We are then considering a scenario with one domain wall, which has been called “smooth”. This essentially means that the configuration is a solution of the field equations of a 5-dimensional (matter-coupled) supergravity theory without extra sources (which would be singular insertions of a brane in the bulk theory). Supersymmetric theories in 5 dimensions with the minimal number of supersymmetries (8 real supercharges) have scalars which occur in vector multiplets and in hypermultiplets. Other multiplets, like tensor multiplets, could be added, but would not change anything below. The kinetic terms of the scalars define a metric on the target manifold M × N that is a direct product of a quaternionic-K¨ahler manifold M of dimension 4r of negative scalar curvature1 parametrized by the scalars of the r hypermultiplets and a very special real manifold N of dimension n, parametrized by the scalars of the n vector multiplets [4, 5]. The general actions have been conveniently written down in [6]. We will recall the notion of quaternionic-K¨ahler manifolds in Sect. 2 and of very special real manifolds in Sect. 4. For the above-mentioned applications, one has to look for supergravity solutions for which the only non-zero fields are the scalars and the warp factor a in (1.1), and these depend only on x 5 . The kinematics is then determined by the kinetic terms, encoded in the geometry of the target manifold, and by the scalar potential. In supersymmetric theories with 8 or more real supercharges the potential is determined by the gauging of (infinitesimal) isometries of the manifold. An isometry can be gauged if there is a vector in the theory that can serve as a connection. The theory contains n + 1 vectors if n is the dimension of the very special real manifold. Indeed, pure supergravity contains already 1 vector, the “graviphoton”, while the other n originate from the vector multiplets. For every isometry there is a moment map, as we shall recall in Sect. 2. When we gauge n + 1 of these, the potential depends on the ‘dressed moment map’ (see Sect. 5), which is a linear combination of n + 1 moment maps on M with functions on N as coefficients. This is an sp(1) = su(2) triplet P α . The scalars of supersymmetry-preserving (BPS) solutions of the theory have to take values in the submanifold determined by the condition [7] α ∂ P = 0, (1.2) ∂φ x |P | 1 We consider local supersymmetry. For rigid supersymmetry, the scalar curvature of M would be zero, implying that it is (locally) a hyper-K¨ahler manifold. Furthermore, we consider here always theories in a 5-dimensional space with Minkowski signature.

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

527

where φ x , with x = 1, . . . , n, are the coordinates of N , and α = 1, 2, 3. Under this condition2 , the scalar potential depends only on the “energy function” [11, 7] f = 23 W 2 = P α P α .

(1.3)

The solutions are then determined by corresponding “flow equations” which determine the scalars as functions of x 5 . These equations are (with prime denoting the derivative with respect to x 5 ) a = ±W, a

φ x = ∓3g xy

∂ W, ∂φ y

q X = ∓3g XY

∂ W, ∂q Y

(1.4)

where W is the positive root in (1.3), and the ± sign can be chosen according to the sign of a /a, but then has to be used consistently in the other equations. The sign can flip when W reaches a zero. Analogous to the coordinates φ x for the real manifold, we see here the coordinates of the quaternionic-K¨ahler manifold q X (scalars of the hypermultiplet), where X = 1, . . . , 4r for r the quaternionic dimension of M. These equations thus determine a gradient line on the product of the manifolds N and M parametrized by x 5 , which runs from −∞ to ∞. The equations imply that (ln a) ≤ 0. The essential properties of the flow can therefore be seen by analyzing fixed points of the gradient flow and zeros of W . The former are the stationary values of the scalars. We look for solutions that have “fixed points” at x 5 = ±∞. These fixed points can be found from algebraic equations [7]. The behaviour of a solution near a fixed point p is determined by whether W increases or decreases when we approach p in a certain direction. This can be read off from the (n + 4r) × (n + 4r) matrix 3 3 U ≡ g ∂ ∂ W g ∂ ∂ f = , (1.5) W 2f ∂W =0 ∂f =0 where , enumerate all the scalars, and thus ∂ contains derivatives with respect to φ x and q X . If the flow is along a direction corresponding to a positive part of this matrix, the scalars flow to this fixed point with large values of the warp factor a, and the point is called a UV attractor. If the flow is along a direction where the matrix U is negative, the scalars are attracted to this point for small values of a, and this point is called an IR attractor or IR fixed point (see e.g. [12, 7] for more details). The eigenvalues of U are also the conformal weights of the corresponding operators in the conformal dual to the supergravity theory. The purpose of this paper is to derive general properties of such flows. These are mostly determined from knowledge of the matrix U. Our main result is a suitable formula for this matrix. Furthermore we can derive general results on the possibility of UV and IR critical points in symmetric or homogeneous quaternionic-K¨ahler manifolds. The paper is organized as follows. In Sect. 2 we recall the basis properties of quaternionic-K¨ahler manifolds and the moment map. We then analyse the part of the matrix U (Hessian of the energy) for a pure quaternionic-K¨ahler manifold in Sect. 3, and derive properties of attractor points. The very special real manifolds are introduced in Sect. 4 and the adapted moment map in Sect. 5. This allows us to find the properties of the full Hessian in Sect. 6. Finally we give conclusions in Sect. 7. An effort is made to translate 2 This condition can be relaxed if the 4-dimensional part of the metric is generalized from the flat one in (1.1) to a curved one [8–10].

528

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

mathematical formulae in notation readable to physicists and vice versa. In particular, we have given a presentation of very special real geometry which is accessible to mathematicians. Some remarks on our notation are gathered in the Appendix. 2. Quaternionic-K¨ahler Moment Map We start by recalling the notion of quaternionic-K¨ahler manifold. Let (M, g) be a Riemannian manifold. A quaternionic-K¨ahler structure Q on M is a rank 3 subbundle Q ⊂ End(T M) invariant under parallel transport such that locally Q = span{J1 , J2 , J3 = J1 J2 }, where the Jα are locally defined skew-symmetric almost complex structures on M. With respect to local coordinates q X , with X = 1, . . . , 4r = dim M, the almost complex structures Jα have components JαX Y satisfying3 JαX Y JβY Z = −δαβ δX Z + εαβγ Jγ X Z ,

(2.1)

where εαβγ is completely antisymmetric with ε123 =1. The invariance of Q under the Levi-Civita connection ∇ is tantamount to the existence of a triplet of one-forms ωα such that ∇Jα = −2εαβγ ωβ Jγ . (2.2) The ωα may be determined from this equation, which can be rewritten as the full covariant constancy of the complex structures; in components: W Y DZ JαX Y := ∂Z JαX Y − ZX JαW Y + ZW JαX W + 2εαβγ ωβZ Jγ X Y = 0.

(2.3)

Here are the Christoffel symbols of ∇ and 2εαβγ ωβ is the connection matrix of the connection induced by ∇ in the bundle Q. We note that D is here a connection in Q ⊗ R3 induced by the Levi-Civita connection on Q and the connection on the trivial rank 3 bundle over M defined by Deα = 2εαβγ ωβ eγ , where eα is the standard basis of R3 . Note that Eq. (2.3) means that the section J := Jα ⊗ eα is parallel with respect to D. In general, D contains the gauge field of all the transformations of the object on which it acts (see Appendix). A Riemannian manifold admits a quaternionic-K¨ahler structure if and only if its holonomy group is a subgroup of Sp(r)Sp(1), where 4r = dim M. The group Sp(r)Sp(1) is the linear group normalizing a quaternionic structure on R4r and preserving a compatible Euclidean scalar product. A quaternionic-K¨ahler manifold of dim M = 4r > 4 is a Riemannian manifold endowed with a quaternionic-K¨ahler structure. In dimension 4 (the case r = 1) this definition would correspond simply to the notion of oriented Riemannian 4-fold. Instead we will assume in addition that Q annihilates the curvature tensor of the manifold (M, g), i.e. JαX V RV Y ZW + JαY V RXV ZW + JαZ V RXY V W + JαW V RXY ZV = 0.

(2.4)

This condition is automatically satisfied if r > 1. Then the following result holds in all dimensions [13]. 3

Here and below, a sum over repeated indices is understood.

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

529

Theorem 1. The curvature tensor R of a quaternionic-K¨ahler manifold of dimension 4r is of the form R = νR0 + W, scal where R0 is the curvature tensor of the quaternionic projective space, ν = 4r(r+2) is the reduced scalar curvature and W is an algebraic curvature tensor of type sp(r) (the “Weyl curvature”). This means that the components can be written as α α α RXY ZW = ν 21 gZ[X gY ]W + 21 JXY JZW − 21 JZ[X JYα]W jB

+fXiA fY εij fZkC fWD εk ABCD ,

(2.5)

where the antisymmetrization of a tensor TXY is defined as T[XY ] := 21 (TXY − TY X ), the fXiA are the vielbeins of the manifold (A = 1, . . . , 2r and i, j = 1, 2) and ABCD is completely symmetric. We do not use vielbeins in this paper, but the interested reader may find a discussion of quaternionic-K¨ahler manifold in terms of vielbeins in the article on quaternionic-K¨ahler manifolds in [14] and a definition of quaternionic-K¨ahler manifolds that starts from vielbeins has been given in [15], see also [16, 17, 7]. To avoid confusion we emphasize that here and in all coordinate expressions we use X, Y, . . . to denote indices from 1, . . . , 4r, following the convention in the 5d supergravity literature. In coordinate free formulas the same letters, in sans-serif font, X, Y, . . ., will denote vector fields. It follows from Theorem 1 that quaternionic-K¨ahler manifolds are Einstein, in fact, Ric = ν(r + 2)g. Only in the case ν = 0, can we choose the three local almost complex structures spanning the quaternionic structure Q to be parallel and, in particular, integrable. This case corresponds to locally hyper-K¨ahlerian manifolds and is excluded in the following discussion. So from now on ν = 0. Supergravity fixes ν = −1, but we will keep ν general below. To any quaternionic-K¨ahler manifold we can associate the parallel 4-form := ρα ∧ ρα , where the ρα = g(·, Jα ·) are the 2-forms (“K¨ahler forms”) associated to any choice of three local almost complex structures (J1 , J2 , J3 = J1 J2 ) spanning Q (its components are just the components of the almost complex structures with indices lowered by the metric). Let K be a Killing vector field on a quaternionic-K¨ahler manifold (M, g, Q). Then K normalizes Q. Indeed, if (M, g) is locally symmetric, then the Lie algebra of all Killing vector fields is well known. If (M, g) is not locally symmetric and dim M = 4r > 4 then the holonomy Lie algebra is hol = sp(r) ⊕ sp(1) and any Killing vector field normalizes the holonomy Lie algebra and in particular its sp(1)-factor, which defines the quaternionic-K¨ahler structure Q. This proves that K normalizes Q, i.e.4 LK J α = bαβ J β for some bαβ (q X ). This amounts to β

(D[X KZ )JYα]Z = −νε αβγ JXY P γ , for some P γ (q X ) with normalization chosen for later convenience. The 3-form obtained from the 4-form by contraction, ιK = ρα (K, ·) ∧ ρα 4

Here LK is the Lie derivative, i.e. LK X = [K, X] for all vector fields X.

(2.6)

530

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

is closed: dιK = LK − ιK d = 0. Proposition 1. The three-form ιK is exact νιK = dρ,

ρ := ∇K, Jα ρα ,

where ·, · is the canonical scalar product on End T M normalized such that Jα , Jβ = δαβ . Proof. Let us compute dρ = alt∇ρ : ∇ρ = ∇ 2 K, Jα ρα − 2εαβγ ∇K, ωβ Jγ ρα + ∇K, Jα ωβ ⊗ ργ .

(2.7)

Here, the last two terms on the right hand side cancel each other. The first term is computed using the following lemma. Lemma 1. Let K be a Killing vector field on a Riemannian manifold with curvature tensor R. Then the second covariant derivative of K is given by5 ∇ 2 K = R(·, K).

(2.8)

Proof. We prove first that the tensor ∇ 2 Z − R(·, Z) is symmetric for any vector field Z. This follows from the Bianchi identity: 2 2 ∇X,Y Z − R(X, Z)Y − (∇Y,X Z − R(Y, Z)X)

= R(X, Y)Z − R(X, Z)Y + R(Y, Z)X = 0. 2 K, Y = R(X, K)X, Y for all vector fields X and So it is sufficient to check that ∇X,X Y. We can assume that [X, Y] = 0, since we are checking an identity between tensors; and we use the Killing equation ∇X K, Y = −∇Y K, X, 2 ∇X,X K, Y = ∇X ∇X K − ∇∇X X K, Y

= X∇X K, Y − ∇X K, ∇X Y + ∇Y K, ∇X X = −X∇Y K, X − ∇X K, ∇X Y + ∇Y K, ∇X X = −∇X ∇Y K, X − ∇X K, ∇X Y = −∇X ∇Y K, X − ∇X K, ∇Y X = −∇X ∇Y K, X + ∇Y ∇X K, X = −R(X, Y)K, X = R(X, K)X, Y. 5 In equations as the one below, where vectors are not explicitly written, they should be understood 2 K := ∇ ∇ K − ∇ as appearing consistently from left to right, e.g. below: ∇X,Y X Y ∇X Y K = R(X, K)Y. Using the covariant derivative D instead, this equation may be written in a coordinate basis in the form DX DY KZ = R Z Y XW KW .

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

531

Now, using this lemma we obtain ∇ρ = ∇ 2 K, Jα ρα = R(·, K), Jα ρα = ν2 ρα (·, K) ⊗ ρα .

(2.9)

Here, we use the fact that the sp(1)-part of R = νR0 + W is given by the middle term of the first line of (2.5) (the other terms annihilate under multiplication with J βXY ):

sp(1) = ν ρ J , 2 α α

R sp(1) = νR0

(2.10)

see Theorem 1. For the exterior derivative we thus obtain, dρ = proving the proposition.

ν ρα (·, K) ∧ ρα = νιK , 2

(2.11)

The moment map associated to the Killing vector K is the section P = P α Jα ∈ (Q) related to the two form ρ by ρ = νg(·, P ), cf. [18]. It follows that P α = ∇K, Jα ,

:= ν1 .

(2.12)

This is consistent with the use of P α in (2.6). We are interested in the gradient flow generated by the function f := P 2 := P , P = 2 ∇K, Jα 2 , (2.13) which we call the energy. Proposition 2. The covariant derivative of the moment map P is given by ∇P = 21 ρα (·, K) ⊗ Jα .

(2.14)

grad f = P K = P α Jα K = ∇K, Jα Jα K.

(2.15)

The gradient of the energy is

Proof. The formula (2.14) is an immediate consequence of (2.9). Using (2.12) and (2.14) we compute the differential df : df = 2∇P , P = P α ρα (·, K) = ∇K, Jα ρα (·, K). This implies the formula for the gradient.

(2.16)

Corollary 1. The set of critical points of P is Crit(P ) = {K = 0}.

(2.17)

The set of critical points of the energy f is the union Crit(f ) = {K = 0} ∪ {f = 0}.

(2.18)

532

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

The formula (2.14) appears in supergravity as the definition of the moment map or “prepotential”6 in the component form: −νDX P α = RαXY KY =

ν α Y J K . 2 XY

(2.19)

In supergravity ν = −1. Here, β

γ

RαXY = 2∂[X ωYα ] + 2ωX ωY εαβγ ,

(2.20)

are the components of the sp(1) curvature (2.10). They are clearly proportional to the K¨ahler forms ρα , yielding the formula for the gradient α KY . ∂X f = −P α JXY

(2.21)

Remark. The set {K = 0} is a union of totally geodesic submanifolds. This follows from the fact that a connected component of the fixed point set of a group of isometries is totally geodesic since isometries transform geodesics to geodesics and there exists a unique geodesic through two sufficiently close points. If (M, g) is complete and has non-positive sectional curvature (e.g. if (M, g) is a symmetric space of non-compact type or, more generally, a Riemannian manifold covered by such a space) then {K = 0} is connected since in the universal covering of M any two points are joint by a unique geodesic. This generalizes to any symmetric space allowed in supergravity (which has to be non-compact due to the ν = −1 condition) the result found in the toy model (universal hypermultiplet) in [7]. Namely, if there is an isolated critical point, then there are no other critical points, or, if there are two critical points then, as explained above, they are connected by a geodesic which consists of critical points. 3. The Hessian of the Energy at a Critical Point In this section we compute the Hessian of the energy f at critical points and study its spectrum. For this we need the following lemma. Lemma 2. The second covariant derivative of the moment map is given by: 2 ∇X,Y P = 21 ρα (Y, ∇X K)Jα .

(3.1)

2 f = P α ρα (Y, ∇X K) + 21 ρα (X, K)ρα (Y, K). Hessf (X, Y) := ∇X,Y

(3.2)

The Hessian of the energy is:

Proof. Using (2.14) we compute 2 P = 21 g(Y, ∇X (Jα )K)Jα + 21 g(Y, Jα ∇X K)Jα + 21 g(Y, Jα K)∇X Jα ∇X,Y = 21 ρα (Y, ∇X K)Jα − εαβγ ωβ (X)g(Y, Jγ K)Jα + g(Y, Jα K)ωβ (X)Jγ

= 21 ρα (Y, ∇X K)Jα .

6

α KY . Note that the form ρα (·, K) = g(·, Jα K) has components −JXY

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

533

For the Hessian of f = P 2 we get: 2 2 f = 2∇X,Y P , P + 2∇X P , ∇Y P ∇X,Y

= ρα (Y, ∇X K)∇K, Jα + 21 ρα (X, K)ρα (Y, K) = P α ρα (Y, ∇X K) + 21 ρα (X, K)ρα (Y, K). Let us decompose the operator LK := ∇K = ∇K, Jα Jα + L¯ K = νP α Jα + L¯ K .

(3.3)

Then L¯ K is a skew symmetric operator commuting with Q. This follows from the fact that ∇ and K preserve Q using the formula ∇K = ∇K − LK .

(3.4)

The operators ∇K, Jα Jα and L¯ K are called the sp(1)-part and the sp(r)-part of LK , respectively. The important properties of L¯ K are (3.5) (SK )α XY := Jα X Z L¯ K ZY = (SK )α Y X , (SK )α X X = 0. Theorem 2. The set of critical points of the energy f is given in (2.18). At a point p ∈ M, where f = 0 the Hessian of f is given by ρα (X, K)2 , (3.6) Hessf (X, X) = 21 and hence it is positive semi-definite. Its kernel is ker Hessf = span{J1 K, J2 K, J3 K}⊥ ⊂ Tp M.

(3.7)

At a point where K = 0 the Hessian is given by Hessf (X, X) = P α ρα (X, ∇X K) = −νf g(X, X) + g(X, SX),

(3.8)

where S is the symmetric operator S := P L¯ K = P α Jα L¯ K = P α (SK )α . Proof. This follows immediately from (2.16), (3.2) and the decomposition of LK into its sp(1) and sp(r)-parts. As this is a main result of this section, we give also its component form: ∂X ∂Y f |K=0 = −ν f gXY + P α (SK )αXY .

(3.9)

Recall that the Hesse operator Hf is defined by g(Hf X, Y) = Hessf (X, Y). We are now in a position to prove: Theorem 3. At a point where K = 0 there exists an eigenbasis for the Hesse operator Hf of the form e1 , J1 e1 , e2 , J1 e2 , . . . , er , J1 er ; J2 e1 , J3 e1 , J2 e2 , J3 e2 , . . . , J2 er , J3 er . The corresponding eigenvalues are λ1 − νf, λ1 − νf, λ2 − νf, λ2 − νf, . . . , λr − νf, λr − νf ; −λ1 − νf, −λ1 − νf, −λ2 − νf, −λ2 − νf, . . . , −λr − νf, −λr − νf.

534

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

Proof. The eigenvectors of Hf = −νf id + S coincide with the eigenvectors of the operator S = P α Jα L¯ K . Without loss of generality we can assume that P α Jα is proportional to J1 : P α Jα = cJ1 , c ∈ R. Then the operator S = P α Jα L¯ K = cJ1 L¯ K commutes with J1 and anticommutes with J2 and J3 . Let v be an eigenvector of S. Then Sv = λv SJ1 v = J1 Sv = λJ1 v, SJ2 v = −J2 Sv = −λJ2 v, and

SJ3 v = −λJ3 v.

Now one can easily prove by induction that there is an eigenbasis of S of the form e1 , J1 e1 , e2 , J1 e2 , . . . , er , J1 er , J2 e1 , J3 e1 , J2 e2 , J3 e2 , . . . , J2 er , J3 er with eigenvalues λ1 , λ1 , λ2 , λ2 , . . . , λr , λr , −λ1 , −λ1 , . . . , −λr , −λr . We will say that the sp(1)-part of ∇K is small (at a point where K = 0) if |νf | < |λi | for all i. We will say that the sp(r)-part is regular if L¯ K is invertible. This is the case if and only if the λi = 0. Corollary 2. Let p ∈ M be a point where K = 0 and let the sp(1)-part of ∇K be small (and therefore the sp(r)-part is regular). Then the Hessian Hessf has r positive and r negative eigenvalues, each of double multiplicity. We recall that all the known non-flat homogeneous quaternionic-K¨ahler manifolds fall into two classes: the Wolf spaces [19] and the Alekseevsky spaces [20–22]. The Wolf spaces are symmetric spaces of positive scalar curvature. Their isometry group is compact. The Alekseevsky spaces are precisely the homogeneous quaternionic-K¨ahler manifolds of negative scalar curvature which admit an R-splittable simply transitive solvable group of isometries. This class contains the non-compact duals of the Wolf spaces, which are symmetric spaces of negative scalar curvature, together with 3 series of non-symmetric quaternionic-K¨ahler manifolds. The following result characterizes the symmetric quaternionic-K¨ahler manifolds. Theorem 4 [23]. A homogeneous quaternionic-K¨ahler manifold is symmetric if and only if it admits a smooth compact quotient by a discrete group of isometries. Theorem 5. Let (M = G/H, g, Q) be one of the known non-flat homogeneous quaternionic-K¨ahler manifolds. Then there exists a Killing vector field K vanishing at a point p ∈ M such that the Hessian Hessf has split signature at p. Proof. By Corollary 2 it is sufficient to prove that the isotropy Lie algebra h contains a vector with small sp(1)-part and regular sp(r)-part. For the symmetric quaternionicK¨ahler manifolds the isotropy Lie algebra splits as a direct sum of ideals:

h = sp(1) ⊕ h , h ⊂ sp(r).

(3.10)

(For most of the symmetric quaternionic-K¨ahler manifolds h = [h , h ]. The only exception is when G is of type An+1 , with h = u(n) and [u(n), u(n)] = su(n)). Let t ⊂ h be a Cartan subalgebra and T ∈ t a regular element. Then T has regular

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

535

sp(r)-part (has only non-zero eigenvalues under the isotropy representation) since the isotropy representation of h has no trivial submodule. By adding a small vector from sp(1) we obtain a Killing vector with the desired properties. The non-symmetric case is more involved since the isotropy group h ⊂ sp(1)⊕sp(r) does not admit a splitting of the type (3.10). The isometry and isotropy groups of these spaces were found in [24] (see also summary in [25]). We use the description of the Alekseevsky spaces given in [26]. This does not use the full isometry group. The metric-preserving group in the centralizer of the Clifford algebra, which consists of the antisymmetric matrices commuting with the gamma matrices, is not included. This group is part of the full isometry and the full isotropy group. Let M = G/H be an Alekseevsky space of dimension 4r. The Lie group G = G() is defined by a spin(V )-equivariant map : ∧2 W → V , where V is a pseudo-Euclidean vector space and W is a module for the even Clifford algebra C0 (V ). The isotropy Lie algebra has the form7 h = so(3) ⊕ so(p, q + 3),

(3.11)

where (p +3, q +3) is the signature of V . Note that the so(3) here is not the one that acts as the quaternionic structure, which we consistently denote as sp(1). We denote by π : h → sp(r) the sp(r)-projection of h. It is faithful and has parts from both subalgebras of (3.11). The sp(1)-projection of h has kernel so(p, q +3) and defines an isomorphism so(3) → sp(1) = Q. The isotropy module splits under π(h) ∼ = so(3) ⊕ so(p, q + 3) as follows: C2 ⊕ C2 ⊗ Rp,q+3 ⊕ W. (3.12) Here so(3) acts on C2 = H by the standard representation of su(2) ∼ = so(3) commuting with the quaternionic structure and so(p, q + 3) acts trivially on C2 and in the standard way on Rp,q+3 . The action on the C0 (V )-module W is induced by the inclusion

so(3) ⊕ so(p, q + 3) ⊂ so(V ) = so(p + 3, q + 3) ∼ = spin(V ) ⊂ C0 (V ). From this description we see that the isotropy module of π(h) has no trivial submodules. This proves that h contains elements with regular sp(r)-part. Also it is easy to see that the sp(1)-part of such an element has to be non-zero (due to the submodule C2 ) and can be chosen to be arbitrarily small. For the symmetric quaternionic-K¨ahler manifolds we can prove the existence of Killing vector fields K vanishing at a point p ∈ M such that Hessf is definite at p (non-degenerate local extremum of f ). Theorem 6. Let (M = G/H, g, Q) be a symmetric quaternionic-K¨ahler manifold of reduced scalar curvature ν. If ν > 0 (respectively, ν < 0), there exists a Killing vector field K vanishing at a point p ∈ M such that the Hessian Hessf is negative (respectively, positive) definite, i.e. the energy f has a non-degenerate local maximum (respectively, minimum) at p. 7 The q here is in agreement with the notation in [20, 22, 21, 24, 25]. The P or P˙ values in these papers determine the choice of the module W . In [26] a generalization is made to homogeneous spaces of non-positive signature. This is reflected in the extra parameter p, which is 0 in the positive-signature case.

536

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

Proof. It follows from the decomposition (3.10) that there exist Killing vector fields K vanishing at a point p ∈ M with zero sp(r)-part at p. For such a field K the Hesse operator of f at p is given by Hf = −νf id. So Hf < 0 if ν > 0 and Hf > 0 if ν < 0. (The same is true for any Killing vector with sufficiently small sp(r)-part.) Note that for the non-symmetric Alekseevsky spaces there are no non-zero Killing vector fields in the isotropy Lie algebra with zero sp(r)-part. Nevertheless we can prove the following result. Theorem 7. Let (M = G/H, g, Q) be an Alekseevsky space. It is defined by a spin(V )equivariant map : ∧2 W → V , where V = Rp+3,q+3 and W is a C0 (V )-module. Then there exists a Killing vector field K vanishing at a point p ∈ M such that the Hessian Hessf is positive semi-definite at p. More precisely, the spectrum of Hessf consists of three eigenvalues: λ := −νf > 0, of multiplicity 4(r − p − q + 2), the eigenvalue 2λ, of multiplicity 2(p + q + 4), and the eigenvalue 0, of multiplicity 2(p + q + 4). Proof. It is sufficient to choose K ∈ so(3) ⊂ h = so(3)⊕ so(p, q +3). At the canonical base point we can assume without loss of generality that P = J1 (and hence f = 2 ). Then the Hesse operator at the canonical base point is Hf = −νf id+S = −ν 2 id+S = − id + S, where S = J1 L¯ K . Recall that the decomposition (3.12) of the isotropy module splits under the sp(r)-projection π(h). With respect to that decomposition S acts trivially on W and acts only on the first factor C2 of C2 ⊕ C2 ⊗ Rp,q+3 = C2 ⊗ Rp,q+4 with eigenvalues ± of double multiplicity. This shows that Hf has eigenvalues − (of multiplicity dim W ), −2 (of multiplicity 2r − dim W/2) and 0 (of multiplicity 2r − dim W/2). In the next sections we want to extend our discussion to the manifolds which are allowed targets for the scalars of 5-dimensional supergravity theories. The most general such manifold for a theory with r hypermultiplets and n vector multiplets is a product M × N of a quaternionic-K¨ahler manifold of dimension 4r and a very special manifold N of dimension n. 4. Very Special Real Manifolds The geometry connected to vector multiplets in 5 dimensions was uncovered in an old beautiful paper [4]. The manifolds were placed in the context of the family of special geometries in [5]. A very special manifold is a connected immersed hypersurface N → {C = 1} ⊂ Rn+1 defined by a homogeneous cubic polynomial C which is non-singular on a neighborhood of the image of the immersion. For simplicity of our exposition, we assume, without loss of generality, that N ⊂ Rn+1 is embedded. Then we do not need to distinguish between points of N and their images in Rn+1 . The radial vector field ξ defined by ξ(p) = p is always transversal to the hypersurface N . It gives rise to a pseudoRiemannian metric g = gN and to a torsionfree connection D on N . They are defined by the formula8 ∂X Y = DX Y + 23 g(X, Y)ξ. (4.1) 8 The factor 2/3 is introduced for consistency of the notation with the supergravity action. It guarantees that the corresponding scalars have the same normalization in the kinetic energy as the scalars of the quaternionic manifold. The translation of these formulae to the familiar supergravity language will be given at the end of this section.

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

537

Here X and Y are tangent to N and ∂ denotes the canonical connection of Rn+1 . Usually one assumes that the metric is positive definite. Let us denote by C(·, ·, ·) the completely symmetric trilinear form whose associated cubic form is C = C(·). They are related by polarization: C(X, X, X) = C(X). Proposition 3. The metric g is related to the Hessian of C by g = − 21 HessC |N.

(4.2)

More explicitly, for all X, Y ∈ Tp N : g(X, Y) = −3C(p, X, Y).

(4.3)

Proof. Equation (4.2) easily implies the explicit formula (4.3) since C is a homogeneous function of degree 3. Let X and Y be vector fields tangent to N . We can extend them (locally) to vector fields in the ambient space Rn+1 satisfying XC = YC = 0. Hence, using the homogeneity of C we obtain at a point p ∈ {C = 1}: HessC (X, Y) = XYC − ∂X YC = 0 − 3C(p, p, ∂X Y) = −3C(p, p, ∂X Y).

(4.4)

The right-hand side of (4.4) equals −2g(X, Y) because the linear form C(p, p, ·) vanishes precisely on the tangent space of N at p and equals 1 on p = ξ(p). We define a (1, 2)-tensor S C on N by g(SXC Y, Z) = 23 C(X, Y, Z)

for all

X, Y, Z ∈ Tp N.

(4.5)

Lemma 3. The Levi-Civita connection of g is given by ∇ = D − SC .

(4.6)

Proof. The connection ∇ = D − S C is torsionfree because D is torsionfree and SXC Y = SYC X. We check that it is a metric connection. Let X, Y, Z be tangent vectors to N (locally) extended to vector fields in Rn+1 satisfying XC = YC = ZC = 0. Then we compute (∇X g)(Y, Z) = Xg(Y, Z) − g(∇X Y, Z) − g(Y, ∇X Z) = Xg(Y, Z) − g(DX Y, Z) − g(Y, DX Z) + g(SXC Y, Z) + g(Y, SXC Z) (4.1)

= Xg(Y, Z) − g(∂X Y, Z) − g(Y, ∂X Z) + g(SXC Y, Z) + g(Y, SXC Z) = (∂X g)(Y, Z) + 2C(X, Y, Z)

(4.2)

= − 13 (∂X HessC )(Y, Z) + 2C(X, Y, Z)

= − 13 (∂ 3 C)(X, Y, Z) + 3C(X, Y, Z) = −3C(X, Y, Z) − 3C(X, Y, Z) = 0. It is customary to denote the standard coordinates of Rn+1 by hI , with I = 0, 1, . . . , n, and the cubic polynomial is then C = CI J K hI hJ hK . The conjugate coordinates are hI := CI J K hJ hK . One may choose local coordinates φ x with x = 1, . . . , n on the hypersurface N. Vector fields tangent to N are those Y’s for which Y = YI ∂I = Yx ∂x , → YI = Yx (∂x hI ).

(4.7)

538

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

Equation (4.1) is then the decomposition of the derivative

∂x Yy (∂y hI ) = (Dx Yy )(∂y hI ) + hI gxy Yy ,

(4.8)

and the orthogonality of hI and ∂y hI implies that this defines the metric gxy . Lemma 3 corresponds to the equation (where the semi-colon indicates covariant differentiation w.r.t. φ x using the Christoffel connection calculated from the metric gxy ) [4] 3/2 I z (∂z hI ) + hI gxy , Txyz := 23 CI J K (∂x hI )(∂y hJ )(∂z hK ), (∂y h );x = − 23 Txy (4.9) such that the derivative D on a vector tangent to the hypersurface corresponds to y Dx Yy = Yy ;x − 23 Txz Yz . (4.10) 5. The Dressed Moment Map Let (M, gM , Q) be a quaternionic-K¨ahler manifold of dimension 4r and (N, gN ) a very special manifold of dimension n, N ⊂ Rn+1 . We denote by πM and πN the projections ∗ g + π ∗ g the product metric. We assume that of the product M × N and by g = πM M N N we are given a Lie algebra γ spanned by n + 1 Killing vector fields KI , I = 0, 1, . . . , n, on M, with the corresponding moment maps being PI : M → Q. We define the dressed moment map P : M × N → Q by P := P α Jα := hI PI

or

PXY := P α Jα XY .

(5.1)

∗ Q over M × N , where π It is a section of the bundle πM M : M × N → M is the 9 projection. Let us also define K := hI KI . (5.2) ∗ T M. We want to study the It is an N-dependent vector field on M, a section of πM 2 gradient flow generated by the energy function f := P .

Proposition 4. The covariant derivative of the dressed moment map P is given by ∇P = dhI ⊗ PI + hI ∇PI = dhI ⊗ PI + 21 ρα (·, K) ⊗ Jα .

(5.3)

The differential of the energy is df = 2∇P , P = 2PI , P dhI + P α ρα (·, K).

(5.4)

The gradient of the energy is grad f = 2PI , P grad hI + P K. Proof. This follows easily from Proposition 2 applied to the PI .

(5.5)

9 In Sect. 2 and 3 we considered one isometry whose Killing vector we denoted by K and whose moment map we denoted by P . This can be considered in the context of this section as the case of a trivial very special real manifold, i.e. n = 0, with hI having only the component h0 = 1.

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

539

Corollary 3. The set of critical points of P is [7] Crit(P ) = {dhI ⊗ PI = 0} ∩ {K = 0} = {PI = hI P } ∩ {K = 0}. The set of critical points of the energy f is

Crit(f ) = P , PI dhI = 0 ∩ {K = 0} ∪ {f = 0}

= P , PI = f hI ∩ {K = 0} ∪ {f = 0} .

(5.6)

(5.7)

In the supergravity context, critical points are points with constant scalars for solutions that preserve supersymmetry. The preservation of supersymmetry imposes for the N-sector the condition PI = hI P , i.e. the critical points of P . 6. The Hessian of the Energy In this section we carry over the calculations from Sect. 3 to the case of the dressed moment map. Lemma 4. The second covariant derivative of the dressed moment map is given by: 2 ∇X,Y P = HesshI (X, Y)PI + 21 ρα (X, KI )Y(hI )Jα + 21 X(hI )ρα (Y, KI )Jα

+ 21 hI ρα (Y, ∇X KI )Jα .

(6.1)

The Hessian of the energy is: Hessf (X, Y) = 2P , PI HesshI (X, Y) + P α ρα (X, KI )Y(hI ) +P α X(hI )ρα (Y, KI ) + P α hI ρα (Y, ∇X KI ) + 2∇X P , ∇Y P .

(6.2)

Proof. Let us compute from (5.3): ∇ 2 P = (∇dhI ) ⊗ PI + ∇PI ⊗ dhI + dhI ⊗ ∇PI + hI ∇ 2 PI . This finishes the proof in view of Lemma 2.

(6.3)

We put L¯ K := hI L¯ KI . Theorem 8. The Hessian of the energy is given by Hessf = 43 f πN∗ gN + 2P , PI dhI S C +dhI ⊗ gP KI + gP KI ⊗ dhI ∗ +πM gM (−νf id + S) + ∇P ⊗ ∇P , where S C is the (1, 2)-tensor defined by (4.5) and S is the symmetric operator S := P L¯ K , gA = g(A·, ·) for an endomorphism A and gX = g(X, ·) for a vector X.

540

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

Proof. Using Lemma 4 we obtain Hessf = 2P , PI HesshI + dhI ⊗ gP KI + gP KI ⊗ dhI + hI gP ∇KI + 2∇P ⊗ ∇P . The decomposition (see (3.3))

shows that

∇KI = νPI + L¯ KI

(6.4)

∗ gM + gS. hI gP ∇KI = −νf πM

(6.5)

To simplify the first term we need to calculate the Hessian HesshI of the function hI |N with respect to the Levi-Civita connection ∇ of N . Of course, the Hessian ∂ 2 hI of the linear function hI with respect to the standard connection of Rn+1 is zero. This means that XYhI = (∂X Y)hI for all vector fields X and Y. So, using (4.1) and Lemma 3, we get for all vector fields X and Y tangent to N : HesshI (X, Y) = XYhI − (∇X Y)hI = (∂X Y − ∇X Y)hI = (DX Y + 23 gN (X, Y)ξ − ∇X Y)hI = 23 gN (X, Y)hI + (SXC Y)hI . In components the Hessian of hI follows from (4.9). The expression gP KI is the oneform with components (6.6) (gP KI )X = −PXY KIY . On the other hand, the components of S (symmetric and traceless) are SXY = P α JαX Z L¯ K ZY .

(6.7)

The result of the theorem can therefore be written as I 4 2 α α z ∂x ∂y f = 3 f gxy − 3 P PI Txy ∂z h , ∂x ∂X f = −PXY KIY ∂x hI , ∂X ∂Y f = −νgXY f + SXY .

(6.8)

Corollary 4. At a critical point of P the Hesse operator of the energy, defined as Hf (·, g·) = Hessf (·, ·),

(6.9)

is given by Hf = 43 f πN∗ + dhI ⊗ P KI + gP KI ⊗ grad hI + (−νf id + S) ◦ πM∗ .

(6.10)

Here πM∗ and πN∗ are the differentials of the canonical projections. Proof. It suffices to remark that at a critical point ∇P = 0 and PI dhI = 0, see Corollary 3.

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

541

7. Conclusions We have considered the properties of flows governed by (1.2) and (1.4). In particular, we have obtained a formula for the Hessian matrix U defined in (1.5): 2δx y + W12 (∂x KZ )PZ Y 3 U := , (7.1) Hf = − 12 PXZ ∂ y KZ −ν 3 δX Y + 12 PX Z L¯ K Y 2f ∂f =0

W

2

W

Z

where the first entries are for the vector multiplets and the second for the hypermultiplets. ν is the reduced scalar curvature, see Theorem 1, which is −1 in supergravity. P and L¯ K select respectively the sp(1) and sp(r) parts of the dressed gauged isometry, defined by (6.6) and (7.2) DX KY = νPXY + L¯ K XY . Thus, in comparison with [7, 15], P is −J and L¯ K is L. Note that if only vector multiplets are present, we have only the upper-left entry of (7.1), and thus only positive eigenvalues (UV attractors) [27, 28]. However, including the quaternionic-K¨ahler manifold (hypermultiplets) opens the possibility of having negative eigenvalues as well [11, 7]. The formula (7.1) implies that such eigenvalues are only possible if either the gauged isometries are mainly in the sp(r) direction (L¯ K big enough) or one gauges generators that are not in the isotropy group of the fixed point (such that KIX = 0 and the off-diagonal elements are non-zero). The lower-right 4r × 4r part of (7.1) has eigenvalues −ν 23 + λ1 , 23 + λ1 , 23 − λ1 , (7.3) 3 3 3 3 3 3 2 − λ1 , 2 + λ2 , 2 + λ2 , 2 − λ2 , 2 − λ2 , . . . , 2 − λr . In various situations with trivial very special real manifold, we have found more detailed results on the structure of the eigenvalues, and the number of possible critical points. In particular, for complete quaternionic-K¨ahler manifolds of non-positive sectional curvature (which include the locally symmetric quaternionic-K¨ahler manifolds of negative scalar curvature) we find that the fixed point set, if non-empty, is either a point or a connected totally geodesic submanifold of non-zero dimension. Note that here, the fixed point set is non-empty if and only if K is a compact generator, i.e. if the closure of the group generated by K is compact. For homogeneous quaternionic-K¨ahler manifolds we prove the existence of Killing fields K such that the spectrum of U has some specific UV/IR-properties, for example: (i) For all known quaternionic-K¨ahler manifolds we exhibit a Killing vector field K vanishing at a point p ∈ M such that U has split signature at p. (ii) For the symmetric quaternionic-K¨ahler manifolds of positive scalar curvature we find a Killing vector field K vanishing at a point p ∈ M such that U is negative definite, i.e. the energy f has a non-degenerate local maximum at p. (iii) For the symmetric quaternionic-K¨ahler manifolds of negative scalar curvature we find a Killing vector field K vanishing at a point p ∈ M such that U is positive definite, i.e. the energy f has a non-degenerate local minimum at p. (iv) For the known non-symmetric homogeneous quaternionic-K¨ahler manifolds (Alekseevsky spaces), which have negative scalar curvature, we find a Killing vector field K vanishing at a point p ∈ M such that U is positive semi-definite, i.e. the energy f has a degenerate local minimum at p. Moreover we calculate the eigenvalues of U.

542

D. V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen

We remark that supergravity selects negative scalar curvature (ν = −1), so situation (ii) never occurs. We thus have either complete UV fixed points as in (iii), or zero directions as in (iv), or split signatures as in (i). These results will be useful for the investigation of possibilities for flow lines from IR to UV critical points, suitable for AdS/CFT dual pairs, or from IR to IR critical points, relevant for the investigation of possible smooth supersymmetric domain-wall solutions. We also expect that the main results of this paper can also be applied to other dimensions where quaternionic-K¨ahler manifolds occur, e.g. 4-dimensional N = 2 supergravity theories.

A. Remarks on the Notation Throughout this paper Sp(n) denotes the compact real form of the symplectic group in 2n variables, which is sometimes denoted as U Sp(2n). In particular Sp(1) = U Sp(2) = SU (2). Generically, D is a derivative that is covariant under all existing local symmetries. In other words, it is a connection in all bundles that are active on the field on which D acts. This means that D is an extension of the Levi-Civita connection (when acting on tangent vectors) adding a term − (gauge field one-form) × gauge transformation, for any gauge transformation under which the object transforms. E.g. the latter gives rise to the last term in (2.3), containing the sp(1) gauge field ωα as it acts there on an object that transforms under sp(1). Equation (2.3) determines ωα and gives our convention for the sp(1) transformation on triplets. The curvature tensor of the Levi-Civita connection ∇ is defined by (denoting vector fields X in a coordinate basis as X ∂ ) ∇X ∇Y Z − ∇Y ∇X Z = ∇[X,Y] Z + R(X, Y)Z, R(X, Y)Z, W = R ϒ X Y Z Wϒ .

(A.1)

∇ 2 is defined as 2 ∇X,Y ≡ ∇X ∇Y − ∇∇X Y .

(A.2)

The curvature of a connection D acting on tangent scalars in a space whose components are denoted by indices , , . . . is given by [D , D ] = −Ra Ta

(A.3)

for any gauge symmetry denoted by indices a, and whose action is indicated here by Ta . See (2.20) for the sp(1) curvature. Acting on a vector field X with components X in a local coordinate basis (neutral under other gauge transformations), ∇Y X = Y (D X )∂ , and the curvature components of the Levi-Civita connection are given by D D X − D D X = R ϒ X .

(A.4)

The Ricci tensor and scalar curvature are Ric = R ,

scal = g Ric .

(A.5)

Flows on Quaternionic-K¨ahler and Very Special Real Manifolds

543

References 1. Aharony, O., Gubser, S.S., Maldacena, J., Ooguri, H., Oz, Y.: Large N field theories, string theory and gravity. Phys. Rept. 323, 183–386 (2000) hep-th/9905111 2. Randall, L., Sundrum, R.: A large mass hierarchy from a small extra dimension. Phys. Rev. Lett. 83, 3370–3373 (1999) hep-ph/9905221 3. Randall, L., Sundrum, R.: An alternative to compactification. Phys. Rev. Lett. 83, 4690–4693 (1999) hep-th/9906064 4. G¨unaydin, M., Sierram G., Townsend, P. K.: The geometry of N = 2 Maxwell–Einstein supergravity and Jordan algebras. Nucl. Phys. B242, 244 (1984) 5. de Wit, B., Van Proeyen, A.: Broken sigma model isometries in very special geometry. Phys. Lett. B293, 94–99 (1992) hep-th/9207091 6. Ceresole, A., Dall’Agata, G.: General matter coupled N = 2, D = 5 gauged supergravity. Nucl. Phys. B585, 143–170 (2000) hep-th/0004111 7. Ceresole, A., Dall’Agata, G., Kallosh, R., Van Proeyen, A.: Hypermultiplets, domain walls and supersymmetric attractors. Phys. Rev. D64, 104006 (2001) hep-th/0104056 8. Lopes Cardoso, G., Dall’Agata, G., L¨ust, D.: Curved BPS domain wall solutions in five- dimensional gauged supergravity. JHEP 07, 026 (2001) hep-th/0104156 9. Chamseddine, A.H. Sabra, W.A.: Curved domain walls of five dimensional gauged supergravity. Nucl. Phys. B630, 326–338 (2002) hep-th/0105207 10. Chamseddine, A.H., Sabra, W.A.: Einstein brane-worlds in 5D gauged supergravity. Phys. Lett. B517, 184–190 (2001) hep-th/0106092 11. Behrndt, K., Herrmann, C., Louis, J., Thomas, S.: Domain walls in five dimensional supergravity with non- trivial hypermultiplets. JHEP 01, 011 (2001) hep-th/0008112 12. Behrndt, K., Gukov, S., Shmakova, M.: Domain walls, black holes, and supersymmetric quantum mechanics. Nucl. Phys. B601, 49–76 (2001) hep-th/0101119 13. Alekseevsky, D.V.: Riemannian spaces with exceptional holonomy groups. Funct. Anal. Applic. 2, 97–105 (1968) 14. Bagger, J., Duplij, S., Siegel, W. eds.: Concise Encyclopedia of Supersymmetry. Dordrecht: Kluwer Academic publishers, 2003 15. Van Proeyen, A.: The scalars of N = 2, D = 5 and attractor equations, in ‘New developments in fundamental interaction theories’ proceedings of 37th Karpacz Winter School, Feb 2001; eds. J. Lukierski and J. Rembieli´nski, AIP proceedings 589, 2001, pp. 31–45 hep-th/0105158 16. Fr`e, P.: Gaugings and other supergravity tools of p-brane physics. hep-th/0102114, Lectures given at EC-RTN Workshop on Latest Development in M-Theory, Paris, France, Feb 2001 17. D’Auria, R., Ferrara, S.: On fermion masses, gradient flows and potential in supersymmetric theories. JHEP 05, 034 (2001) hep-th/0103153 18. Galicki, K., Lawson Jr., H.B.: Quaternionic reduction and quaternionic orbifolds. Math. Ann. 282, 1–21 (1988) 19. Wolf, J.A.: Complex homogeneous contact manifolds and quaternionic symmetric spaces. J. Math. Mech. 14, 1033–1047 (1965) 20. Alekseevsky, D.V.: Classification of quaternionic spaces with a transitive solvable group of motions. Math. USSR Izvestija 9, 297–339 (1975) 21. de Wit, B., Van Proeyen, A.: Special geometry, cubic polynomials and homogeneous quaternionic spaces. Commun. Math. Phys. 149, 307–334 (1992) hep-th/9112027 22. Cort´es, V.: Alekseevskian spaces. Diff. Geom. Appl. 6, 129–168 (1996) Available from http://www.math.uni-bonn.de/people/vicente/ 23. Alekseevsky, D.V., Cort´es, V.: Homogeneous quaternionic K¨ahler manifolds of unimodular group. Boll. Un. Mat. Ital. B(7) 11(2), suppl. 217–229 (1997) available from http://www.math.uni-bonn.de/people/vicente/ 24. de Wit, B., Vanderseypen, F., Van Proeyen, A.: Symmetry structure of special geometries. Nucl. Phys. B400, 463–524 (1993) hep-th/9210068 25. de Wit, B., Van Proeyen, A.: Isometries of special manifolds. hep-th/9505097, Proceedings of the Meeting on Quaternionic Structures in Mathematics and Physics, Trieste, September 1994; available on http://www.emis.de/proceedings/QSMP94/ 26. Cort´es, V.: A new construction of homogeneous quaternionic manifolds and related geometric structures. Memoirs Amer. Math. Soc. 147(700), 1–63 (2000) math.DG/9908058 27. Kallosh, R., Linde, A.: Supersymmetry and the brane world. JHEP 02, 005 (2000) hep-th/0001071 28. Behrndt, K., Cvetiˇc, M.: Anti-de Sitter vacua of gauged supergravities with 8 supercharges. Phys. Rev. D61, 101901 (2000) hep-th/0001159 Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 238, 545–562 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0868-7

Communications in

Mathematical Physics

Variational Estimates for Discrete Schr¨odinger Operators with Potentials of Indefinite Sign D. Damanik1, , D. Hundertmark2, , R. Killip1 , B. Simon1, 1

Mathematics 253-37, California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected]; [email protected]; [email protected] 2 Institut Mittag-Leffler, Aurav¨agen 17, 182 60 Djursholm, Sweden Received: 5 November 2002 / Accepted: 31 January 2003 Published online: 3 June 2003 – © Springer-Verlag 2003

Abstract: Let H be a one-dimensional discrete Schr¨odinger operator. We prove that if σess (H ) ⊂ [−2, 2], then H − H0 is compact and σess (H ) = [−2, 2]. We also prove that if H0 + 41 V 2 has at least one bound state, then the same is true for H0 + V . Further, if H0 + 41 V 2 has infinitely many bound states, then so does H0 + V . Consequences include the fact that for decaying potential V with lim inf |n|→∞ |nV (n)| > 1, H0 + V has infinitely many bound states; the signs of V are irrelevant. Higher-dimensional analogues are also discussed. 1. Introduction Let H be a Schr¨odinger operator on 2 (Z), (H u)(n) = u(n + 1) + u(n − 1) + V (n)u(n)

(1.1)

with bounded potential V : Z → R. The free Schr¨odinger operator, H0 , corresponds to the case V = 0. One of our main results in this paper is Theorem 1. If σess (H ) ⊂ [−2, 2], then V (n) → 0 as |n| → ∞, that is, H − H0 is compact. Remark. By Weyl’s Theorem, we have the immediate corollary that σess (H ) = [−2, 2] if and only if V (n) → 0. Our motivation for this result came from two sources:

Supported in part by NSF grant DMS-0227289 On leave from Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green Street, Urbana, IL 61801-2975, USA Supported in part by NSF grant DMS-0140592

546

D. Damanik, D. Hundertmark, R. Killip, B. Simon

Theorem 2 (Killip-Simon [7]). If σ (H ) ⊂ [−2, 2], then V = 0. Theorem 3 (Rakhmanov [12]; see also Denisov [5], Nevai [11], and references therein). Let J be a general half-line Jacobi matrix on 2 (Z+ ), (J u)(n) = an u(n + 1) + bn u(n) + an−1 u(n − 1),

(1.2)

where an > 0 and Z+ = {1, 2, . . . }. Suppose that [−2, 2] is the essential support of the a.c. part of the spectral measure and also the essential spectrum. Then limn→∞ |an − 1| + |bn | = 0, that is, J is a compact perturbation of J0 , the Jacobi matrix with an ≡ 1, bn ≡ 0. While Theorem 3 motivated our thoughts, it is not closely related to the result. Not only are the methods different, but it holds for any a priori an ; whereas our results require some a priori estimates like an → 1 as |n| → ∞. For example, if an ≡ 21 and bn takes values +1 and −1 over longer and longer intervals, it is not hard to see that σ (J ) = [−2, 2], but clearly, J − J0 is not compact. Thus Theorem 1, unlike Theorem 3, is essentially restricted to discrete Schr¨odinger operators. For continuum Schr¨odinger operators, consideration of sparse positive nondecaying potentials shows that σ (H ) = [0, ∞) is possible even when (H + 1)−1 − (H0 + 1)−1 is not compact. The reason is that our proof depends essentially – as does Theorem 2 – on the fact that σ (H ) has two sides in the discrete case. Theorem 1 has an interesting corollary: Corollary 4. Let H be an arbitrary one-dimensional discrete Schr¨odinger operator. Then sup σess (H ) − inf σess (H ) ≥ 4 with equality if and only if V (n) → V∞ a constant as |n| → ∞. Proof. Let a+ = sup σess (H ), a− = inf σess (H ). If a+ − a− ≤ 4, then H − 21 (a+ + a− ) is a Schr¨odinger operator with essential spectrum in [−2, 2]. So Theorem 1 implies the original V (n) → 21 (a+ + a− ). Hence, a+ − a− = 4 and σess = [a− , a+ ]. Remarks. (a) A similar argument combined with Theorem 2 implies that if sup σ (H ) − inf σ (H ) ≤ 4, then V is a constant. (b) If V (n) = (−1)n λ and λ is large, standard Floquet theorem arguments show that σ (H ) has two bands centered about ±(λ + O( λ1 )) and of width O( λ1 ). Thus, while the size of the convex hull of σ (H ) is of size at least 4, the size of σ (H ) can be arbitrarily small. Indeed, by results of Deift-Simon [4], if H has purely a.c. spectrum, (e.g., V periodic), the total size of σ (H ) is at most 4. While Theorem 1 is our main motivating result, the ideas behind it yield many other results about the absence of eigenvalues and about the finiteness or infinitude of their number for Schr¨odinger operators not only on the line, but also on the half-line or in higher dimensions. Included in our results are (i) Theorem 1 holds in two dimensions and is false in three or more dimensions (see Theorems 4.1 and 4.2). This is connected to the fact that Schr¨odinger operators in one and two dimensions always have a bound state for nontrivial attractive potentials (see [9, pp. 156–157] and [8, 15]), whereas in three and more dimensions, small attractive potentials need not have bound states by the Cwikel-Lieb-Rozenblum bound [1, 10, 14].

Variational Estimates for Discrete Schr¨odinger Operators

547

(ii) For a half-line discrete Schr¨odinger operator, H , if σ (H ) = [−2, 2] (i.e., no bound states), then (see Theorem 5.2) |V (n)| ≤ 2n−1/2 .

(1.3)

On the other hand (see Theorem 5.2), there are examples, Vk (n), with no bound states and limk supn n1/2 |Vk (n)| = 1. This shows that the power 21 in (1.3) cannot be made larger. It also shows √ that the constant, 2, cannot be made smaller than 1. (The optimal constant is 2. This is proved in [3].) (iii) The examples in (ii) are necessarily sparse in that if |V (n)| ≥ Cn−α and H has only finitely many bound states, then α ≥ 1. Indeed, we will prove (see Theorem 5.6) that if α = 1 and C > 1 or α < 1 and C > 0, then H has an infinity of bound states. This will follow from the very general theorem: Theorem 5. Let V (n) → 0. If H0 + 41 V 2 has at least one (resp., infinitely many) eigenvalues outside [−2, 2], then H0 + V has at least one (resp., infinitely many) eigenvalues outside [−2, 2]. Theorem 3.1 extends this result to all dimensions. (iv) If |V (n)| ≥ Cn−α and α < 1, we will prove suitable eigenvalue moments diverge. The starting point of the present paper is the discussion at the end of Sect. 10 of [7] that it should be possible to prove Theorem 2 variationally with suitable second-order perturbation trial functions. Second-order eigenvalue perturbation theory has a change of the first-order eigenfunction by a term proportional to V. Thus, our variational trial function will have two pieces: ϕ and an extra piece, proportional to V ϕ. The second key idea is to make use of the fact that the spectrum of H0 has two sides, and we can use a pair of trial functions: one to get an eigenvalue below −2 and one to get an eigenvalue above +2. By combining them, we will have various cancellations that involve terms whose sign is uncertain. Explicitly, given a pair of trial vectors ϕ+ and ϕ− , we define (ϕ+ , ϕ− ; V ) = ϕ+ , (H − 2)ϕ+ + ϕ− , (−H − 2)ϕ− ,

(1.4)

where H is given by (1.1). If > 0, either ϕ+ , (H −2)ϕ+ > 0 or ϕ− , (H +2)ϕ− < 0, that is, there is either an eigenvalue above 2 or below −2! In choosing ϕ− relative to ϕ+ , it will help to use the unitary operator U on 2 (Z) given by (U ϕ)(n) = (−1)n ϕ(n)

(1.5)

so that U H0 U −1 = −H0

U V U −1 = V .

(1.6)

The key calculation in Sect. 2 will be that (ϕ +

1 4

V ϕ, U (ϕ −

1 4

V ϕ)) ≥ 2 ϕ, [H0 +

1 4

V 2 − 2]ϕ.

(1.7)

For example, this immediately implies the “at least one bound state” part of Theorem 5. If H0 + 41 V 2 has a bound state, ϕ, we must have ϕ, (H0 + 41 V 2 )ϕ > 2 ϕ, ϕ, so > 0.

548

D. Damanik, D. Hundertmark, R. Killip, B. Simon

The current paper complements [2]. That paper provided upper bounds on the distance from [−2, 2] of eigenvalues of discrete Schr¨odinger operators with oscillatory potentials. This paper provides lower bounds. In particular, there it was shown the Jacobi n matrix with an ≡ 1, bn = β(−1) has finitely many eigenvalues if |β| ≤ 21 . Here, we n prove infinitely many (see Theorem 5.7) if |β| > 1. We also show, by ad hoc methods, that there are no eigenvalues for |β| ≤ 1 (see Proposition 5.9). In Sect. 2, we prove variational estimates, including (1.7). In Sect. 3, we prove Theorem 5. In Sect. 4, we prove Theorem 1 and provide a new proof of Theorem 2. Sections 2–4 also discuss higher dimensions. In Sect. 5, we study the one-dimensional situation more closely. We thank Andrej Zlatoˇs for useful discussions. 2. Variational Estimates On 2 (Zν ), define H0 by (H0 u)(n) =

u(n + j )

(2.1)

|j |=1

so −2ν ≤ H0 ≤ 2ν.

(2.2)

For V, a bounded function on Zν , let H = H0 + V .

(2.3)

We are interested in the spectrum of H outside [−2ν, 2ν] = σ (H0 ). If we define U on 2 (Zν ) by (U ϕ)(n) = (−1)|n| ϕ(n),

(2.4)

where |n| = |n1 | + · · · + |nν |, then U H0 U −1 = −H0

U V U −1 = V .

(2.5)

We define, for ϕ+ , ϕ− ∈ 2 (Zν ), (ϕ+ , ϕ− ; V ) = ϕ+ , (H − 2ν)ϕ+ + ϕ− , (−H − 2ν)ϕ− ,

(2.6) (n)

> 0 implies that H has spectrum outside [−2ν, 2ν] and, as we will see, (ϕ+ , (n) (n) ϕ− ; V ) > 0 for suitable ϕ± implies the spectral projection χR\[−2ν,2ν] (H ) has infinite dimension. Note first that Proposition 2.1. If f, g ∈ 2 (Zν ), then (f + g, U (f − g); V ) ≥ 2 f, (H0 − 2ν)f − 8ν g 2 + 4 Re f, V g.

(2.7)

Variational Estimates for Discrete Schr¨odinger Operators

549

Proof. By (2.5), (f + g, U (f − g); V ) = (f + g), (H0 − 2ν + V )(f + g) + (f − g), (H0 − 2ν − V )(f − g) = 2 f, (H0 − 2ν)f + 2 g, (H0 − 2ν)g + 4 Re f, V g. By (2.2), H0 ≥ −2ν, so

g, (H0 − 2ν)g ≥ −4ν g 2 . This yields (2.7).

One obvious choice is to take f = ϕ, g = γ V ϕ. The V -terms on the right side of (2.7) are then

V ϕ 2 (−8νγ 2 + 4γ ) which is maximized at γ = of (1.7).

1 2 4ν , where −8νγ +4γ

=

(2.8)

1 2ν . Thus we have a generalization

Theorem 2.2. For any ϕ ∈ 2 (Zν ), 1 1 (1 + 4ν V )ϕ, U (1 − 4ν V )ϕ; V ≥ 2 ϕ, (H0 − 2ν +

1 4ν

V 2 )ϕ .

(2.9)

In some applications, we will want to be able to estimate f ± g in terms of f , and so want to cut off V g. We have Theorem 2.3. For any F ∈ ∞ with 0 ≤ F ≤ 1, we have ϕ(1 + (4ν)−1 F V ), U ϕ(1 − (4ν)−1 F V ); V ≥ 2 ϕ, (H0 − 2ν + (4ν)−1 F V 2 )ϕ . (2.10) Proof. By taking g = γ F V ϕ, f = ϕ, the V -terms in (2.7) are −8νγ 2 F V ϕ 2 + 4γ V ϕ, F V ϕ−

(2.11)

in place of (2.8). Since 0 ≤ F ≤ 1, we have −F 2 ≥ −F , so − F V ϕ 2 ≥ − V ϕ, F V ϕ and (2.10) results.

The properties of H0 needed above are only (2.2) and (2.5). If J is a Jacobi matrix (1.2) and J1 is the Jacobi matrix with the same values of an but with bn = 0, then U J1 U −1 = −J1 . Equation (2.2) is replaced by J1 ≥ −α,

(2.12)

α = max (an + an+1 ).

(2.13)

where n

One has Theorem 2.4. For any ϕ ∈ 2 (Z+ ), with ϕ± = (1 ± γ V )ϕ (where γ = (2 + α)−1 ), we have

ϕ+ , (J − 2)ϕ+ + U ϕ− , (−2 − J )U ϕ− ≥ 2 ϕ, (J1 − 2 + γ b2 )ϕ . (2.14)

550

D. Damanik, D. Hundertmark, R. Killip, B. Simon

3. A V 2 Comparison Theorem Our goal in this section is to prove the following extension of Theorem 5: Theorem 3.1. Let V be defined on Zν . Let V (n) → 0 as |n| → ∞. If H0 +(4ν)−1 V 2 has at least one eigenvalue (resp., infinitely many) outside [−2ν, 2ν], then so does H0 + V. The key to this will be Theorem 2.2, but we will also need Lemma 3.2. Let W ≥ 0 on Zν with W (n) → 0 as |n| → ∞. If H0 + W has infinitely many eigenvalues in (2ν, ∞), then we can find {ϕn }∞ n=1 with ϕn , (H0 + W )ϕn > 2ν ϕn 2 , so that each ϕn has finite support and (3.1) dist supp(ϕn ), supp(ϕm ) ≥ 2 for all n = m. Proof. Let k = {n ∈ Zν | maxi=1,...,ν |ni | ≤ k}. We first claim that for every k, 0 be there exists ψ with ψ = 0 on k so that ψ, (H0 + W )ψ > 2ν ψ 2 . Let H H0 with Dirichlet boundary conditions on ∂ k , that is, dropping off-diagonal 0 − H0 is finite rank, so H 0 + W terms H0,ij with i ∈ k , j ∈ / k or vice-versa. H 0 + W is a direct sum of an operator has infinitely many eigenvalues in (2ν, ∞). But H on 2 ( k ) and one on 2 (Zν \ k ). Since dim 2 ( k ) < ∞, we can find ψ ∈ 2 (Zν \ k ) 0 + W )ψ > 2ν ψ 2 . so ψ, (H0 + W )ψ = ψ, (H Now pick ϕn inductively as follows. After picking {ϕn }N n=1 , we have each ϕn has finite support, so there is a k with each ϕn = 0 on Zν \ k , n = 1, . . . , N. By the initial argument, pick ψN+1 vanishing on k+1 so that ψN+1 , (H0 + W )ψN+1 > (m) 2ν ψN+1 , ψN+1 and ψN+1 = 0 on k+1 . Let ψN +1 be finitely supported approxi(m)

mations to ψN+1 which vanish on k+1 . By continuity, for some m, ψN+1 , (H0 + (m)

(m)

(m)

(m)

W )ψN+1 > 2ν ψN+1 , ψN +1 . Pick ϕN+1 = ψN +1 .

Proof of Theorem 3.1. If H0 + (4ν)−1 V 2 has at least one eigenvalue outside [−2ν, 2ν], 1 2 there exists ϕ with ϕ, (H0 + 4ν V − 2ν)ϕ > 0. By (2.9), H0 + V has some eigenvalue outside [−2ν, 2ν]. If H0 +(4ν)−1 V 2 has infinitely many eigenvalues, by Lemma 3.2, there exist ϕn obeying (3.1) so that ϕn , (H0 + 41 V 2 )ϕn > 2ν ϕn 2 . By (2.9), we can find ψn with either

ψn , (H0 + V )ψn > 2ν ψn 2 or ψn , (H0 + V )ψn < −2ν ψn 2 and supp(ψn ) ⊂ supp(ϕn ). By (3.1), we have

ψn , ψm = 0

and ψn , (H0 + V )ψm = 0

for n = m.

Thus, by the min-max principle, H0 + V has an infinity of eigenvalues in either (2ν, ∞) or (−∞, −2ν). Using Theorem 2.4 in place of Theorem 2.2, we get Theorem 3.3. Let J ({an }, {bn }) be the Jacobi matrix (1.2). Suppose an → 1 and bn → 0 so σess (J ) = [−2, 2]. Let α be given by (2.13) and γ = (2 + α)−1 . If J ({an }, {γ bn2 }) has at least one eigenvalue (resp., infinitely many) in (2, ∞), then J ({an }, {bn }) has at least one eigenvalue (resp., infinitely many) in (−∞, −2) ∪ (2, ∞). Remark. In particular, if J ({an }, {bn = 0}) has an infinity of eigenvalues, they cannot be destroyed by a crazy choice of {bn }.

Variational Estimates for Discrete Schr¨odinger Operators

551

4. Essential Spectra and Compactness in Dimension 1 and 2 Our goal in this section is to prove Theorem 4.1. Let ν = 1 or 2. If σess (H0 +V ) ⊂ [−2ν, 2ν], then V (n) → 0 as |n| → ∞. Theorem 4.2. If ν ≥ 3, there exist potentials V in ∞ (Zν ) so that σ (H0 + V ) = [−2ν, 2ν] and so that lim supn→∞ |V (n)| > 0. We will also provide a new proof of Theorem 2. The key to the dimension dependence is the issue of finding ϕn ∈ 2 (Zν ) so that ϕn (0) = 1 and ϕn , (2ν − H0 )ϕn → 0. We will see that this can be done in dimension 1 and 2. It cannot be done in three or more dimensions, essentially because (2ν − H0 )−1 exists, not as a bounded operator on 2 but as a matrix defined on vectors of finite support. To minimize ϕ, (2ν − H0 )ϕ subject to ϕ(0) = 1, by the method of Lagrange multipliers, one takes ϕ = (2ν − H0 )−1 δ0 / δ0 (2ν − H0 )−1 δ0 . This is not in 2 but has 2 approximations. In fact, let ϕ ∈ 2 with ϕ(0) = δ0 , ϕ = 1. By the Cauchy-Schwarz inequality, 1 ≤ (2ν − H0 )1/2 ϕ (2ν − H0 )−1/2 δ0 , that is,

ϕ, (2ν − H0 )ϕ ≥ δ0 , (2ν − H0 )−1 δ0 −1 > 0 for ν ≥ 3. So any 2 sequence ϕ with ϕ(0) = 1 has a minimal kinetic energy in dimension ν ≥ 3. A different way of thinking about this is as follows: If ϕ has compact support in a box of size L and ϕ(0) = 1, then, on average, ∇ϕ is at least L−1 so ∇ϕ 2 =

ϕ, (2ν − H0 )ϕ ∼ Lν L−2 . If ν ≥ 3, one does not do better by taking big boxes. In ν = 1, one certainly does; and in ν = 2, a careful analysis will give (ln L)−1 decay. Proposition 4.3. Let L1 , L2 ≥ 1. There exists ϕL1 ,L2 ∈ 2 (Z), supported in [−L1 , L2 ], so that (i) ϕL1 ,L2 (0) = 1, (ii) ϕL1 ,L2 , (2 − H0 )ϕL1 ,L2 = (L1 + 1)−1 + (L2 + 1)−1 , (iii) for suitable constants c1 > 0 and c2 < ∞, c1 (L1 + L2 ) ≤ ϕL1 ,L2 2 ≤ c2 (L1 + L2 ). Proof. Define

  1−   ϕL1 ,L2 (n) = 1 −   0

n L2 +1 |n| L1 +1

(4.1)

0 ≤ n ≤ L2 + 1 0 ≤ −n ≤ L1 + 1

,

(4.2)

n ≥ L2 + 1 or n ≤ −L1 − 1

then (i) and (iii) are easy. As

ψ, (2 − H0 )ψ =

∞

2

ψ(j + 1) − ψ(j )

j =−∞

for any ψ ∈ 2 (Z), we have

ϕL1 ,L2 , (2 − H0 )ϕL1 ,L2 =

L 2 +1 j =1

2 1 L2 +1

= (L1 + 1) which proves (ii).

−1

+

L 1 +1

j =−1

2 1 L1 +1

+ (L2 + 1)−1 ,

(4.3)

552

D. Damanik, D. Hundertmark, R. Killip, B. Simon

Remark. If ψ(0) = 1 and ψ is supported in [−L1 , L2 ], L 2 +1

ψ(j ) − ψ(j − 1) = −1

j =1

so, by the Schwarz inequality, 1 ≤ (L2 + 1)

L 2 +1

|ψ(j ) − ψ(j − 1)|2 .

j =1

Thus

ψ, (2 − H0 )ψ ≥ (L1 + 1)−1 + (L2 + 1)−1 which shows that (4.2) is an extremal function. Proposition 4.4. Let L ≥ 1. There exists ϕL ∈ 2 (Z2 ) supported in {(n1 , n2 ) | |n1 | + |n2 | ≤ L} so that (i) ϕL (0) = 1, (ii) 0 ≤ ϕL , (4 − H0 )ϕL ≤ c[ln(L + 1)]−1 for some c > 0, (iii) (L−1 ln(L))2 ϕL 2 → d > 0 . Remark. It seems clear that one cannot do better than ln(L)−1 in the large L asymptotics of ϕL , (4 − H0 )ϕL for any test function obeying (i) and the support condition. Proof. Define

ϕL (n1 , n2 ) =

− ln[(1+|n1 |+|n2 |)/(L+1)] ln(L+1)

0

if |n1 | + |n2 | ≤ L , if |n1 | + |n2 | ≥ L

then (i) is obvious. As a 1 a+1 − ln = ln 1 + ≤ a −1 ln (L + 1) (L + 1) a we have that

ϕL , (4 − H0 )ϕL =

2

ϕL (n1 + 1, n2 ) − ϕL (n1 , n2 )

n1 ,n2

2 + ϕL (n1 , n2 + 1) − ϕL (n1 , n2 ) ≤ ln(L + 1)−2 (1 + |n1 | + |n2 |)−2 n1 ,n2 |n1 |+|n2 |≤L

≤ c ln(L + 1)−1 since the sum diverges as ln L. This proves (ii). To prove (iii), we note that, by a simple approximation argument, [ln(|x| + |y|)]2 dx dy ln(L)2 L−2 ϕL 2 → |x|+|y|≤1

as L → ∞.

Variational Estimates for Discrete Schr¨odinger Operators

553

Proof of Theorem 4.1. Consider first the case ν = 1. Suppose lim sup|V (n)| = a > 0. Pick L so that 2(L + 1)−1 < 18 min(a 2 , 2a). Pick a sequence n1 , . . . , nj , . . . with |V (nj )| → a so that |nj | − max1≤≤j −1 |n | ≥ 2(L + 2). Thus, |nj − n | ≥ 2(L + 2) for all j = . Define 2 F (n) = min 1, (4.4) |V (n)| and let ψj (n) = ϕL,L (n − nj ). Then

ψj , (H0 − 2 +

1 4

F V 2 )ψj ≥ −2(L + 1)−1 +

1 4

F (nj )V (nj )2

≥ − 18 min(a 2 , 2a) +

1 4

min(|V (nj )|2 , 2|V (nj )|).

1 8

min(a 2 , 2a).

Thus we have that lim inf ψj , (H0 − 2 + As |F V | ≤ 2, if ϕ±,j = (1 ± 1 2

1 4

1 4

F V 2 )ψj ≥

F V )ψj , we have

ψj ≤ ϕ±,j ≤

3 2

ψj ≤ CL ,

(4.5)

where CL is independent of j ; compare (4.1). By (2.9), we have a subsequence of j ’s so that either lim inf ϕ+,j , (H0 + V − 2)ϕ+,j ≥

1 16

min(a 2 , 2a).

or lim inf ϕ−,j , (−H0 − V − 2)ϕ+,j ≥

1 16

min(a 2 , 2a).

Moreover, the ϕ’s are orthogonal. Thus H has essential spectrum in either [2 +

1 16

d −1 min(a 2 , 2a), ∞) or (−∞, −2 −

1 16

d −1 min(a 2 , 2a)].

The proof for ν = 2 is similar, using Proposition 4.4 in place of Proposition 4.3.

Proof of Theorem 4.2. We will give an example with V ≥ 0. Thus the only spectrum that H0 + V can have outside [−2ν, 2ν] is in (2ν, ∞). As ν ≥ 3, the operator (2ν − H0 )−1 has finite matrix elements despite being unbounded. We denote the n, m matrix element, the Green function, by Gν (n − m). By the Birman-Schwinger principle [18, Sect. 3.5], if the matrix Mnm = V (n)1/2 Gν (n − m)V (m)1/2 defines an operator on 2 (Zν ) with norm strictly less than 1, then H0 +V has no spectrum in (2ν, ∞). Since Gν (n) → 0 as n → ∞ (indeed, it decays as |n|−(ν−2) ), we can find a sequence in Zν with |nj | → ∞ and |Gν (nj − nk )| < 21 . (4.6) j =k

554

D. Damanik, D. Hundertmark, R. Killip, B. Simon

For example, pick nk inductively so j 0 for all n and m so the absolute value sign is redundant.) Choose λ > 0 so that λGν (0) < and define V by

V (n) =

1 2

(4.7)

min(1, λ) n = some nj . 0 otherwise

In this way, lim sup|n|→∞ |V (n)| = min(1, λ) > 0. However, by Schur’s lemma,

M < 1 so H0 + V has no eigenvalues. The ideas in the first part of this section allow us to reprove Theorem 2 and, more importantly, extend it to two dimensions. Theorem 4.5. Let ν = 1 or 2. If σ (H0 + V ) ⊂ [−2ν, 2ν], then V = 0. Proof. By Theorems 4.1 and 4.2, V (n) → 0. By Theorem 3.1, if H0 + V has no bound 1 2 states, neither does H0 + 4ν V . Since V = 0 if and only if V 2 = 0, we may as well consider the case V ≥ 0. Let ϕL be the function guaranteed by Proposition 4.3 or 4.4. Then

ϕL , (H0 + V − 2ν)ϕL ≥ V (0) + ϕL , (H0 − 2ν)ϕL . Since ϕL , (H0 − 2ν)ϕL → 0, we must have V (0) = 0. By translation invariance, V (n) = 0 for all n. Theorem 4.6. Let J be the Jacobi matrix (1.2). Suppose lim inf an ≥ 1 and σess (J ) ⊂ [−2, 2]. Then bn → 0 as n → ∞. Proof. Since lim inf an ≥ 1, we can suppose an ≥ 1 since the change from an to min(an , 1) is a compact perturbation. By the lemma below, σess (J ) can only shrink if an ≥ 1 is replaced by an = 1. Thus we can suppose an = 1 in what follows. = H0 on 2 (Z\Z+ ) with a Dirichlet boundary condition at 0, H = J on Let H 2 + (Z ), and

0 n≤0 V (n) = . bn n ≥ 1 by a finite rank perturbation. Thus H has essential Then H = H0 + V differs from H spectrum in [−2, 2]. The proof is completed by using Theorem 4.1. Lemma 4.7. If J ({an }, {bn }) is the Jacobi matrix given by (1.2), then sup σess (J ({an }, {bn })) and − inf σess (J ({an }, {bn })) are monotone increasing as an increases. Proof. As noted in Sect. 3 of Hundertmark-Simon [6], for each N , the sum of the N + largest positive eigenvalues, N j =1 Ej (J ({an }, {bn })), is monotone in {an }. But N 1 + Ej J ({an }, {bn }) . sup σess J ({an }, {bn }) = lim n→∞ N j =1

The proof for − inf σess is similar.

Variational Estimates for Discrete Schr¨odinger Operators

555

5. Decay and Bound States for Half-Line Discrete Schr¨odinger Operators While whole-line discrete Schr¨odinger operators have bound states if V ≡ 0 (Theorem 2), this is not true for half-line operators. Indeed, the discrete analogue of Bargmann’s bound [6] implies that ∞

n|V (n)| < 1 ⇒ σ (J0 + V ) = [−2, 2],

(5.1)

n=1

where J0 is the free Jacobi operator, that is, (1.2) with an ≡ 1, bn ≡ 0. One can also include the endpoint case: If a sequence of selfadjoint operators Ak converges strongly to A, then σ (A) ⊆ σ (Ak ) n

k≥n

see [13, Theorem VIII.24]. This shows that (5.1) can be extended to ∞

n|V (n)| ≤ 1 ⇒ σ (J0 + V ) = [−2, 2].

(5.2)

n=1

In this section, we explore what the absence of bound states tells us about the decay of V. We begin with the case V ≥ 0: Theorem 5.1. Suppose V (n) ≥ 0 and that J0 + V has no bound states. Then |V (n)| ≤ n−1 .

(5.3)

Moreover, (5.3) cannot be improved in that for each n0 , there exists Vn0 so that Vn0 (n0 ) = n−1 0 and J0 + Vn0 has no bound states. Proof. Let Wn0 be

Wn0 (n) =

1 0

n = n0 . n = n0

We claim J0 + λWn0 has a bound state if and only if |λ| > n−1 0 . By (1.6), we can suppose λ > 0. In that case, by a Sturm oscillation theorem [17], there is a bound state in (2, ∞) if and only if the solution of u(n + 1) + u(n − 1) + λWn0 (n)u(n) = 2u(n)

u(0) = 0, u(1) = 1

(5.4)

has a negative value for some n ∈ Z+ . The solution of (5.4) is

n n ≤ n0 u(n) = n0 + (1 − λn0 )(n − n0 ) n ≥ n0 which takes negative values if and only if λn0 > 1. This proves the claim. In particular, n−1 0 Wn0 = Vn0 is a potential where equality holds in (5.3) and σ (J0 + V0 ) = [−2, 2]. On the other hand, if V (n0 ) > n−1 0 , then since V ≥ 0, V (n) ≥ V (n0 )Wn0 (n) for all n and so, by a comparison theorem and the fact that we have shown J0 + V (n0 )Wn0 has a bound state, we have that J0 + V has a bound state. The contrapositive of V (n0 ) > n−1 0 ⇒ σ (J0 + V ) = [−2, 2] is the first assertion of the theorem.

556

D. Damanik, D. Hundertmark, R. Killip, B. Simon

Remark. Notice that Theorem 5.1 says (5.2) is optimal in the very strong sense that if ∞ α |V (n)| ≤ 1 ⇒ σ (J0 + V ) = [−2, 2] for all potentials V, then each αn ≤ n. n n=1 Positivity of the potential made the proof of Theorem 5.1 elementary. Because of the magic of Theorem 5, we can deduce a result for V ’s of arbitrary sign: Theorem 5.2. If J0 + V has no bound states, then |V (n)| ≤ 2n−1/2 .

(5.5)

Moreover, (5.5) cannot be improved by more than a factor of 2 in that for each n0 , there exists Vn0 so that J0 + Vn0 has no bound states and 1/2

lim n0 |Vn0 (n0 )| = 1.

n0 →∞

Remarks. (a) The proof shows Vn0 (n0 ) =

1 n0

+

1 4n20

−

1 2n0

≡ βn0 −1/2

so (5.5) cannot be improved to value better than βn0 ∼ n0 − 21 n−1 0 . (b) In [3] it is shown that the absence of bound states implies √ |V (n)| ≤ 2n−1/2 √ −1/2 and that there are examples Vn0 with Vn0 (n0 ) = 2n0 and no bound states. Proof. Theorem 5 extends to the situation where H0 is replaced by J0 since the mapping ϕ → ϕ(1 ± F V ) is local. Thus if J0 + V has no bound states, neither does J0 + 41 V 2 . Since V 2 ≥ 0, Theorem 5.1 applies, and thus 41 |V (n)|2 ≤ n−1 , which is (5.5). For the other direction, let Wn0 be   n = n0 1 Wn0 = −1 n = n0 + 1 .  0 n = n0 , n0 + 1 A direct solution of (5.4) is

n n ≤ n0 u(n) = . (1 − λ)n0 + 1 + (1 + λ − λ2 n0 )(n − n0 − 1) n ≥ n0 + 1 Thus u(n) has a negative value if and only if 1 + λ − λ2 n0 < 0. Define crit λ± = ± 4n1 2 + n10 − 2n1 0 .

(5.6)

(5.7)

0

crit If |λ| > min(|λcrit + |, |λ− |), u takes negative values for either u(n, λ) or u(n, −λ). By (1.6), J0 + V has eigenvalues in (−∞, −2) if and only if J0 − V has eigenvalues in crit crit (2, ∞). Thus since |λcrit + | < |λ− |, J0 + λWn0 has no eigenvalues if |λ| ≤ λ+ .

One can also say something about infinitely many bound states:

Variational Estimates for Discrete Schr¨odinger Operators

557

Theorem 5.3. (i) If V ≥ 0 and lim sup |V (n)|n > 1,

(5.8)

n→∞

then J0 + V has infinitely many bound states. (ii) For general V, if lim supn→∞ |V (n)|n1/2 > 2, then J0 + V has infinitely many bound states. Proof. (ii) follows from (i) by Theorem 5. To prove (i), suppose J0 + V has only finitely many bound states. Then (J0 + V − 2)u has only finitely many sign changes, so there (n) = V (n + N0 ) is N0 with u(n)u(n + 1) > 0 if n > N0 . It follows that J0 + V with V −1 has no bound states. Thus |V (n)| ≤ n , so lim supn→∞ n|V (n)| ≤ 1. Thus, by contrapositives, (5.8) implies J0 + V has infinitely many bound states. Example 5.4. Let N be a positive integer and nk = N 2k . We consider the sequence u(n) which has slope u(n + 1) − u(n) = N −k for n ∈ [nk , nk+1 ) and then determine the potential V at the sites nk so that u is the generalized eigenfunction at energy 2. (Constancy of the slope in the intervals (nk , nk+1 ) implies that the potential vanishes there.) We have u(nk ) = n1 + (n2 − n1 )N −1 + · · · + (nk − nk−1 )N −(k−1) = (1 − N −1 ){N 2 + N 3 + · · · + N k } + N k+1 = N k+1 {1 + N −1 − N −k }, and so 2u(nk ) − u(nk + 1) − u(nk − 1) u(nk ) 1−k N − N −k = k+1 N {1 + N −1 − N −k } 1 − N −1 1 = . 1 + N −1 − N −k nk

V (nk ) =

As u is monotone, there are no sign flips. We may conclude that J0 + V has no bound states because V (n) ≥ 0. Therefore, taking N → ∞, we see that the 1 in (5.8) is optimal. A similar argument [19] shows there are examples with lim sup n1/2 |V (n)| = 1 − ε and no bound states for each ε > 0. Basically, V (n) = 0 for n = nk or nk + 1 and −1/2 V (nk ) = −V (nk + 1) = nk (1 − εk ) with εk → ε. Again, nk must grow at least geometrically. The examples that saturate Theorems 5.1 and 5.3 are sparse, that is, mainly zero. If V is mainly nonzero and comparable in size, the borderlines change from n−1 to n−2 for positive V ’s and from n−1/2 to n−1 for V ’s of arbitrary sign. Theorem 5.5. Let V ≥ 0. Suppose there exists ε > 0 and nk → ∞ so that (i) nk 2 V (j ) ≥ εV (nk ). nk j =nk /2

(5.9)

558

D. Damanik, D. Hundertmark, R. Killip, B. Simon

(ii) lim supk→∞ εn2k V (nk ) > 48 . Then J0 + V has infinitely many bound states. Proof. For notational simplicity, we suppose each nk is a multiple of 4. By passing to a subsequence, we can suppose that εnk 6 V (nk ) > , 8 nk nk+1 3 > nk + 2. 2 4

(5.10) (5.11)

Let uk be the function which is 1 at nk , has constant slope on the intervals [ n4k −1, nk ] and [nk , 3n2 k + 1], and vanishes at n = n4k − 1 and n = 3n2 k + 1. By Proposition 4.3,

uk , (2 − J0 )uk ≤

6 . nk

On [ n2k , nk ], we have |u(j )| ≥ 21 , so

uk , V uk ≥

n 1 εnk V (nk ) V (j ) ≥ 4 8

(by (5.9)).

j =n/2

By (5.10), uk , (J0 + V − 2)uk > 0 for all k. By (5.11) for k = ,

uk , u = uk , (J0 + V )u = 0 so, by the min-max principle, J0 + V has infinitely many eigenvalues in (2, ∞).

Theorem 5 and Theorem 5.5 immediately imply Theorem 5.6. Suppose there exists ε > 0 and nk → ∞ so that k |V (j )|2 ≥ ε2 |V (nk )|2 , (i) n2k nj =n k /2 √ (ii) lim supk→∞ εnk |V (nk )| > 8 3 . Then J0 + V has infinitely many bound states. In this regard, here is another application of Theorem 5: Theorem 5.7. If |V (n)| ≥ many bound states.

β n

with β > 1 and V (n) → 0, then J0 + V has infinitely

Proof. It is known (see [2, Theorem A.7]) if β 2 > 1, then the operator with potential β2 β2 , and hence the operator with potential 41 V (n)2 ≥ 4n 2 , has infinitely many bound 4n2 states. The assertion now follows from Theorem 5. Corollary 5.8. If V (n) → 0 but lim inf |n|→∞ |nV (n)| > 1, then J0 + V has infinitely many bound states. The same result holds in the whole-line setting. Proof. We begin with the half-line case. By hypothesis, there exists a β > 1 such that |V (n)| ≥ βn for all but finitely many n. Therefore the claim follows from the previous theorem because a finite rank perturbation can remove at most finitely many eigenvalues. The whole-line case follows by Dirichlet decoupling.

Variational Estimates for Discrete Schr¨odinger Operators

559 n

1 Remark. It is known (see [2]) that if V (n) = 4n1 2 or V (n) = β (−1) n with |β| < 2 , then J0 + V has finitely many bound states. Thus the powers n−2 and n−1 in the previous results are optimal.

The optimal constant in Theorem 5.7 is 1, as we now show. n

Proposition 5.9. For β ∈ [−1, 1], the operator J0 + V with potential V (n) = β (−1) n has no bound states. n

Proof. We will show that the operator with potential V (n) = (−1) n has no bound states. As the absolute value of a bound state eigenvalue is an increasing function of the coupling n constant, this implies that potentials of the form V (n) = β (−1) have no bound states n for β ∈ [0, 1]. Equation (2.5) shows that J0 + V is unitarily equivalent to −(J0 − V ). Thus, the proposition for β ∈ [−1, 0] follows from the β ∈ [0, 1] case. By the unitary equivalence of J0 + V and −(J0 − V ), it suffices to show that for V0 = (−1)n /n, J0 + V0 and J0 − V0 have no eigenvalues in (2, ∞). We look at solutions of u(n + 1) + u(n − 1) = (2 ∓ V0 (n))u(n).

(5.12)

By Sturm oscillation theory, the number of eigenvalues of J0 ± V0 in (2, ∞) is equal to the number of zeros, in (0, ∞), of the linear interpolation of the generalized eigenfunction – that is, the solution of (5.12) with u(0) = 0. Moreover, the Sturm separation theorem implies that if (5.12) has a solution with u(n) > 0 for n = 0, 1, 2, ..., then the generalized eigenfunction must be positive for n ≥ 1 (and not 2 ; see remark below). We are able to write down positive solutions explicitly, but rather than pull such a rabbit out of a hat, we provide some explanation. Motivated by calculations in Maple, we look for solutions with u(n) = u(n + 1) for either all odd n or all even n. This is equivalent to asking if x −1 y −1 xy − 1 −x = (5.13) 1 0 1 0 y −1 has 11 as an eigenvector. If this is true for y = E − V (n), x = E − V (n + 1) for all odd (resp. even) n, then the Schr¨odinger equation has a solution with u(n) = u(n − 1) for all odd (resp. even) n, and for such n, u(n + 2) = [E − V (n) − 1]u(n). 1 The matrix in (5.13) has 1 as an eigenvector if and only if xy = x + y.

(5.14)

(5.15)

If x = 2 + a, y = 2 + b, then (5.12) becomes ab = −a − b.

(5.16)

1 This is solved by b = m1 , a = − m+1 with y − 1 = 1 + m1 . Since −V (n) appears in the transfer matrix for V0 , we take m = 2n + 1, n = 0, 1, 2, . . . and find a solution with 1 u(0) = u(1) = 1 u(2n) = u(2n + 1), u(2n + 2) = 1 + 2n+1 u(2n)

560

D. Damanik, D. Hundertmark, R. Killip, B. Simon

which is a positive solution with u(n) → ∞ as n1/2 as n → ∞. For −V0 , we take m = 2n, n = 1, 2, . . . , and find a solution with 1 u(2n) u(0) = 0 u(1) = u(2) = 1 u(2n) = u(2n − 1) u(2n + 2) = 1 + 2n so again, u(n) → ∞ as n1/2 . We have thus found the required solution to show J0 + V0 has no eigenvalues in (2, ∞). Remarks. (a) It follows from the proof that the generalized eigenfunctions at energies ±2 are not square summable. This shows that ±2 are not eigenvalues. 1 in the arguments given above shows that there are (b) Choosing y = − m1 , x = m+1 solutions u± of (J0 + V0 )u = 0 with |u± (n)| ∼ |n|±1/2 as n → ∞. This shows that 0 is not an eigenvalue of J0 + V0 but suggesting that for J0 + (1 + ε)V0 , there are solutions 2 at infinity for ε > 0. That is, just as coupling 1 is the borderline for eigenvalues outside [−2, 2], it is the borderline for an eigenvalue at E = 0 similar to the Wigner-von Neumann phenomenon. As our final topic, we want to discuss divergence of eigenvalue moments if |V (n)| ∼ n−α with α < 1. Lemma 5.10. Let A be a bounded selfadjoint operator. Let {ϕj }∞ j =1 be an orthonormal set with

ϕj , Aϕk = αj δj k .

(5.17)

If F is a nonnegative even function on R that is monotone nondecreasing on [0, ∞), then F (αj ). (5.18) Tr F (A) ≥ j

Remarks. (a) As F (A) ≥ 0, it follows that Tr(F (A)) is always defined although it may be infinite. (b) In particular, if ϕj is a family of nonzero vectors in 2 (Z+ ) with dist(supp(ϕj ), supp(ϕk )) ≥ 2 for j = k, then for J = J0 + V, ϕj , J ϕj . Tr F (J ) ≥ F (5.19)

ϕj , ϕj j

Proof. Let E1 ≥ E2 ≥ · · · be the eigenvalues of |A|. By min-max and max-min for A, we have Ej ≥ |αj |∗ , where |αj |∗ is the decreasing rearrangement of |αj |. So (5.18) follows. Lemma 5.11. Let |V | ≤ 4ν on supp(ϕ). Then there exists ψ with supp(ψ) = supp(ϕ) so that 1 V 2 )ϕ − 2ν . (5.20)

ψ −2 ψ, (H0 + V )ψ − 2ν ≥ 41 ϕ −2 ϕ, (H0 + 4ν Proof. Let ψ± = (1 ± (4ν)−1 V )ϕ. Since |V | ≤ 4ν, ψ± 2 ≤ 4 ϕ 2 . The result now follows from (2.9) by choosing ψ to be either ψ+ or U ψ− .

Variational Estimates for Discrete Schr¨odinger Operators

561

Theorem 5.12. Let J be a Jacobi matrix of the form J0 + V , where |V (n)| ≥ Cn−α for some α < 1 and V (n) → 0. Then

|Ej | − 2

γ

(5.21)

=∞

(5.22)

j

for γ <

1−α , 2α

(5.23)

where Ej are eigenvalues of J outside [−2, 2]. 1/2 critical to Szego-type ˝ Remark. In particular, the eigenvalue sum ∞ j =1 (|Ej | − 2) 1 sum rules [7, 16] diverges if α < 2 . This illuminates results in [2, 16]. Proof. Fix p > 0. Let ϕm be supported near mp+1 on an interval [mp+1 −C1 mp , mp+1 + C1 mp ], where C1 is picked to arrange that supports are separated by at least 2. Taking the slopes fixed on each half-interval and using Proposition 4.3, we see C2 , mp C3 mp

ϕm , 41 V 2 ϕm ≥ 2α(p+1) , m

ϕm , ϕm ≥ C4 mp .

ϕm , (2 − H0 )ϕm ≤

So long as α(p + 1) < p (i.e., p < find

ϕm , ϕm −1 ϕm , (H0 +

α 1−α ), (5.25) beats out 1 4

(5.24) (5.25) (5.26) (5.24) for large m, and we

V 2 − 2)ϕm ≥ C5 m−2α(p+1) .

(5.27)

α 2α As p ↓ 1−α , 2α(p + 1) ↓ 1−α . By the lemma with F (x) = dist(x, [−2, 2])γ , we see that we have divergence if (5.23) holds. α Remarks. (a) If the constant C in (5.21) is large enough, we can take p = 1−α and get 1−α divergence if γ = 2α . (b) One can extend this result as well as Theorems 5.3 and 5.5 to higher dimensions.

References 1. Cwikel, M.: Weak type estimates for singular values and the number of bound states of Schr¨odinger operators. Ann. Math. 106, 93–100 (1977) 2. Damanik, D., Hundertmark, D., Simon, B.: Bound states and the Szeg o˝ condition for Jacobi matrices and Schr¨odinger operators. To appear in J. Funct. Anal. 3. Damanik, D., Killip, R.: Half-line Schr¨odinger operators with no bound states. Preprint 4. Deift, P., Simon, B.: Almost periodic Schr¨odinger operators, III. The absolutely continuous spectrum in one dimension. Commun. Math. Phys. 90, 389–411 (1983) 5. Denisov, S.: On Rakhmanov’s theorem for Jacobi matrices. To appear in Proc. Amer. Math. Soc.

562

D. Damanik, D. Hundertmark, R. Killip, B. Simon

6. Hundertmark, D., Simon, B.: Lieb-Thirring inequalities for Jacobi matrices. J. Approx. Theory 118, 106–130 (2002) 7. Killip, R., Simon, B.: Sum rules for Jacobi matrices and their applications to spectral theory. To appear in Ann. of Math. 8. Klaus, M.: On the bound state of Schr¨odinger operators in one dimension. Ann. Phys. 108, 288–300 (1977) 9. Landau, L.D., Lifshitz, E.M.: Quantum Mechanics: Non-relativistic Theory. Course of Theoretical Physics, Vol. 3. Reading, Mass: Addison-Wesley, 1958 10. Lieb, E.H.: Bounds on the eigenvalues of the Laplace and Schr¨odinger operators. Bull. Amer. Math. Soc. 82, 751–753 (1976); see also The number of bound states of one-body Schr¨odinger operators and the Weyl problem, in “Geometry of the Laplace Operator (Proc. Sympos. Pure Math., Univ. Hawaii, Honolulu, Hawaii, 1979), pp. 241–252, Proc. Sympos. Pure Math., XXXVI, Amer. Math. Soc., Providence, R.I., 1980 11. Nevai, P.: Weakly convergent sequences of functions and orthogonal polynomials. J. Approx. Theory 65, 322–340 (1991) 12. Rakhmanov, E.A.: On the asymptotics of the ratio of orthogonal polynomials, II. Math. USSR Sb. 46, 105–117 (1983) 13. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. I. Functional Analysis. New York: Academic Press, 1980 14. Rozenblum, G.V.: Distribution of the discrete spectrum of singular differential operators. Dokl. Akad. Nauk SSSR 202, 1012–1015 (1972); Izv. VUZaved. Matematika 1, 75–86 (1976) [Russian] 15. Simon, B.: The bound state of weakly coupled Schr¨odinger operators in one and two dimensions. Ann. Phys. 97, 279–288 (1976) 16. Simon, B., Zlatoˇs, A.: Sum rules and the Szeg o˝ condition for orthogonal polynomials on the real line. To appear in Commun. Math. Phys. 17. Teschl, G.: Jacobi Operators and Completely Integrable Nonlinear Lattices. In: Mathematical Surveys and Monographs. Vol. 72, Providence, R.I.: American Mathematical Society, 2000 18. Thirring, W.: A Course in Mathematical Physics. Vol. 3. Quantum Mechanics of Atoms and Molecules, Lecture Notes in Physics, 141. New York-Vienna: Springer-Verlag, 1981 19. Zlatoˇs, A.: Private communication Communicated by M. Aizenman

Commun. Math. Phys. 238, 563–582 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0835-3

Communications in

Mathematical Physics

Asymptotics of Karhunen-Loeve Eigenvalues and Tight Constants for Probability Distributions of Passive Scalar Transport Jared C. Bronski Department of Mathematics, University of Illinois Urbana-Champaign, 1409 W. Green St., Urbana, IL 61820, USA. E-mail: [email protected] Received: 22 January 2002 / Accepted: 2 February 2003 Published online: 7 May 2003 – © Springer-Verlag 2003

Abstract: In this paper we study the asymptotics of the probability distribution function for a certain model of freely decaying passive scalar transport. In particular we prove rigorous large n, or semiclassical, asymptotics for the eigenvalues of the covariance of a fractional Brownian motion. Using these asymptotics, along with some standard large deviations results, we are able to derive tight asymptotics for the rate of decay of the tails of the probability density for a generalization of the Majda model of scalar intermittency originally due to Vanden Eijnden. We are also able to derive asymptotically tight estimates for the closely related problem of small L2 ball probabilities for a fractional Brownian motion.

1. Introduction

The problem of turbulence and fluid intermittency is an important one in mathematical physics, and one that is not completely understood even in the case of passive scalar intermittency, where the equations are linear. While there are a number of papers in the physical literature which address the problem with non-rigorous asymptotics calculations, such as instanton or steepest descents calculations on path integrals, there are still relatively few rigorous results [16, 17, 21, 5, 18, 7, 30]. However the basic problem, calculating the rate of decay of the tails of a probability distribution, is one which is well suited to large deviations ideas. In this paper we consider Vanden Eijnden’s generalization [30] of Majda’s model of passive scalar transport by a random linear shear flow [16]. We give only a brief outline of the model here: more details can be found in the original paper of Majda [16] and later work on the model [5, 30].

564

J.C. Bronski

The generalized Majda model consists of a random linear shear flow with a random initial condition: ∂T dT + x dBH (t) = T dt, ∂y T (x, y, 0) = exp(2π iky)k α/2 φ(k)dW (k)

(1) α > −1.

(2)

Here BH (t) is a fractional Brownian motion, dW (k) a white noise measure, = δ(k + k )dkdk , α a parameter which sets the scale of the initial data, and φ a smooth cut-off function. In the original paper of Majda only the white noise case, H = 21 , is considered. The generalization to the fractional Brownian case is interesting physically, as well as mathematically, since much of the work in the area of passive scalar transport has assumed velocity distributions which are white-noise in time. The temporal correlations of real fluid turbulence are, of course, not white-noise, but exhibit long range correlations. Since the increments of a fractional Brownian motion exhibit a similar slow algebraic decay this is a physically appealing model. Physical arguments suggest that the effect of the long-range correlations in the velocity field should be to increase intermittency. Vanden Eijnden studied the above model and was able to relate the long-time tails of the velocity distributions in the model to the probability that a fractional Brownian motion remain in a small ball in the L2 norm [30]. The connection to the small ball problem can be seen as follows. Using the method of characteristics the generalized Majda model can be solved explicitly for a fixed realization to give the following representation for the scalar field T : 2 2 2 2 t 2 T = k α/2 φ(k)e2πik(y−BH (t)x)−4π k t−2π k 0 BH (s)ds dW (k). The scaling properties of the fractional Brownian motion imply that (in law) 2 2H +1 1 B 2 (u)du, and thus for typical realizations of the fractional B 0 H (s)ds = t 0 H Brownian motion there is extremely strong damping - much stronger than for the standard heat equation. For this reason one expects that, in the long-time limit, the probability of observing a large value of the scalar field is directly related to the probability that 1 2 −2H , so that only the standard heat decay is 0 BH (u) remains small, of the order of t observed. Using existing probabilistic results on small ball estimates for the L2 norm of fractional Brownian motions, along with rigorous scaling arguments of the above sort, Vanden Eijnden was able to establish bounds on the tails of the limiting probability distribution of the scalar field T , of the form

t

−cx β ≤ log P{|T | ≥ x} ≤ −Cx β

x 1.

(3)

The bounds derived were not tight, however, since the optimal constants in the small ball estimates were not known. It is interesting to note that essentially the same intuition for the origin of intermittency is put forward in the paper of Bernard, Gaw¸edzki, and Kupiainen [3], and in the survey article of Falkovich, Gaw¸edzki and Vergassola [8]. Bernard, Gaw¸edzki, and Kupiainen study the closely related Kraichnan model, in particular the partial differential equations

Karhunen-Loeve Eigenvalues and Tight Constants

565

governing the time evolution of the N-point functions. For the white-noise Majda model the analogous PDEs would be ∂Pn 1 y )2 Pn + x,y Pn . = ( x·∇ ∂t 2 As Bernard, Gaw¸edzki, and Kupiainen point out the above equation represents a super-diffusion, where the square distance between particles typically grows faster than linearly in time. They also argue that the dominant contribution to the N-point functions comes from those modes which exhibit atypically slow growth in time. In this paper we give sharp estimates of the tails of the probability distribution of the scalar field, as well the related small ball estimates, through a careful asymptotic analysis of the quantity 1

(DH (λ))− 2 ≡ E(e−λ

1 0

2 (u)du BH

).

It is this quantity, with λ = 4π 2 k 2 t 2H +1 , which controls the decay rate of the scalar in the generalized Majda model. This quantity admits several other interpretations – it is the Laplace transform of the distribution of the L2 norm of the fractional Brownian motion, as well as being the Fredholm determinant of a certain operator. To establish our results we consider the eigenvalues of the following integral operator: λn φn =

1 2

1

(x 2H + y 2H − |x − y|2H )φn (y)dy = Aφn .

0

This operator is positive, self-adjoint and compact, and thus the eigenvalues λn converge to zero as n approaches infinity. We prove rigorous asymptotics on the rate of decay of these eigenvalues, of the form λn =

νH n2H +1

(2H +2)(4H +3) + o n− 4H +5 +δ

n 1,

(4)

with an explicit expression for the constant νH . This problem is connected with the problems mentioned above through the wellknown Karhunen-Loeve expansion [2]. The Karhunen-Loeve theorem implies that the eigenfunctions of the above problem provide an orthogonal decomposition of the fractional Brownian motion process BH (t): BH (t) =

∞

λk βk φk (t),

(5)

k=1

where λk , φk are the eigenvalues and functions respectively, and βk are i.i.d. Gaussian random variables with unit variance. As will be seen, all of the relevant quantities for the Majda model and the related small ball problem can be computed in terms of the Karhunen-Loeve eigenvalues.

566

J.C. Bronski

2. A Preliminary Calculation and Motivation As a motivation for the sequel we present the following purely formal Feynman-Kac integral calculation on Vanden Eijnden’s generalization of the Majda model. We begin with the generalized Majda model, ∂T ∂T = x B˙ h (t) + T , ∂t ∂y and take the Fourier transform in y to get ∂ Tˆ = ikx B˙ h (t)Tˆ + T . ∂t The Green’s function for this problem is given via the Feynman-Kac formula t

ˆ G(x, x , k, t) = Eβ (eik

0

βx,x (s)dBH (s)

).

Taking the expectation over the statistics of the fractional Brownian motion, and using the following formula for the covariance of the increments, EBH (dBH (s)dBH (s )) = H (2H − 1)|s − s |2H −2 dsds = R(s, s )dsds , leads to the following formula for the averaged Green’s function: ˆ x , k, t)) = Eβ (e−|k| EBH (G(x,

2

t t 0 0

R(s,s )β(s)β(s )dsds

).

Similarly the expectation of the product of n copies of the Green’s function, the n-point function, is given by ˆ 1 , x1 , k1 , t)G(x ˆ 2 , x2 , k2 , t) . . . G(x ˆ n , xn , kn , t)) EBH (G(x = Eβ (e−

ki kj

t t 0 0

R(s,s )βi (s)βj (s )dsds

).

As in the original Majda model the orthogonal invariance of the Brownian motion implies that the quadratic form ki kj βi βj can be diagonalized by a suitable rotation. The 2 , and quadratic form k ⊗ k is rank one, with k being an eigenvector with eigenvalue |k| the subspace orthogonal to k being the null-space. Thus the problem is “free” in the n−1 and we need only consider Eq. (2), with k 2 replaced by |k| 2. directions orthogonal to k, Note that integrating over the random velocity field has introduced a non-local term in the action of the Feynman-Kac integral. Nonlocal terms of this kind are fairly common in functional integral calculations. In Feynman’s original treatment of the polaron problem [10] such terms arise in the process of integrating out the lattice coordinates, and the asymptotics of similar objects has been treated by Donsker and Varadhan [6]. Also note that since the Weiner measure itself is Gaussian – it is formally given by e−

t β˙ 2 ds 0 2 dβx,x ,t ,

the effect of the fluctuating velocity field is to replace the usual quadratic form in the Weiner integral with a modified quadratic form: ˙ β> ˙ → <β, ˙ β> ˙ + 2k 2 <β, Bβ>, <β,

Karhunen-Loeve Eigenvalues and Tight Constants

567

t where B is the integral operator with a weakly singular kernel: Bφ = 0 φ(s )R(s, s )ds = t H (2H − 1) 0 φ(s )|s − s |2H −2 ds . Intuitively this can be thought of as producing an additional diffusion which is nonlocal, due to the fact that B is not diagonal. The intermittency in the Majda model arises from the interplay between the scale of the initial data and the additional diffusion due to the nonlocal term induced by the fluctuating velocity field. When the initial data is large scale, and thus concentrated near k = 0, this term is relatively unimportant. When the initial data is supported on small scales, this k dependent dissipation becomes important, and contributes to intermittency. In particular one expects that it is generally the large k asymptotics that should be important in determining intermittency, since it is the large k modes which are most strongly damped. It should also be clear that the intermittency in this model arises from an interplay between the initial data, which selects the “range” of relevant k, and the additional diffusion due to the fluctuating velocity field, which enters through the covariance operator B. Since the above functional integral is Gaussian it can be evaluated by diagonalizing the quadratic form associated with the operator B. In fact there is the following (formal) Mehler type formula for the Green’s function: 2 ¯ G(x, x, |k|2 , t) = e−L(x,x ,k ,t) D(k 2 , t)− 2 , 1

where L is the minimum of the action functional t t t 2 |y(s)| ˙ + 2k 2 y(s)y(s )R(s, s )dsds , L = inf y(0)=x y(t)=x

0

0

0

and D(|k|2 , t) is the Van Vleck determinant: D(|k|2 , t) = det(I + 2|k|2 D−1 R) =

∞ (1 + 2k 2 λi (t)). i=1

Here λi are the eigenvalues of the operator D−1 R, with D the second derivative operator. Thus we are lead to the eigenvalue problem D−1 Bφ = µφ or, equivalently, Bφ = µDφ. First we note that, by scaling, it is enough to consider t = 1 since λi (t) = t 2H +1 λi (1). We also note that for large n and x = 0, 1 the functions φn (x) = cos((n + 21 )π x) are approximate eigenfunctions of this problem, since (letting n∗ = (n + 21 )π ) we have 1 1 B cos(n∗ x) = |x − y|2H −2 cos(n∗ y)dy H (2H − 1) 0 1−x x (x − y)2H −2 cos(n∗ y) + (y − x)2H −2 cos(n∗ y) = 0 0 n∗ x n∗ (1−x) 2H −2 ∗ = u cos(n x + u)+ u2H −2 cos(n∗ x − u). 0

0

(6)

When n is large and x = 0, 1 the above reduces to

∞ 1 ∗ ∗ B sin(n x) ≈ 2 cos(n x) u2H −2 cos(n∗ u)du H (2H − 1) 0 2 (2H − 1) sin(π H ) ≈ cos(n∗ x). n∗ 2H −1

(7) (8)

568

J.C. Bronski

Since we also have the D cos(n∗ x) = −(n∗ )2 cos(n∗ x) it follows that cos(n∗ x) is an approximate eigenfunction of this generalized eigenvalue problem: 2H (2H − 1) (2H − 2) D cos(n∗ x) (n∗ )2H +1

(2H + 1) sin(π H ) =− D cos(n∗ x). (n∗ )2H +1

B cos(n∗ x) = −

Since this is a purely formal calculation we have been somewhat cavalier about the domain of definition of the second derivative operator D or, equivalently, the boundary conditions. The phase π2 was chosen largely for convenience. Since we are only interested in the leading order asymptotics it should be clear that this factor will not change the leading order asymptotics. We will see in the following section that the tails of the PDF’s in the generalized Majda model can be calculated explicitly in terms of the eigenvalues of a closely related problem, and the eigenvalues of this related problem have the same large n asymptotics as the calculation above would indicate. The paper is organized as follows. In the next section we state the main results, followed by the applications of the results to the Majda model and the problem of small ball estimates. The proofs of the main results are deferred to an appendix. In this paper all operators are self-adjoint, and are compact unless otherwise noted. The expression λn (H) denotes the nth eigenvalue of the operator H ordered by decreasing absolute value. We also use the relation ∼ to denote standard asymptotic equality while we use to denote log asymptotics: f (x) g(x) x → x0 iff log(f (x)) ∼ log(g(x)) x → x0 .

(9)

3. Main Results The main result of this paper is the following rigorous asymptotic calculation on the Karhunen-Loeve eigenvalues of a fractional Brownian motion. The rest of the results follow in a fairly straightforward way from this. Theorem 1. The Karhunen-Loeve eigenvalues fractional Brownian motion with Hurst exponent H ∈ (0, 1), 1 1 2H (x + y 2H − |x − y|2H )fn (y)dy = Afn , (10) λn fn = 2 0 satisfy the large n asymptotics (2H +2)(4H +3) sin(π H ) (2H + 1) + o n− 4H +5 +δ ∀δ > 0 n 1 (nπ )2H +1 (2H +2)(4H +3) νH = 2H +1 + o n− 4H +5 +δ , n

λn =

where denotes the usual Euler gamma function. Sketch of Proof. A simple calculation shows that the above eigenvalue problem is, modulo some boundary terms, the integrated form of the eigenvalue problem considered in the last section which was obtained by a formal Feynman-Kac calculation. The reason that the integrated form arises is obvious: the kernel of the above integral operator is the

Karhunen-Loeve Eigenvalues and Tight Constants

569

covariance of the fractional Brownian motion, while the operator in the previous section was the covariance of the increments. Thus the two are related, modulo some boundary terms, by an integration by parts. The eigenvalue problem in the previous section had cos((n + 21 )π x) as approximate eigenfunctions in the large n limit. This suggests that, because of the integration, we consider the action of the above operator on the basis {sin(n∗ x)}∞ n=0 , where again n∗ = (n + 1/2)π. As before the phase π/2 is not really necessary to prove the result, but is included for convenience, and for agreement with the exact result in the Brownian case H = 1/2. In this basis we find that the above operator can be written as a diagonal piece, with diagonal entries νH n−(2H +1) , plus an off-diagonal part which can be shown to be relatively compact, from which the above estimate follows. The details are presented in Appendix A. We note several facts. In the limiting cases H → 0, H → 1 the constant νH vanishes. This is clearly the correct limiting behavior since for H = 0, 1 the operator A has rank one, being a projection onto the subspace of constant and linear functions respectively, and there is only one non-zero eigenvalue. In these cases the random process reduces to a random constant and a line with random slope respectively. In the Brownian case, 1 H = 21 , the result agrees with the exact result λn = , though with an error 1 2 ((n+ 2 )π)

term which is better than indicated above. In fact, as can be seen from the appendix, in the Brownian case the leading order error vanishes. This indicates that the next order correction to the above asymptotics, the analog of the Keller-Maslov index for this problem, differs in scaling between the Brownian case and the non-Brownian case. This is a potentially physically interesting difference between the two cases, which will be discussed in the conclusions. Also note that the above eigenvalue estimates are good for all H ∈ (0, 1) though it is only H ∈ (1/2, 1) which is important for the fluid model. The important quantity from the point of view of both the Majda model and the small-ball estimates is the determinant function DH (λ), defined by DH (λ) =

∞

(1 + 2λλi ).

i=1

This quantity is not the same as the Van Vleck determinant discussed in Sect. 2, although it is closely related and has the same leading order asymptotics. From the point of view of large deviations theory DH (λ) is the exponential moment generating function for the L2 norm of the process BH (t). The exponential moment generating function is, of course, the fundamental object for large deviations calculations. In the Brownian case, H = 1/2, the eigenvalues are given by λi = (i + 1/2)−2 π −2 , and √ D1/2 (λ) = cosh( 2λ). From the definition of DH (λ) and the eigenvalue asymptotics it is clear that the canonical product converges for all λ and defines an entire function of order 2H1+1 and genus zero [1]. There is a very close connection between the rate of growth of entire functions of fractional order and the distribution of their zeroes. A result of Polya [29] states that, if f (λ) is an entire function of fractional order, all of whose zeroes are real and negative, then we have the Stieltjes integral formula ∞ n(t)dt log(f (λ)) = λ t (λ + t) 0

570

J.C. Bronski

with n(t) the zero counting function (the number of zeroes of f with modulus less than t). Further, if the zero counting function grows like n(t) ∼ αt ρ , with ρ ∈ (0, 1), then f (λ) has the following asymptotics: log(f (λ)) ∼

απ λρ sin(πρ)

arg(λ) ∈ (−π, π ).

In this case the nth zero is located at −(2λn )−1 ∼ −n2H +1 /(2νH ), and the zero counting 1 function n(t) scales like n(t) ∼ (2tνH ) 2H +1 . Thus we have the following estimate on the asymptotics of the log-determinant: Corollary 1. The asymptotics of the log determinant DH (λ) are given by 4H +4 1 log(DH (λ)) = (νH ∀ δ > 0 λ 1, λ) 2H +1 + o λ (4H +5)(2H +1) +δ

(11)

is given by where the constant νH = νH

2 sin(π H ) (2H + 1) . (sin( 2Hπ+1 ))2H +1

(12)

For completeness we give an elementary proof of this in the appendix. 4. The Majda Model As mentioned before the function DH (λ) is the fundamental quantity for understanding the Majda model. It can be shown (see Vanden Eijnden [30]) that if T (x, y, t) is the scalar field satisfying the generalized Majda model: ∂T dBH (t) = T dt, dT + x ∂y T (x, y, 0) = exp(2π iky)k α/2 φ(k)dW (k),

(13) α > −1,

(14)

then in the large time limit the normalized moments are given by µ2n = lim

t→∞

E(T 2n (x, y, t)) (E(T 2 (x, y, t)))n

=

2n n!

∞ 0

λ √

α+1 −1 2

DH (λ)

∞

α+1

λ√n 2 −1 dλ 0 DH (λ) ,

n

( n(α+1) ) 2 dλ

n 2n!( ( α+1 2 ))

(15)

where E denotes expectation over both the initial data and the fractional Brownian mon th tion. Since the factor 22n! n n! σ represents the 2n moment of a Gaussian distribution with variance σ 2 it is the factor ∞

α+1

λ√n 2 −1 dλ 0 DH (λ) n(α+1)

( 2 )

which is responsible for the intermittency in the model. Note that since DH (λ) is an entire function of fractional order it grows more slowly than exponential, and thus the

Karhunen-Loeve Eigenvalues and Tight Constants

571

above ratio grows faster than exponentially with n for large n. By a standard Laplace asymptotics calculation we have that

∞

log 0

∞ α+1 λn 2 −1 ( n(α+1) −1) log(λ)− 21 log(DH (λ)) 2 e dλ dλ = log √ DH (λ) 0 ∗ ∼ DH (n),

∗ (n) is related to D (λ) by where DH H ∗ (n) DH

n(α + 1) 1 = sup − 1 log(λ) − log(DH (λ)). 2 2 λ

∗ (n) is Note that, under the change of variables ρ = log(λ) it becomes apparent that DH 1

λ) 2H +1 we have the Legendre transform of − 21 log(DH (eρ )). Since log(DH (λ)) ∼ (νH ∗ that the large n behavior of DH is given by ∗ (n) ∼ DH

(α + 1)(2H + 1) n log(n) 2

(α + 1) (α + 1)22H (2H + 1)2H +1 + o(n) +n log 2 e2H +1 νH

from which it follows that the large n asymptotics of the nth moment is given by

µ2n

n(α+1)(2H +1) ) 2n! ( 2 n n(α+1) 2 n!

( )

2

1

2(H + 2 )(α+1)

n (16)

(α+1)/2 σ νH

∞ 1 the constant from Corolwith σ = 0 λ(α−1)/2 (DH (λ))− 2 dλ/ ((α + 1)/2), and νH lary 1 in the previous section. From this asymptotic expression for the moments it is straightforward, though tedious, to calculate the asymptotic rate of decay of the probability measure. For details see the paper of Bronski and McLaughlin [5]. Alternatively one could use the expression for the small ball probabilities derived in the next section, along with the arguments of Vanden Eijnden [30]. Either way one is led to the following result: Corollary 2. The tails of the long time normalized probability distribution for the scalar T in the generalized Majda model satisfy the asymptotics 2

1

log(P{T > X}) ∼ −X 1+H (α+1) β − 1+H (α+1) X 1,

(17)

where the constant β is given by β=

2(α + 1)H (α+1) (2H + 1)

(2H +1)(α+1) 2

2

α+1 2

(α+1)/2 (1 + H (α + 1))1+H (α+1) σ νH

.

(18)

572

J.C. Bronski

5. Applications to Small L2 Ball Estimates for Fractional Brownian Motions There are a number of results in probability theory that fall under the rubric of small ball estimates [13, 14, 31]. Basically these are estimates of the probability that some random process B(t) will lie inside a ball of radius in some given norm || · ||X . These estimates are often important for statistical mechanical models – for instance the question raised by Sinai [28] and answered by Molchan [22, 23] regarding the large time behavior of the supremum of a fractional Brownian motion is an example of a small ball problem. Generally these small ball probabilities vanish faster than algebraically as approaches zero. For instance, the paper of Li and Shao [14] gives the following estimate for the small ball probabilities of the fractional Brownian motion under Lp norms: 1

−1 −1 p Hp −c ≤ log P |BH (t)| dt ≤ ≤ −C Hp (19) 0

for some positive constants c, C. It was this result (with p=2) that Vanden Eijnden used in his original analysis of the generalized Majda model. For the L2 case the asymptotics for the determinant function DH (λ) which were proven in Sect. 3 provide a simple way to calculate the best constant in the above estimate. Specifically we have the following asymptotics: Corollary 3. The fractional Brownian motion satisfies the following small L2 ball estimate: 1 2H sin(π H ) (2H + 1) H log(P{||BH (t)||22 ≤ }) = − π (2H +1) (sin( 2H +1 ))2H +1 (2H + 1) 2H −1

1

× 2H + o( − 2H ),

1.

(20)

Proof. The proof of this result is a straightforward calculation. Because the fractional Brownian motion is a Gaussian process we have the identity

1 1 2 (21) |BH (t)| dt =√ E exp −λ DH (λ) 0 with DH (λ) the determinant function studied earlier. If we let η be the R-valued random variable given by η = |BH (t)|2 dt, with probability measure µ(dη) we have that 1 exp(−λη)µ(dη) = L(µ)(λ) = √ , (22) DH (λ) so that √D1 (λ) is the Laplace transform of the probability measure, and the probability H that the motion lies in a ball of radius x is given by an inverse Laplace transform

1 P{η ≤ x} = L−1 (x). (23) √ λ DH (λ) When x 1 the asymptotics of this probability can be calculated by a straightforward large deviations calculation, with rate function given by the Legendre transform of − log(DH (λ))/2. In particular it follows from de Bruijn’s exponential Tauberian theorem [4, 15] that ∗ log(P{η ≤ x}) ∼ DH (x)

x 1,

(24)

Karhunen-Loeve Eigenvalues and Tight Constants

573

∗ (x) is given by where DH ∗ DH (x) = sup xλ − λ

1 log(DH (λ)). 2

We have shown that the moment generating function has the asymptotic behavior 4H +4 1 1 1 ∀δ > 0 λ 1 (25) log(DH (λ)) = (νH λ) 2H +1 + o λ (4H +5)(2H +1) +δ 2 2 with νH =

2 sin(π H ) (2H + 1) , (sin( 2Hπ+1 ))2H +1

so it follows that the Legendre transform of probability, satisfy

1 2

(26)

log(D(λ)), and thus the small ball L2 ball

1 2H 1 sin(π H ) (2H + 1) ≤ }) = −H

− 2H π ((2H + 1) sin( 2H +1 ))2H +1 4H +4 ∀ δ > 0 1. + o − (4H +5)2H +δ

log(P{||BH ||22

(27)

We make several remarks: First note that in the Brownian case H = 21 the above constant reduces to the familiar value 1/8. Secondly we note that the error bounds derived for the asymptotics of the eigenvalues are not sufficient to conclude a stronger statement like 1 P{||BH ||22 ≤ } ∼ C exp(−m − 2H ). Finally we note that the above estimates extend in a straightforward way to integrals of fractional Brownian motions (i.e. H > 1). It should also be straightforward to extend to the case of Sobolev norms || · ||Hs for small enough s, since the leading order diagonal part of the operator commutes with differentiation. This result is closely related to a result on small ball estimates for integrals of Brownian motions which appears in the recent work of Li and Shao [15]. In this work Li and Shao derived small ball estimates for integrals of Brownian motion. Since the fractional Brownian motion can be thought of as a fractional integral of a Brownian motion it is interesting to compare Eq. (3.14) in the work of Li and Shao to the above estimate, with m in the Li/Shao result corresponding to H + 21 in the above, 2 in the Li/Shao result equal to above, and m!Xm in the Li/Shao result equal to BH above. We emphasize, however, that the Li/Shao result is only valid for m integer (H a half-integer) and one cannot simply take the integer index m in the Li/Shao paper to be a fraction, since there 1 is the additional factor of | sin(π H )| 2H (which, of course, reduces to 1 when H is a half-integer). 6. Conclusions In this paper we have calculated detailed asymptotic behavior for the probability distribution for the Majda model of passive scalar transport by computing the leading order asymptotics of an associated eigenvalue problem. It seems clear that one could compute more detailed information by computing the next order asymptotics of the Karhunen-Loeve eigenvalues. One reason for doing so would be to establish the following, potentially

574

J.C. Bronski

physically interesting fact which is suggested by the current analysis. In the Brownian case, the original Majda model, the decay of the tails of the limiting probability distribution is, to within logarithmic terms, purely a “stretched exponential”: 4

− log(P{T > X}) ∼ CX 3+α + O(log(x)). This basically follows from the fact that the eigenvalues satisfy λn ∼ (nπ )−2 + O(n−3 ). The present analysis suggests that, in the non-Brownian case, the eigenvalue asymptotics behave like λn ∼ (nπ )−(2H +1) + O(n−β ) with β strictly less than (2H + 2) for H = 1/2, since this is the scaling predicted by the leading order error. It is only in the case where H = 1/2 that the leading order error happens to vanish. This would imply that the next order correction to the tails of the probability distribution is an algebraic term of lower order, 2

− log(P{T > X}) ∼ CX 1+H (α+1) + O(X γ ), which is potentially physically interesting. It is far from clear, however, that the error bounds derived in this paper are the best possible, so this point requires further study. The same calculation also establishes the best constants for the small L2 ball estimate for fractional Brownian motion. The same basic technique should work for establishing the best constant for Sobolev norms, Hα for α < H , since the leading order part of the operator A commutes with the second derivative. Unfortunately the present technique does not seem to shed any light on other norms such as the Lp norms, which require more detailed information than eigenvalues alone contain. A. Calculation of the Eigenvalue Asymptotics for the fBM Kernel In this section we prove the following theorem, which was stated in Sect. 3. Theorem 1. The Karhunen-Loeve eigenvalues fractional Brownian motion with Hurst exponent H ∈ (0, 1), 1 λn fn = 2

1

(x 2H + y 2H − |x − y|2H )fn (y)dy = Afn ,

0

satisfy the large n asymptotics (2H +2)(4H +3) sin(π H ) (2H + 1) + o k − 4H +5 +δ ∀δ > 0 n 1 (nπ )2H +1 (2H +2)(4H +3) νH = 2H +1 + o n− 4H +5 +δ , n

λn =

where denotes the usual Euler gamma function.

(28)

Karhunen-Loeve Eigenvalues and Tight Constants

575

Proof. The formal asymptotics calculation presented in Sect. 2 suggests that in the large n limit the eigenvalues and eigenfunctions are given approximately by νH λn ∼ 2H +1 , n 1, n

(29) √ 1 φn (x) ∼ 2 sin πx , n 1. n+ 2 Note that the basis above diagonalizes the problem in the Brownian case H = 21 . We prove the rigorous√estimates of the eigenvalues by considering the operator in the orthonormal basis { 2 sin((n + 21 )π x)}∞ n=0 . In this basis the covariance operator A, now considered as an operator from l2 to l2 , has matrix elements 1 1 An,m = <φn Aφm > = (x 2H + y 2H − |x − y|2H ) sin(n∗ x) sin(m∗ y)dxdy, 0

0

(30) where n∗ = (n + 21 )π and likewise m∗ . The idea is to show that in this basis the operator can be written as a diagonal piece, with diagonal entries Di,i given by the first term in the asymptotic expression above, and an off-diagonal piece which is relatively compact, from which the result follows easily. We show the result here for H ∈ ( 21 , 1), which is the case which is relevant for the Majda model. The result for H ∈ (0, 21 ) follows with only slight modifications. It can be shown (see Appendix B) that A can be written as a leading order diagonal piece plus a higher order off-diagonal piece, A = D + O, where the diagonal and off-diagonal entries are given by

sin(π H ) (2H + 1) −2(H +1) Dn,m = + O(n ) δn,m , n∗ 2H +1

1 cos(π H ) (2H + 1) 1 1 + O m + n even On,m = − n∗ m∗ (n∗ − m∗ ) n∗ 2 m ∗ 2 n∗ 2H −1 m∗ 2H −1

cos(π H ) (2H + 1) 1 1 1 = + O m + n odd, + n∗ m∗ (n∗ + m∗ ) n∗ 2 m ∗ 2 n∗ 2H −1 m∗ 2H −1 (31) respectively. Note the leading order contribution to the off-diagonal term vanishes for H = 21 , the Brownian case. This is to be expected, since the basis is exactly the eigenbasis for the Brownian case. We would like to show that the off-diagonal part of the operator is relatively compact to the diagonal part, and thus that eigenvalues of the offdiagonal part of the operator go to zero faster than the eigenvalues of the diagonal part. This, together with the mini-max principle for compact self-adjoint operators, implies that the eigenvalues of the full operator are asymptotically those of the diagonal piece, with error term as above. Since the operator D is self-adjoint and positive we can define fractional powers, and thus write O in the form O = Dβ D−β OD−β Dβ ˜ β. = Dβ OD The off-diagonal terms Om,n decay like Cn−2 m−2H for n > m 1, while the (diagonal) elements of D−1 grow like nβ(α+1)/2 , and thus the matrix elements O˜ n,m decay

576

J.C. Bronski

˜ is Hilbert-Schmidt (and thus compact) like n(2H +1)β−2 m(2H +1)β−2H . It follows that O ˜ diverges for β ∈ (0, 21 ). In the borderline case β = 21 the Hilbert-Schmidt norm of O logarithmically. ˜ is Hilbert-Schmidt, and that the eigenvalues of O ˜ Taking β = 21 − δ we have that O are square summable and thus (arranged in order of decreasing magnitude) satisfy 1

˜ ≤ Cn− 2 |λn (O)|

(32)

for some C. Finally we need the following two results, which follow as a straightforward consequence of the mini-max principle – for proof see the text of Porter and Stirling [24]. Lemma 1 (Porter and Stirling). If T, K are compact and K is self-adjoint then the eigenvalues of T† KT satisfy |λn (T† KT)| ≤ min |λj (K)| λn−j +1 (T† T). j ∈1...n

(33)

˜ is Hilbert-Schmidt this implies that ˜ and Dβ are compact and self adjoint, and O Since O ˜ |λj (D2β )| |λn (O)| = |λn (D−β OD−β )| ≤ |λn−j (O)| 1 1 ≤ Cn− 2 n2β(2H +1) ∀β < 2 ≤ Cn−(2H +3/2−δ) ∀δ > 0, where the second follows from taking j of the order of n. Finally we need to show that the more rapid decay of the eigenvalues of O implies that the eigenvalues of A are asymptotic to those of D. This also follows from a second lemma. Lemma 2 (Porter and Stirling). If K1 , K2 are compact and self adjoint then we have |λn (K1 + K2 )| ≤ min |λn−j +1 (K1 )| + |λj (K2 )|. j ∈1...n

(34)

Taking K1 = D and K2 = O and j = nβ it follows that λn (A) ≤ |λn−nβ (D)| + |λnβ (O)|, νH λn (A) ≤ 2H +1 + O(n−(2H +2−β) ) + o(n−(β(2H +3/2)−δ) ) ∀δ > 0, n

(35)

where the first error term comes from D and the second from O. Taking these two errors 2H +2 to be of the same order gives β(2H + 3/2) = 2H + 2 − β, or β = 2H +5/2 . Thus we have (2H +2)(4H +3) νH λn (A) ≤ 2H +1 + o n− 4H +5 +δ ∀δ > 0. (36) n Repeating the argument with K1 = A and K2 = −O gives (2H +2)(4H +3) νH λn (A) ≥ 2H +1 + o n− 4H +5 +δ ∀δ > 0 n and the result follows.

(37)

Karhunen-Loeve Eigenvalues and Tight Constants

577

B. Asymptotics of the Matrix Elements In this section we derive the asymptotic behavior of the matrix elements An,m . The following integral indentities, which are easily derived using contour integration, are useful 1

(a) cos( π2 a) cos(mx)x a−1 dx = + O(m−1 ) a ∈ (0, 1) m 1, a m 0 1 (38)

(a) sin( π2 a) −1 sin(mx)x a−1 dx = + O(m ) a ∈ (0, 1) m 1. ma 0 operator in the basis φn = √ Again we 1are looking at the matrix elements of the 2 sin((n + 2 )π x). For convenience we denote (n + 21 ) = n∗ in the sequel, and likewise m∗ . Thus we have 1 1 1 2H An,m = (x + y 2H − |x − y|2H ) sin(n∗ x) sin(m∗ y)dxdy 2 0 0 H (2H − 1) |x − y|2H −2 cos(n∗ x) cos(m∗ y)dxdy, = (39) n∗ m ∗ the second of which follows from integrating by parts in both x and y. Notice that the singularity lies along the diagonal x = y. Intuition suggests that the leading order contribution to the integral should be from the singularity along the diagonal. In fact this leading order contribution is actually diagonal in n, m. This is most easily seen by the method used above – we rewrite the integral for the matrix element in the following way: 2H (2H − 1) |y − x|2H −2 cos(m∗ y) cos(n∗ x)dxdy An,m = n∗ m ∗ I1 − |y − x|2H −2 cos(m∗ y) cos(n∗ x)dxdy, (40) I2 ∪I3

where I2,3 are the triangles denoted in Fig. (1). Using the identity in Eq. 39 it is easy to calculate that the contribution from the integral over the parallelogram I1 is 1 1+y |y − x|2H −2 cos(n∗ x) cos(m∗ y)dxdy 0 −1+y

sin(π H ) (2H − 1) −(2H ) = + O(n ) δn,m . (41) n∗ 2H −1 We now proceed to estimate the contributions from regions I2,3 . It is intuitively clear that the contributions from these regions should be, in some sense, smaller, since the integrands are singular only at the points (0, 0) and (1, 1), rather than along a line (see Fig. 1). To estimate this contribution we first note that by the substitution x = 1 − x, y = 1 − y maps the region I3 onto the region I2 , and the integral over regions I2 ∪ I3 can be expressed as 1 0 |x − y|2H −2 (cos(n∗ x) cos(m∗ y) + (−1)m+n sin(n∗ x) sin(m∗ y))dxdy. 0

−1+y

(42)

578

J.C. Bronski

(1,1)

(1,1)

= I1 (0,0)

(1,1)

−

(0,0)

I3 I2 (0,0)

Fig. 1. The regions of integration for calculating the matrix elements An,m . The singularity of the kernel lies along the line x = y denoted in dashed bold

The integral over this region simplifies greatly under the change of variables u = x + y, v = x − y, giving the result |x − y|2H −2 (cos(n∗ x) cos(m∗ y) + (−1)m+n sin(n∗ x) sin(m∗ y)) I2 ∪I3

∗

n∗ m cos |v| (u + v) cos (u − v) 2

2

0 −v ∗ ∗ n m (u + v) sin (u − v) dudv +(−1)n+m sin 2 2 1 sin(n∗ u) + (−1)n+m sin(m∗ v) 2H −2 |v| dv. = n∗ + (−1)n+m m∗ 0 1 = 2

1 v

2H −2

From the identity in Eq. (39) it follows that α(α − 1) |x − y|2H −2 (cos(n∗ x) cos(m∗ y) ± sin(n∗ x) sin(m∗ y)) n∗ m ∗ I2

(2H + 1) cos(π H ) 1 1 1 = , ∓ ∗ 2H −1 + O n∗ m∗ (n∗ ∓ m∗ ) n∗ 2 m ∗ 2 n∗ 2H −1 m

(43)

(44)

and thus that An,m = Cn δn,m + On,m + Em,n .

(45)

C. Asymptotics of the Log Determinant In this section we present an elementary proof of the asymptotics of the log determinant which were derived in Sect. 4 using the entire function result of Polya. Corollary 1. The log asymptotics of the determinant are given by 4H +4 1 log(DH (λ)) = (νH ∀ δ > 0, λ 1, λ) 2H +1 + o λ (4H +5)(2H +1) +δ

(46)

is given by where the constant νH νH =

2 sin(π H ) (2H + 1) . (sin( 2Hπ+1 ))2H +1

(47)

Karhunen-Loeve Eigenvalues and Tight Constants

579

Proof. First we consider the product D˜ H (λ) =

∞ i=1

2λνH 1 + 2H +1 i

,

(48)

where the exact eigenvalues have been replaced by their asymptotic values. Obviously the log determinant satisfies

∞ 2λνH log(D˜ H (λ)) = (49) log 1 + 2H +1 . i i=1

If we define M 2H +1 = 2νH λ we have the expression 2H +1 ∞ M log(D˜ H (λ)) = log 1 + i i=1 2H +1 ∞ 1 M log 1 + =M . i M

(50)

i=1

The latter sum is obviously the Riemann sum for the integral

∞ 1 log 1 + 2H +1 . x 0

(51)

Since the integrand is monotonic, standard arguments show that the difference between the integral and the Riemann sum can be controlled as follows:

2H +1 ∞ ∞ M 1 1 M =M log 1 + log 1 + 2H +1 dx + O(log(M)). M x x 0 n=0

The above integral is explicitly computable using the following Fubini argument:

∞ 1 ∞ 1 1 log 1 + 2H +1 = dydx 2H +1 x y + x 0 0 0 1 ∞ 1 dxdy = y + x 2H +1 0 0 ∞ 1 2H 1 y − 2H +1 dxdy = 1 + x 2H +1 0 0 1 2H π y − 2H +1 = (2H + 1) sin( 2Hπ+1 ) 0 π , = sin( 2Hπ+1 ) dx where the identity 1+x α = π/(α sin(π/α)) follows from a straightforward contour integral. This implies that 1 π log(D˜ H (λ)) = M 2H +1 + O(log(M)) sin( 2Hπ+1 ) 1

1 (2 sin(π H ) (2H + 1)) 2H +1 = λ 2H +1 + O(log(λ)). sin( 2Hπ+1 )

(52)

580

J.C. Bronski

It is straightforward to show (see Appendix B) that ∞ k=1

1 + 2λi log 1 + 2λ˜ i

4H +4 = o λ (4H +5)(2H +1) +δ ∀δ > 0,

(53)

and by combining these results we have the asymptotics 1

(4H +4) 1 (2 sin(π H ) (2H + 1)) 2H +1 log(DH (λ)) = λ 2H +1 + o(λ (4H +5)(2H +1) +δ ) ∀δ > 0. π sin( 2H +1 ) (54)

˜ Claim. log(D(λ)) and log(D(λ)) have the same leading order asymptotics. Proof. We know from Sect. 2 that the actual eigenvalues and the approximate eigenvalues satisfy the estimate (2H +2)(4H +3) λi = λ˜ i + o i − 4H +5 +δ . (55) We have that ˜ log(D(λ)) − log(D(λ)) =

log

i

1 + 2λλi 1 + 2λλ˜ i

(56)

.

We break the sum up into two sums: ˜ log(D(λ)) − log(D(λ)) =

N i

1 + 2λλi log 1 + 2λλ˜ i

+

∞ i=N

1 + 2λλi log 1 + 2λλ˜ i

.

(57)

The first sum can be estimated as follows: we have the elementary inequalities | log((1 + a)/(1 + b))| ≤ | log((a/b))| (a, b > 0) and log(1 + a) ≤ a (a > 0). Combining these we have N i=1

1 + 2λλi log 1 + 2λλ˜ i

≤

|1 −

λi |. λ˜ i

(58)

From the eigenvalue estimate in Theorem 1 we have that |1 −

1 λi | ≤ Ci − 4H +5 +δ ∀δ > 0, ˜λi

(59)

and thus it follows that N i

log

1 + 2λλi 1 + 2λλ˜ i

In the second sum we have

≤

N i

log

λi λ˜ i

4H +4

≤ CN 4H +5 +δ ∀δ > 0.

(60)

Karhunen-Loeve Eigenvalues and Tight Constants ∞ i=N

log

1 + 2λλi 1 + 2λλ˜ i

≤ ≤

∞ i=N ∞

581 1

C

λi −(2H +1+ 4H +5 )+δ 1 + λC i −(2H +1)

Cλi

−(2H +2)(4H +3) +δ 4H +5

≤ CλN −

8H 2 +10H +1 +δ 4H +5

.

(61)

i=N

Comparing the error terms for the two pieces of the sum we find that the error is minimized when the two are of the same order of magnitude, which occurs when N = O(λ2H ). This leads to the estimate 4H +4

˜ log(D(λ)) − log(D(λ)) = o(λ (2H +1)(4H +5) +δ ) ∀δ > 0 λ 1.

(62)

Note that because the error estimates for the difference between the exact and the ap˜ proximate eigenvalues decay relatively slowly we cannot conclude that D(λ) ∼ cD(λ), as would be the case if i log(λi /λ˜ i ) were convergent. Acknowledgements. The author would like to recognize NSF support under grants DMS-9972869/ 0203938, as well as a Sloan Foundation fellowship. The author would also like to thank Eric Vanden Eijnden, Richard Sowers and Richard McLaughlin for useful and interesting conversations.

References 1. Ahlfors, L.V.: Complex Analysis: An Introduction to the Theory of Analytic Functions of One Complex Variable, 3rd edn. New York: McGraw-Hill, 1979 2. Ash, R.B., Gardner, M.F.: Topics in Stochastic Processes. New York: Academic Press, 1975 3. Bernard, D., Gaw¸edzki, K., Kupiainen, A.: Slow modes in passive advection. J. Stat. Phys. 90(3/4), 519–569 (1998) 4. Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge: Cambridge University Press, 1987 5. Bronski, J.C., McLaughlin, R.M.: Rigorous estimates of the tails of the probability distribution function for the random linear shear model. J. Stat. Phys. 98(3/4), 897–915 (2000) 6. Donsker, M.D., Varadhan, S.R.S.: Asymptotic Evaluation of certain Wiener integrals for large time. In: Functional Integration and its Applications. Oxford: Oxford University Press, 1975 7. Eyink, G.L., Xin, J.: Self-similar decay in the Kraichnan model of a passive scalar. J. Stat. Phys. 100, 679–741 (2000) 8. Falkovich, G., Gaw¸edzki, K., Vergassola, M.: Particles and fields in fluid turbulence. Rev. Mod. Phys. 73, 913–975 (2001) 9. Fannjiang, A., Komorowski, T.: Fractional Brownian motions and enhanced diffusion in a unidirectional wave-like turbulence. J. Stat. Phys. 100, 1071–1095 (2000) 10. Feynman, R., Hibbs, A.: Quantum Mechanics and Integrals. New York: McGraw Hill, 1965 11. Isichenko, M.: Percolation, statistical topography, and transport in random media. Rev. Mod. Phys. 64, 961–1043 (1992) 12. Juneja, A., Lathrop, D.P., Sreenivasan, K.R., Stolovitzsky, G.: Synthetic Turbulence. Phys. Rev. E 49, 5179–5194 (1994) 13. Li, W.V., Linde, W.: C. R. Acad. Sci. Paris S´er. I Math 326, 1329–1334 (1998) 14. Li, W.V., Shao, Q.M.: Small ball estimates for Gaussian processes under Sobolev type norms. J. Theoret. Prob. 12, 699–720 (1999) 15. Li, W.V., Shao, Q.M.: To Appear: Stochastic Processes: Theory and Methods, Handbook of Statistics, 19 16. Majda, A.J.: The random uniform shear layer: An explicit example of turbulent diffusion with broad tail probability distributions. Phys. Fluids A 5, 1963–1970 (1993) 17. Majda, A.J.: Explicit inertial range renormalization theory in a model for turbulent diffusion. J. Stat. Phys. 73, 515–542 (1993) 18. Majda, A., Kramer, P.: Simplified models for turbulent diffusion: Theory, numerical modelling, and physical phenomena. Phys. Rep. 314, 237–574 (1999) 19. Mandelbrot, B.B., Van Ness, J.W.: Fractional Brownian Motions, Fractional Noises, and Applications. SIAM Rev. 10, 422–437 (1968)

582

J.C. Bronski

20. McLaughlin, R.M., Majda, A.J.: An explicit example with non-Gaussian probability distribution for nontrivial scalar mean and fluctuation. Phys. Fluids 8, 536 (1996) 21. McLaughlin, R.M.: Turbulent Diffusion. Ph.D. thesis, Program in Applied and Computational Mathematics, Princeton University, 1994 22. Molchan, G.M.: Maximum of a Fractional Brownian Motion: Probabilities of Small Values. Commun. Math. Phys. 205, 97–111 (1999) 23. Molchan, G.M.: Burgers equation with self-similar Gaussian initial data: Tail probabilities. J. Stat. Phys. 88, 1139–1150 (1997) 24. Porter, D., Stirling, D.S.G.: Integral Equations. Cambridge: Cambridge University Press, 1990 25. Pumir, A.: A numerical study of the mixing of a passive scalar in three dimensions in the presence of a mean gradient. Phys. Fluids 6, 2118–2132 (1994) 26. Pumir, A., Shraiman, B., Siggia, E.: Exponential tails and random advection. Phys. Rev. Lett. 66, 2984 (1991) 27. Reade, R.: The Statistics of Burgers Turbulence Initialized with Fractional Brownian Noise Data. Commun. Math. Phys. 191, 71–86 (1998) 28. Sinai, Ya.G.: The Statistics of Shocks in Solutions of Inviscid Burgers equation. Commun. Math. Phys. 148, 601–621 (1992) 29. Titchmarsh, E.C.: The Theory of Functions. Oxford: Oxford University Press, 1986 30. Vanden Eijnden, E.: Non-Gaussian invariant measures for the Majda model of decaying turbulent transport. Comm. Pure Appl. Math. 54, 1146–1167 (2001) 31. Wang, Y.: Small Ball Problem via Wavelets for Gaussian Processes. Stat. Prob. Lett. 32, 133–139 (1997) Communicated by A. Kupiainen